Nandish Jayaram created MADLIB-1003:
---------------------------------------
Summary: Partition expression in path function fails on multiple
conditions
Key: MADLIB-1003
URL: https://issues.apache.org/jira/browse/MADLIB-1003
Project: Apache MADlib
Issue Type: Bug
Components: Module: Utilities
Reporter: Nandish Jayaram
The path function in utilities is supposed to accept expressions for the
partition_expr parameter, but there are two aspects of the partition expression
that are not handled currently:
1) If there are more than one condition in the partition expression, path()
fails to run successfully. For instance, consider the input table present in
the install check file for path, the following function call results in an
error:
SELECT madlib.path(
'"Weblog"', -- Name of the table
'"Path_output"', -- Table name
to store the path results
'user_id,
age_group > 1, income_group > 1', -- Partition expression to group
the data table
'event_timestamp ASC', -- Order expression to sort the
tuples of the data table
'I:="Click_event"=0 AND
purchase_event=0, Click:="Click_event"=1 AND purchase_event=0,
Conv:=purchase_event=1', -- Definition of various symbols used in the
pattern definition 'I(click){1}(CONV){1}', -- Definition of the
path pattern to search for
'COUNT(*)' --
Aggregate/window functions to be applied on the matched paths
,TRUE
);
ERROR: spiexceptions.DuplicateColumn: column "?column?" specified more than
once
CONTEXT: Traceback (most recent call last):
PL/Python function "path", line 23, in <module>
return path.path(**globals())
PL/Python function "path", line 276, in path
PL/Python function "path"
2) We cannot rename a particular condition/column name in the partition
expression using "AS". For example, we get the following error with the
function call shown below:
SELECT madlib.path(
'"Weblog"', -- Name of the table
'"Path_output"', -- Table name
to store the path results
'user_id as
uid', -- Partition expression to group the data table
'event_timestamp ASC', -- Order expression to sort the
tuples of the data table
'I:="Click_event"=0 AND
purchase_event=0, Click:="Click_event"=1 AND purchase_event=0,
Conv:=purchase_event=1', -- Definition of various symbols used in the
pattern definition 'I(click){1}(CONV){1}', -- Definition of the
path pattern to search for
'COUNT(*)' --
Aggregate/window functions to be applied on the matched paths
,TRUE
);
ERROR: spiexceptions.SyntaxError: syntax error at or near "AS"
QUERY: �
CONTEXT: Traceback (most recent call last):
PL/Python function "path", line 23, in <module>
return path.path(**globals())
PL/Python function "path", line 114, in path
PL/Python function "path"
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)