Github user njayaram2 commented on a diff in the pull request:
https://github.com/apache/incubator-madlib/pull/49#discussion_r67951017
--- Diff: src/ports/postgres/modules/utilities/test/sessionize.sql_in ---
@@ -87,6 +89,23 @@ SELECT
assert(
relative_error(array_agg(CASE WHEN original_session_id NOTNULL
THEN original_session_id ELSE 0 END), array_agg(CASE WHEN session_id NOTNULL
THEN session_id ELSE 0 END)) < 1e-6,
'wrong output in sessionization')
-FROM sessionize_output;
+FROM sessionize_output_v;
+SELECT * FROM sessionize_output_v;
+
+SELECT sessionize(
+ 'eventlog_installchk', -- Name of the input table
+ 'sessionize_output_t', -- Name of the output table
+ 'user_id<102000', -- Partition expression to group the data
+ 'event_timestamp', -- Order expression to sort the tuples of the
data table
+ '180', -- Max time that can elapse between consecutive rows to be
considered part of the same session
+ '*,user_id<102000', -- Select all columns in the input table,
along with the partition expression and session id columns
--- End diff --
I rename each expression using quotes, as mentioned in one of my comments
above. Since I rename each expression and don't rely on postgres giving a
default name for an expression, the code will handle cases where there are 2 or
more expressions. But I will put up better comments and test cases in install
check.
Reproducing the example from the previous comment here for context:
If my output_cols parameter was:
' "user id"<100, revenue>20 '
I parse this to rename each expression:
"user id"<100 is named as "user id<100"
revenue>20 is named as "revenue>20"
This ends up creating a select statement as follows:
SELECT "user id"<100 AS "user id<100", revenue>20 AS "revenue>20"
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---