[
https://issues.apache.org/jira/browse/MADLIB-917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15178356#comment-15178356
]
Frank McQuillan commented on MADLIB-917:
----------------------------------------
Test c)
{code:sql}
SELECT madlib.path(
'weblog', -- Name of the table
'path_output', -- Table name to store the path results
'DATE(event_timestamp)', -- Partition expression to group the data table
'event_timestamp ASC', -- Order expression to sort the tuples
of the data table
'IMPR:=click_event=0 AND purchase_event=0, CLICK:=click_event=1 AND
purchase_event=0, CONV:=purchase_event=1', -- Definition of various symbols
used in the pattern definition
'(IMPR){1}(CLICK){1}(CONV){1}', -- Definition of the path
pattern to search for
'SUM(margin) as sum_of_margin, SUM(revenue) as sum_of_revenue', --
Aggregate/window functions to be applied on the matched paths
TRUE -- Persist matches
);
{code}
produces
{code:sql}
date | sum_of_margin | sum_of_revenue
------------+---------------+----------------
2012-04-16 | 77 | 456
{code}
> Path - window functions (multiple matches per partition, 1 window per match)
> ----------------------------------------------------------------------------
>
> Key: MADLIB-917
> URL: https://issues.apache.org/jira/browse/MADLIB-917
> Project: Apache MADlib
> Issue Type: New Feature
> Components: Module: Utilities
> Affects Versions: v1.9
> Reporter: Frank McQuillan
> Assignee: Rahul Iyer
> Fix For: v1.9
>
> Attachments: Ecommerce data set for path test 3.csv, path query3.sql
>
>
> Story
> As a user, I want to define symbols so that I can define a regular expression
> of symbols to identify sequences of events that I care about.
> Partition:
> 1) Multiple matches per partition in this story.
> 2) Note that the match in the data might not span the whole partition, that
> is, that matched rows could just be a subset of the rows in the partition.
> Window:
> 1) Limited to 1 window per partition.
> Other
> 1) Club rows together in the case where there are multiple matches per
> partition, when doing aggregate/window functions. E.g., if doing sum of a
> revenue column, then sum all rows from all matches (as opposed to a separate
> sum for each match).
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)