[ 
https://issues.apache.org/jira/browse/MADLIB-917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15157430#comment-15157430
 ] 

ASF GitHub Bot commented on MADLIB-917:
---------------------------------------

GitHub user iyerr3 opened a pull request:

    https://github.com/apache/incubator-madlib/pull/21

    Path: Allow multiple matches in single partition

    JIRA: MADLIB-917
    
    Path query was rewritten to use array functions. This was necessary
    since multiple matches produced a set of matches for each group. This
    set value has to be unnest - this was not possible in GPDB since the
    GROUP BY placed the executor in a context that does not allow SRF.
    Solution was to avoid the unnest by using array functions for all
    computations and finally unnesting without the GROUP BY.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/iyerr3/incubator-madlib 
feature/path_multiple_matches

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-madlib/pull/21.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #21
    
----
commit 28d84ff91d4c20d2bfb68bdd051332a645324a39
Author: Rahul Iyer <[email protected]>
Date:   2015-12-24T22:15:36Z

    Path: Allow multiple matches in single partition
    
    JIRA: MADLIB-917
    
    Path query was rewritten to use array functions. This was necessary
    since multiple matches produced a set of matches for each group. This
    set value has to be unnest - this was not possible in GPDB since the
    GROUP BY placed the executor in a context that does not allow SRF.
    Solution was to avoid the unnest by using array functions for all
    computations and finally unnesting without the GROUP BY.

----


> Path - window functions (multiple matches per partition, 1 window per match)
> ----------------------------------------------------------------------------
>
>                 Key: MADLIB-917
>                 URL: https://issues.apache.org/jira/browse/MADLIB-917
>             Project: Apache MADlib
>          Issue Type: New Feature
>          Components: Module: Utilities
>    Affects Versions: v1.9
>            Reporter: Frank McQuillan
>            Assignee: Rahul Iyer
>             Fix For: v1.9
>
>         Attachments: Ecommerce data set for path test 3.csv, path query3.sql
>
>
> Story
> As a user, I want to define symbols so that I can define a regular expression 
> of symbols to identify sequences of events that I care about.
> Partition:
> 1) Multiple matches per partition in this story.
> 2) Note that the match in the data might not span the whole partition, that 
> is, that matched rows could just be a subset of the rows in the partition.
> Window:
> 1) Limited to 1 window per partition.
> Other
> 1) Club rows together in the case where there are multiple matches per 
> partition, when doing aggregate/window functions.  E.g., if doing sum of a 
> revenue column, then sum all rows from all matches (as opposed to a separate 
> sum for each match).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to