[ 
https://issues.apache.org/jira/browse/TINKERPOP-1225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15218783#comment-15218783
 ] 

ASF GitHub Bot commented on TINKERPOP-1225:
-------------------------------------------

GitHub user okram opened a pull request:

    https://github.com/apache/incubator-tinkerpop/pull/285

    TINKERPOP-1225: Do a "rolling reduce" for GroupXXXStep in OLAP.

    https://issues.apache.org/jira/browse/TINKERPOP-1225
    
    We now can do mid-barrier reductions in with `group()` on OLAP. This is 
huge as this means that if you have an reducer in your `by()`-valueTraversal, 
the stream is constantly reduced to limit memory consumption. Its more 
expensive in terms of time (not by much) for small data, but for large data, no 
worries about OME with group() (both both OLTP and OLAP).
    
    CHANGELOG
    
    ```
    * `GroupStep` and `GroupSideEffectStep` make use of mid-traversal reducers 
to limit memory consumption in OLAP.
    ``` 
    
    VOTE +1

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/apache/incubator-tinkerpop TINKERPOP-1225

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-tinkerpop/pull/285.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #285
    
----
commit 8eb53815acfcec05316dcf39bd91d9e3a43d971a
Author: Marko A. Rodriguez <okramma...@gmail.com>
Date:   2016-03-30T20:24:38Z

    GroupStep and GroupSideEffectStep now make use mid-traversal barriers to do 
data reduction on the fly in order to limite the memory footprint and reduce 
the chances of OME. OLTP always did this, but now OLAP (which needs it more) 
does it. This is epic. Also fixed a minor bug in ReducingBarrierStep. Added 
some more GroupTest test cases -- one that does a groupCount() instead of a 
group() just to make sure things are working as expected.

----


> Do a "rolling reduce" for GroupXXXStep in OLAP.
> -----------------------------------------------
>
>                 Key: TINKERPOP-1225
>                 URL: https://issues.apache.org/jira/browse/TINKERPOP-1225
>             Project: TinkerPop
>          Issue Type: Improvement
>          Components: process
>    Affects Versions: 3.1.1-incubating
>            Reporter: Marko A. Rodriguez
>            Assignee: Marko A. Rodriguez
>
> {{GroupXXXStep}} in OLAP is able to process traversers up to the first 
> barrier in the reduction. 99% of the time, the first barrier is the last 
> barrier and thus, you get a nice lazy computation which limits the memory 
> footprint.
> Unfortunately, we don't have this luxury in OLAP. Until!!! However, the work 
> that [~spmallette] did to get {{GroupBiOperator}} to serialize with 
> traversals might make it possible for us to merge barriers in the reduction 
> and thus have OLAP and OLTP {{GroupXXXStep}} behave analogously.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to