[ https://issues.apache.org/jira/browse/TINKERPOP-1225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15218783#comment-15218783 ]
ASF GitHub Bot commented on TINKERPOP-1225: ------------------------------------------- GitHub user okram opened a pull request: https://github.com/apache/incubator-tinkerpop/pull/285 TINKERPOP-1225: Do a "rolling reduce" for GroupXXXStep in OLAP. https://issues.apache.org/jira/browse/TINKERPOP-1225 We now can do mid-barrier reductions in with `group()` on OLAP. This is huge as this means that if you have an reducer in your `by()`-valueTraversal, the stream is constantly reduced to limit memory consumption. Its more expensive in terms of time (not by much) for small data, but for large data, no worries about OME with group() (both both OLTP and OLAP). CHANGELOG ``` * `GroupStep` and `GroupSideEffectStep` make use of mid-traversal reducers to limit memory consumption in OLAP. ``` VOTE +1 You can merge this pull request into a Git repository by running: $ git pull https://github.com/apache/incubator-tinkerpop TINKERPOP-1225 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-tinkerpop/pull/285.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #285 ---- commit 8eb53815acfcec05316dcf39bd91d9e3a43d971a Author: Marko A. Rodriguez <okramma...@gmail.com> Date: 2016-03-30T20:24:38Z GroupStep and GroupSideEffectStep now make use mid-traversal barriers to do data reduction on the fly in order to limite the memory footprint and reduce the chances of OME. OLTP always did this, but now OLAP (which needs it more) does it. This is epic. Also fixed a minor bug in ReducingBarrierStep. Added some more GroupTest test cases -- one that does a groupCount() instead of a group() just to make sure things are working as expected. ---- > Do a "rolling reduce" for GroupXXXStep in OLAP. > ----------------------------------------------- > > Key: TINKERPOP-1225 > URL: https://issues.apache.org/jira/browse/TINKERPOP-1225 > Project: TinkerPop > Issue Type: Improvement > Components: process > Affects Versions: 3.1.1-incubating > Reporter: Marko A. Rodriguez > Assignee: Marko A. Rodriguez > > {{GroupXXXStep}} in OLAP is able to process traversers up to the first > barrier in the reduction. 99% of the time, the first barrier is the last > barrier and thus, you get a nice lazy computation which limits the memory > footprint. > Unfortunately, we don't have this luxury in OLAP. Until!!! However, the work > that [~spmallette] did to get {{GroupBiOperator}} to serialize with > traversals might make it possible for us to merge barriers in the reduction > and thus have OLAP and OLTP {{GroupXXXStep}} behave analogously. -- This message was sent by Atlassian JIRA (v6.3.4#6332)