[ https://issues.apache.org/jira/browse/TINKERPOP-1209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15196178#comment-15196178 ]
ASF GitHub Bot commented on TINKERPOP-1209: ------------------------------------------- GitHub user okram opened a pull request: https://github.com/apache/incubator-tinkerpop/pull/264 TINKERPOP-1209 & TINKERPOP-1210: OrderXXXStep Updates https://issues.apache.org/jira/browse/TINKERPOP-1209 https://issues.apache.org/jira/browse/TINKERPOP-1210 In OLAP, if you have a pattern like `order()..limit(x)`, `OrderLimitStrategy` will make it so that each partitioned order (split across workers) orders and limits prior to merging. This greatly reduces the amount of data reaching the master traversal as `order()...limit()` is a common traversal pattern in OLAP. CHANGELOG ``` * Fixed an hash code bug in `OrderGlobalStep` and `OrderLocalStep`. * Added `OrderLimitStrategy` which will ensure that partitions are limited before being merged in OLAP. * `ComparatorHolder` now separates the traversal from the comparator. (*breaking*) ``` UPGRADE ``` ComparatorHolder API Change ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Providers that either have their own `ComparatorHolder` implementation or reason on `OrderXXXStep` will need to update their code. `ComparatorHolder` now returns `List<Pair<Traversal,Comparator>>`. This has greatly reduced the complexity of comparison-based steps like `OrderXXXStep`. However, its a breaking API change that is trivial to update to, just some awareness is required. ``` VOTE +1. You can merge this pull request into a Git repository by running: $ git pull https://github.com/apache/incubator-tinkerpop TINKERPOP-1209 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-tinkerpop/pull/264.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #264 ---- commit 813ba24da510b3770988d62916c2016265323129 Author: Marko A. Rodriguez <okramma...@gmail.com> Date: 2016-03-15T16:10:26Z ComparatorHolder now has getComprators() return a List of Pair<Traversal,Comparator>. This was what we needed and this allowed me to gut lots of code and its so much more intutitive and will make it so pre-order/limits will be possible in OLAP. Unfortunately, this is breaking for vendors that reason (or have) a OrderXXXStep. The update is trivial and in fact, a lot less if/else for them. commit 3ee1548a2b9c0b1c1e00d13a024a388965f5e846 Author: Marko A. Rodriguez <okramma...@gmail.com> Date: 2016-03-15T16:12:59Z TraversalComparator is gutted -- check out that if/else nest that we no longer have to propagate through. Thank the heavens. commit a67567e0ce627a5fa313d4be0a0f5b9f6ad65584 Author: Marko A. Rodriguez <okramma...@gmail.com> Date: 2016-03-15T17:57:54Z added OrderLimitStrategy which finds order()...limit(x) patterns. It then tells OrderStep to order-then-limit. This is a potentially massive optimization in OLAP where if you do order().limit(5), the max number of traversers coming to the master traversal, is 5 * numberOfWorkers instead of the full set of traversers. Added OrderBiOperator which is a Memory reducer which handles this in OLAP. Added test cases to make this pretty. Added this as a default strategy in the GlobalCache. Currently OrderLimitStrategy is only for OLAP -- we could make it for OLTP, but we would have to write our own custom Collections.sort() that has a size limit. commit dc0348717115a3572b5b70ef1d8f969c505c2bbf Author: Marko A. Rodriguez <okramma...@gmail.com> Date: 2016-03-15T20:18:43Z added more test cases. Fixed a old equality issue in OrderGlobalStepTest and OrderLocalStepTest cc/ @dkuppitz. Added more test cases to ensure OrderLimitStrategy is behaving correctly. OrderBiOperator now uses JavaSerializer so Giraph and Spark are happy. I think this is good to go. Perhaps one more test case using GratefulDead graph would be good. ---- > ComparatorHolder should returns a Pair<Traversal,Comparator>. > ------------------------------------------------------------- > > Key: TINKERPOP-1209 > URL: https://issues.apache.org/jira/browse/TINKERPOP-1209 > Project: TinkerPop > Issue Type: Improvement > Components: process > Affects Versions: 3.1.1-incubating > Reporter: Marko A. Rodriguez > Assignee: Marko A. Rodriguez > Labels: breaking > > Right now {{ComparatorHolder}} has a method: > {code} > List<Comparator> getComparators() > {code} > This should really be: > {code} > List<Pair<Traversal<?,E>,Comparator<E>>> getComparators() > {code} > By doing this, we will be able to order during the {{Memory}}-reduction in > Gremlin OLAP. We will be able to create values that look like this: > {code} > [[32, "marko"], v[1]] > [[12, "stephen"], v[7]] > [[67, "daniel"], v[8]] > ... > {code} > Then there will be an {{OrderBiOperator}} that will have a > {{List<Compartor>}} that, for the example above, is size 2. It will then be > able to use the already computed traversal ends to sort the vertices. -- This message was sent by Atlassian JIRA (v6.3.4#6332)