GitHub user okram opened a pull request:

    https://github.com/apache/incubator-tinkerpop/pull/264

    TINKERPOP-1209 & TINKERPOP-1210: OrderXXXStep Updates

    https://issues.apache.org/jira/browse/TINKERPOP-1209
    https://issues.apache.org/jira/browse/TINKERPOP-1210
    
    In OLAP, if you have a pattern like `order()..limit(x)`, 
`OrderLimitStrategy` will make it so that each partitioned order (split across 
workers) orders and limits prior to merging. This greatly reduces the amount of 
data reaching the master traversal as `order()...limit()` is a common traversal 
pattern in OLAP. 
    
    CHANGELOG
    
    ```
    * Fixed an hash code bug in `OrderGlobalStep` and `OrderLocalStep`.
    * Added `OrderLimitStrategy` which will ensure that partitions are limited 
before being merged in OLAP.
    * `ComparatorHolder` now separates the traversal from the comparator. 
(*breaking*)
    ```
    
    UPGRADE
    
    ```
    ComparatorHolder API Change
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    
    Providers that either have their own `ComparatorHolder` implementation or 
reason on `OrderXXXStep` will need to update their code. `ComparatorHolder` now 
returns `List<Pair<Traversal,Comparator>>`. This has greatly reduced the 
complexity of comparison-based steps like `OrderXXXStep`. However, its a 
breaking API change that is trivial to update to, just some awareness is 
required.
    ```
    
    VOTE +1.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/apache/incubator-tinkerpop TINKERPOP-1209

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-tinkerpop/pull/264.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #264
    
----
commit 813ba24da510b3770988d62916c2016265323129
Author: Marko A. Rodriguez <okramma...@gmail.com>
Date:   2016-03-15T16:10:26Z

    ComparatorHolder now has getComprators() return a List of 
Pair<Traversal,Comparator>. This was what we needed and this allowed me to gut 
lots of code and its so much more intutitive and will make it so 
pre-order/limits will be possible in OLAP. Unfortunately, this is breaking for 
vendors that reason (or have) a OrderXXXStep. The update is trivial and in 
fact, a lot less if/else for them.

commit 3ee1548a2b9c0b1c1e00d13a024a388965f5e846
Author: Marko A. Rodriguez <okramma...@gmail.com>
Date:   2016-03-15T16:12:59Z

    TraversalComparator is gutted -- check out that if/else nest that we no 
longer have to propagate through. Thank the heavens.

commit a67567e0ce627a5fa313d4be0a0f5b9f6ad65584
Author: Marko A. Rodriguez <okramma...@gmail.com>
Date:   2016-03-15T17:57:54Z

    added OrderLimitStrategy which finds order()...limit(x) patterns. It then 
tells OrderStep to order-then-limit. This is a potentially massive optimization 
in OLAP where if you do order().limit(5), the max number of traversers coming 
to the master traversal, is 5 * numberOfWorkers instead of the full set of 
traversers. Added OrderBiOperator which is a Memory reducer which handles this 
in OLAP. Added test cases to make this pretty. Added this as a default strategy 
in the GlobalCache. Currently OrderLimitStrategy is only for OLAP -- we could 
make it for OLTP, but we would have to write our own custom Collections.sort() 
that has a size limit.

commit dc0348717115a3572b5b70ef1d8f969c505c2bbf
Author: Marko A. Rodriguez <okramma...@gmail.com>
Date:   2016-03-15T20:18:43Z

    added more test cases. Fixed a old equality issue in OrderGlobalStepTest 
and OrderLocalStepTest cc/ @dkuppitz. Added more test cases to ensure 
OrderLimitStrategy is behaving correctly. OrderBiOperator now uses 
JavaSerializer so Giraph and Spark are happy. I think this is good to go. 
Perhaps one more test case using GratefulDead graph would be good.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

Reply via email to