[ 
https://issues.apache.org/jira/browse/TINKERPOP-1209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15196178#comment-15196178
 ] 

ASF GitHub Bot commented on TINKERPOP-1209:
-------------------------------------------

GitHub user okram opened a pull request:

    https://github.com/apache/incubator-tinkerpop/pull/264

    TINKERPOP-1209 & TINKERPOP-1210: OrderXXXStep Updates

    https://issues.apache.org/jira/browse/TINKERPOP-1209
    https://issues.apache.org/jira/browse/TINKERPOP-1210
    
    In OLAP, if you have a pattern like `order()..limit(x)`, 
`OrderLimitStrategy` will make it so that each partitioned order (split across 
workers) orders and limits prior to merging. This greatly reduces the amount of 
data reaching the master traversal as `order()...limit()` is a common traversal 
pattern in OLAP. 
    
    CHANGELOG
    
    ```
    * Fixed an hash code bug in `OrderGlobalStep` and `OrderLocalStep`.
    * Added `OrderLimitStrategy` which will ensure that partitions are limited 
before being merged in OLAP.
    * `ComparatorHolder` now separates the traversal from the comparator. 
(*breaking*)
    ```
    
    UPGRADE
    
    ```
    ComparatorHolder API Change
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    
    Providers that either have their own `ComparatorHolder` implementation or 
reason on `OrderXXXStep` will need to update their code. `ComparatorHolder` now 
returns `List<Pair<Traversal,Comparator>>`. This has greatly reduced the 
complexity of comparison-based steps like `OrderXXXStep`. However, its a 
breaking API change that is trivial to update to, just some awareness is 
required.
    ```
    
    VOTE +1.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/apache/incubator-tinkerpop TINKERPOP-1209

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-tinkerpop/pull/264.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #264
    
----
commit 813ba24da510b3770988d62916c2016265323129
Author: Marko A. Rodriguez <okramma...@gmail.com>
Date:   2016-03-15T16:10:26Z

    ComparatorHolder now has getComprators() return a List of 
Pair<Traversal,Comparator>. This was what we needed and this allowed me to gut 
lots of code and its so much more intutitive and will make it so 
pre-order/limits will be possible in OLAP. Unfortunately, this is breaking for 
vendors that reason (or have) a OrderXXXStep. The update is trivial and in 
fact, a lot less if/else for them.

commit 3ee1548a2b9c0b1c1e00d13a024a388965f5e846
Author: Marko A. Rodriguez <okramma...@gmail.com>
Date:   2016-03-15T16:12:59Z

    TraversalComparator is gutted -- check out that if/else nest that we no 
longer have to propagate through. Thank the heavens.

commit a67567e0ce627a5fa313d4be0a0f5b9f6ad65584
Author: Marko A. Rodriguez <okramma...@gmail.com>
Date:   2016-03-15T17:57:54Z

    added OrderLimitStrategy which finds order()...limit(x) patterns. It then 
tells OrderStep to order-then-limit. This is a potentially massive optimization 
in OLAP where if you do order().limit(5), the max number of traversers coming 
to the master traversal, is 5 * numberOfWorkers instead of the full set of 
traversers. Added OrderBiOperator which is a Memory reducer which handles this 
in OLAP. Added test cases to make this pretty. Added this as a default strategy 
in the GlobalCache. Currently OrderLimitStrategy is only for OLAP -- we could 
make it for OLTP, but we would have to write our own custom Collections.sort() 
that has a size limit.

commit dc0348717115a3572b5b70ef1d8f969c505c2bbf
Author: Marko A. Rodriguez <okramma...@gmail.com>
Date:   2016-03-15T20:18:43Z

    added more test cases. Fixed a old equality issue in OrderGlobalStepTest 
and OrderLocalStepTest cc/ @dkuppitz. Added more test cases to ensure 
OrderLimitStrategy is behaving correctly. OrderBiOperator now uses 
JavaSerializer so Giraph and Spark are happy. I think this is good to go. 
Perhaps one more test case using GratefulDead graph would be good.

----


> ComparatorHolder should returns a Pair<Traversal,Comparator>.
> -------------------------------------------------------------
>
>                 Key: TINKERPOP-1209
>                 URL: https://issues.apache.org/jira/browse/TINKERPOP-1209
>             Project: TinkerPop
>          Issue Type: Improvement
>          Components: process
>    Affects Versions: 3.1.1-incubating
>            Reporter: Marko A. Rodriguez
>            Assignee: Marko A. Rodriguez
>              Labels: breaking
>
> Right now {{ComparatorHolder}} has a method:
> {code}
> List<Comparator> getComparators()
> {code}
> This should really be:
> {code}
> List<Pair<Traversal<?,E>,Comparator<E>>> getComparators()
> {code}
> By doing this, we will be able to order during the {{Memory}}-reduction in 
> Gremlin OLAP. We will be able to create values that look like this:
> {code}
> [[32, "marko"], v[1]]
> [[12, "stephen"], v[7]]
> [[67, "daniel"], v[8]]
> ...
> {code}
> Then there will be an {{OrderBiOperator}} that will have a 
> {{List<Compartor>}} that, for the example above, is size 2. It will then be 
> able to use the already computed traversal ends to sort the vertices.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to