[
https://issues.apache.org/jira/browse/TINKERPOP-1617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15840347#comment-15840347
]
ASF GitHub Bot commented on TINKERPOP-1617:
-------------------------------------------
GitHub user okram opened a pull request:
https://github.com/apache/tinkerpop/pull/549
TINKERPOP-1617: Create a SingleIterationStrategy which will do its best to
rewrite OLAP traversals to not message pass.
https://issues.apache.org/jira/browse/TINKERPOP-1617
There are various traversals that can be rewritten using `local()` that
will enable the `GraphComputer` to avoid a message pass and thus, can
accomplish the computation in a single scan of the graph. Benefiting traversal
examples include:
```
g.V().out().id() --> g.V().local(out().id())
g.V().out().id().count() --> g.V().local(out().id()).count()
g.V().out().id().dedup().count()
g.V().inE().values("weight") // realize that in-edges are hosted by the
out-vertex
g.V().inE().values("weight").sum()
g.V().both().count()
g.V().inE().count()
g.V().as("a").outE().inV().as("b").id().dedup("a", "b").by(T.id).count()
```
Finally, the traversal that sparked this PR:
```
g.V().in().id().select("articleNumber").dedup().count() // requires one
message pass
==translatesTo==>
g.V().local(in().id().select("articleNumber")).dedup().count() // requires
no message passing
```
`SingleIterationStrategy` plays well with `SparkSingleIterationStrategy`
which determines whether it is necessary to `cache()` and/or `partition()` the
graph. If the traversal can be accomplished without a message pass (i.e. a
single iteration), then performance is greatly improved as RDD partitions can
be dropped as they are processed sequentially.
VOTE +1.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/apache/tinkerpop TINKERPOP-1617
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/tinkerpop/pull/549.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #549
----
----
> Create a SingleIterationStrategy which will do its best to rewrite OLAP
> traversals to not message pass.
> -------------------------------------------------------------------------------------------------------
>
> Key: TINKERPOP-1617
> URL: https://issues.apache.org/jira/browse/TINKERPOP-1617
> Project: TinkerPop
> Issue Type: Improvement
> Components: process
> Affects Versions: 3.2.3
> Reporter: Marko A. Rodriguez
> Assignee: Marko A. Rodriguez
>
> The traversal:
> {code}
> g.V().out().id().count()
> {code}
> Requires a message pass from {{out()}}. We shouldn't do this. Instead, if we
> wrap the pre-barrier stage into a {{local()}}, we have:
> {code}
> g.V().local(out().id()).count()
> {code}
> ...which doesn't require a message pass and has the same semantics. This will
> help open up numerous OLAP type traversals to single-pass/non-caching scans.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)