[
https://issues.apache.org/jira/browse/TINKERPOP-1254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15369701#comment-15369701
]
ASF GitHub Bot commented on TINKERPOP-1254:
-------------------------------------------
Github user twilmes commented on the issue:
https://github.com/apache/tinkerpop/pull/358
I made a small update to `ReferencePath` to create new label Sets when a
patch is detached. This had been causing issues where the first set of labels
for a path where being shared across `MutablePaths` after detachment. A label
would be removed from one, and therefore all of the traversers that had that
label `Set` in their path, would be affected.
The `PathProcessors` are now respecting keepLabels null and labels are not
dropped if `PrunePathStrategy` is not applied.
**PrunePathStrategy on**
```
g.V().match(
as("a").in("sungBy").as("b"),
as("a").in("sungBy").as("c"),
as("b").out("writtenBy").as("d"),
as("c").out("writtenBy").as("e"),
as("d").has("name", "George_Harrison"),
as("e").has("name", "Bob_Marley")).select("a").count().profile()
Traversal Metrics
Step Count
Traversers Time (ms) % Dur
=============================================================================================================
GraphStep(vertex,[]) 808
808 44.217 99.97
MatchStep(AND,[[MatchStartStep(a), ProfileStep,... 1
1 0.004 0.01
MatchStartStep(a) 808
808 43.517
VertexStep(IN,[sungBy],vertex) 501
499 20.323
MatchEndStep(b) (profiling ignored)
0.000
MatchStartStep(a) 2
2 0.006
VertexStep(IN,[sungBy],vertex) 156
156 2.129
MatchEndStep(c) (profiling ignored)
0.000
MatchStartStep(b) 501
499 5.126
VertexStep(OUT,[writtenBy],vertex) 509
504 3.423
MatchEndStep(d) (profiling ignored)
0.000
MatchStartStep(c) 156
156 1.083
VertexStep(OUT,[writtenBy],vertex) 157
157 1.029
MatchEndStep(e) (profiling ignored)
0.000
MatchStartStep(d) 509
266 1.685
HasStep([name.eq(George_Harrison)]) 2
2 0.002
MatchEndStep (profiling ignored)
0.000
MatchStartStep(e) 157
57 0.391
HasStep([name.eq(Bob_Marley)]) 1
1 0.001
MatchEndStep (profiling ignored)
0.000
SelectOneStep(a) 1
1 0.003 0.01
CountGlobalStep 1
1 0.003 0.01
>TOTAL -
- 44.228 -
```
**PrunePathStrategy off**
```
Traversal Metrics
Step Count
Traversers Time (ms) % Dur
=============================================================================================================
GraphStep(vertex,[]) 808
808 7.565 99.84
MatchStep(AND,[[MatchStartStep(a), ProfileStep,... 1
1 0.007 0.10
MatchStartStep(a) 808
808 5.726
VertexStep(IN,[sungBy],vertex) 501
499 9.532
MatchEndStep(b) (profiling ignored)
0.000
MatchStartStep(a) 2
2 0.007
VertexStep(IN,[sungBy],vertex) 156
156 1.803
MatchEndStep(c) (profiling ignored)
0.000
MatchStartStep(b) 501
499 3.221
VertexStep(OUT,[writtenBy],vertex) 509
504 2.297
MatchEndStep(d) (profiling ignored)
0.000
MatchStartStep(c) 156
156 1.192
VertexStep(OUT,[writtenBy],vertex) 157
157 0.910
MatchEndStep(e) (profiling ignored)
0.000
MatchStartStep(d) 509
504 1.533
HasStep([name.eq(George_Harrison)]) 2
2 0.002
MatchEndStep (profiling ignored)
0.000
MatchStartStep(e) 157
157 1.425
HasStep([name.eq(Bob_Marley)]) 1
1 0.000
MatchEndStep (profiling ignored)
0.000
SelectOneStep(a) 1
1 0.003 0.04
CountGlobalStep 1
1 0.001 0.02
>TOTAL -
- 7.577 -
```
I am running the integration tests now and will respond back when they
complete.
> Support dropping traverser path information when it is no longer needed.
> ------------------------------------------------------------------------
>
> Key: TINKERPOP-1254
> URL: https://issues.apache.org/jira/browse/TINKERPOP-1254
> Project: TinkerPop
> Issue Type: Improvement
> Components: process
> Affects Versions: 3.1.1-incubating
> Reporter: Marko A. Rodriguez
> Assignee: Ted Wilmes
>
> The most expensive traversals (especially in OLAP) are those that can not be
> "bulked." There are various reasons why two traversers at the same object can
> not be bulked, but the primary reason is {{PATH}} or {{LABELED_PATH}}. That
> is, when the history of the traverser is required, the probability of two
> traversers having the same history is low.
> A key to making traversals more efficient is to do as a much as possible to
> remove historic information from a traverser so it can get bulked. How does
> one do this?
> {code}
> g.V.as('a').out().as('b').out().where(neq('a').and().neq('b')).both().name
> {code}
> The {{LABELED_PATH}} of "a" and "b" are required up to the {{where()}} and at
> which point, at {{both()}}, they are no longer required. It would be smart to
> support:
> {code}
> traverser.dropLabels(Set<String>)
> traverser.dropPath()
> {code}
> We would then, via a {{TraversalOptimizationStrategy}} insert a step between
> {{where()}} and {{both()}} called {{PathPruneStep}} which would be a
> {{SideEffectStep}}. The strategy would know which labels were no longer
> needed (via forward lookahead) and then do:
> {code}
> public class PathPruneStep {
> final Set<String> dropLabels = ...
> final boolean dropPath = ...
> public void sideEffect(final Traverser<S> traverser) {
> final Traverser<S> start = this.starts.next();
> if(this.dropPath) start.dropPath();
> else start.dropLabels(labels);
> }
> }
> {code}
> Again, the more we can prune historic path data no longer needed, the higher
> the probability of bulking. Think about this in terms of {{match()}}.
> {code}
> g.V().match(
> a.out.b,
> b.out.c,
> c.neq.a,
> c.out.b,
> ).select("a")
> {code}
> All we need is "a" at the end. Thus, once a pattern has been passed and no
> future patterns require that label, drop it!
> This idea is related to TINKERPOP-331, but I don't think we should deal with
> manipulating the species. Thus, I think 331 is too "low level."
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)