[
https://issues.apache.org/jira/browse/TINKERPOP3-772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Marko A. Rodriguez closed TINKERPOP3-772.
-----------------------------------------
Resolution: Fixed
So this works. It also uncovered a pretty nasty bug around {{Path.addLabel()}}
which has since been fixed as well. With this
{{TraverserRequirement.LABELED_PATH}}, this is now fast as it uses bulking.
{code}
g.V().repeat(out()).times(5).as("a").out("writtenBy").as("b").select("a","b").by("name").count()
{code}
Moreover, the likelihood of bulking is greatly increased as if two traversers
have the same value for a particular labeled step (not their entire path
history!), then they will be bulked. We can start to think about
{{LazyBarrierStrategy}} working its magic in {{MatchStep}} -- crazy.
We need to optimize the {{ImmutablePath.hashCode()}} and
{{ImmutablePath.equals()}} methods to get it faster. Right now, its about
4-to-1 in speed to a path-less traverser.
{code}
[1099.4620842999998, 547143359] // w/o select()
[3884.4588014, 547143359] // w/ select
{code}
This is in {{3.1.0-incubating}} because the GryoMapper has a new ID (for the
new traverser), {{Path.addLabel()}} no longer exists (its now
{{Path.extend(Set<String>)}} -- that was the nasty bug), and
{{Traverser.addLabels(Set<String>)}} was added (related to nasty bug). When we
do {{Traverser.setPath()}} I will probably get rid of
{{Traverser.addLabels()}}. But that is a later ticket for {{3.1.0-incubating}}.
> TraverserRequirement.LABELED_PATH
> ---------------------------------
>
> Key: TINKERPOP3-772
> URL: https://issues.apache.org/jira/browse/TINKERPOP3-772
> Project: TinkerPop 3
> Issue Type: Improvement
> Components: process
> Affects Versions: 3.0.0-incubating
> Reporter: Marko A. Rodriguez
> Assignee: Marko A. Rodriguez
> Fix For: 3.1.0-incubating
>
>
> Path computations are "all or nothing" right now. That is, if a step requires
> path data (history), then the complete history of the traverser is recorded.
> However, there are only a few steps that actually require full history. These
> are:
> * {{PathStep}}
> * {{SimplePath}}
> * {{CyclicPath}}
> All other steps that use path history, require only those path steps that are
> labeled. For instance:
> * {{MatchStep}}
> * {{SelectStep}}
> * {{WhereStep}}
> * {{DedupGlobalStep}}
> * ...
> Thus, we should add a new {{TraverserRequirement}} called {{LABELED_PATH}}.
> This requirement says, "only Path.extend() if the step is labeled." This will
> increase the probability of bulking and will allow regions of a traversal
> that don't require paths to not generate that data (and thus, reduce the
> memory footprint).
> Another idea we should consider, which I believe would be added as a
> {{FinalizationStrategy}}, is {{DropPathStep}}. This would say: "after this
> point, the following labels are no longer needed so when the traverser comes
> here, delete the history for labels x, y, z..." This will allow us to prune
> path data to again, increase the probability of bulking.
> There is one odd duck step that {{LABELED_PATH}} should help, but doesn't:
> * {{OtherVertexStep}}
> There is no label, but we only need the path history of the last vertex. I
> suspect "hidden path labels" is what we need (and have proposed in an other
> ticket for optimizing {{match()}}).
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)