[ 
https://issues.apache.org/jira/browse/TINKERPOP3-772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marko A. Rodriguez closed TINKERPOP3-772.
-----------------------------------------
    Resolution: Fixed

So this works. It also uncovered a pretty nasty bug around {{Path.addLabel()}} 
which has since been fixed as well. With this 
{{TraverserRequirement.LABELED_PATH}}, this is now fast as it uses bulking.

{code}
g.V().repeat(out()).times(5).as("a").out("writtenBy").as("b").select("a","b").by("name").count()
{code}

Moreover, the likelihood of bulking is greatly increased as if two traversers 
have the same value for a particular labeled step (not their entire path 
history!), then they will be bulked. We can start to think about 
{{LazyBarrierStrategy}} working its magic in {{MatchStep}} -- crazy.

We need to optimize the {{ImmutablePath.hashCode()}} and 
{{ImmutablePath.equals()}} methods to get it faster. Right now, its about 
4-to-1 in speed to a path-less traverser.

{code}
[1099.4620842999998, 547143359] // w/o select()
[3884.4588014, 547143359] // w/ select
{code}

This is in {{3.1.0-incubating}} because the GryoMapper has a new ID (for the 
new traverser), {{Path.addLabel()}} no longer exists (its now 
{{Path.extend(Set<String>)}} -- that was the nasty bug), and 
{{Traverser.addLabels(Set<String>)}} was added (related to nasty bug). When we 
do {{Traverser.setPath()}} I will probably get rid of 
{{Traverser.addLabels()}}. But that is a later ticket for {{3.1.0-incubating}}.

> TraverserRequirement.LABELED_PATH
> ---------------------------------
>
>                 Key: TINKERPOP3-772
>                 URL: https://issues.apache.org/jira/browse/TINKERPOP3-772
>             Project: TinkerPop 3
>          Issue Type: Improvement
>          Components: process
>    Affects Versions: 3.0.0-incubating
>            Reporter: Marko A. Rodriguez
>            Assignee: Marko A. Rodriguez
>             Fix For: 3.1.0-incubating
>
>
> Path computations are "all or nothing" right now. That is, if a step requires 
> path data (history), then the complete history of the traverser is recorded. 
> However, there are only a few steps that actually require full history. These 
> are:
>   * {{PathStep}}
>   * {{SimplePath}}
>   * {{CyclicPath}}
> All other steps that use path history, require only those path steps that are 
> labeled. For instance:
>   * {{MatchStep}}
>   * {{SelectStep}}
>   * {{WhereStep}}
>   * {{DedupGlobalStep}}
>   * ...
> Thus, we should add a new {{TraverserRequirement}} called {{LABELED_PATH}}. 
> This requirement says, "only Path.extend() if the step is labeled." This will 
> increase the probability of bulking and will allow regions of a traversal 
> that don't require paths to not generate that data (and thus, reduce the 
> memory footprint).
> Another idea we should consider, which I believe would be added as a 
> {{FinalizationStrategy}}, is {{DropPathStep}}. This would say: "after this 
> point, the following labels are no longer needed so when the traverser comes 
> here, delete the history for labels x, y, z..." This will allow us to prune 
> path data to again, increase the probability of bulking.
> There is one odd duck step that {{LABELED_PATH}} should help, but doesn't:
>   * {{OtherVertexStep}}
> There is no label, but we only need the path history of the last vertex. I 
> suspect "hidden path labels" is what we need (and have proposed in an other 
> ticket for optimizing {{match()}}).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to