[ 
https://issues.apache.org/jira/browse/TINKERPOP-1206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15197322#comment-15197322
 ] 

ASF GitHub Bot commented on TINKERPOP-1206:
-------------------------------------------

GitHub user okram opened a pull request:

    https://github.com/apache/incubator-tinkerpop/pull/266

    TINKERPOP-1206 & TINKERPOP-1014: Bulk Traversers and Requirements 
Optimization

    https://issues.apache.org/jira/browse/TINKERPOP-1206
    https://issues.apache.org/jira/browse/TINKERPOP-1014
    
    Two tangentially related issues are solved in this PR. The core of Gremlin 
is `Step`. We have this notion of an `ExpandableStepIterator` which knows 
either to get local traversers or traversers from the previous step. We use to 
have it where if you `addStarts(Iterator<Traverser>)`, it would add them to 
their own `MultiIterator`. This was pointless cause we don't get the benefit of 
bulking. Not only do we simplify the logic of this fundamental class, but we 
also reduce the amount of objects it creates/maintains and we all increase the 
likelihood of bulking (which is the ultimate potential optimization). The 
consequence is that we no longer support `O_Traverser` as bulking is ALWAYS 
required. This is fine -- very few traversals benefited from `O_Traverser` and 
all they gained was no `long` bulk field (trivial). `O_Traverser` is now 
abstract.
    
    This got me into thinking about how we calculate requirements and I 
realized how insanely stupid we (me) were especially for `by()`-modulators 
where EVERY call to a local traversal (typically `by()`-traversals) were 
recalculating the requirements!!!! That is, a full walk through the traversal 
tree. I decided to make `TraverserGenerator` static to the root traversal and 
when it is needed (the first time), it calculates the requirements and locks in 
the generator. Likewise, for child traversals, they look to the root traversal 
for the generator and lock it in locally. Thus, all this notion of requirements 
being expensive just vanished. We should keep our "traverser species" as 
calculating requirements is now cheap.
    
    CHANGELOG
    
    ```
    * Optimized `ExpandableStepIterator` with simpler logic and increased the 
likelihood of bulking.
    * Optimized `TraverserRequirement` calculations.
    * `Step.processNextStart()` and `Step.next()` now return 
`Traverser.Admin<E>`. (*breaking*)
    * `Traversal.addTraverserRequirement()` method removed. (*breaking*)
    ```
    
    UPDATE
    
    ```
    Step API Update
    ^^^^^^^^^^^^^^^^
    
    The `Step` interface is fundamental to Gremlin. `Step.processNextStart()` 
and `Step.next()` both retuned `Traverser<E>`. We had so many 
`Traverser.asAdmin()` and direct typecast calls throughout (especially in 
`TraversalVertexProgram`) that it was deemed prudent to have 
`Step.processNextStart()` and `Step.next()` return `Traverser.Admin<E>`. 
Moreover it makes sense as this is internal logic where admins are always 
needed. Providers with their own step definitions will simply need to change 
their method signatures (no logic update is required -- save that `asAdmin()` 
can be safely removed if used). This is a trivial update to make.
    
    Traversal API Update
    ^^^^^^^^^^^^^^^^^^^^
    
    The way in which `TraverserRequirements` are calculated has been changed 
(for the better). The ramification is that post compilation requirement 
additions no longer make sense and should not be allowed. To enforce this, 
`Traversal.addTraverserRequirement()` method has been removed from the 
interface. Moreover, users should never be able to add requirements manually 
(this should all be inferred from the end compilation). However, if need be, 
there is always `RequirementStrategy` which will allow the user to add a 
requirement at strategy application time (though again, there should not be a 
reason to do so).
    ```
    
    VOTE +1.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/apache/incubator-tinkerpop TINKERPOP-1206

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-tinkerpop/pull/266.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #266
    
----
commit 5b4009d6418e818066d72c474123df1fbc081478
Author: Marko A. Rodriguez <okramma...@gmail.com>
Date:   2016-03-16T02:49:55Z

    I've redesigned how TraverserRequirements are accessed. This GREATLY 
reduces the amount of logic required for all by()-modulators, StartSteps, 
GraphSteps, and barriers. I'm talking CRAZY more efficient when by()-modulators 
are used. I've made it so the Step API works with Traverser.Admin. We had too 
many .asAdmins() everywhere and typecasting everywhere -- no bueno. So much 
more clean now. For providers of Steps, breaking -- but its simply just 
changing your processNextStart() signature to return a Traverser.Admin. The 
amount of logic removed in this push is insane. This covers TINKERPOP-1206 and 
partially TINKERPOP-1014. The model I used for getTraverserGenerator() is 
really smart and we might want to use it for getSideEffects() and 
getStrategies(). In short, lazy computation of these data structures as they 
are rarely globally needed. This also means we can pretty signifanctly reduce 
the size of a Traversal serialization which is good for RemoteGraph. Finally 
found a major bug in GryoMapper -- registered XXXGeneartor, not XXX (for one 
species). Sucky.

commit 90873e3cd640f45fd8872a0c19406bdfd1f3dc2a
Author: Marko A. Rodriguez <okramma...@gmail.com>
Date:   2016-03-16T13:13:32Z

    make the requirements unmodifiable for safety.

----


> ExpandableIterator can take a full TraverserSet at once -- Barriers.
> --------------------------------------------------------------------
>
>                 Key: TINKERPOP-1206
>                 URL: https://issues.apache.org/jira/browse/TINKERPOP-1206
>             Project: TinkerPop
>          Issue Type: Improvement
>          Components: process
>    Affects Versions: 3.1.1-incubating
>            Reporter: Marko A. Rodriguez
>
> I haven't looked at {{ExpandableIterator}} in over a year. Its one of the 
> most fundamental structures of a Gremlin traversal. I just realized it can 
> take an entire {{TraverserSet}}. As such, if the previous step is a 
> {{Barrier}}, don't iterate the barrier out, simply "dump it" into the current 
> steps {{ExpandableIterator}}. That should speed up things significantly -- 
> though there are not that many barrier steps... but still.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to