[jira] [Commented] (TINKERPOP-1642) Improve performance of mutating traversals

ASF GitHub Bot (JIRA) Tue, 28 Mar 2017 09:33:01 -0700

    [ 
https://issues.apache.org/jira/browse/TINKERPOP-1642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15945483#comment-15945483
 ]


ASF GitHub Bot commented on TINKERPOP-1642:
-------------------------------------------

GitHub user spmallette opened a pull request:

    https://github.com/apache/tinkerpop/pull/587

    TINKERPOP-1642 Performance Enhancement on Mutating Traversal Execution

    https://issues.apache.org/jira/browse/TINKERPOP-1642
    
    This is a focused effort to improve the performance of traversals that 
mutate the graph.  All improvements were made on gremlin-core and should 
therefore be helpful to all `Graph` implementations. Here's the benchmarks at 
3.2.4:
    
    ```text
    Benchmark                                           Mode  Cnt        Score  
       Error  Units
    GraphMutateBenchmark.testAddE                      thrpt   20    20010.155 
±     621.783  ops/s
    GraphMutateBenchmark.testAddEdge                   thrpt   20  3609684.997 
±  195545.641  ops/s
    GraphMutateBenchmark.testAddV                      thrpt   20   288621.370 
±   12576.655  ops/s
    GraphMutateBenchmark.testAddVAddEWithPropsChained  thrpt   20        5.402 
±       0.084  ops/s
    GraphMutateBenchmark.testAddVWithProps             thrpt   20    21924.945 
±     230.668  ops/s
    GraphMutateBenchmark.testAddVWithPropsChained      thrpt   20      184.896 
±       4.875  ops/s
    GraphMutateBenchmark.testAddVertex                 thrpt   20  5830447.734 
±  131727.984  ops/s
    GraphMutateBenchmark.testAddVertexWithProps        thrpt   20   163131.738 
±    2647.117  ops/s
    GraphMutateBenchmark.testEdgeProperty              thrpt   20  8971499.256 
± 1630382.353  ops/s
    GraphMutateBenchmark.testEdgePropertyStep          thrpt   20   125053.249 
±    4770.478  ops/s
    GraphMutateBenchmark.testVertexProperty            thrpt   20  3741129.376 
±  286173.817  ops/s
    GraphMutateBenchmark.testVertexPropertyStep        thrpt   20   121742.106 
±    4339.508  ops/s
    ```
    
    and here's the benchmarks now in this PR which will apply to 3.2.5 and 
3.3.0:
    
    ```text
    Benchmark                                           Mode  Cnt        Score  
      Error  Units
    GraphMutateBenchmark.testAddE                      thrpt   20    26921.570 
±    241.374  ops/s
    GraphMutateBenchmark.testAddEdge                   thrpt   20  3567550.566 
± 185748.638  ops/s
    GraphMutateBenchmark.testAddV                      thrpt   20   341523.214 
±  10678.074  ops/s
    GraphMutateBenchmark.testAddVAddEWithPropsChained  thrpt   20      462.173 
±     46.842  ops/s
    GraphMutateBenchmark.testAddVWithProps             thrpt   20    46946.475 
±    508.174  ops/s
    GraphMutateBenchmark.testAddVWithPropsChained      thrpt   20      430.650 
±      5.941  ops/s
    GraphMutateBenchmark.testAddVertex                 thrpt   20  6362284.245 
± 294203.426  ops/s
    GraphMutateBenchmark.testAddVertexWithProps        thrpt   20   165223.769 
±   4788.481  ops/s
    GraphMutateBenchmark.testEdgeProperty              thrpt   20  8490043.427 
± 575562.362  ops/s
    GraphMutateBenchmark.testEdgePropertyStep          thrpt   20   142430.605 
±   3452.101  ops/s
    GraphMutateBenchmark.testVertexProperty            thrpt   20  4016719.775 
± 587856.207  ops/s
    GraphMutateBenchmark.testVertexPropertyStep        thrpt   20   131740.946 
±   1525.710  ops/s
    ```
    
    The key things to note:
    
    * A simple `g.addV()` got about 20% faster 
    * A simple `g.addE()` got about 34% faster
    * A long chained set of `addV()` where multiple vertices were added in a 
single traversal became a bit more than twice as fast.
    * A long chained set of `addV()` and `addE()` where multiple vertices/edges 
were added in a single traversal became 92 times faster
    * These changes did not hurt performance of the "non-mutating" traversals - 
those benchmarks remained about the same.
    
    Obviously, these numbers still lag where we want to be, but the changes in 
place represent a good body of work at this point and have opened up other 
ideas for improvements which can be dealt with as other specific issues/pull 
requests.
    
    VOTE +1

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/apache/tinkerpop TINKERPOP-1642

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/tinkerpop/pull/587.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #587
    
----
commit 4ffd19e0bba9408aa648521b4b94d1997055a796
Author: Stephen Mallette <[email protected]>
Date:   2017-03-02T15:44:58Z

    TINKERPOP-1642 Improved performance of addV() and chained mutating 
statements
    
    Added more tests for Parameters. Performance improved by about 2.5x given 
the benchmarks. Also long chained mutating traversals are now inserting roughly 
on par with single insert traversals (prior to this change they were about 25% 
slower).

commit c8cc470e38b6f2ef3f87fa6cfebb59a92e6301d2
Author: Stephen Mallette <[email protected]>
Date:   2017-03-03T11:47:56Z

    TINKERPOP-1642 Cached the Traversal list in Parameters
    
    This leads to a pretty good boost in performance from the prior commit 
especially for long chained mutating traversals that add vertices and edges as 
there is a 3X improvement in those cases.

commit 632fd219cd4452486dd45fb3a748f897fd751904
Author: Stephen Mallette <[email protected]>
Date:   2017-03-03T13:45:24Z

    TINKERPOP-1642 Minor fixes

commit 0406a92d2530b2486bc81ca6d6ab4eacd0094a9a
Author: Marko A. Rodriguez <[email protected]>
Date:   2017-03-03T16:27:58Z

    Added ScopingStrategy which does a single TraversalHelper.getLabels() call 
and recurssively tells where() and select() steps the path labels accessed by 
the Traversal.

commit 9f7b7591a3ea0782b9389061d74db2586ff90430
Author: Marko A. Rodriguez <[email protected]>
Date:   2017-03-06T15:07:39Z

    Add a slight optimization to ScopingStrategy that will not compute step 
labels if no Scoping steps are defined. Also added JavaDoc and comments 
accordingly.

commit 2549650f62148c21f1217af629b1ef04fdd9bbe7
Author: Marko A. Rodriguez <[email protected]>
Date:   2017-03-08T14:37:07Z

    Wow. Can't believe we didn't do this from the start. LABELED_PATH is set if 
a Step is labeled. Dar. So much simpler than all the recurssion in 
TraversalHelper.getLabels() and having a ScopingStrategy.... @spmallette -- can 
you verify that your Mutating traversal profiling is faster now.

commit 9185f45273cefca3a0b622e87e1eab825dea0ffe
Author: Stephen Mallette <[email protected]>
Date:   2017-03-10T17:31:38Z

    TINKERPOP-1642 Removed some extra iteration in Parameters.getTraversals()

commit 4838f7b0eb0f2f97b3dcad18a72295dc655dd463
Author: Stephen Mallette <[email protected]>
Date:   2017-03-10T18:27:01Z

    TINKERPOP-1642 Minor refactoring to get rid of duplicate code

commit 9e4942cf83e5add39838605641112438b093dbd7
Author: Stephen Mallette <[email protected]>
Date:   2017-03-10T20:11:52Z

    TINKERPOP-1642 Made Mutating steps implement Scoping.
    
    This simplified PathUtil.getReferencedLabels() and reduced iterations over 
Parameter traversals.

commit d4467512345b39dd0a8f89309d26039b0d5e96f8
Author: Daniel Kuppitz <[email protected]>
Date:   2017-03-13T15:01:12Z

    Fixed VertexProgram test that verifies proper TraversalRequirements.

commit 9893e49ae03cc1a474a6b8085d0a4e3d5eff3b36
Author: Stephen Mallette <[email protected]>
Date:   2017-03-13T18:50:09Z

    TINKERPOP-1642 Pushed integrateChild() operations into Parameters.set()
    
    There was no need to continually iterate the traversal list and doing 
integrateChild() over and over again for each newly added traversal. Removing 
those wasteful cycles and only doing that once, when the Traversal was added 
helped improve performance.

commit d39ef5caf0e1fc51767202b9bf28a12117a51a40
Author: Marko A. Rodriguez <[email protected]>
Date:   2017-03-17T17:57:52Z

    I made it so ProfileStrategy uses the MARKER model developed by @dkuppitz 
and myself to reduce recurssive lookups in strategies. Moving forward, we 
really need to move to a Traversal.metadata model as using labeled steps is a 
bit hackish.

commit 3929a6748e3b54692c0d103319965dd0d47ba3e1
Author: Marko A. Rodriguez <[email protected]>
Date:   2017-03-17T19:04:58Z

    added MARKER model to PathRetractionStrategy to reduce recurssive lookups 
of invalidating steps. Again, we really need Traversal.metdata to make this 
more 'pure'.

commit 0824625f4db8e0fdc483183faf8d505f3300b6b4
Author: Stephen Mallette <[email protected]>
Date:   2017-03-27T18:03:26Z

    TINKERPOP-1642 Fixed up some execution problems after rebase
    
    The logic still doesn't seem quite right but at least the tests pass and a 
complete build is achieved. The issue is related to new stuff added in 
PathRetractionStrategy on TINKERPOP-1652.

commit fc3fcb38255aa49bd4a311172d0d8b84e31dfdb9
Author: Marko A. Rodriguez <[email protected]>
Date:   2017-03-28T13:46:31Z

    removed a repeated recurssion in PathRetractionStrategy using the MARKER 
model of strategies.

----


> Improve performance of mutating traversals
> ------------------------------------------
>
>                 Key: TINKERPOP-1642
>                 URL: https://issues.apache.org/jira/browse/TINKERPOP-1642
>             Project: TinkerPop
>          Issue Type: Improvement
>          Components: process
>    Affects Versions: 3.2.4
>            Reporter: stephen mallette
>            Assignee: stephen mallette
>            Priority: Critical
>
> Make an attempt improve performance of mutating traversals. Some general 
> goals for this ticket should be to:
> 1. Improve performance parity of structure and process methods for modifying 
> the graph - structure is pretty far ahead.
> 2. Improve the speed of large chained mutating traversals - see 
> http://stackoverflow.com/q/41926409/1831717
> 3. Determine ways to make these performance improvements in {{gremlin-core}} 
> so that they may benefit all {{Graph}} implementations
> 4. Improve the speed of TinkerGraph specifically if possible as it is used 
> for subgraphing functions in many cases and ends up being useful regardless 
> of the graph implementation ultimately used.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (TINKERPOP-1642) Improve performance of mutating traversals

Reply via email to