[
https://issues.apache.org/jira/browse/TINKERPOP-1642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15945483#comment-15945483
]
ASF GitHub Bot commented on TINKERPOP-1642:
-------------------------------------------
GitHub user spmallette opened a pull request:
https://github.com/apache/tinkerpop/pull/587
TINKERPOP-1642 Performance Enhancement on Mutating Traversal Execution
https://issues.apache.org/jira/browse/TINKERPOP-1642
This is a focused effort to improve the performance of traversals that
mutate the graph. All improvements were made on gremlin-core and should
therefore be helpful to all `Graph` implementations. Here's the benchmarks at
3.2.4:
```text
Benchmark Mode Cnt Score
Error Units
GraphMutateBenchmark.testAddE thrpt 20 20010.155
± 621.783 ops/s
GraphMutateBenchmark.testAddEdge thrpt 20 3609684.997
± 195545.641 ops/s
GraphMutateBenchmark.testAddV thrpt 20 288621.370
± 12576.655 ops/s
GraphMutateBenchmark.testAddVAddEWithPropsChained thrpt 20 5.402
± 0.084 ops/s
GraphMutateBenchmark.testAddVWithProps thrpt 20 21924.945
± 230.668 ops/s
GraphMutateBenchmark.testAddVWithPropsChained thrpt 20 184.896
± 4.875 ops/s
GraphMutateBenchmark.testAddVertex thrpt 20 5830447.734
± 131727.984 ops/s
GraphMutateBenchmark.testAddVertexWithProps thrpt 20 163131.738
± 2647.117 ops/s
GraphMutateBenchmark.testEdgeProperty thrpt 20 8971499.256
± 1630382.353 ops/s
GraphMutateBenchmark.testEdgePropertyStep thrpt 20 125053.249
± 4770.478 ops/s
GraphMutateBenchmark.testVertexProperty thrpt 20 3741129.376
± 286173.817 ops/s
GraphMutateBenchmark.testVertexPropertyStep thrpt 20 121742.106
± 4339.508 ops/s
```
and here's the benchmarks now in this PR which will apply to 3.2.5 and
3.3.0:
```text
Benchmark Mode Cnt Score
Error Units
GraphMutateBenchmark.testAddE thrpt 20 26921.570
± 241.374 ops/s
GraphMutateBenchmark.testAddEdge thrpt 20 3567550.566
± 185748.638 ops/s
GraphMutateBenchmark.testAddV thrpt 20 341523.214
± 10678.074 ops/s
GraphMutateBenchmark.testAddVAddEWithPropsChained thrpt 20 462.173
± 46.842 ops/s
GraphMutateBenchmark.testAddVWithProps thrpt 20 46946.475
± 508.174 ops/s
GraphMutateBenchmark.testAddVWithPropsChained thrpt 20 430.650
± 5.941 ops/s
GraphMutateBenchmark.testAddVertex thrpt 20 6362284.245
± 294203.426 ops/s
GraphMutateBenchmark.testAddVertexWithProps thrpt 20 165223.769
± 4788.481 ops/s
GraphMutateBenchmark.testEdgeProperty thrpt 20 8490043.427
± 575562.362 ops/s
GraphMutateBenchmark.testEdgePropertyStep thrpt 20 142430.605
± 3452.101 ops/s
GraphMutateBenchmark.testVertexProperty thrpt 20 4016719.775
± 587856.207 ops/s
GraphMutateBenchmark.testVertexPropertyStep thrpt 20 131740.946
± 1525.710 ops/s
```
The key things to note:
* A simple `g.addV()` got about 20% faster
* A simple `g.addE()` got about 34% faster
* A long chained set of `addV()` where multiple vertices were added in a
single traversal became a bit more than twice as fast.
* A long chained set of `addV()` and `addE()` where multiple vertices/edges
were added in a single traversal became 92 times faster
* These changes did not hurt performance of the "non-mutating" traversals -
those benchmarks remained about the same.
Obviously, these numbers still lag where we want to be, but the changes in
place represent a good body of work at this point and have opened up other
ideas for improvements which can be dealt with as other specific issues/pull
requests.
VOTE +1
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/apache/tinkerpop TINKERPOP-1642
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/tinkerpop/pull/587.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #587
----
commit 4ffd19e0bba9408aa648521b4b94d1997055a796
Author: Stephen Mallette <[email protected]>
Date: 2017-03-02T15:44:58Z
TINKERPOP-1642 Improved performance of addV() and chained mutating
statements
Added more tests for Parameters. Performance improved by about 2.5x given
the benchmarks. Also long chained mutating traversals are now inserting roughly
on par with single insert traversals (prior to this change they were about 25%
slower).
commit c8cc470e38b6f2ef3f87fa6cfebb59a92e6301d2
Author: Stephen Mallette <[email protected]>
Date: 2017-03-03T11:47:56Z
TINKERPOP-1642 Cached the Traversal list in Parameters
This leads to a pretty good boost in performance from the prior commit
especially for long chained mutating traversals that add vertices and edges as
there is a 3X improvement in those cases.
commit 632fd219cd4452486dd45fb3a748f897fd751904
Author: Stephen Mallette <[email protected]>
Date: 2017-03-03T13:45:24Z
TINKERPOP-1642 Minor fixes
commit 0406a92d2530b2486bc81ca6d6ab4eacd0094a9a
Author: Marko A. Rodriguez <[email protected]>
Date: 2017-03-03T16:27:58Z
Added ScopingStrategy which does a single TraversalHelper.getLabels() call
and recurssively tells where() and select() steps the path labels accessed by
the Traversal.
commit 9f7b7591a3ea0782b9389061d74db2586ff90430
Author: Marko A. Rodriguez <[email protected]>
Date: 2017-03-06T15:07:39Z
Add a slight optimization to ScopingStrategy that will not compute step
labels if no Scoping steps are defined. Also added JavaDoc and comments
accordingly.
commit 2549650f62148c21f1217af629b1ef04fdd9bbe7
Author: Marko A. Rodriguez <[email protected]>
Date: 2017-03-08T14:37:07Z
Wow. Can't believe we didn't do this from the start. LABELED_PATH is set if
a Step is labeled. Dar. So much simpler than all the recurssion in
TraversalHelper.getLabels() and having a ScopingStrategy.... @spmallette -- can
you verify that your Mutating traversal profiling is faster now.
commit 9185f45273cefca3a0b622e87e1eab825dea0ffe
Author: Stephen Mallette <[email protected]>
Date: 2017-03-10T17:31:38Z
TINKERPOP-1642 Removed some extra iteration in Parameters.getTraversals()
commit 4838f7b0eb0f2f97b3dcad18a72295dc655dd463
Author: Stephen Mallette <[email protected]>
Date: 2017-03-10T18:27:01Z
TINKERPOP-1642 Minor refactoring to get rid of duplicate code
commit 9e4942cf83e5add39838605641112438b093dbd7
Author: Stephen Mallette <[email protected]>
Date: 2017-03-10T20:11:52Z
TINKERPOP-1642 Made Mutating steps implement Scoping.
This simplified PathUtil.getReferencedLabels() and reduced iterations over
Parameter traversals.
commit d4467512345b39dd0a8f89309d26039b0d5e96f8
Author: Daniel Kuppitz <[email protected]>
Date: 2017-03-13T15:01:12Z
Fixed VertexProgram test that verifies proper TraversalRequirements.
commit 9893e49ae03cc1a474a6b8085d0a4e3d5eff3b36
Author: Stephen Mallette <[email protected]>
Date: 2017-03-13T18:50:09Z
TINKERPOP-1642 Pushed integrateChild() operations into Parameters.set()
There was no need to continually iterate the traversal list and doing
integrateChild() over and over again for each newly added traversal. Removing
those wasteful cycles and only doing that once, when the Traversal was added
helped improve performance.
commit d39ef5caf0e1fc51767202b9bf28a12117a51a40
Author: Marko A. Rodriguez <[email protected]>
Date: 2017-03-17T17:57:52Z
I made it so ProfileStrategy uses the MARKER model developed by @dkuppitz
and myself to reduce recurssive lookups in strategies. Moving forward, we
really need to move to a Traversal.metadata model as using labeled steps is a
bit hackish.
commit 3929a6748e3b54692c0d103319965dd0d47ba3e1
Author: Marko A. Rodriguez <[email protected]>
Date: 2017-03-17T19:04:58Z
added MARKER model to PathRetractionStrategy to reduce recurssive lookups
of invalidating steps. Again, we really need Traversal.metdata to make this
more 'pure'.
commit 0824625f4db8e0fdc483183faf8d505f3300b6b4
Author: Stephen Mallette <[email protected]>
Date: 2017-03-27T18:03:26Z
TINKERPOP-1642 Fixed up some execution problems after rebase
The logic still doesn't seem quite right but at least the tests pass and a
complete build is achieved. The issue is related to new stuff added in
PathRetractionStrategy on TINKERPOP-1652.
commit fc3fcb38255aa49bd4a311172d0d8b84e31dfdb9
Author: Marko A. Rodriguez <[email protected]>
Date: 2017-03-28T13:46:31Z
removed a repeated recurssion in PathRetractionStrategy using the MARKER
model of strategies.
----
> Improve performance of mutating traversals
> ------------------------------------------
>
> Key: TINKERPOP-1642
> URL: https://issues.apache.org/jira/browse/TINKERPOP-1642
> Project: TinkerPop
> Issue Type: Improvement
> Components: process
> Affects Versions: 3.2.4
> Reporter: stephen mallette
> Assignee: stephen mallette
> Priority: Critical
>
> Make an attempt improve performance of mutating traversals. Some general
> goals for this ticket should be to:
> 1. Improve performance parity of structure and process methods for modifying
> the graph - structure is pretty far ahead.
> 2. Improve the speed of large chained mutating traversals - see
> http://stackoverflow.com/q/41926409/1831717
> 3. Determine ways to make these performance improvements in {{gremlin-core}}
> so that they may benefit all {{Graph}} implementations
> 4. Improve the speed of TinkerGraph specifically if possible as it is used
> for subgraphing functions in many cases and ends up being useful regardless
> of the graph implementation ultimately used.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)