[
https://issues.apache.org/jira/browse/TINKERPOP-1801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16208050#comment-16208050
]
ASF GitHub Bot commented on TINKERPOP-1801:
-------------------------------------------
GitHub user artem-aliev opened a pull request:
https://github.com/apache/tinkerpop/pull/733
TINKERPOP-1801: fix profile() timing in OLAP by adding worker iterati…
…on timings to step metrics
this is a simple fix that do not change any API
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/artem-aliev/tinkerpop TINKERPOP-1801
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/tinkerpop/pull/733.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #733
----
commit 827ea9cfd57202612518e5e6bcff18f601dd2018
Author: artemaliev <artem.aliev@gmail,com>
Date: 2017-10-17T18:00:31Z
TINKERPOP-1801: fix profile() timing in OLAP by adding worker iteration
timings to step metrics
this is a simple fix that do not change any API
----
> OLAP profile() step return incorrect timing
> --------------------------------------------
>
> Key: TINKERPOP-1801
> URL: https://issues.apache.org/jira/browse/TINKERPOP-1801
> Project: TinkerPop
> Issue Type: Bug
> Affects Versions: 3.3.0, 3.2.6
> Reporter: Artem Aliev
>
> Graph ProfileStep calculates time of next()/hasNext() calls, expecting
> recursion.
> But Message passing/RDD joins is used by GraphComputer.
> So next() does not recursively call next steps, but message is generated. And
> most of the time is taken by message passing (RDD join).
> Thus on graph computer the time between ProfileStep should be measured, not
> inside it.
> The other approach is to get Spark statistics with SparkListener and add
> spark stages timings into profiler metrics. that will work only for spark but
> will give better representation of step costs.
> The simple fix is measuring time between OLAP iterations and add it to the
> profiler step.
> This will not take into account computer setup time, but will be precise
> enough for long running queries.
> To reproduce:
> tinkerPop 3.2.6 gremlin:
> {code}
> plugin activated: tinkerpop.server
> plugin activated: tinkerpop.utilities
> plugin activated: tinkerpop.spark
> plugin activated: tinkerpop.tinkergraph
> gremlin> graph =
> GraphFactory.open('conf/hadoop/hadoop-grateful-gryo.properties')
> gremlin> g = graph.traversal().withComputer(SparkGraphComputer)
> ==>graphtraversalsource[hadoopgraph[gryoinputformat->gryooutputformat],
> sparkgraphcomputer]
> gremlin> g.V().out().out().count().profile()
> ==>Traversal Metrics
> Step Count
> Traversers Time (ms) % Dur
> =============================================================================================================
> GraphStep(vertex,[]) 808
> 808 2.025 18.35
> VertexStep(OUT,vertex) 8049
> 562 4.430 40.14
> VertexStep(OUT,edge) 327370
> 7551 4.581 41.50
> CountGlobalStep 1
> 1 0.001 0.01
> >TOTAL -
> - 11.038 -
> gremlin> clock(1){g.V().out().out().count().next() }
> ==>3421.92758
> gremlin>
> {code}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)