Github user artem-aliev commented on the issue:
https://github.com/apache/tinkerpop/pull/734
The fix add iteration time to the appropriate step. Iteration time is a
time between workerIterationStart() and workerIterationEnd() callbacks.
So before fix the timing looks like
```
gremlin> g.V().out().out().count().profile()
==>Traversal Metrics
Step Count
Traversers Time (ms) % Dur
=============================================================================================================
GraphStep(vertex,[]) 9962
9962 70.873 48.95
VertexStep(OUT,vertex) 1012657
3745 37.132 25.65
VertexStep(OUT,edge) 2101815
6192 36.751 25.39
CountGlobalStep 1
1 0.018 0.01
>TOTAL -
- 144.775 -
```
While query runs 10s seconds.
After the fix:
```
gremlin> g.V().out().out().count().profile()
==>Traversal Metrics
Step Count
Traversers Time (ms) % Dur
=============================================================================================================
GraphStep(vertex,[]) 9962
9962 14186.809 97.43
VertexStep(OUT,vertex) 1012657
3745 340.051 2.34
VertexStep(OUT,edge) 2101815
6192 33.684 0.23
CountGlobalStep 1
1 0.004 0.00
>TOTAL -
- 14560.549 -
```
That shows that most of the time for this OLAP query was spend in the
initial iteration (actually star graph creation).
There still could be some inconsistencies because
1. No computer specific setup time is measured
2. Spark has a lot of lazy staff, so most of Spark RDD setup are counted as
a first step.
3. The algorithm could fail to assign timing to the right step in bas of
sophisticated queries.
By the way new timing is pretty close to the wall clock time.
---