[jira] [Commented] (TINKERPOP-1801) OLAP profile() step return incorrect timing

ASF GitHub Bot (JIRA) Tue, 17 Oct 2017 11:08:14 -0700

    [ 
https://issues.apache.org/jira/browse/TINKERPOP-1801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16208050#comment-16208050
 ]


ASF GitHub Bot commented on TINKERPOP-1801:
-------------------------------------------

GitHub user artem-aliev opened a pull request:

    https://github.com/apache/tinkerpop/pull/733

    TINKERPOP-1801: fix profile() timing in OLAP by adding worker iterati…

    …on timings to step metrics
    
    this is a simple fix that do not change any API

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/artem-aliev/tinkerpop TINKERPOP-1801

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/tinkerpop/pull/733.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #733
    
----
commit 827ea9cfd57202612518e5e6bcff18f601dd2018
Author: artemaliev <artem.aliev@gmail,com>
Date:   2017-10-17T18:00:31Z

    TINKERPOP-1801: fix profile() timing in OLAP by adding worker iteration 
timings to step metrics
    this is a simple fix that do not change any API

----


>  OLAP profile() step return incorrect timing
> --------------------------------------------
>
>                 Key: TINKERPOP-1801
>                 URL: https://issues.apache.org/jira/browse/TINKERPOP-1801
>             Project: TinkerPop
>          Issue Type: Bug
>    Affects Versions: 3.3.0, 3.2.6
>            Reporter: Artem Aliev
>
> Graph ProfileStep calculates time of next()/hasNext() calls, expecting 
> recursion.
> But Message passing/RDD joins is used by GraphComputer.
> So next() does not recursively call next steps, but message is generated. And 
> most of the time is taken by message passing (RDD join). 
> Thus on graph computer the time between ProfileStep should be measured, not 
> inside it.
> The other approach is to get Spark statistics with SparkListener and add 
> spark stages timings into profiler metrics. that will work only for spark but 
> will give better representation of step costs.
> The simple fix is measuring time between OLAP iterations and add it to the 
> profiler step.
> This will not take into account computer setup time, but will be precise 
> enough for long running queries.
> To reproduce:
> tinkerPop 3.2.6 gremlin:
> {code}
> plugin activated: tinkerpop.server
> plugin activated: tinkerpop.utilities
> plugin activated: tinkerpop.spark
> plugin activated: tinkerpop.tinkergraph
> gremlin> graph = 
> GraphFactory.open('conf/hadoop/hadoop-grateful-gryo.properties')
> gremlin> g = graph.traversal().withComputer(SparkGraphComputer)
> ==>graphtraversalsource[hadoopgraph[gryoinputformat->gryooutputformat], 
> sparkgraphcomputer]
> gremlin> g.V().out().out().count().profile()
> ==>Traversal Metrics
> Step                                                               Count  
> Traversers       Time (ms)    % Dur
> =============================================================================================================
> GraphStep(vertex,[])                                                 808      
>    808           2.025    18.35
> VertexStep(OUT,vertex)                                              8049      
>    562           4.430    40.14
> VertexStep(OUT,edge)                                              327370      
>   7551           4.581    41.50
> CountGlobalStep                                                        1      
>      1           0.001     0.01
>                                             >TOTAL                     -      
>      -          11.038        -
> gremlin> clock(1){g.V().out().out().count().next() }
> ==>3421.92758
> gremlin>
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (TINKERPOP-1801) OLAP profile() step return incorrect timing

Reply via email to