[ 
https://issues.apache.org/jira/browse/TINKERPOP-1870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16326941#comment-16326941
 ] 

Artem Aliev commented on TINKERPOP-1870:
----------------------------------------

I wrapped the block into findVertexTraverser() method to see its timing in 
profiler. See attached profiler screenshots

So it takes 20-30% of execution time in single 6 core executor. The performance 
was was greatly improved on my 10k vertex graph:

Before fix:
{code}

gremlin> g.V().count()
==>10000
gremlin> g.E().count()
==>160000

gremlin> clock(1) \{g.V().emit().repeat(both().dedup()).count().next()}
==>52349.640981
gremlin> clock(1) \{g.V().emit().repeat(both().dedup()).count().next()}
==>53800.898754999995
gremlin> clock(1) \{g.V().emit().repeat(both().dedup()).count().next()}
==>50643.744645

{code}

After fix:
{code}

gremlin> clock(1) \{g.V().emit().repeat(both().dedup()).count().next()}
==>42062.945477
gremlin> clock(1) \{g.V().emit().repeat(both().dedup()).count().next()}
==>38419.463171999996
gremlin> clock(1) \{g.V().emit().repeat(both().dedup()).count().next()}
==>34336.707208

{code}

{code}
>mvn clean install
[INFO] BUILD SUCCESS
{code}

> n^2 synchronious operation in OLAP WorkerExecutor.execute() method
> ------------------------------------------------------------------
>
>                 Key: TINKERPOP-1870
>                 URL: https://issues.apache.org/jira/browse/TINKERPOP-1870
>             Project: TinkerPop
>          Issue Type: Improvement
>            Reporter: Artem Aliev
>            Priority: Major
>         Attachments: findTraverser1.png, findTraverser2.png, 
> findTraverserFixed.png
>
>
> [https://github.com/apache/tinkerpop/blob/master/gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/process/computer/traversal/WorkerExecutor.java#L80-L93]
> This block of code iterates over all remote traverses to select one related 
> to the current vertex and remove it. This operation is repeated for the next 
> vertex and so one. For following example query it means n^2 operations (n is 
> number of vertices). All of them in sync block. multi core spark executor 
> will do this operations serial. 
> {code}
> g.V().emit().repeat(both().dedup()).count().next()
> {code}
> See jvisualvm screenshot. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to