Tobias Bertelsen created SPARK-9937:
---------------------------------------

             Summary: GraphX Performance: Partition overhead scales 
quadratically
                 Key: SPARK-9937
                 URL: https://issues.apache.org/jira/browse/SPARK-9937
             Project: Spark
          Issue Type: Bug
          Components: GraphX
            Reporter: Tobias Bertelsen


Hello everybody, or particularly Graph X developers.

I working on an algorithm that combines normal RDD operations and graph 
operations. When I tested the parallelizability I discovered that when I added 
more worker nodes most stages would run faster, but my graph operations would 
run slower.

More specifically with twice the number of servers the graph operations would 
take twice as long, indicating that the amount of work increased fourfold. I 
created a plot of the runtime for different number of servers, which  I have 
attached.
The graph operations are called clustering in the plot.

I tried to look into the code and I think I found something that might be the 
problem.
The operations shipVertexAttributes and shipVertexIds in VertexRDDImpl seems to 
be generating RDD's that contains an element for every combination of vertex 
partition and edge partition, even if there are no connection between the two.
The result is that the overhead time ends up dominating the computation time.

I am not familiar with the design and code base for Graph X. Perhaps there are 
more of problems of this kind which causes parallelization problems.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to