Tobias Bertelsen created SPARK-9937:
---------------------------------------
Summary: GraphX Performance: Partition overhead scales
quadratically
Key: SPARK-9937
URL: https://issues.apache.org/jira/browse/SPARK-9937
Project: Spark
Issue Type: Bug
Components: GraphX
Reporter: Tobias Bertelsen
Hello everybody, or particularly Graph X developers.
I working on an algorithm that combines normal RDD operations and graph
operations. When I tested the parallelizability I discovered that when I added
more worker nodes most stages would run faster, but my graph operations would
run slower.
More specifically with twice the number of servers the graph operations would
take twice as long, indicating that the amount of work increased fourfold. I
created a plot of the runtime for different number of servers, which I have
attached.
The graph operations are called clustering in the plot.
I tried to look into the code and I think I found something that might be the
problem.
The operations shipVertexAttributes and shipVertexIds in VertexRDDImpl seems to
be generating RDD's that contains an element for every combination of vertex
partition and edge partition, even if there are no connection between the two.
The result is that the overhead time ends up dominating the computation time.
I am not familiar with the design and code base for Graph X. Perhaps there are
more of problems of this kind which causes parallelization problems.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]