Hi All,

Trying to understand why connected components algorithms runs much slower
than the graphX equivalent?

Graphx code creates 16 stages.

GraphFrame graphFrame = GraphFrame.fromEdges(edges);
Dataset<Row> connectedComponents =
graphFrame.connectedComponents().setAlgorithm("graphx").run();

and the GraphFrames code below creates 55 stages.

GraphFrame graphFrame = GraphFrame.fromEdges(edges);
        Dataset<Row> connectedComponents =
graphFrame.connectedComponents().run();

Any ideas on how to make GraphFrames faster? Also what is the latest
Graph Processing Library/Framework I should be using? I feel like
there isn't lot of work going on in either GraphFrames or GraphX so I
am just curious on what I should use for long term?

Thanks!

Reply via email to