Hi All, Trying to understand why connected components algorithms runs much slower than the graphX equivalent?
Graphx code creates 16 stages. GraphFrame graphFrame = GraphFrame.fromEdges(edges); Dataset<Row> connectedComponents = graphFrame.connectedComponents().setAlgorithm("graphx").run(); and the GraphFrames code below creates 55 stages. GraphFrame graphFrame = GraphFrame.fromEdges(edges); Dataset<Row> connectedComponents = graphFrame.connectedComponents().run(); Any ideas on how to make GraphFrames faster? Also what is the latest Graph Processing Library/Framework I should be using? I feel like there isn't lot of work going on in either GraphFrames or GraphX so I am just curious on what I should use for long term? Thanks!