[
https://issues.apache.org/jira/browse/SPARK-13714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15182791#comment-15182791
]
Apache Spark commented on SPARK-13714:
--------------------------------------
User 'zhengruifeng' has created a pull request for this issue:
https://github.com/apache/spark/pull/11556
> Another ConnectedComponents based on Max-Degree Propagation
> -----------------------------------------------------------
>
> Key: SPARK-13714
> URL: https://issues.apache.org/jira/browse/SPARK-13714
> Project: Spark
> Issue Type: New Feature
> Components: GraphX
> Reporter: zhengruifeng
> Priority: Minor
>
> Current ConnectedComponents algorithm was based on Min-VertexId Propagation,
> which is sensitive to the place of Min-VertexId.
> While this implementation is based on Max-Degree Propagation.
> First, the degree graph is computed. And in the pregel progress, the vertex
> with the max degree in a CC is the start point of propagation.
> This new method has advantages over the old one:
> 1, The convergence is only determined by the structs of CC, and is robust to
> the place of vertex with Min-ID.
> 2, For spherical CCs in which there may be a concept like 'center', it can
> accelerate the convergence. For example, GraphGenerators.gridGraph(sc, 3, 3),
> the old CC need 4 supersteps, while the new one only need 2 supersteps.
> 3, If we limit the number of iteration, the new method tend to generate more
> acceptable results.
> 4, The output for each CC is the vertex with max degree in it, which may be
> more meaningful. And because the vertex-ID is nominal in most cases, the
> vertex with min-ID in a CC is somewhat meanless.
> But there are still two disadvantages:
> 1,The message boy grows, from (VID) to (VID, Degree). that is (Long) ->
> (Long, Int)
> 2,For graph with simple CCs, it may be slower than old one. Because it need a
> extra degree computation.
> The api is the same like ConnectedComponents:
> val graph = ...
> val cc = graph.ConnectedComponentsWithDegree(100)
> or
> val cc = ConnectedComponentsWithDegree.run(graph, 100)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]