[ 
https://issues.apache.org/jira/browse/SPARK-13714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15182791#comment-15182791
 ] 

Apache Spark commented on SPARK-13714:
--------------------------------------

User 'zhengruifeng' has created a pull request for this issue:
https://github.com/apache/spark/pull/11556

> Another ConnectedComponents based on Max-Degree Propagation
> -----------------------------------------------------------
>
>                 Key: SPARK-13714
>                 URL: https://issues.apache.org/jira/browse/SPARK-13714
>             Project: Spark
>          Issue Type: New Feature
>          Components: GraphX
>            Reporter: zhengruifeng
>            Priority: Minor
>
> Current ConnectedComponents algorithm was based on Min-VertexId Propagation, 
> which is sensitive to the place of Min-VertexId.
> While this implementation is based on Max-Degree Propagation.
> First, the degree graph is computed. And in the pregel progress, the vertex 
> with the max degree in a CC is the start point of propagation.
> This new method has advantages over the old one:
> 1, The convergence is only determined by the structs of CC, and is robust to 
> the place of vertex with Min-ID.
> 2, For spherical CCs in which there may be a concept like 'center', it can 
> accelerate the convergence. For example, GraphGenerators.gridGraph(sc, 3, 3), 
> the old CC need 4 supersteps, while the new one only need 2 supersteps.
> 3, If we limit the number of iteration, the new method tend to generate more 
> acceptable results.
> 4, The output for each CC is the vertex with max degree in it, which may be 
> more meaningful. And because the vertex-ID is nominal in most cases, the 
> vertex with min-ID in a CC is somewhat meanless.
> But there are still two disadvantages:
> 1,The message boy grows, from (VID) to (VID, Degree). that is (Long) -> 
> (Long, Int)
> 2,For graph with simple CCs, it may be slower than old one. Because it need a 
> extra degree computation.
> The api is the same like ConnectedComponents:
> val graph = ...
> val cc = graph.ConnectedComponentsWithDegree(100)
> or
> val cc = ConnectedComponentsWithDegree.run(graph, 100)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to