[
https://issues.apache.org/jira/browse/SPARK-10994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sean Owen resolved SPARK-10994.
-------------------------------
Resolution: Won't Fix
> Clustering coefficient computation in GraphX
> --------------------------------------------
>
> Key: SPARK-10994
> URL: https://issues.apache.org/jira/browse/SPARK-10994
> Project: Spark
> Issue Type: New Feature
> Components: GraphX
> Reporter: Yang Yang
> Original Estimate: 336h
> Remaining Estimate: 336h
>
> The Clustering Coefficient (CC) is a fundamental measure in social (or other
> type of) network analysis assessing the degree to which nodes tend to cluster
> together [1][2]. Clustering coefficient, along with density, node degree,
> path length, diameter, connectedness, and node centrality are seven most
> important properties to characterise a network [3].
> We found that GraphX has already implemented connectedness, node centrality,
> path length, but does not have a componenet for computing clustering
> coefficient. This actually was the first intention for us to implement an
> algorithm to compute clustering coefficient for each vertex of a given graph.
> Clustering coefficient is very helpful to many real applications, such as
> user behaviour prediction and structure prediction (like link prediction). We
> did that before in a bunch of papers (e.g., [4-5]), and also found many other
> publication papers using this metric in their work [6-8]. We are very
> confident that this feature will benefit GraphX and attract a large number of
> users.
> References
> [1] https://en.wikipedia.org/wiki/Clustering_coefficient
> [2] Watts, Duncan J., and Steven H. Strogatz. "Collective dynamics of
> ‘small-world’ networks." nature 393.6684 (1998): 440-442. (with 27266
> citations).
> [3] https://en.wikipedia.org/wiki/Network_science
> [4] Jing Zhang, Zhanpeng Fang, Wei Chen, and Jie Tang. Diffusion of
> "Following" Links in Microblogging Networks. IEEE Transaction on Knowledge
> and Data Engineering (TKDE), Volume 27, Issue 8, 2015, Pages 2093-2106.
> [5] Yang Yang, Jie Tang, Jacklyne Keomany, Yanting Zhao, Ying Ding, Juanzi
> Li, and Liangwei Wang. Mining Competitive Relationships by Learning across
> Heterogeneous Networks. In Proceedings of the Twenty-First Conference on
> Information and Knowledge Management (CIKM'12). pp. 1432-1441.
> [6] Clauset, Aaron, Cristopher Moore, and Mark EJ Newman. Hierarchical
> structure and the prediction of missing links in networks. Nature 453.7191
> (2008): 98-101. (with 973 citations)
> [7] Adamic, Lada A., and Eytan Adar. Friends and neighbors on the web. Social
> networks 25.3 (2003): 211-230. (1238 citations)
> [8] Lichtenwalter, Ryan N., Jake T. Lussier, and Nitesh V. Chawla. New
> perspectives and methods in link prediction. In KDD'10.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]