[
https://issues.apache.org/jira/browse/SPARK-4259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Xiangrui Meng resolved SPARK-4259.
----------------------------------
Resolution: Fixed
Fix Version/s: 1.3.0
Issue resolved by pull request 4254
[https://github.com/apache/spark/pull/4254]
> Add Power Iteration Clustering Algorithm with Gaussian Similarity Function
> --------------------------------------------------------------------------
>
> Key: SPARK-4259
> URL: https://issues.apache.org/jira/browse/SPARK-4259
> Project: Spark
> Issue Type: New Feature
> Components: MLlib
> Reporter: Fan Jiang
> Assignee: Fan Jiang
> Labels: features
> Fix For: 1.3.0
>
>
> In recent years, power Iteration clustering has become one of the most
> popular modern clustering algorithms. It is simple to implement, can be
> solved efficiently by standard linear algebra software, and very often
> outperforms traditional clustering algorithms such as the k-means algorithm.
> Power iteration clustering is a scalable and efficient algorithm for
> clustering points given pointwise mutual affinity values. Internally the
> algorithm:
> computes the Gaussian distance between all pairs of points and represents
> these distances in an Affinity Matrix
> calculates a Normalized Affinity Matrix
> calculates the principal eigenvalue and eigenvector
> Clusters each of the input points according to their principal eigenvector
> component value
> Details of this algorithm are found within [Power Iteration Clustering, Lin
> and Cohen]{www.icml2010.org/papers/387.pdf}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]