RJ Nowling created SPARK-2429:
---------------------------------
Summary: Hierarchical Implementation of KMeans
Key: SPARK-2429
URL: https://issues.apache.org/jira/browse/SPARK-2429
Project: Spark
Issue Type: New Feature
Components: MLlib
Reporter: RJ Nowling
Priority: Minor
Hierarchical clustering algorithms are widely used and would make a nice
addition to MLlib. Clustering algorithms are useful for determining
relationships between clusters as well as offering faster assignment.
Discussion on the dev list suggested the following possible approaches:
* Top down, recursive application of KMeans
* Reuse DecisionTree implementation with different objective function
* Hierarchical SVD
It was also suggested that support for distance metrics other than Euclidean
such as negative dot or cosine are necessary.
--
This message was sent by Atlassian JIRA
(v6.2#6252)