Yu Ishikawa created SPARK-3439: ---------------------------------- Summary: Add Canopy Clustering Algorithm Key: SPARK-3439 URL: https://issues.apache.org/jira/browse/SPARK-3439 Project: Spark Issue Type: New Feature Components: MLlib Reporter: Yu Ishikawa Priority: Minor
The canopy clustering algorithm is an unsupervised pre-clustering algorithm. It is often used as a preprocessing step for the K-means algorithm or the Hierarchical clustering algorithm. It is intended to speed up clustering operations on large data sets, where using another algorithm directly may be impractical due to the size of the data set. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org