Derrick Burns created SPARK-6001:
------------------------------------
Summary: K-Means clusterer should return the assignments of input
points to clusters
Key: SPARK-6001
URL: https://issues.apache.org/jira/browse/SPARK-6001
Project: Spark
Issue Type: Improvement
Components: MLlib
Affects Versions: 1.2.1
Reporter: Derrick Burns
Priority: Minor
The K-Means clusterer returns a KMeansModel that contains the cluster centers.
However, when available, I suggest that the K-Means clusterer also return an
RDD of the assignments of the input data to the clusters. While the assignments
can be computed given the KMeansModel, why not return assignments if they are
available to save re-computation costs.
The K-means implementation at
https://github.com/derrickburns/generalized-kmeans-clustering returns the
assignments when available.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]