Lei Wang created SPARK-17836:
--------------------------------
Summary: Use cross validation to determine the number of clusters
for EM or KMeans algorithms
Key: SPARK-17836
URL: https://issues.apache.org/jira/browse/SPARK-17836
Project: Spark
Issue Type: Bug
Components: ML
Reporter: Lei Wang
Sometimes it's not easy for users to determine number of clusters.
It would be very useful If spark ml can support this.
There are several methods to do this according to wiki
https://en.wikipedia.org/wiki/Determining_the_number_of_clusters_in_a_data_set
Weka uses crossing validation.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]