RJ Nowling created SPARK-2308:
---------------------------------

             Summary: Add KMeans MiniBatch clustering algorithm to MLlib
                 Key: SPARK-2308
                 URL: https://issues.apache.org/jira/browse/SPARK-2308
             Project: Spark
          Issue Type: New Feature
          Components: MLlib
            Reporter: RJ Nowling
            Priority: Minor


Mini-batch is a version of KMeans that uses a randomly-sampled subset of the 
data points in each iteration instead of the full set of data points, improving 
performance (and in some cases, accuracy).  The mini-batch version is 
compatible with the KMeans|| initialization algorithm currently implemented in 
MLlib.

I suggest adding KMeans Mini-batch as an alternative.

I'd like this to be assigned to me.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to