KMeansPlusPlusClusterer should run multiple trials
--------------------------------------------------
Key: MATH-548
URL: https://issues.apache.org/jira/browse/MATH-548
Project: Commons Math
Issue Type: Improvement
Reporter: Nate Paymer
Priority: Minor
The interface and documentation for KMeansPlusPlusClusterer imply that a single
call to cluster() is sufficient to get the optimal set of clusters. But this
isn't true -- practically every client should be calling cluster() multiple
times, selecting the best resulting set of clusters. It seems to me that
rather than forcing every client to implement this functionality, it should be
placed directly in the KMeansPlusPlusClusterer class.
I propose adding a new method to KMeansPlusPlusClusterer:
List<Cluster<T>> cluster(Collection<T> points, int k, int numTrials, int
maxIterationsPerTrial)
which calls the existing cluster() method numTrials times, returning the best
result.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira