[
https://issues.apache.org/jira/browse/MATH-548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13044562#comment-13044562
]
Luc Maisonobe commented on MATH-548:
------------------------------------
If I understand correctly, you suggest putting something similar to the
multi-start feature from the optimization package.
What I don't get is how we can define "best result".
> KMeansPlusPlusClusterer should run multiple trials
> --------------------------------------------------
>
> Key: MATH-548
> URL: https://issues.apache.org/jira/browse/MATH-548
> Project: Commons Math
> Issue Type: Improvement
> Reporter: Nate Paymer
> Priority: Minor
>
> The interface and documentation for KMeansPlusPlusClusterer imply that a
> single call to cluster() is sufficient to get the optimal set of clusters.
> But this isn't true -- practically every client should be calling cluster()
> multiple times, selecting the best resulting set of clusters. It seems to me
> that rather than forcing every client to implement this functionality, it
> should be placed directly in the KMeansPlusPlusClusterer class.
> I propose adding a new method to KMeansPlusPlusClusterer:
> List<Cluster<T>> cluster(Collection<T> points, int k, int numTrials, int
> maxIterationsPerTrial)
> which calls the existing cluster() method numTrials times, returning the best
> result.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira