[
https://issues.apache.org/jira/browse/MATH-1374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rob Tompkins updated MATH-1374:
-------------------------------
Fix Version/s: 4.0
> KMeansPlusPlusClusterer unable to converge having repeatable points in input
> dataset
> ------------------------------------------------------------------------------------
>
> Key: MATH-1374
> URL: https://issues.apache.org/jira/browse/MATH-1374
> Project: Commons Math
> Issue Type: Bug
> Reporter: Artem Barger
> Assignee: Artem Barger
> Fix For: 4.0
>
> Attachments: MATH-1374.patch
>
>
> If the input list size of {{Clusterable}} is greater than parameter {{k}}
> while has less unique points than {{k}}, the algorithm will fail to converge,
> tested w/ different EmptyClusterStrategy options, here is the example of
> default one:
> {code}
> @Test
> public void testNumberOfRequestedClustersSameAsInputSize() {
> final RandomVectorGenerator rng = new
> UncorrelatedRandomVectorGenerator(10,
> new
> GaussianRandomGenerator(RandomSource.create(RandomSource.MT)));
> List<DoublePoint> points = new ArrayList<>();
> for (int i = 0; i < 10; i++) {
> final DoublePoint point = new DoublePoint(rng.nextVector());
> for (int j = 0; j < 3; j++) {
> points.add(point);
> }
> }
> final KMeansPlusPlusClusterer<DoublePoint> clusterer = new
> KMeansPlusPlusClusterer<>(12);
> clusterer.cluster(points);
> }
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)