[ https://issues.apache.org/jira/browse/MATH-1374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Artem Barger updated MATH-1374: ------------------------------- Attachment: MATH-1374.patch Proposed fix which deals w/ the problem. > KMeansPlusPlusClusterer unable to converge having repeatable points in input > dataset > ------------------------------------------------------------------------------------ > > Key: MATH-1374 > URL: https://issues.apache.org/jira/browse/MATH-1374 > Project: Commons Math > Issue Type: Bug > Reporter: Artem Barger > Attachments: MATH-1374.patch > > > If the input list size of {{Clusterable}} is greater than parameter {{k}} > while has less unique points than {{k}}, the algorithm will fail to converge, > tested w/ different EmptyClusterStrategy options, here is the example of > default one: > {code} > @Test > public void testNumberOfRequestedClustersSameAsInputSize() { > final RandomVectorGenerator rng = new > UncorrelatedRandomVectorGenerator(10, > new > GaussianRandomGenerator(RandomSource.create(RandomSource.MT))); > List<DoublePoint> points = new ArrayList<>(); > for (int i = 0; i < 10; i++) { > final DoublePoint point = new DoublePoint(rng.nextVector()); > for (int j = 0; j < 3; j++) { > points.add(point); > } > } > final KMeansPlusPlusClusterer<DoublePoint> clusterer = new > KMeansPlusPlusClusterer<>(12); > clusterer.cluster(points); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)