Truncation issue in KMeansPlusPlusClusterer
-------------------------------------------

                 Key: MATH-546
                 URL: https://issues.apache.org/jira/browse/MATH-546
             Project: Commons Math
          Issue Type: Bug
    Affects Versions: 3.0
            Reporter: Nate Paymer
            Priority: Minor


The for loop inside KMeansPlusPlusClusterer.chooseInitialClusters defines a 
variable
  int sum = 0;
This variable should have type double, rather than int.  Using an int causes 
the method to truncate the distances between points to (square roots of) 
integers.  It's especially bad when the distances between points are typically 
less than 1.

As an aside, in version 2.2, this bug manifested itself by making the clusterer 
return empty clusters.  I wonder if the EmptyClusterStrategy would still be 
necessary if this bug were fixed.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to