Dan Filimon created MAHOUT-1223:
-----------------------------------

             Summary: Point skipped in StreamingKMeans when iterating through 
centroids from a reducer
                 Key: MAHOUT-1223
                 URL: https://issues.apache.org/jira/browse/MAHOUT-1223
             Project: Mahout
          Issue Type: Bug
          Components: Clustering
    Affects Versions: 0.8
            Reporter: Dan Filimon
            Priority: Minor


When calling StreamingKMeans in the reducer (to collapse the number of clusters 
to they can fit into memory), the clustering is done on the Hadoop reducer 
iterable.

Currently, the first Centroid is added directly as a special case and then is 
skipped when iterating through the main loop.
However, Hadoop reducer iterables cannot be rewound therefore causing SKM to 
skip one point.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to