Dan Filimon created MAHOUT-1223:
-----------------------------------
Summary: Point skipped in StreamingKMeans when iterating through
centroids from a reducer
Key: MAHOUT-1223
URL: https://issues.apache.org/jira/browse/MAHOUT-1223
Project: Mahout
Issue Type: Bug
Components: Clustering
Affects Versions: 0.8
Reporter: Dan Filimon
Priority: Minor
When calling StreamingKMeans in the reducer (to collapse the number of clusters
to they can fit into memory), the clustering is done on the Hadoop reducer
iterable.
Currently, the first Centroid is added directly as a special case and then is
skipped when iterating through the main loop.
However, Hadoop reducer iterables cannot be rewound therefore causing SKM to
skip one point.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira