Wrong implementation of the mapper in the canopy clusterer?
-----------------------------------------------------------
Key: MAHOUT-169
URL: https://issues.apache.org/jira/browse/MAHOUT-169
Project: Mahout
Issue Type: Question
Components: Clustering
Reporter: Peter Wippermann
The class
org.apache.mahout.clustering.canopy.CanopyMapper
takes use of the method "Canopy.addPointToCanopies(point, canopies)".
The documentation of the Canopy-Class says, that this function is used by the
reducer - no mention of the mapper here.
It furthermore says, that the mapper would use "emitPointToNewCanopies" or
"emitPointToExistingCanopies", which however he does not.
I'm just trying to figure out, how the canopy clustering in general works. But
this is confusing to me. So if it is a bug, please fix it. If it is not, I'd be
very happy, if you could explain me, why :-)
Furthermore I'm wondering why the syntheticcontrol clustering example will only
find ONE SINGLE CLUSTER. Do I mix thinks up here???
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.