Just +1 <grin>
On 2/22/12 10:35 PM, Paritosh Ranjan (Commented) (JIRA) wrote:
[
https://issues.apache.org/jira/browse/MAHOUT-929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13214329#comment-13214329
]
Paritosh Ranjan commented on MAHOUT-929:
----------------------------------------
Assigned to myself.
I think cluster classification driver is developed now. Would wait for some
time for the ClusterClassificationMapper's Test case ( patch ) as we asked on
dev.
Else I will write it and commit it. Might need help while committing for the
first time.
Considering, ClusterClassificationDriver development is done, we need to
refactor the KMeans, FuzzyK, Dirichlet, Canopy Drivers.
I will create separate child issues for refactoring these algos, so that
different people can pick it in parallel, if they want. It will help in
avoiding duplicate efforts.
Jeff, any comments/suggestions?
Refactor Clustering (Vector Classification) into a Separate Postprocess with
Outlier Pruning
--------------------------------------------------------------------------------------------
Key: MAHOUT-929
URL: https://issues.apache.org/jira/browse/MAHOUT-929
Project: Mahout
Issue Type: Improvement
Components: Classification, Clustering
Affects Versions: 0.6
Reporter: Jeff Eastman
Assignee: Paritosh Ranjan
Fix For: 0.7
Attachments: Mahout-929, Mahout-929, Mahout-929, Mahout-929
The current clustering drivers have a -cp option to produce clusteredPoints
directory containing the input vectors classified by the final clusters
produced by the algorithm. These options are redundantly implemented in those
drivers.
- Factor out& implement an independent post processor to perform the
classification step independently of the various clustering implementations.
- Implement a pluggable outlier removal capability for this classifier.
- Consider building off of the ClusterClassifier& ClusterIterator ideas.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira