[ https://issues.apache.org/jira/browse/MAHOUT-931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13175989#comment-13175989 ]
Paritosh Ranjan commented on MAHOUT-931: ---------------------------------------- I think the Clustering Policy is all that is needed for extensibility. The design changes I did are : a) Passing the vector rather than the probability to the clustering policy. I think this might be needed for clustering/outlier removal. Might help in transforming vector/adding weight before classification ( thinking of some future functionalities ) b) Added ClusterConfig objects to the policies. Now, the clustering policy will know all about the clustering parameters used. So, they will be able to classify accordingly. c) ClusterConfig objects will emerge as generic cluster configuration objects, which can be used anywhere in clustering algorithms. Right now, there are a bunch of clustering parameters scattered through method calls. I am in a habit of renaming/cleaning things while coding. So, it just happened. > Implement a pluggable outlier removal capability for cluster classifiers > ------------------------------------------------------------------------ > > Key: MAHOUT-931 > URL: https://issues.apache.org/jira/browse/MAHOUT-931 > Project: Mahout > Issue Type: Improvement > Components: Classification, Clustering > Affects Versions: 0.6 > Reporter: Paritosh Ranjan > Fix For: 0.7 > > Attachments: MAHOUT-931 > > > A pluggable outlier removal capability while classifying the clusters is > needed. The classification and outlier removal implementations, both should > be completely separate entities for better abstraction. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira