[ 
https://issues.apache.org/jira/browse/MAHOUT-931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13176266#comment-13176266
 ] 

Paritosh Ranjan commented on MAHOUT-931:
----------------------------------------

Ok. 

Should I proceed like this :

Step 1) Encapsulte Cluster specific CLI arguments (ClusterConfig and its 
cluster specific implementations)

Step 2) Implement all Clustering policies

Step 3) Implement outlier removal in policies. 
Step 3a) First cut : use a probability threshold based outlier removal ( as 
described in previous comment )
Step 3b) Final cut : Use cluster specific arguments for outlier removal. 

Step 4) Replace Clustering Algorithms with Classifier/Iterator ( for algorithms 
which can be done using this )

Regarding naming, I would say, that, readability should always be given 
importance. I consider naming as an important part of software development, 
either working alone or in a team. I prefer readable code than JavaDocs. The 
current code is not having ample JavaDocs, so at least naming should be 
appropriate. I am not pushing for name change, just expressing my thoughts.

If you agree upon implementing things in the order (Steps) I mentioned. Then I 
can start implementing them. If you have any suggestions to improve them, then 
please suggest. 

                
> Implement a pluggable outlier removal capability for cluster classifiers
> ------------------------------------------------------------------------
>
>                 Key: MAHOUT-931
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-931
>             Project: Mahout
>          Issue Type: Improvement
>          Components: Classification, Clustering
>    Affects Versions: 0.6
>            Reporter: Paritosh Ranjan
>             Fix For: 0.7
>
>         Attachments: MAHOUT-931
>
>
> A pluggable outlier removal capability while classifying the clusters is 
> needed. The classification and outlier removal implementations, both should 
> be completely separate entities for better abstraction. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to