[ 
https://issues.apache.org/jira/browse/MAHOUT-982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13222169#comment-13222169
 ] 

[email protected] commented on MAHOUT-982:
------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4174/
-----------------------------------------------------------

Review request for mahout.


Summary
-------

Executing clustering using ClusterClassificationDriver in CanopyDriver.

This replaces the existing funtionality. If this refactoring is marked ok, then 
we can add a threshold as the method parameter/CLI argument to support oulier 
removal in CanopyClustering.
This patch is first of its kind for the ClusteringDrivers. If this is okayed, 
then the similar refactoring can be done easily for KMeans, FuzzyK and 
Dirichlet.


This addresses bug MAHOUT-982.
    https://issues.apache.org/jira/browse/MAHOUT-982


Diffs
-----

  
trunk/core/src/test/java/org/apache/mahout/clustering/canopy/TestCanopyCreation.java
 1294137 
  
trunk/core/src/main/java/org/apache/mahout/clustering/canopy/ClusterMapper.java 
1294137 
  
trunk/core/src/main/java/org/apache/mahout/clustering/canopy/CanopyClusterer.java
 1294137 
  
trunk/core/src/main/java/org/apache/mahout/clustering/canopy/CanopyDriver.java 
1294137 

Diff: https://reviews.apache.org/r/4174/diff


Testing
-------


Thanks,

Paritosh


                
> Refactor Canopy Clustering into a separate post process with outlier pruning
> ----------------------------------------------------------------------------
>
>                 Key: MAHOUT-982
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-982
>             Project: Mahout
>          Issue Type: Sub-task
>          Components: Clustering
>    Affects Versions: 0.6
>            Reporter: Paritosh Ranjan
>            Assignee: Paritosh Ranjan
>              Labels: clustering
>             Fix For: 0.7
>
>         Attachments: MAHOUT-982.txt
>
>
> Use ClusterClassificationDriver to refactor clustering out of CanopyDriver 
> with outlier pruning support.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to