[ 
https://issues.apache.org/jira/browse/MAHOUT-933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13226397#comment-13226397
 ] 

Jeff Eastman commented on MAHOUT-933:
-------------------------------------

r1298625 made the following changes:

MAHOUT-933:
- refactored ClusteringPolicies into hierarchy under new 
AbstractClusteringPolicy
- added close() to ClusteringPolicy to allow policy-specific actions needed to 
compute convergence
- removed ClusteringPolicy from ClusterIterator constructor as 
ClusterClassifier already has one
- added convergence computations for kmeans and fuzzyk
- added final clustersOut renaming to add -final suffix
- updated Display examples and unit tests to reflect above
- all tests run

I think it is time to begin refactoring the buildClusters methods of the 
respective clustering drivers to use ClusterIterator as it seems to be 
producing equivalent results to the original implementations. This will involve 
removing a lot of existing driver, mapper and reducer code and many 
time-consuming unit tests. It will also have some impact on other components as 
the representation of clusters in the file system changes from Cluster to 
self-describing ClusterWritable.

I have created independent subtasks to address these conversion issues so that 
they may be undertaken independently.

                
> Implement mapreduce version of ClusterIterator
> ----------------------------------------------
>
>                 Key: MAHOUT-933
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-933
>             Project: Mahout
>          Issue Type: Improvement
>          Components: Classification, Clustering
>    Affects Versions: 0.6
>            Reporter: Paritosh Ranjan
>            Assignee: Jeff Eastman
>             Fix For: 0.7
>
>
> Right now, ClusterIterator consumes vectors only from in-memory and 
> sequential hdfs. A mapreduce version to consume vectors needs to be 
> implemented.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to