[ 
https://issues.apache.org/jira/browse/MAHOUT-991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paritosh Ranjan updated MAHOUT-991:
-----------------------------------

    Description: 
Adjust the Canopy, MeanShift, K-means, Dirichlet and Fuzzy KMeans 
implementations to emit ClusterWritables instead of Clusters. Adjust the other 
clustering tools (ClusterDumper and ClusterEvaluators) to accept 
ClusterWritables produced by these algorithms.

The new ClusterIterator and ClusterClassifier uses an expanded sequence file 
representation that stores Clusters as self-describing ClusterWritable objects. 
So, once all of these algorithms will start emitting ClusterWritables, then 
KMeans, Dirichlet and FuzzyK will be able to use ClusterIterator and 
ClusterClassifier for buildClusters phase.

  was:The new ClusterIterator and ClusterClassifier uses an expanded sequence 
file representation that stores Clusters as self-describing ClusterWritable 
objects. Adjust the Canopy and MeanShift implementations which do not use this 
approach to emit ClusterWritables instead of Clusters. Adjust the other 
clustering tools (ClusterDumper and ClusterEvaluators) to accept 
ClusterWritables produced by these algorithms.

        Summary: Convert Canopy, MeanShift, K-means, Dirichlet, Fuzzy KMeans 
and Other Tools to emit ClusterWritable  (was: Convert Canopy, MeanShift and 
Other Tools to Use ClusterWritable)
    
> Convert Canopy, MeanShift, K-means, Dirichlet, Fuzzy KMeans and Other Tools 
> to emit ClusterWritable
> ---------------------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-991
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-991
>             Project: Mahout
>          Issue Type: Sub-task
>          Components: Clustering
>    Affects Versions: 0.6
>            Reporter: Jeff Eastman
>            Assignee: Jeff Eastman
>             Fix For: 0.7
>
>
> Adjust the Canopy, MeanShift, K-means, Dirichlet and Fuzzy KMeans 
> implementations to emit ClusterWritables instead of Clusters. Adjust the 
> other clustering tools (ClusterDumper and ClusterEvaluators) to accept 
> ClusterWritables produced by these algorithms.
> The new ClusterIterator and ClusterClassifier uses an expanded sequence file 
> representation that stores Clusters as self-describing ClusterWritable 
> objects. So, once all of these algorithms will start emitting 
> ClusterWritables, then KMeans, Dirichlet and FuzzyK will be able to use 
> ClusterIterator and ClusterClassifier for buildClusters phase.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to