[jira] Updated: (MAHOUT-160) ClusterDumper utility to output all the clusters in all sequence files and points

Shashikant Kore (JIRA) Wed, 05 Aug 2009 23:59:39 -0700

     [ 
https://issues.apache.org/jira/browse/MAHOUT-160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Shashikant Kore updated MAHOUT-160:
-----------------------------------

    Attachment: mahout-160.patch

ClusterDumper utility has been  modified to take the clusters and points 
directory as input instead of sequence file and points file.

> ClusterDumper utility to output all the clusters in all sequence files and 
> points
> ---------------------------------------------------------------------------------
>
>                 Key: MAHOUT-160
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-160
>             Project: Mahout
>          Issue Type: Improvement
>            Reporter: Shashikant Kore
>         Attachments: mahout-160.patch
>
>
> The current ClusterDumper utility takes a sequence file and points file as 
> input and prints the cluster vector along with the points that belong to the 
> clusters in the sequence file. This utility doesn't produce correct results 
> in case there are multiple sequence files and points. 
> To avoid this problem, all the point to cluster mappings need to be read 
> first and then iterate on the sequence files.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (MAHOUT-160) ClusterDumper utility to output all the clusters in all sequence files and points

Reply via email to