[ 
https://issues.apache.org/jira/browse/MAHOUT-854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Thacker updated MAHOUT-854:
---------------------------------

    Attachment: MAHOUT-854.patch

I am not sure on 2 things:

1. Is it just me or when I try running the script using any of the clustering 
algorithms I get this error:
{noformat}./build-reuters.sh: line 165: 17319 Killed                  $MAHOUT 
seq2sparse -i ${WORK_DIR}/reuters-out-seqdir/ -o 
${WORK_DIR}/reuters-out-seqdir-sparse-kmeans {noformat}


2. Regarding MinHash is the clusterdump part required? If yes then can someone 
tell me what needs to be done to implement it for MinHash. I'm not to sure on 
how to implement it in case it is needed. 
                
> Add MinHash to build-reuters.sh example
> ---------------------------------------
>
>                 Key: MAHOUT-854
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-854
>             Project: Mahout
>          Issue Type: Improvement
>          Components: Clustering, Examples
>            Reporter: Varun Thacker
>            Priority: Minor
>             Fix For: 0.6
>
>         Attachments: MAHOUT-854.patch
>
>
> We can use the Reuters data set for MinHash clustering. Thus adding the 
> MinHash algorithm to the build-reuters.sh would be nice.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to