[ https://issues.apache.org/jira/browse/MAHOUT-854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Varun Thacker updated MAHOUT-854: --------------------------------- Attachment: MAHOUT-854.patch I am not sure on 2 things: 1. Is it just me or when I try running the script using any of the clustering algorithms I get this error: {noformat}./build-reuters.sh: line 165: 17319 Killed $MAHOUT seq2sparse -i ${WORK_DIR}/reuters-out-seqdir/ -o ${WORK_DIR}/reuters-out-seqdir-sparse-kmeans {noformat} 2. Regarding MinHash is the clusterdump part required? If yes then can someone tell me what needs to be done to implement it for MinHash. I'm not to sure on how to implement it in case it is needed. > Add MinHash to build-reuters.sh example > --------------------------------------- > > Key: MAHOUT-854 > URL: https://issues.apache.org/jira/browse/MAHOUT-854 > Project: Mahout > Issue Type: Improvement > Components: Clustering, Examples > Reporter: Varun Thacker > Priority: Minor > Fix For: 0.6 > > Attachments: MAHOUT-854.patch > > > We can use the Reuters data set for MinHash clustering. Thus adding the > MinHash algorithm to the build-reuters.sh would be nice. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira