[ https://issues.apache.org/jira/browse/MAHOUT-1247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13679076#comment-13679076 ]
Grant Ingersoll commented on MAHOUT-1247: ----------------------------------------- After you run cluster-reuters.sh, you can run: {code}bin/mahout org.apache.mahout.vectorizer.DictionaryVectorizer -i /tmp/mahout-work-grantingersoll/reuters-out-seqdir-sparse-kmeans/tokenized-documents -o ./dicVec{code} Make sure you have HADOOP_HOME set and also substitute in the appropriate work directory. > cluster-reuters doesn't work on Hadoop > -------------------------------------- > > Key: MAHOUT-1247 > URL: https://issues.apache.org/jira/browse/MAHOUT-1247 > Project: Mahout > Issue Type: Bug > Reporter: Grant Ingersoll > Assignee: Grant Ingersoll > Fix For: 0.8 > > > At least two issues: > 1. MAHOUT-992 messed up the Distributed Cache stuff somehow > 2. The ExtractReuters data is not being moved to HDFS. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira