[
https://issues.apache.org/jira/browse/MAHOUT-680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13037896#comment-13037896
]
Frank Scholten commented on MAHOUT-680:
---------------------------------------
I just ran the following sequence on the cluster from the mahout folder and
this works as well.
Hadoop setup:
Last login: Mon May 23 08:04:40 2011 from 82.161.41.42
frank@domU-12-31-39-00-1C-22:~$ cd mahout-0.5-680-bug/
frank@domU-12-31-39-00-1C-22:~/mahout-0.5-680-bug$ export
HADOOP_HOME=/usr/local/hadoop-0.20.2
frank@domU-12-31-39-00-1C-22:~/mahout-0.5-680-bug$ export
HADOOP_CONF_DIR=$HADOOP_HOME/conf
frank@domU-12-31-39-00-1C-22:~/mahout-0.5-680-bug$ export
PATH=$PATH:$HADOOP_HOME/bin
Recreating directories:
frank@domU-12-31-39-00-1C-22:~/mahout-0.5-680-bug$ hadoop fs -rmr
/user/root/input
Moved to trash: hdfs://ec2-50-17-63-252.compute-1.amazonaws.com/user/root/input
frank@domU-12-31-39-00-1C-22:~/mahout-0.5-680-bug$ hadoop fs -rmr
/user/root/output
Moved to trash: hdfs://ec2-50-17-63-252.compute-1.amazonaws.com/user/root/output
frank@domU-12-31-39-00-1C-22:~/mahout-0.5-680-bug$ hadoop fs -mkdir
/user/root/input
frank@domU-12-31-39-00-1C-22:~/mahout-0.5-680-bug$ hadoop fs -put README.txt
/user/root/input
Running seqdirectory + seq2sparse
frank@domU-12-31-39-00-1C-22:~/mahout-0.5-680-bug$ bin/mahout seqdirectory
--input /user/root/input
frank@domU-12-31-39-00-1C-22:~/mahout-0.5-680-bug$ bin/mahout seq2sparse
--input output --output output-seq2sparse
View the tfidf-vectors:
$ frank@domU-12-31-39-00-1C-22:~/mahout-0.5-680-bug$ bin/mahout hadoop fs -text
/user/root/output-seq2sparse/tfidf-vectors/part-r-00000
Running on hadoop, using HADOOP_HOME=/usr/local/hadoop-0.20.2
HADOOP_CONF_DIR=/usr/local/hadoop-0.20.2/conf
/README.txt org.apache.mahout.math.VectorWritable@4979935d
Let me know if I missed something or if your setup is different.
> Running the Hadoop script through bin/mahout to set up classpath
> ----------------------------------------------------------------
>
> Key: MAHOUT-680
> URL: https://issues.apache.org/jira/browse/MAHOUT-680
> Project: Mahout
> Issue Type: Improvement
> Affects Versions: 0.4
> Reporter: Frank Scholten
> Priority: Minor
> Fix For: 0.5
>
> Attachments: MAHOUT-680.patch, MAHOUT-680.patch, jobtracker.jsp.html
>
>
> Added a patch which allows you to run the $HADOOP_HOME/bin/hadoop command
> script through the bin/mahout script.
> This way the Mahout script adds the Mahout classes to the $HADOOP_CLASSPATH
> so you can view sequencefiles generated by Mahout jobs with
> bin/mahout hadoop fs -text <sequencefile>
> without having to specify Mahout classes manually or getting
> ClassNotFoundExceptions
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira