Some issues with the CLASSPATH had just been fixed in MAHOUT-680. You could try them and see if it works.
On Sun, Jun 19, 2011 at 7:04 PM, Drew Farris <[email protected]> wrote: > Jeff, > > The key bit from your output is this: > > + exec /Users/jeff/hadoop/hadoop-0.20.2/bin/hadoop jar > > /Users/jeff/Documents/workspace/mahout/examples/target/mahout-examples-0.6-SNAPSHOT-job.jar > org.apache.mahout.driver.MahoutDriver seqdirectory -i > mahout-work/reuters-out -o mahout-work/reuters-out-seqdir -c UTF-8 > -chunk 5 > > This shows me that the mahout driver script is using hadoop to run the > seqdirectory task, which will cause it to treat the directories > specified by -i and -o as hdfs paths. The mystery to me is how there's > no mention of MAHOUT_LOCAL getting set. Can you confirm that the > following line appears in your copy of build-reuters.sh (that you're > running on the mac?) > > MAHOUT_LOCAL=true $MAHOUT seqdirectory \ > -i mahout-work/reuters-out \ > -o mahout-work/reuters-out-seqdir \ > -c UTF-8 -chunk 5 > > I'm getting hadoop/mahout installed on my mac presently to determine > if this is some sort of mac shell issue. > > Ian, > > Thanks for the exact syntax to get the jars from the hadoop > installation on disk referenced in the script. Adding the jars to the > classpath makes sense, and sort of confirms that it may be a class > compatibility problem. One thing I noticed is that the diff you > provided will add the jars from the hadoop install at the end of the > classpath. Perhaps they should go at the beginning of the path instead > of the end so that the hadoop jars from the installation are always > used first? > > So, instead of: > > for f in "$HADOOP_HOME"/hadoop-*.jar; do > CLASSPATH=${CLASSPATH}:$f > done > > The following might work better: > > for f in "$HADOOP_HOME"/hadoop-*.jar; do > CLASSPATH=$f:${CLASSPATH} > done > > (Same goes for the second classpath loop/etc) > > Drew >
