>From the output, it looks like the local side of the put does not exist in this case, which potentially indicates that seqdirectory is depositing things on HDFS which is then immediately being deleted in the subsequent step where the (presumably old) data is being cleaned off of HDFS prior to the put.
Jeff, is there any chance you can run the seqdirectory part of build-reuters.sh by hand in this environment prefixed with sh -x to collect debug info, e.g: sh -x bin/mahout seqdirectory ... I'm interested in determining whether it is running java (in local mode) or hadoop to execute the MahoutDriver class. Also, could you recap what environment you are running on the mac? iirc, it's trunk and vanilla hadoop 0.22.2? I can try to reproduce locally, up to this point I've been running on a linux box. IIRC the OP that reported the problem had a mac as well. Thanks, Drew On Sat, Jun 11, 2011 at 11:25 PM, Jeff Eastman <[email protected]> wrote: > On my Mac unicluster, latest trunk: > - synthetic-control works > - 20 newsgroups works > - reuters does not work: > > jeff-eastmans-macbook-pro:mahout jeff$ ./examples/bin/build-reuters.sh > Please select a number to choose the corresponding clustering algorithm > 1. kmeans clustering > 2. lda clustering > Enter your choice : 1 > ok. You chose 1 and we'll use kmeans Clustering > Sequencing ... > MAHOUT_LOCAL is set, running locally > SLF4J: Class path contains multiple SLF4J bindings. > SLF4J: Found binding in > [jar:file:/Users/jeff/Documents/workspace/mahout/examples/target/mahout-examples-0.6-SNAPSHOT-job.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: Found binding in > [jar:file:/Users/jeff/Documents/workspace/mahout/examples/target/dependency/slf4j-jcl-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an > explanation. > Jun 11, 2011 8:18:16 PM org.slf4j.impl.JCLLoggerAdapter info > INFO: Command line arguments: {--charset=UTF-8, --chunkSize=5, > --endPhase=2147483647, > --fileFilterClass=org.apache.mahout.text.PrefixAdditionFilter, > --input=mahout-work/reuters-out, --keyPrefix=, > --output=mahout-work/reuters-out-seqdir, --startPhase=0, --tempDir=temp} > Jun 11, 2011 8:18:17 PM org.slf4j.impl.JCLLoggerAdapter info > INFO: Program took 1475 ms > Deleted hdfs://localhost:9000/user/jeff/mahout-work/reuters-out-seqdir > put: File mahout-work/reuters-out-seqdir does not exist. > > > > > On 6/11/11 6:27 PM, Jeff Eastman wrote: >> >> On 6/11/11 4:32 AM, Grant Ingersoll wrote: >>> >>> What do you get when you run on good ol' Hadoop, i.e the one we actually >>> support and build and test on? >>> >> You know, that is a really good question :). I have a vanilla hadoop >> unicluster on my Mac but haven't used it in a while. Almost everything I've >> tried from Mahout on the other 2 cluster flavors work just great. >> Build-reuters is the only one which is giving me problems. I will try my >> unicluster too. > >
