Hi, It might help for the build part, but probably won't fix the 2nd issue? The / is not writeable on most systems so creation of /tokenized-documents/_temporary will still fail?
2010/5/10 Jeff Eastman <j...@windwardsolutions.com> > Hi Florent, > > I successfully ran the new build-reuters.sh before I committed it this > morning, so I suspect you must have some other problem in your system. Have > you tried deleting your Maven repository (.m2) and doing a full mvn clean > install? > > Jeff > > > On 5/10/10 12:50 PM, Florent Empis wrote: > >> Hi, >> >> I've seen the commit from Robin this afternoon so I gave it another try. >> Using the new shell I still run into a few problems >> At first, in order to satisfy a dependency to slf4j I've had to add the >> following to examples/pom.xml (once again I'm not a maven expert, so this >> may not be the correct way to do it) >> >> <dependency> >> <groupId>org.slf4j</groupId> >> <artifactId>slf4j-nop</artifactId> >> <version>1.5.8</version> >> <classifier>sources</classifier> >> </dependency> >> >> Then, after a succesful mvn -B >> I've launched the shell: >> flor...@florent-laptop:~/workspace/mahout$ >> ./examples/bin/build-reuters.sh >> >> It fails with the following error: >> 10/05/10 21:28:06 WARN mapred.LocalJobRunner: job_local_0001 >> java.io.IOException: The temporary job-output directory >> file:/tokenized-documents/_temporary doesn't exist! >> at >> >> org.apache.hadoop.mapred.FileOutputCommitter.getWorkPath(FileOutputCommitter.java:204) >> at >> >> org.apache.hadoop.mapred.FileOutputFormat.getTaskOutputPath(FileOutputFormat.java:234) >> at >> >> org.apache.hadoop.mapred.SequenceFileOutputFormat.getRecordWriter(SequenceFileOutputFormat.java:48) >> at >> >> org.apache.hadoop.mapred.MapTask$DirectMapOutputCollector.<init>(MapTask.java:662) >> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:352) >> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307) >> at >> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177) >> 10/05/10 21:28:07 INFO mapred.JobClient: map 0% reduce 0% >> 10/05/10 21:28:07 INFO mapred.JobClient: Job complete: job_local_0001 >> 10/05/10 21:28:07 INFO mapred.JobClient: Counters: 0 >> 10/05/10 21:28:07 ERROR driver.MahoutDriver: MahoutDriver failed with >> args: >> [-i, ./examples/bin/work/reuters-out-seqdir/, -o, >> ./examples/bin/work/reuters-out-seqdir-sparse, null] >> Job failed! >> Exception in thread "main" java.io.IOException: Job failed! >> at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1252) >> at >> >> org.apache.mahout.utils.vectors.text.DocumentProcessor.tokenizeDocuments(DocumentProcessor.java:97) >> at >> >> org.apache.mahout.text.SparseVectorsFromSequenceFiles.main(SparseVectorsFromSequenceFiles.java:215) >> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> at >> >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) >> at >> >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) >> at java.lang.reflect.Method.invoke(Method.java:597) >> at >> >> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68) >> at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139) >> at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:172) >> >> A find makes me think that the issue is >> in >> /utils/src/main/java/org/apache/mahout/utils/vectors/text/DocumentProcessor.java >> >> /utils/src/main/java/org/apache/mahout/utils/vectors/text/DocumentProcessor.java: >> public static final String TOKENIZED_DOCUMENT_OUTPUT_FOLDER = >> "/tokenized-documents"; >> >> I tried changing this value, but it did not solve my problem, although I >> did >> a mvn -B on utils afterwards.... it looks like the mahout-utils used by >> the >> test comes from somewhere else: I guess there's something I'm missing.... >> >> >> >> >> 2010/5/10 Jeff Eastman<j...@windwardsolutions.com> >> >> >> >>> I will commit once I verify it completes. It's running now... >>> Jeff >>> >>> >>> On 5/10/10 7:50 AM, Robin Anil wrote: >>> >>> >>> >>>> +1. Should be using bin/mahout script for all these. >>>> >>>> >>>> Robin >>>> >>>> >>>> On Mon, May 10, 2010 at 8:12 PM, Jeff Eastman< >>>> j...@windwardsolutions.com >>>> >>>> >>>>> wrote: >>>>> >>>>> >>>> >>>> >>>> >>>> >>>>> Well, thanks for the info. Perhaps we should replace the script then. >>>>> Leaving time bombs around like this is not good. >>>>> Jeff >>>>> >>>>> >>>>> On 5/10/10 7:32 AM, Robin Anil wrote: >>>>> >>>>> >>>>> >>>>> >>>>> >>>>>> thats been broken for a long time, it was used by David while he >>>>>> developed >>>>>> LDA, It didn't get updated to work post 0.2 . Use Sisir's script to >>>>>> convert >>>>>> reuters to vectors, its up on the wiki >>>>>> >>>>>> Robin >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>>> >>>>> >>>> >>>> >>>> >>> >>> >>> >> >> > >