Hi,
I'm trying to get the LDA example working. I'm working out of svn, revision
#1209631 (current as of right now). The k-means example works fine.
I have hadoop running and the configs setup properly for running MR jobs
and using HDFS. I'm starting from ./cluster-reuters.sh and choosing LDA ---
The first step in the example script, seq2sparse, completes successfully.
The second step, actually running LDA, does not. It's erroring out with the
following message:
Exception in thread "main" java.lang.IllegalStateException:
hdfs://hostname:54310/tmp/mahout-work-hadoop/reuters-out-seqdir-sparse-lda/tf-vectors/_logs
at
org.apache.mahout.common.iterator.sequencefile.SequenceFileDirValueIterator$1.apply(SequenceFileDirValueIterator.java:82)
at
org.apache.mahout.common.iterator.sequencefile.SequenceFileDirValueIterator$1.apply(SequenceFileDirValueIterator.java:73)
at com.google.common.collect.Iterators$8.next(Iterators.java:765)
at com.google.common.collect.Iterators$5.hasNext(Iterators.java:526)
at
com.google.common.collect.ForwardingIterator.hasNext(ForwardingIterator.java:43)
at
org.apache.mahout.clustering.lda.LDADriver.determineNumberOfWordsFromFirstVector(LDADriver.java:204)
at
org.apache.mahout.clustering.lda.LDADriver.run(LDADriver.java:164)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at
org.apache.mahout.clustering.lda.LDADriver.main(LDADriver.java:90)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at
org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
at
org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:188)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
Caused by: java.io.IOException: Cannot open filename
/tmp/mahout-work-hadoop/reuters-out-seqdir-sparse-lda/tf-vectors/_logs
at
org.apache.hadoop.hdfs.DFSClient$DFSInputStream.openInfo(DFSClient.java:1526)
at
org.apache.hadoop.hdfs.DFSClient$DFSInputStream.<init>(DFSClient.java:1517)
at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:384)
at
org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:178)
at
org.apache.hadoop.io.SequenceFile$Reader.openFile(SequenceFile.java:1444)
at
org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1431)
at
org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1424)
at
org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1419)
at
org.apache.mahout.common.iterator.sequencefile.SequenceFileValueIterator.<init>(SequenceFileValueIterator.java:51)
at
org.apache.mahout.common.iterator.sequencefile.SequenceFileDirValueIterator$1.apply(SequenceFileDirValueIterator.java:77)
... 20 more
-Chris