Hi Florie, You should provide as an input to ldatopics the result of the last state produced by lda, rather then the tf-vectors. You can take a look at the script that clusters the Reuters data set (located at examples/bin/build-reuters.sh in the Mahout source trunk)
Regards, Vasil On Tue, May 10, 2011 at 10:23 PM, florie <[email protected]> wrote: > Hi, after using lda, I am having problems with reading the output topics > using ldatopics: > > The error is as follows: > > [ion@lovemachine Downloads]$ mahout-0.4/bin/mahout ldatopics --input > sparsePosTokens/tf-vectors --dict sparsePosTokens/dictionary.file-0 --words > 30 --output sparsePosTokens/topics --dictionaryType sequencefile > Running on hadoop, using HADOOP_HOME=hadoop > No HADOOP_CONF_DIR set, using hadoop/conf > Exception in thread "main" java.io.IOException: wrong value class: 0.0 is > not class org.apache.mahout.math.VectorWritable > at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1874) > at > > org.apache.mahout.clustering.lda.LDAPrintTopics.topWordsForTopics(LDAPrintTopics.java:208) > at > > org.apache.mahout.clustering.lda.LDAPrintTopics.main(LDAPrintTopics.java:156) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at > > org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68) > at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139) > at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:184) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.util.RunJar.main(RunJar.java:156) > [ion@lovemachine Downloads]$ mahout-0.4/bin/mahout ldatopics --input > vectorSeqTokens/tf-vectors --dict vectorSeqTokens/dictionary.file-0 --words > 30 --output sparsePosTokens/topics --dictionaryType sequencefile > Running on hadoop, using HADOOP_HOME=hadoop > No HADOOP_CONF_DIR set, using hadoop/conf > Exception in thread "main" java.io.IOException: wrong value class: 0.0 is > not class org.apache.mahout.math.VectorWritable > at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1874) > at > > org.apache.mahout.clustering.lda.LDAPrintTopics.topWordsForTopics(LDAPrintTopics.java:208) > at > > org.apache.mahout.clustering.lda.LDAPrintTopics.main(LDAPrintTopics.java:156) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at > > org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68) > at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139) > at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:184) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.util.RunJar.main(RunJar.java:156) > > > I can't seem to figure what the problem is. Any help is appreciated. Thank > you! > -- > Ion Florie Ho > [email protected] > Dept. of Systems Science and Industrial Engineering > Binghamton University > 4400 Vestal Parkway > Binghamton, NY 13902 > Phone: (516) 587-0807 >
