Hi,
I'm new to mahout and tried to research this a bit before
encountering this problem.
After I generate sequencefile for directory of text files, I run this:
bin/mahout org.apache.mahout.vectorizer.collocations.llr.CollocDriver
-i out/chunk-0 -o colloc -a org.apache.mahout.vectorizer.DefaultAnalyzer
-ng 3
It produces a couple exceptions:
...
WARNING: job_local_0001
java.lang.ClassCastException: org.apache.hadoop.io.Text cannot be cast
to org.apache.mahout.common.StringTuple
at
org.apache.mahout.vectorizer.collocations.llr.CollocMapper.map(CollocMapper.java:41)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
at
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
Jan 21, 2011 5:30:07 PM org.apache.hadoop.mapred.JobClient
monitorAndPrintJob
...
ava.lang.IndexOutOfBoundsException: Index: 0, Size: 0
at java.util.ArrayList.RangeCheck(ArrayList.java:547)
at java.util.ArrayList.get(ArrayList.java:322)
at
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:124)
How can I make this work?
Thanks for any tips,
Darren