Hi,
Im trying to create a classification model using some wiki sample articles
using seqwiki on the wiki xml dumps. It creates the sequence files and vectors
ok but when executing:
bin/mahout trainnb -i vectors/tfidf-vectors -el -o mod -li labidx -ow -c
I get the following:
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 1
at
org.apache.mahout.classifier.naivebayes.BayesUtils.writeLabelIndex(BayesUtils.java:122)
at
org.apache.mahout.classifier.naivebayes.training.TrainNaiveBayesJob.createLabelIndex(TrainNaiveBayesJob.java:178)
at
org.apache.mahout.classifier.naivebayes.training.TrainNaiveBayesJob.run(TrainNaiveBayesJob.java:93)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at
org.apache.mahout.classifier.naivebayes.training.TrainNaiveBayesJob.main(TrainNaiveBayesJob.java:63)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at
org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:195)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:197)
I've tried using the various output methods from seq2spare with the same
results, and also various wiki source files.
Any advice would be greatly appreciated.
Cheers
Sam