Hi,
I am running a Hadoop 1.0.2 cluster in pseudo distributed mode and my Mahout
version is 0.7. I am trying to do Text classification using Mahout naïve bayes
command.
I created sequence files using my custom java program and uploaded the seq file
to HDFS.
I am running the following mahout commands
mahout seq2sparse -I my-seq -o my-vectors
mahout split -i my-vectors/tfidf-vectors --trainingOutput train-vectors
--testOutput test-vectors --randomSelectionPct 40 --overwrite --sequenceFiles
-xm sequential
mahout trainnb -i train-vectors -el -li labelindex -o model -ow -c
mahout testnb -i train-vectors -m model -l labelindex -ow -o my-testing -c
mahout testnb -i test-vectors -m model -l labelindex -ow -o tweets-testing -c
When I run this final command, I am getting the following exception.
Exception in thread "main" java.lang.IllegalArgumentException: Label not found:
EMI
at
com.google.common.base.Preconditions.checkArgument(Preconditions.java:88)
at
org.apache.mahout.classifier.ConfusionMatrix.getCount(ConfusionMatrix.java:102)
at
org.apache.mahout.classifier.ConfusionMatrix.incrementCount(ConfusionMatrix.java:122)
at
org.apache.mahout.classifier.ConfusionMatrix.incrementCount(ConfusionMatrix.java:126)
at
org.apache.mahout.classifier.ConfusionMatrix.addInstance(ConfusionMatrix.java:94)
at
org.apache.mahout.classifier.ResultAnalyzer.addInstance(ResultAnalyzer.java:71)
at
org.apache.mahout.classifier.naivebayes.test.TestNaiveBayesDriver.analyzeResults(TestNaiveBayesDriver.java:158)
at
org.apache.mahout.classifier.naivebayes.test.TestNaiveBayesDriver.run(TestNaiveBayesDriver.java:124)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at
org.apache.mahout.classifier.naivebayes.test.TestNaiveBayesDriver.main(TestNaiveBayesDriver.java:65)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:616)
at
org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:195)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:616)
Can someone help me in fixing this error?
Regards,
Anand.C