Hi, Steps I followed are below :
$ bin/mahout wikipediaDataSetCreator -i D:/mahout-0.4/examples/bin/work/wikipedia/wikipediaClassification/Traininput -o examples/bi n/work/wikipedia/wikipediaClassification/train-subject -c $MAHOUT_HOME/examples/src/test/resources/subjects.txt $ bin/mahout wikipediaDataSetCreator -i D:/mahout-0.4/examples/bin/work/wikipedia/wikipediaClassification/Testinput -o examples/bin /work/wikipedia/wikipediaClassification/test-subject -c $MAHOUT_HOME/examples/src/test/resources/subjects.txt $ bin/mahout trainclassifier -i examples/bin/work/wikipedia/wikipediaClassification/train-subject -o examples/bin/work/wikipedia/wikip ediaClassification/wikipedia-subject-model $ bin/mahout testclassifier -m examples/bin/work/wikipedia/wikipediaClassification/wikipedia-subject-model -d examples/bin/work/wikipedia/wikipediaClassification/test-subject Regards, Divya -----Original Message----- From: Grant Ingersoll [mailto:[email protected]] Sent: Saturday, November 27, 2010 8:54 PM To: [email protected] Subject: Re: NPE in bayes wiki example Can you provide all the steps you have done up to this point? -Grant On Nov 25, 2010, at 12:57 AM, Divya wrote: > Hi, > > I am getting null pointer exception when I pass my test input data to > testclassifier > > > > $ bin/mahout testclassifier -m > examples/bin/work/wikipedia/wikipediaClassification/wikipedia-subject-model > -d examples/bin/work/wikipe > > dia/wikipediaClassification/test-subject > > Running on hadoop, using HADOOP_HOME=C:\cygwin\home\Divya\hadoop-0.20.2 > > HADOOP_CONF_DIR=C:\cygwin\home\Divya\hadoop-0.20.2\conf > > 10/11/25 13:51:36 INFO bayes.TestClassifier: Loading model from: > {basePath=examples/bin/work/wikipedia/wikipediaClassification/wikipedi > > a-subject-model, classifierType=bayes, alpha_i=1.0, dataSource=hdfs, > gramSize=1, verbose=false, encoding=UTF-8, defaultCat=unknown, tes > > tDirPath=examples/bin/work/wikipedia/wikipediaClassification/test-subject} > > 10/11/25 13:51:36 INFO bayes.TestClassifier: Testing Bayes Classifier > > 10/11/25 13:51:38 INFO io.SequenceFileModelReader: > file:/D:/mahout-0.4/examples/bin/work/wikipedia/wikipediaClassification/wiki > pedia-su > > bject-model/trainer-weights/Sigma_j/part-00000 > > 10/11/25 13:51:38 INFO io.SequenceFileModelReader: > file:/D:/mahout-0.4/examples/bin/work/wikipedia/wikipediaClassification/wiki > pedia-su > > bject-model/trainer-weights/Sigma_k/part-00000 > > 10/11/25 13:51:38 INFO io.SequenceFileModelReader: > file:/D:/mahout-0.4/examples/bin/work/wikipedia/wikipediaClassification/wiki > pedia-su > > bject-model/trainer-weights/Sigma_kSigma_j/part-00000 > > 10/11/25 13:51:38 INFO io.SequenceFileModelReader: 8.048212844092422 > > 10/11/25 13:51:39 INFO io.SequenceFileModelReader: > file:/D:/mahout-0.4/examples/bin/work/wikipedia/wikipediaClassification/wiki > pedia-su > > bject-model/trainer-thetaNormalizer/part-00000 > > 10/11/25 13:51:39 INFO io.SequenceFileModelReader: > file:/D:/mahout-0.4/examples/bin/work/wikipedia/wikipediaClassification/wiki > pedia-su > > bject-model/trainer-tfIdf/trainer-tfIdf/part-00000 > > 10/11/25 13:51:39 INFO datastore.InMemoryBayesDatastore: history > -23722.080627413125 23722.080627413125 -1.0 > > Exception in thread "main" java.lang.NullPointerException > > at > org.apache.mahout.classifier.ConfusionMatrix.getCount(ConfusionMatrix.java:1 > 02) > > at > org.apache.mahout.classifier.ConfusionMatrix.incrementCount(ConfusionMatrix. > java:118) > > at > org.apache.mahout.classifier.ConfusionMatrix.incrementCount(ConfusionMatrix. > java:122) > > at > org.apache.mahout.classifier.ConfusionMatrix.addInstance(ConfusionMatrix.jav > a:90) > > at > org.apache.mahout.classifier.ResultAnalyzer.addInstance(ResultAnalyzer.java: > 68) > > at > org.apache.mahout.classifier.bayes.TestClassifier.classifySequential(TestCla > ssifier.java:266) > > at > org.apache.mahout.classifier.bayes.TestClassifier.main(TestClassifier.java:1 > 86) > > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39 > ) > > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl > .java:25) > > at java.lang.reflect.Method.invoke(Method.java:597) > > at > org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver > .java:68) > > at > org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139) > > at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:184) > > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39 > ) > > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl > .java:25) > > at java.lang.reflect.Method.invoke(Method.java:597) > > at org.apache.hadoop.util.RunJar.main(RunJar.java:156) > > > > My classifier is subjects.txt which has two entries History and Science. > > > > > > > > but when I pass train input data I get to see the results > > > > $ bin/mahout testclassifier -m > examples/bin/work/wikipedia/wikipediaClassification/wikipedia-subject-model > -d examples/bin/work/wikipe > > dia/wikipediaClassification/train-subject > > Running on hadoop, using HADOOP_HOME=C:\cygwin\home\Divya\hadoop-0.20.2 > > HADOOP_CONF_DIR=C:\cygwin\home\Divya\hadoop-0.20.2\conf > > 10/11/25 13:51:54 INFO bayes.TestClassifier: Loading model from: > {basePath=examples/bin/work/wikipedia/wikipediaClassification/wikipedi > > a-subject-model, classifierType=bayes, alpha_i=1.0, dataSource=hdfs, > gramSize=1, verbose=false, encoding=UTF-8, defaultCat=unknown, tes > > tDirPath=examples/bin/work/wikipedia/wikipediaClassification/train-subject} > > 10/11/25 13:51:54 INFO bayes.TestClassifier: Testing Bayes Classifier > > 10/11/25 13:51:55 INFO io.SequenceFileModelReader: > file:/D:/mahout-0.4/examples/bin/work/wikipedia/wikipediaClassification/wiki > pedia-su > > bject-model/trainer-weights/Sigma_j/part-00000 > > 10/11/25 13:51:55 INFO io.SequenceFileModelReader: > file:/D:/mahout-0.4/examples/bin/work/wikipedia/wikipediaClassification/wiki > pedia-su > > bject-model/trainer-weights/Sigma_k/part-00000 > > 10/11/25 13:51:55 INFO io.SequenceFileModelReader: > file:/D:/mahout-0.4/examples/bin/work/wikipedia/wikipediaClassification/wiki > pedia-su > > bject-model/trainer-weights/Sigma_kSigma_j/part-00000 > > 10/11/25 13:51:55 INFO io.SequenceFileModelReader: 8.048212844092422 > > 10/11/25 13:51:55 INFO io.SequenceFileModelReader: > file:/D:/mahout-0.4/examples/bin/work/wikipedia/wikipediaClassification/wiki > pedia-su > > bject-model/trainer-thetaNormalizer/part-00000 > > 10/11/25 13:51:55 INFO io.SequenceFileModelReader: > file:/D:/mahout-0.4/examples/bin/work/wikipedia/wikipediaClassification/wiki > pedia-su > > bject-model/trainer-tfIdf/trainer-tfIdf/part-00000 > > 10/11/25 13:51:55 INFO datastore.InMemoryBayesDatastore: history > -23722.080627413125 23722.080627413125 -1.0 > > 10/11/25 13:51:55 INFO bayes.TestClassifier: Classified instances from > part-r-00000 > > 10/11/25 13:51:55 INFO bayes.TestClassifier: > ======================================================= > > Summary > > ------------------------------------------------------- > > Correctly Classified Instances : 2 100% > > Incorrectly Classified Instances : 0 0% > > Total Classified Instances : 2 > > > > ======================================================= > > Confusion Matrix > > ------------------------------------------------------- > > a <--Classified as > > 2 | 2 a = history > > Default Category: unknown: 1 > > > > > > 10/11/25 13:51:55 INFO driver.MahoutDriver: Program took 953 ms > > > > > > Can someone please explain the reason behind it. > > > > Thanks > > Regards, > > Divya > -------------------------- Grant Ingersoll http://www.lucidimagination.com/ Search the Lucene ecosystem docs using Solr/Lucene: http://www.lucidimagination.com/search
