Hi everyone, 

I've been attempting to use TestClassifier on a directory of roughly
49,000 small text files. When running the following command I receive a
NullPointerException in ConfusionMatrix.getCount(). I've attached the
full verbose output of the mahout run plus the stacktrace. 

This is on 0.4-SNAPSHOT running today's HEAD plus the small patch to
BayesFileFormatter I submitted in MAHOUT-488.

Any pointers on how to go about resolving this problem ?

Thanks, 

-- 
Mathieu Sauve-Frankel
$ mahout testclassifier -m /tmp/aint.model -d /tmp/aint/data/ -e UTF-8 -ng 1 -a 
1.0 -source hdfs -method sequential -type bayes
no HADOOP_HOME set, running locally
Aug 27, 2010 1:31:58 AM org.slf4j.impl.JCLLoggerAdapter info
INFO: Loading model from: {basePath=/tmp/aint.model, classifierType=bayes, 
alpha_i=1.0, dataSource=hdfs, gramSize=1, verbose=false, encoding=UTF-8, 
defaultCat=unknown, testDirPath=/tmp/aint/data/}
Aug 27, 2010 1:31:58 AM org.slf4j.impl.JCLLoggerAdapter info
INFO: Testing Bayes Classifier
Aug 27, 2010 1:31:58 AM org.slf4j.impl.JCLLoggerAdapter info
INFO: file:/tmp/aint.model/trainer-weights/Sigma_j/part-00000
Aug 27, 2010 1:31:58 AM org.slf4j.impl.JCLLoggerAdapter info
INFO: file:/tmp/aint.model/trainer-weights/Sigma_k/part-00000
Aug 27, 2010 1:31:58 AM org.slf4j.impl.JCLLoggerAdapter info
INFO: file:/tmp/aint.model/trainer-weights/Sigma_kSigma_j/part-00000
Aug 27, 2010 1:31:58 AM org.slf4j.impl.JCLLoggerAdapter info
INFO: 1680.850219331143
Aug 27, 2010 1:31:58 AM org.slf4j.impl.JCLLoggerAdapter info
INFO: file:/tmp/aint.model/trainer-thetaNormalizer/part-00000
Aug 27, 2010 1:31:58 AM org.slf4j.impl.JCLLoggerAdapter info
INFO: file:/tmp/aint.model/trainer-tfIdf/trainer-tfIdf/part-00000
Aug 27, 2010 1:31:58 AM org.slf4j.impl.JCLLoggerAdapter info
INFO: aint -8192.62364383991 8192.62364383991 -1.0
Exception in thread "main" java.lang.NullPointerException
        at 
org.apache.mahout.classifier.ConfusionMatrix.getCount(ConfusionMatrix.java:99)
        at 
org.apache.mahout.classifier.ConfusionMatrix.incrementCount(ConfusionMatrix.java:114)
        at 
org.apache.mahout.classifier.ConfusionMatrix.incrementCount(ConfusionMatrix.java:118)
        at 
org.apache.mahout.classifier.ConfusionMatrix.addInstance(ConfusionMatrix.java:88)
        at 
org.apache.mahout.classifier.ResultAnalyzer.addInstance(ResultAnalyzer.java:68)
        at 
org.apache.mahout.classifier.bayes.TestClassifier.classifySequential(TestClassifier.java:256)
        at 
org.apache.mahout.classifier.bayes.TestClassifier.main(TestClassifier.java:176)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at 
org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
        at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
        at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:175)

Reply via email to