Using ClassifierDemo class provided below https://bitbucket.org/jaganadhg/blog/src/tip/bck9/java/src/org/bc/kl/ClassifierDemo.java
I tested the 20news test data with the trained model. It worked fine. However, when I ran the same class (ClassifierDemo) against my own test dataset with my own trained model, I received the following messages. It basically is returning NaN per potential class. And the final label it is assigning the test dataset to is always the same label: "Health", which is the default category, I think. Why is it doing this? I am pretty sure I trained it and generated the model correctly. Oct 31, 2011 2:00:09 PM org.slf4j.impl.JCLLoggerAdapter info INFO: Read 50000 feature weights Oct 31, 2011 2:00:09 PM org.slf4j.impl.JCLLoggerAdapter info INFO: Read 100000 feature weights Oct 31, 2011 2:00:09 PM org.slf4j.impl.JCLLoggerAdapter info INFO: Read 150000 feature weights Oct 31, 2011 2:00:09 PM org.slf4j.impl.JCLLoggerAdapter info INFO: 0.0 Oct 31, 2011 2:00:11 PM org.slf4j.impl.JCLLoggerAdapter info INFO: Health NaN NaN NaN Oct 31, 2011 2:00:11 PM org.slf4j.impl.JCLLoggerAdapter info INFO: SciTech NaN NaN NaN Oct 31, 2011 2:00:11 PM org.slf4j.impl.JCLLoggerAdapter info INFO: General NaN NaN NaN Oct 31, 2011 2:00:11 PM org.slf4j.impl.JCLLoggerAdapter info INFO: Entertainment NaN NaN NaN Oct 31, 2011 2:00:11 PM org.slf4j.impl.JCLLoggerAdapter info INFO: Politics NaN NaN NaN Oct 31, 2011 2:00:11 PM org.slf4j.impl.JCLLoggerAdapter info INFO: Sports NaN NaN NaN Oct 31, 2011 2:00:11 PM org.slf4j.impl.JCLLoggerAdapter info INFO: Business NaN NaN NaN Label: Health Score: 988.1363714476455 -- View this message in context: http://lucene.472066.n3.nabble.com/NaN-classification-results-cbayes-tp3468910p3468910.html Sent from the Mahout User List mailing list archive at Nabble.com.
