2014-09-04 15:45 GMT+02:00 Karimkhan Pathan <[email protected]>: > Oh okay, well I tried with predict_proba. But if query is out of domain then > classifier uniformly divide probability to all learned domains. Like in case > of 4 domains (0.333123570669, 0.333073654046, 0.166936800591, > 0.166865974694)
Naive Bayes returns highly distorted probabilities. It's a good classifier, but a lousy probability model. predict_proba is really only useful for ensemble algorithms. What you could do is phrase the problem as multi-label classification with sklearn.multiclass.OneVsRestClassifier, and then predict the class with the highest probability under this model iff it exceeds .5. If none of the k classifiers predicts positive, return the null class. This is just an idea, no guarantee that it will work. You'll need to convert the targets using sklearn.preprocessing.MultiLabelBinarizer. ------------------------------------------------------------------------------ Slashdot TV. Video for Nerds. Stuff that matters. http://tv.slashdot.org/ _______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
