2014-09-04 15:45 GMT+02:00 Karimkhan Pathan <[email protected]>:
> Oh okay, well I tried with predict_proba. But if query is out of domain then
> classifier uniformly divide probability to all learned domains. Like in case
> of 4 domains (0.333123570669, 0.333073654046, 0.166936800591,
> 0.166865974694)

Naive Bayes returns highly distorted probabilities. It's a good
classifier, but a lousy probability model. predict_proba is really
only useful for ensemble algorithms.

What you could do is phrase the problem as multi-label classification
with sklearn.multiclass.OneVsRestClassifier, and then predict the
class with the highest probability under this model iff it exceeds .5.
If none of the k classifiers predicts positive, return the null class.
This is just an idea, no guarantee that it will work. You'll need to
convert the targets using sklearn.preprocessing.MultiLabelBinarizer.

------------------------------------------------------------------------------
Slashdot TV.  
Video for Nerds.  Stuff that matters.
http://tv.slashdot.org/
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to