Hi Karimkhan,

If I am understanding your question correctly, you are asking to classify
test data in a class that is not specified in your training set.

For instance if you have three classes of news article specified in your
training data (e.g. politics, sports, and food) and you try to classify an
article that 'truly' best belongs in a 'business' category you are out of
luck. Your classification can only be as good as the training data and your
classifier will put the article in the closest match it can find (if the
article was about McDonald's stock price, it might be classified as food,
for instance).

Hope that helps!


On Wed, Sep 3, 2014 at 10:31 AM, Sebastian Raschka <[email protected]>
wrote:

> This is due to the Laplace smoothening. If I understand correctly, you
> want the classification to fail if there is a new feature value (e.g., a
> word that is not in the vocabulary when you are doing text classification)?
>
> You can set the alpha parameter to 0 (see
> http://scikit-learn.org/stable/modules/generated/sklearn.naive_bayes.MultinomialNB.html#sklearn.naive_bayes.MultinomialNB)
> which would disable the Laplace smoothening.
>
> Best,
> Sebastian Raschka
>
> > On Sep 3, 2014, at 6:20 AM, Karimkhan Pathan <[email protected]>
> wrote:
> >
> > I have trained my classifier using 20 domain datasets using
> MultinomialNB. And it is working fine for these 20 domains.
> >
> > Issue is, if I make query which contains text which does not belongs to
> any of these 20 domain, even it gives classification result.
> >
> > Is it possible that if query does not belongs to any of 20 domain, it
> should get probability value 0?
> >
> ------------------------------------------------------------------------------
> > Slashdot TV.
> > Video for Nerds.  Stuff that matters.
> > http://tv.slashdot.org/
> > _______________________________________________
> > Scikit-learn-general mailing list
> > [email protected]
> > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
>
> ------------------------------------------------------------------------------
> Slashdot TV.
> Video for Nerds.  Stuff that matters.
> http://tv.slashdot.org/
> _______________________________________________
> Scikit-learn-general mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>



-- 
Patrick Short
------------------------------

University of North Carolina at Chapel Hill, 2014

Applied Mathematics and Quantitative Biology

[email protected] | 919-455-7045 C
------------------------------------------------------------------------------
Slashdot TV.  
Video for Nerds.  Stuff that matters.
http://tv.slashdot.org/
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to