Karimkhan,

Two possible naive methods that you can directly use with sklearn are:

(1) use predict_proba and check if the probability of belonging to the most
probable class (p1) is less than a threshold. Or you can use the entropy
over the probability distribution instead of p1. However, an instance with
a low prediction probability is not always necessarily an instance of an
unknown class.

(2) use the one-class svm available on sklearn ( see
http://scikit-learn.org/stable/modules/outlier_detection.html ) and try to
build a one-class svm for each of your known classes (e.g. if you have 4
known classes, you will build 4 one-class svm models). If a new test point
is classified as an outilier by all those models, then it is possibly a
novel class instance. However, it is a bit difficult to tune the parameters
"nu" and "gamma" for the one-class svm.

Another possibly more efficient way (but not straightforward) is to extend
(2) to detect the instances of the test set that are determined as outliers
and are close to each other.


2014-09-04 15:45 GMT+02:00 Karimkhan Pathan <[email protected]>:

> Oh okay, well I tried with predict_proba. But if query is out of domain
> then classifier uniformly divide probability to all learned domains. Like
> in case of 4 domains
> (0.333123570669, 0.333073654046, 0.166936800591, 0.166865974694)
>
>
> On Thu, Sep 4, 2014 at 7:00 PM, Gael Varoquaux <
> [email protected]> wrote:
>
>> On Thu, Sep 04, 2014 at 05:22:02PM +0530, Karimkhan Pathan wrote:
>> > Well could you please throw light on my classification issue? I guess
>> > you might be knowing well whether something helpful class/method exists
>> > in scikit which can solve this issue.
>>
>> I don't know. I would naively try to do a predict_proba and conclude that
>> it's none of the classes known if none of the probas are somewhat
>> confident. But I have no prior experience doing that, so I cannot give a
>> good advice.
>>
>> G
>>
>>
>>
>> ------------------------------------------------------------------------------
>> Slashdot TV.
>> Video for Nerds.  Stuff that matters.
>> http://tv.slashdot.org/
>> _______________________________________________
>> Scikit-learn-general mailing list
>> [email protected]
>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>
>
>
>
> ------------------------------------------------------------------------------
> Slashdot TV.
> Video for Nerds.  Stuff that matters.
> http://tv.slashdot.org/
> _______________________________________________
> Scikit-learn-general mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
>


-- 
Mohamed-Rafik BOUGUELIA
PhD Student
INRIA Nancy Grand Est - LORIA - READ Team
Nancy University - France.
------------------------------------------------------------------------------
Slashdot TV.  
Video for Nerds.  Stuff that matters.
http://tv.slashdot.org/
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to