Thanks Joel. That makes sense.

Josh


On Sat, Aug 24, 2013 at 5:57 PM, Joel Nothman
<jnoth...@student.usyd.edu.au>wrote:

> On Sun, Aug 25, 2013 at 3:28 AM, Josh Wasserstein 
> <ribonucle...@gmail.com>wrote:
>
>> I am working with on a multi-class classification problem with admittedly
>> very little data. My total datset has 29 examples with the following label
>> distribution:
>>
>> Label A: 15 examples
>> Label B: 8 examples
>> Label C: 6 examples
>>
>> For cross validation I am using stratified repeated K-fold CV, with K =
>> 3, and 20 repetitions
>>   sfs = StratifiedShuffleSplit(y,n_iter=n_iter,test_size=1.0/K)
>>
>> The problem comes when I do a SVM grid search, e.g.:
>>
>>     clf = GridSearchCV(SVC(C=1, cache_size=5000, probability=True),
>>                        tuned_parameters,
>>                        scoring=score_func,
>>                        verbose=1, n_jobs=1, cv=sfs)
>>     clf.fit(X, y)
>>
>> where score_func is usually one of:
>> f1_micro
>> f1_macro
>> f1_weighted
>>
>> I get warning messages like the following:
>>
>> > /path/to/python2.7/site-packages/sklearn/metrics/metrics.py:1249:
>> > UserWarning: The sum of true positives and false positives are equal
>> > to zero for some labels. Precision is ill defined for those labels
>> > [0].  The precision and recall are equal to zero for some labels.
>> > fbeta_score is ill defined for those labels [0 2].
>> > average=average)
>>
>> My questions are:
>>
>> *1. *Why does this happen? I thought that F1 scoring would choose an
>> operating point (i.e. a score threshold) where we get at least *some 
>> *positives
>> (regardless of whether they are FP or TP).
>>
>
> The threshold is chosen by the classifier, not the metric. But this is
> also often impossible: your classifier might return A and B ahead of C for
> instance, or it might validly predict a label that ism't present in your
> evaluation data.
>
> The reason for the warning is that you might argue that predicting 0
> entries should result in a precision of 1. You can also argue that it
> should be 0. Similarly for the recall of a predicted label that does not
> appear in your test data. This decision will make a big difference to macro
> F1.
>
> *2 *Can I reliably trust the scores that I get when I get this warning?
>>
>
> Scikit-learn opts for 0 in these cases, so the result is a lower bound on
> the metric. But a micro-average may be more suitable/stable.
>
>
>
> ------------------------------------------------------------------------------
> Introducing Performance Central, a new site from SourceForge and
> AppDynamics. Performance Central is your source for news, insights,
> analysis and resources for efficient Application Performance Management.
> Visit us today!
> http://pubads.g.doubleclick.net/gampad/clk?id=48897511&iu=/4140/ostg.clktrk
> _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
>
------------------------------------------------------------------------------
Introducing Performance Central, a new site from SourceForge and 
AppDynamics. Performance Central is your source for news, insights, 
analysis and resources for efficient Application Performance Management. 
Visit us today!
http://pubads.g.doubleclick.net/gampad/clk?id=48897511&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to