Re: [Scikit-learn-general] evaluating learning algorithms

Olivier Grisel Sun, 02 Oct 2011 02:36:05 -0700

2011/10/2 mathieu lacage <[email protected]>:
>
>
> On Sat, Oct 1, 2011 at 2:48 PM, Alexandre Gramfort
> <[email protected]> wrote:
>>
>> average the ROC curves across folds (train / test splits) is a way:
>>
>> http://scikit-learn.sourceforge.net/auto_examples/plot_roc_crossval.html
>>
>> then you can compare the mean ROC curves for the different algorithms.
>>
>> Just be careful not to estimate the model parameters using the test set.
>
> I actually already tried something similar by averaging the auc for the roc
> but what prevented me from trying to do what you suggest is that I see a
> very high variance on my auc (and, in general, all the quality metrics from
> my classifiers). Furthermore, I am unable to get the variance to decrease by
> increasing the number of train/test pairs to try for cross validation.
>
> Namely:
> 100 pairs: avg=0.425, std=0.349106001094
> 1000 pairs: avg=0.4725, std=0.354250970359
> 10000 pairs:avg=0.48235, std=0.352155473477
>
> So, it is pretty clear to me that what I have here is either not the right
> features builtin or just really noisy target data or both. As is, it seems
> foolish and useless to pick a classifier and its parameters based on what I
> have.


An AUC of 0.50 is a random classifier. Either your data are pure noise
or you classifier has an issue.

> By the way, I ended up coding the following metric function which might be
> useful for sklearn.metrics:
>
> def statistics(reference, prediction):
>     import numpy
>     good_true = numpy.logical_and(prediction, reference).sum()
>     good_false = numpy.logical_and(numpy.logical_not(prediction),
> numpy.logical_not(reference)).sum()
>     bad_true = numpy.logical_and(prediction,
> numpy.logical_not(reference)).sum()
>     bad_false = numpy.logical_and(numpy.logical_not(prediction),
> reference).sum()
>     n = len(prediction)
>     tp = float(good_true) / n
>     tn = float(good_false) / n
>     fp = float(bad_true) / n
>     fn = float(bad_false) / n
>     tpr = tp / (tp + fn + 0.0001)
>     tnr = tn / (tn + fp + 0.0001)
>     return (tp,tn,fp,fn,tpr,tnr)

This will only work for binary classifiers with outcome 0 for the
negative class and a non zero label for the positive class if I am not
mistaken. Sounds a bit restrictive to me.

-- 
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel

------------------------------------------------------------------------------
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security
threats, fraudulent activity, and more. Splunk takes this data and makes
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2dcopy2
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Re: [Scikit-learn-general] evaluating learning algorithms

Reply via email to