On Sat, Oct 1, 2011 at 2:48 PM, Alexandre Gramfort <
[email protected]> wrote:

>
> average the ROC curves across folds (train / test splits) is a way:
>
> http://scikit-learn.sourceforge.net/auto_examples/plot_roc_crossval.html
>
> then you can compare the mean ROC curves for the different algorithms.
>
> Just be careful not to estimate the model parameters using the test set.
>

I actually already tried something similar by averaging the auc for the roc
but what prevented me from trying to do what you suggest is that I see a
very high variance on my auc (and, in general, all the quality metrics from
my classifiers). Furthermore, I am unable to get the variance to decrease by
increasing the number of train/test pairs to try for cross validation.

Namely:
100 pairs: avg=0.425, std=0.349106001094
1000 pairs: avg=0.4725, std=0.354250970359
10000 pairs:avg=0.48235, std=0.352155473477

So, it is pretty clear to me that what I have here is either not the right
features builtin or just really noisy target data or both. As is, it seems
foolish and useless to pick a classifier and its parameters based on what I
have.

By the way, I ended up coding the following metric function which might be
useful for sklearn.metrics:

def statistics(reference, prediction):
    import numpy
    good_true = numpy.logical_and(prediction, reference).sum()
    good_false = numpy.logical_and(numpy.logical_not(prediction),
numpy.logical_not(reference)).sum()
    bad_true = numpy.logical_and(prediction,
numpy.logical_not(reference)).sum()
    bad_false = numpy.logical_and(numpy.logical_not(prediction),
reference).sum()
    n = len(prediction)
    tp = float(good_true) / n
    tn = float(good_false) / n
    fp = float(bad_true) / n
    fn = float(bad_false) / n
    tpr = tp / (tp + fn + 0.0001)
    tnr = tn / (tn + fp + 0.0001)
    return (tp,tn,fp,fn,tpr,tnr)


Mathieu
-- 
Mathieu Lacage <[email protected]>
------------------------------------------------------------------------------
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security
threats, fraudulent activity, and more. Splunk takes this data and makes
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2dcopy2
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to