2011/10/2 mathieu lacage <[email protected]>: > > > On Sat, Oct 1, 2011 at 2:48 PM, Alexandre Gramfort > <[email protected]> wrote: >> >> average the ROC curves across folds (train / test splits) is a way: >> >> http://scikit-learn.sourceforge.net/auto_examples/plot_roc_crossval.html >> >> then you can compare the mean ROC curves for the different algorithms. >> >> Just be careful not to estimate the model parameters using the test set. > > I actually already tried something similar by averaging the auc for the roc > but what prevented me from trying to do what you suggest is that I see a > very high variance on my auc (and, in general, all the quality metrics from > my classifiers). Furthermore, I am unable to get the variance to decrease by > increasing the number of train/test pairs to try for cross validation. > > Namely: > 100 pairs: avg=0.425, std=0.349106001094 > 1000 pairs: avg=0.4725, std=0.354250970359 > 10000 pairs:avg=0.48235, std=0.352155473477 > > So, it is pretty clear to me that what I have here is either not the right > features builtin or just really noisy target data or both. As is, it seems > foolish and useless to pick a classifier and its parameters based on what I > have.
An AUC of 0.50 is a random classifier. Either your data are pure noise or you classifier has an issue. > By the way, I ended up coding the following metric function which might be > useful for sklearn.metrics: > > def statistics(reference, prediction): > import numpy > good_true = numpy.logical_and(prediction, reference).sum() > good_false = numpy.logical_and(numpy.logical_not(prediction), > numpy.logical_not(reference)).sum() > bad_true = numpy.logical_and(prediction, > numpy.logical_not(reference)).sum() > bad_false = numpy.logical_and(numpy.logical_not(prediction), > reference).sum() > n = len(prediction) > tp = float(good_true) / n > tn = float(good_false) / n > fp = float(bad_true) / n > fn = float(bad_false) / n > tpr = tp / (tp + fn + 0.0001) > tnr = tn / (tn + fp + 0.0001) > return (tp,tn,fp,fn,tpr,tnr) This will only work for binary classifiers with outcome 0 for the negative class and a non zero label for the positive class if I am not mistaken. Sounds a bit restrictive to me. -- Olivier http://twitter.com/ogrisel - http://github.com/ogrisel ------------------------------------------------------------------------------ All of the data generated in your IT infrastructure is seriously valuable. Why? It contains a definitive record of application performance, security threats, fraudulent activity, and more. Splunk takes this data and makes sense of it. IT sense. And common sense. http://p.sf.net/sfu/splunk-d2dcopy2 _______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
