Re: [Scikit-learn-general] computing cv scores

Gael Varoquaux Sun, 06 Nov 2011 13:22:12 -0800

On Sun, Nov 06, 2011 at 03:35:02PM -0500, Satrajit Ghosh wrote:
>    thanks very much gael. unfortunately, even using 5-fold cross-validation
>    will still result in a pretty small test set. the N is pretty small. i'm
>    actually using a stratifiedkfold with as large a test set as i can get
>    without blowing the variance of the model through the roof.


If you are using a StratifiedKFold, it seems to me that you are in
classification settings. If the error metric that you are using is the
default 0-1 loss, i.e. the mean of the prediction errors, than the two
options that you are refering to are very similar. If the folds have the
same size, than they are exactly mathematically equal.

I don't really see how a different averaging strategy across folds would
improve the variance in these settings, but maybe I am missing something?

Gael

------------------------------------------------------------------------------
RSA(R) Conference 2012
Save $700 by Nov 18
Register now
http://p.sf.net/sfu/rsa-sfdev2dev1
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Re: [Scikit-learn-general] computing cv scores

Reply via email to