> This statement doesn't sound true. Generally hyper-parameters > (especially ones to do with regularization) *do* depend on training > set size, and not in such straightforward ways. Data is never > perfectly I.I.D. and sometimes it can be far from it. My impression > was that standard practice for SVMs is to optimize C on held-out data. > When would the scale_C heuristic actually save anyone from having to > do this optimization?
I think there is a misunderstanding. With scale_C=False the GridSearchCV is not consistent. If you use 2 Folds (cv=2) with GridSearchCV then the optimal C obtained will actually be 2*C the best C when fit with the full training data. Makes sense? did you see the notes in the devel doc : http://scikit-learn.org/dev/modules/svm.html#svc > Even if the scale_C heuristic (is it fair to call it that?) is a good > idea, My 2c is that it does not justify redefining the meaning of the > "C" parameter which has a very standard interpretation in papers, > textbooks, and other SVM solvers. If you really must redefine the C > parameter (but why?) then it would make sense to me to rename it as > well. let me know if you still think it's non sense. Alex ------------------------------------------------------------------------------ This SF email is sponsosred by: Try Windows Azure free for 90 days Click Here http://p.sf.net/sfu/sfd2d-msazure _______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
