> I agree that it's a good idea to correct C for sample size when moving > from a sub-problem to the full thing. I just wouldn't use the word > "optimal" to describe the new value of C that you get this way - it's > an extrapolation, a good guess... possibly provably better than the > un-corrected value of C, but I would balk at claiming that it's > optimal.
I fully agree. With scaling it's a more reasonable guess. Can you let me know where I should rephrase and avoid the word "optimal". > I can also appreciate why you'd want a parametrization (via alpha) > that makes this correction heuristic automatic, in that you actually > don't have to change the number that comes out of cross-validation > when re-learning on the full set. That's really convenient! > > How about parametrizing the wrapper like this: > > SVC(C=None, alpha=None, ...) > > ... and deleting the scale_C parameter. > > This way old code still works, new code can use alpha, and if anyone > specifies both C and alpha you raise an error. > > The alpha specified this way could (should?) have the same name and > interpretation as the l2_regularization coefficient in the > SGDClassifier. I learned to live with the scale_C and as mathieu said both choices provide the same flexibility. It feels odd to me to have 2 parameters which are mutually exclusive but that's only my personal opinion. anybody else feels the same as James? Alex ------------------------------------------------------------------------------ This SF email is sponsosred by: Try Windows Azure free for 90 days Click Here http://p.sf.net/sfu/sfd2d-msazure _______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
