> I agree that it's a good idea to correct C for sample size when moving
> from a sub-problem to the full thing.  I just wouldn't use the word
> "optimal" to describe the new value of C that you get this way - it's
> an extrapolation, a good guess... possibly provably better than the
> un-corrected value of C, but I would balk at claiming that it's
> optimal.

I fully agree. With scaling it's a more reasonable guess.
Can you let me know where I should rephrase and avoid the word "optimal".

> I can also appreciate why you'd want a parametrization (via alpha)
> that makes this correction heuristic automatic, in that you actually
> don't have to change the number that comes out of cross-validation
> when re-learning on the full set. That's really convenient!
>
> How about parametrizing the wrapper like this:
>
> SVC(C=None, alpha=None, ...)
>
> ... and deleting the scale_C parameter.
>
> This way old code still works, new code can use alpha, and if anyone
> specifies both C and alpha you raise an error.
>
> The alpha specified this way could (should?) have the same name and
> interpretation as the l2_regularization coefficient in the
> SGDClassifier.

I learned to live with the scale_C and as mathieu said both choices provide
the same flexibility. It feels odd to me to have 2 parameters which are
mutually exclusive but that's only my personal opinion.

anybody else feels the same as James?

Alex

------------------------------------------------------------------------------
This SF email is sponsosred by:
Try Windows Azure free for 90 days Click Here 
http://p.sf.net/sfu/sfd2d-msazure
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to