Le 18 mars 2012 09:22, Andreas <[email protected]> a écrit :
> On 03/18/2012 05:07 PM, James Bergstra wrote:
>> On Sat, Mar 17, 2012 at 11:55 PM, Mathieu Blondel<[email protected]>  
>> wrote:
>>
>>>> The alpha specified this way could (should?) have the same name and
>>>> interpretation as the l2_regularization coefficient in the
>>>> SGDClassifier.
>>>>
>>> Would you convert alpha into a C internal value or would you patch
>>> libsvm / liblinear to use alpha? I don't understand how the former
>>> would be different from the scale_C option, in practice.
>>>
>> In the implementation, I would convert it to C and call libsvm
>> similarly to how scale_C is working now. The reason that I piped up on
>> the list was purely for code readability. If I read
>>
>> svm = SVC(C=10)
>>
>> it really looks like svm is an SVM model with C=10.  If there's a
>> implicit scale_C=True in the arguments, it's confusing.  This caused
>> my code to have a bug, and I got annoyed.
>>
>> On the other hand if I had read
>>
>> svm = SVC(alpha=1e-3)
>>
>> then I would have wondered "what's alpha?" and gone to look up the
>> docs and learn how alpha is converted to C.
>>
> I see two possible remedies:
> - Having two different possible parameters, as James and I
>   proposed earlier (and which Lars didn't seem to like much),
>   where the user can specify either "C" or "alpha"

I would probably stand in the unscaled C + alpha camp as well. Even if
having 2 mutually exclusive hyper-params feels weird, at least it
should respect the principle of minimal surprise for libsvm /
liblinear users and people who care about consistency (both for grid
search and naming convention with other linear models) will use alpha.

> - Changing the "scale_C" option back to "False" by default.
>   That means having different parameter names than other
>   linear models and "inconsistent" cross-validation.
>
> Are there any other options?

We could also maintain the current situation and try to make the
documentation more explicit and improve it by giving motivations
(consistency vs. dataset size when doing grid search ) for scaling C
by default.

-- 
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel

------------------------------------------------------------------------------
This SF email is sponsosred by:
Try Windows Azure free for 90 days Click Here 
http://p.sf.net/sfu/sfd2d-msazure
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to