Re: [scikit-learn] Difference in normalization between Lasso and LogisticRegression + L1

2019-06-10 Thread Alexandre Gramfort
see https://github.com/scikit-learn/scikit-learn/pulls?utf8=%E2%9C%93&q=is%3Apr+is%3Aclosed+scale_C+ for historical perspective on this issue. Alex On Wed, May 29, 2019 at 11:32 PM Stuart Reynolds wrote: > > I looked into like a while ago. There were differences in which algorithms > regulariz

Re: [scikit-learn] Difference in normalization between Lasso and LogisticRegression + L1

2019-05-29 Thread Stuart Reynolds
I looked into like a while ago. There were differences in which algorithms regularized the intercept, and which ones do not. (I believe liblinear does, lbgfs does not). All of the algorithms disagreed with logistic regression in scipy. - Stuart On Wed, May 29, 2019 at 10:50 AM Andreas Mueller wr

Re: [scikit-learn] Difference in normalization between Lasso and LogisticRegression + L1

2019-05-29 Thread Andreas Mueller
That is not very ideal indeed. I think we just went with what liblinear did, and when saga was introduced kept that behavior. It should probably be scaled as in Lasso, I would imagine? On 5/29/19 1:42 PM, Michael Eickenberg wrote: Hi Jesse, I think there was an effort to compare normalizatio

Re: [scikit-learn] Difference in normalization between Lasso and LogisticRegression + L1

2019-05-29 Thread Michael Eickenberg
Hi Jesse, I think there was an effort to compare normalization methods on the data attachment term between Lasso and Ridge regression back in 2012/13, but this might have not been finished or extended to Logistic Regression. If it is not documented well, it could definitely benefit from a documen

[scikit-learn] Difference in normalization between Lasso and LogisticRegression + L1

2019-05-29 Thread Jesse Livezey
Hi everyone, I noticed recently that in the Lasso implementation (and docs), the MSE term is normalized by the number of samples https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.Lasso.html but for LogisticRegression + L1, the logloss does not seem to be normalized by the num