Re: [scikit-learn] biased predictions in logistic regression

Rachel Melamed Thu, 15 Dec 2016 11:05:18 -0800

Thanks for the reply.  The covariates (“X") are all dummy/categorical 
variables.  So I guess no, nothing is normalized.


On Dec 15, 2016, at 1:54 PM, Alexey Dral 
<[email protected]<mailto:[email protected]>> wrote:

Hi Rachel,

Do you have your data normalized?

2016-12-15 20:21 GMT+03:00 Rachel Melamed 
<[email protected]<mailto:[email protected]>>:
Hi all,
Does anyone have any suggestions for this problem:
http://stackoverflow.com/questions/41125342/sklearn-logistic-regression-gives-biased-results


I am running around 1000 similar logistic regressions, with the same covariates 
but slightly different data and response variables. All of my response 
variables have a sparse successes (p(success) < .05 usually).

I noticed that with the regularized regression, the results are consistently 
biased to predict more "successes" than is observed in the training data. When 
I relax the regularization, this bias goes away. The bias observed is 
unacceptable for my use case, but the more-regularized model does seem a bit 
better.

Below, I plot the results for the 1000 different regressions for 2 different 
values of C: [results for the different regressions for 2 different values of 
C] <https://i.stack.imgur.com/1cbrC.png>

I looked at the parameter estimates for one of these regressions: below each 
point is one parameter. It seems like the intercept (the point on the bottom 
left) is too high for the C=1 model. [enter image description here] 
<https://i.stack.imgur.com/NTFOY.png>


_______________________________________________
scikit-learn mailing list
[email protected]<mailto:[email protected]>
https://mail.python.org/mailman/listinfo/scikit-learn




--
Yours sincerely,
Alexey A. Dral
_______________________________________________
scikit-learn mailing list
[email protected]<mailto:[email protected]>
https://mail.python.org/mailman/listinfo/scikit-learn

_______________________________________________
scikit-learn mailing list
[email protected]
https://mail.python.org/mailman/listinfo/scikit-learn

Re: [scikit-learn] biased predictions in logistic regression

Reply via email to