Hi Rachel, Do you have your data normalized?
2016-12-15 20:21 GMT+03:00 Rachel Melamed <mela...@uchicago.edu>: > Hi all, > Does anyone have any suggestions for this problem: > http://stackoverflow.com/questions/41125342/sklearn- > logistic-regression-gives-biased-results > > I am running around 1000 similar logistic regressions, with the same > covariates but slightly different data and response variables. All of my > response variables have a sparse successes (p(success) < .05 usually). > > I noticed that with the regularized regression, the results are > consistently biased to predict more "successes" than is observed in the > training data. When I relax the regularization, this bias goes away. The > bias observed is unacceptable for my use case, but the more-regularized > model does seem a bit better. > > Below, I plot the results for the 1000 different regressions for 2 > different values of C: [image: results for the different regressions for > 2 different values of C] <https://i.stack.imgur.com/1cbrC.png> > > I looked at the parameter estimates for one of these regressions: below > each point is one parameter. It seems like the intercept (the point on the > bottom left) is too high for the C=1 model. [image: enter image > description here] <https://i.stack.imgur.com/NTFOY.png> > > > > _______________________________________________ > scikit-learn mailing list > scikit-learn@python.org > https://mail.python.org/mailman/listinfo/scikit-learn > > -- Yours sincerely, Alexey A. Dral
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn