Could you try to normalize dataset after feature dummy encoding and see if it is reproducible behavior?
2016-12-15 22:03 GMT+03:00 Rachel Melamed <mela...@uchicago.edu>: > Thanks for the reply. The covariates (“X") are all dummy/categorical > variables. So I guess no, nothing is normalized. > > On Dec 15, 2016, at 1:54 PM, Alexey Dral <aad...@gmail.com> wrote: > > Hi Rachel, > > Do you have your data normalized? > > 2016-12-15 20:21 GMT+03:00 Rachel Melamed <mela...@uchicago.edu>: > >> Hi all, >> Does anyone have any suggestions for this problem: >> http://stackoverflow.com/questions/41125342/sklearn-logistic >> -regression-gives-biased-results >> >> I am running around 1000 similar logistic regressions, with the same >> covariates but slightly different data and response variables. All of my >> response variables have a sparse successes (p(success) < .05 usually). >> >> I noticed that with the regularized regression, the results are >> consistently biased to predict more "successes" than is observed in the >> training data. When I relax the regularization, this bias goes away. The >> bias observed is unacceptable for my use case, but the more-regularized >> model does seem a bit better. >> >> Below, I plot the results for the 1000 different regressions for 2 >> different values of C: [image: results for the different regressions for >> 2 different values of C] <https://i.stack.imgur.com/1cbrC.png> >> >> I looked at the parameter estimates for one of these regressions: below >> each point is one parameter. It seems like the intercept (the point on the >> bottom left) is too high for the C=1 model. [image: enter image >> description here] <https://i.stack.imgur.com/NTFOY.png> >> >> >> >> _______________________________________________ >> scikit-learn mailing list >> scikit-learn@python.org >> https://mail.python.org/mailman/listinfo/scikit-learn >> >> > > > -- > Yours sincerely, > Alexey A. Dral > _______________________________________________ > scikit-learn mailing list > scikit-learn@python.org > https://mail.python.org/mailman/listinfo/scikit-learn > > > > _______________________________________________ > scikit-learn mailing list > scikit-learn@python.org > https://mail.python.org/mailman/listinfo/scikit-learn > > -- Yours sincerely, Alexey A. Dral
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn