More importantly than the statement from Sturla, which I may or may not agree with based on the modeling assumption (and every p-value is based on a modeling assumption), the logistic in scikit-learn is a penalized logistic model. Thus the closed-form formulas for p-values are not valid.
G On Sat, Apr 18, 2015 at 10:31:27PM +0000, Sturla Molden wrote: > Phillip Feldman <phillip.m.feld...@gmail.com> > wrote: > > When using logistic regression, I'm often trying to establish whether a > > given feature has any effect. > Compare models with and without the feature: Cross-validation, BIC, AIC, > PRESS, Bayes factor, etc. By the rules of inductive reasoning (cf. lex > parsimoniae, Occam's razor), the model that better predicts future data is > the more likely. If the model without the feature included gives equally > good or better predictions, Occam's razor instructs us that we ought to > assume that the feature has no substantial effect. > > R and Matlab give me p-values, but > > Scikit-learn does not. > p-values are not useful for model building (model selection). Actually, > p-values are not useful for anything and should be banned: It is > unfortunate that we use the word "significant" if p < 0.05, because it does > not mean "significant" in the linguistic sense. A feature has a > "significant effect" if p < 0.05, but it does not mean that the feature is > likely to have an effect. That is an inductive statement which we should > infer by model selection. Because of the way the p-value behaves, it is not > an Occam's razor. A feature can have an "significant effect" on past data, > but still deteriorate future predictions if included. This is particularly > the case if you have a large data set. Using the p-value to evaluate a > feature means we can draw a conclusion not supported by the data. We should > therefore never compute p-values. > Sturla > ------------------------------------------------------------------------------ > BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT > Develop your own process in accordance with the BPMN 2 standard > Learn Process modeling best practices with Bonita BPM through live exercises > http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_ > source=Sourceforge_BPM_Camp_5_6_15&utm_medium=email&utm_campaign=VA_SF > _______________________________________________ > Scikit-learn-general mailing list > Scikit-learn-general@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general -- Gael Varoquaux Researcher, INRIA Parietal Laboratoire de Neuro-Imagerie Assistee par Ordinateur NeuroSpin/CEA Saclay , Bat 145, 91191 Gif-sur-Yvette France Phone: ++ 33-1-69-08-79-68 http://gael-varoquaux.info http://twitter.com/GaelVaroquaux ------------------------------------------------------------------------------ BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT Develop your own process in accordance with the BPMN 2 standard Learn Process modeling best practices with Bonita BPM through live exercises http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_ source=Sourceforge_BPM_Camp_5_6_15&utm_medium=email&utm_campaign=VA_SF _______________________________________________ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general