On Sat, Apr 18, 2015 at 6:40 PM, Phillip Feldman <phillip.m.feld...@gmail.com> wrote: > This is a very nice explanation. Thanks!! > > Re. "We should therefore never compute p-values": I assume that you meant > that within the narrow context of regression, and not, e.g., in the context > of tests of distribution.
Sturla means: No null hypothesis testing at all and the editors of one journal agree with this https://groups.google.com/d/msg/pystatsmodels/e8aTj2ydyFI/odkShG2K3wwJ http://www.scientificamerican.com/article/scientists-perturbed-by-loss-of-stat-tool-to-sift-research-fudge-from-fact/ Fortunately for statsmodels, there is a large part of the world that also want to know about which variables affect a event or characteristic, instead of just doing best prediction with anonymous variables. (I just went through some articles to see how we can produce p-values after feature selection with penalized least squares or maximum penalized likelihood. :) Josef What's the effect of extended pacifier use? > > On Sat, Apr 18, 2015 at 3:31 PM, Sturla Molden <sturla.mol...@gmail.com> > wrote: >> >> Phillip Feldman <phillip.m.feld...@gmail.com> >> wrote: >> >> > When using logistic regression, I'm often trying to establish whether a >> > given feature has any effect. >> >> Compare models with and without the feature: Cross-validation, BIC, AIC, >> PRESS, Bayes factor, etc. By the rules of inductive reasoning (cf. lex >> parsimoniae, Occam's razor), the model that better predicts future data is >> the more likely. If the model without the feature included gives equally >> good or better predictions, Occam's razor instructs us that we ought to >> assume that the feature has no substantial effect. >> >> > R and Matlab give me p-values, but >> > Scikit-learn does not. >> >> p-values are not useful for model building (model selection). Actually, >> p-values are not useful for anything and should be banned: It is >> unfortunate that we use the word "significant" if p < 0.05, because it >> does >> not mean "significant" in the linguistic sense. A feature has a >> "significant effect" if p < 0.05, but it does not mean that the feature is >> likely to have an effect. That is an inductive statement which we should >> infer by model selection. Because of the way the p-value behaves, it is >> not >> an Occam's razor. A feature can have an "significant effect" on past data, >> but still deteriorate future predictions if included. This is particularly >> the case if you have a large data set. Using the p-value to evaluate a >> feature means we can draw a conclusion not supported by the data. We >> should >> therefore never compute p-values. >> >> Sturla >> >> >> >> ------------------------------------------------------------------------------ >> BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT >> Develop your own process in accordance with the BPMN 2 standard >> Learn Process modeling best practices with Bonita BPM through live >> exercises >> http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- >> event?utm_ >> source=Sourceforge_BPM_Camp_5_6_15&utm_medium=email&utm_campaign=VA_SF >> _______________________________________________ >> Scikit-learn-general mailing list >> Scikit-learn-general@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general > > > > ------------------------------------------------------------------------------ > BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT > Develop your own process in accordance with the BPMN 2 standard > Learn Process modeling best practices with Bonita BPM through live exercises > http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_ > source=Sourceforge_BPM_Camp_5_6_15&utm_medium=email&utm_campaign=VA_SF > _______________________________________________ > Scikit-learn-general mailing list > Scikit-learn-general@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general > ------------------------------------------------------------------------------ BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT Develop your own process in accordance with the BPMN 2 standard Learn Process modeling best practices with Bonita BPM through live exercises http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_ source=Sourceforge_BPM_Camp_5_6_15&utm_medium=email&utm_campaign=VA_SF _______________________________________________ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general