This is a very nice explanation. Thanks!!
Re. "We should therefore never compute p-values": I assume that you meant
that within the narrow context of regression, and not, e.g., in the context
of tests of distribution.
On Sat, Apr 18, 2015 at 3:31 PM, Sturla Molden <sturla.mol...@gmail.com>
wrote:
> Phillip Feldman <phillip.m.feld...@gmail.com>
> wrote:
>
> > When using logistic regression, I'm often trying to establish whether a
> > given feature has any effect.
>
> Compare models with and without the feature: Cross-validation, BIC, AIC,
> PRESS, Bayes factor, etc. By the rules of inductive reasoning (cf. lex
> parsimoniae, Occam's razor), the model that better predicts future data is
> the more likely. If the model without the feature included gives equally
> good or better predictions, Occam's razor instructs us that we ought to
> assume that the feature has no substantial effect.
>
> > R and Matlab give me p-values, but
> > Scikit-learn does not.
>
> p-values are not useful for model building (model selection). Actually,
> p-values are not useful for anything and should be banned: It is
> unfortunate that we use the word "significant" if p < 0.05, because it does
> not mean "significant" in the linguistic sense. A feature has a
> "significant effect" if p < 0.05, but it does not mean that the feature is
> likely to have an effect. That is an inductive statement which we should
> infer by model selection. Because of the way the p-value behaves, it is not
> an Occam's razor. A feature can have an "significant effect" on past data,
> but still deteriorate future predictions if included. This is particularly
> the case if you have a large data set. Using the p-value to evaluate a
> feature means we can draw a conclusion not supported by the data. We should
> therefore never compute p-values.
>
> Sturla
>
>
>
> ------------------------------------------------------------------------------
> BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT
> Develop your own process in accordance with the BPMN 2 standard
> Learn Process modeling best practices with Bonita BPM through live
> exercises
> http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual-
> event?utm_
> source=Sourceforge_BPM_Camp_5_6_15&utm_medium=email&utm_campaign=VA_SF
> _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
------------------------------------------------------------------------------
BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT
Develop your own process in accordance with the BPMN 2 standard
Learn Process modeling best practices with Bonita BPM through live exercises
http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_
source=Sourceforge_BPM_Camp_5_6_15&utm_medium=email&utm_campaign=VA_SF
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general