Phillip Feldman <phillip.m.feld...@gmail.com>
wrote:

> When using logistic regression, I'm often trying to establish whether a
> given feature has any effect.  

Compare models with and without the feature: Cross-validation, BIC, AIC,
PRESS, Bayes factor, etc. By the rules of inductive reasoning (cf. lex
parsimoniae, Occam's razor), the model that better predicts future data is
the more likely. If the model without the feature included gives equally
good or better predictions, Occam's razor instructs us that we ought to
assume that the feature has no substantial effect.

> R and Matlab give me p-values, but
> Scikit-learn does not.

p-values are not useful for model building (model selection). Actually,
p-values are not useful for anything and should be banned: It is
unfortunate that we use the word "significant" if p < 0.05, because it does
not mean "significant" in the linguistic sense. A feature has a
"significant effect" if p < 0.05, but it does not mean that the feature is
likely to have an effect. That is an inductive statement which we should
infer by model selection. Because of the way the p-value behaves, it is not
an Occam's razor. A feature can have an "significant effect" on past data,
but still deteriorate future predictions if included. This is particularly
the case if you have a large data set. Using the p-value to evaluate a
feature means we can draw a conclusion not supported by the data. We should
therefore never compute p-values.

Sturla


------------------------------------------------------------------------------
BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT
Develop your own process in accordance with the BPMN 2 standard
Learn Process modeling best practices with Bonita BPM through live exercises
http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_
source=Sourceforge_BPM_Camp_5_6_15&utm_medium=email&utm_campaign=VA_SF
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to