It wouldn't hurt to have p-values returned, but personally, I don't miss them in scikit-learn. I think that's a classic "ML vs. statistics" discussion -- what I mean is the inference vs. prediction stuff. To me, scikit-learn is primarily a machine learning library.
> On Apr 19, 2015, at 12:53 AM, Sturla Molden <sturla.mol...@gmail.com> wrote: > > <josef.p...@gmail.com> wrote: > >> Good, I was reading your previous comments on the topic as being >> against all frequentist null hypothesis testing. > > In the frequentist paradigm I prefer to use model selection instead of > classical hypothesis testing with p-values. My focus is on building useful > models which are able to predict future outcomes. > > In Bayesian statistics hypothesis testing and model selection are > identical. > > > Sturla > > >> >> Note. The editors of Basic and Applied Social Psychology are also >> banning confidence intervals. >> >> >>> >>> A null hypothesis test is also just a matter of model selection: In the >>> case of the classical t-test, the null hypothesis is a model selection >>> between one model with a single parameter x ~ N(sigma,0) and the >>> alternative hypothesis is a model with two parameters, x ~ N(sigma,mu). If >>> the mean is actually 0, adding an additional parameter mu should overfit >>> the data. You can e.g. see this on the BIC value. >>> >>> >>>> and the editors of one journal agree with this >>>> >>>> https://groups.google.com/d/msg/pystatsmodels/e8aTj2ydyFI/odkShG2K3wwJ >>>> http://www.scientificamerican.com/article/scientists-perturbed-by-loss-of-stat-tool-to-sift-research-fudge-from-fact/ >>> >>> Epidemiology also has a ban on p-values for more than 10 years, due to its >>> founding editor. The ban was lifted when they changed editor 2001, but the >>> quality of the publications dropped when p-values were reintroduced. >>> >>> http://journals.lww.com/epidem/fulltext/2001/05000/the_value_of_p.2.aspx >> >> >> " >> Does all this mean a change in Epidemiology’s policy on P-values? It >> may be no more than a change in perception. We will not ban P-values. >> But neither did Rothman. He called for caution, and we do the same. >> The question is not whether the P-value is intrinsically bad, but >> whether it too easily substitutes for the thoughtful integration of >> evidence and reasoning. Given the P-value’s blighted history, >> researchers who would employ the P-value take on a particularly heavy >> burden to do so wisely. >> " >> I have no disagreement with that. >> p-values are only one of our five columns in the results parameter table. >> >> I refrain from any other comments that might overlap quite a bit with >> previous discussions that we had. >> >> Josef >> >>> >>> The editors of Journal of Physiology have (beginning from last year) >>> started to request confidence intervals instead of p-values. I know this >>> because collegues in Oslo have gotten papers returned and been instructed >>> to change all their analysis away from using p-values. This was not in the >>> journal's instructions to authors, so it came as a surprise. >>> >>> I agree with the editors of Basic and Applied Social Psychology on their >>> ban on p-values and classical hypothesis testing. Inferential statistics is >>> seldom used correctly. Most scientists do not have the competence to know >>> when to use descriptive statistics and when to use inferential statistics, >>> it seems. The common practice is to always use inferential statistics, even >>> when inappropriate. Thus we see papers littered with p-values. It is for >>> the common good to just ban inferential statistics all together. Instead >>> the editors of BASP request descriptive statistics and good graphs. The >>> inference can then be done qualitatively. If an effect is not visible by >>> eye balling, then it is likely not there (or at least not important). The >>> scale and resolution used on a graph should reflect the relevant effect >>> sizes. If the scale makes a tiny effect invisible on a graph, then it is >>> not relevant even if present. This is not a new and unproven method to >>> science, Isaac Newton and Albert Einstein did this too. Descriptive >>> statistics combined with qualitative inference is an old and proven method >>> that everyone can use correctly. Of course it would be better if scientists >>> actually had the competence to use inferential statistics correctly. >>> Unfortunately everything suggests that few scientists do, at least outside >>> the fields of statistics and machine learning. >>> >>> >>>> Fortunately for statsmodels, there is a large part of the world that >>>> also want to know about which variables affect a event or >>>> characteristic, instead of just doing best prediction with anonymous >>>> variables >>> >>> Model selection can be blind or driven by domain-specific knowledge. In the >>> latter case, we are better off using Bayesian statistics, because when >>> using knowledge of a subject as guide we are including prior information in >>> our analysis. Then it is better to be specific about that. >>> >>> >>> Sturla >>> >>> >>> ------------------------------------------------------------------------------ >>> BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT >>> Develop your own process in accordance with the BPMN 2 standard >>> Learn Process modeling best practices with Bonita BPM through live exercises >>> http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_ >>> source=Sourceforge_BPM_Camp_5_6_15&utm_medium=email&utm_campaign=VA_SF >>> _______________________________________________ >>> Scikit-learn-general mailing list >>> Scikit-learn-general@lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general >> >> ------------------------------------------------------------------------------ >> BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT >> Develop your own process in accordance with the BPMN 2 standard >> Learn Process modeling best practices with Bonita BPM through live exercises >> http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_ >> source=Sourceforge_BPM_Camp_5_6_15&utm_medium=email&utm_campaign=VA_SF >> _______________________________________________ >> Scikit-learn-general mailing list >> Scikit-learn-general@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general > > > ------------------------------------------------------------------------------ > BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT > Develop your own process in accordance with the BPMN 2 standard > Learn Process modeling best practices with Bonita BPM through live exercises > http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_ > source=Sourceforge_BPM_Camp_5_6_15&utm_medium=email&utm_campaign=VA_SF > _______________________________________________ > Scikit-learn-general mailing list > Scikit-learn-general@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general ------------------------------------------------------------------------------ BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT Develop your own process in accordance with the BPMN 2 standard Learn Process modeling best practices with Bonita BPM through live exercises http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_ source=Sourceforge_BPM_Camp_5_6_15&utm_medium=email&utm_campaign=VA_SF _______________________________________________ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general