On Wed, 12 Jan 2005, Berton Gunter wrote: >R-Listers. > >The following is a rant originally sent privately to Frank Harrell in >response to remarks he made on this list. The ideas are not new or original, >but he suggested I share it with the list, as he felt that it might be of >wider interest, nonetheless. I have real doubts about this, and I apologize >in advance to those who agree that I should have kept my remarks private. >In view of this, if you wish to criticize my remarks on list, that's fine, >but I won't respond (I've said enough already!). I would be happy to discuss >issues (a little) further off list with anyone who wishes to bother, but not >on list. > >Also, Frank sent me a relevant reference for those who might wish to read a >more thoughtful consideration of the issues: > >@ARTICLE{far92cos, > author = {Faraway, J. J.}, > year = 1992, > title = {The cost of data analysis}, > journal = J Comp Graphical Stat, > volume = 1, > pages = {213-229}, > annote = {bootstrap; validation; predictive accuracy; modeling strategy; > regression diagnostics;model uncertainty} >} > >I welcome further relevant references, pro or con! > >Finally, I need to emphasize that these are clearly my very personal views >and do not reflect those of my company or colleagues. > >Cheers to all ... >----------- > >The relevant portion of Frank's original comment was in a thread about K-S >tests for the goodness of fit of a parametric distribution: > >... >> If you use the empirical CDF to select a parametric >> distribution, the final estimate of the distribution will inherit the >> variance of the ECDF. >> The main reason statisticians think that >> parametric curve fits are far more efficient than >> nonparametric ones is >> that they don't account for model uncertainty in their final >> confidence >> intervals. >> >> -- Frank Harrell > >My reply: > >That's a perceptive remark, but I would go further... You mentioned >**model** uncertainty. In fact, in any data analysis in which we explore the >data first to choose a model, fit the model (parametric or non..), and then >use whatever (pivots from parametric analysis; bootstrapping;...) to say >something about "model uncertainty," we're always kidding ourselves and our >colleagues because we fail to take into account the considerable variability >introduced by our initial subjective exploration and subsequent choice of >modeling strategy. One can only say (at best) that the stated model >uncertainty is an underestimate of the true uncertainty. And very likely a >considerable underestimate because of the model choice subjectivity. > >Now I in no way wish to discourage or abridge data exploration; only to >point out that we statisticians have promulgated a self-serving and >unrealistic view of the value of formal inference in quantifying true >scientific uncertainty when we do such exploration -- and that there is >therefore something fundamentally contradictory in our own rhetoric and >methods. Taking a larger view, I think this remark is part of the deeper >epistemological issue of characterizing what can be scientifically "known" >or, indeed, defining the difference between science and art, say. My own >view is that scientific certainty is a fruitless concept: we build models >that we benchmark against our subjective measurements (as the measurements >themselves depend on earlier scientific models) of "reality." Insofar as >data can limit or support our flights of modeling fancy, they do; but in the >end, it is neither an objective process nor one whose "uncertainty" can be >strictly quantified.
I totally agree with the above and I am totally unqualified to comment on the below. You (and others) might find these papers interesting... http://www.santafe.edu/~chaos/chaos/pubs.htm Specifically papers like... Synchronizing to the Environment: Information Theoretic Constraints on Agent Learning. http://www.santafe.edu/~cmg/papers/stte.pdf Is Anything Ever New? Considering Emergence. http://www.santafe.edu/~cmg/papers/EverNew.pdf Observing Complexity and The Complexity of Observation http://www.santafe.edu/~cmg/papers/OCACO.pdf What Lies Between Order and Chaos? http://www.santafe.edu/~cmg/papers/wlboac.pdf And probably many more. >In creating the illusion that "statistical methods" can >overcome these limitations, I think we have both done science a disservice >and relegated ourselves to an isolated, fringe role in scientific inquiry. > >Needless to say, opposing viewpoints to such iconclastic remarks are >cheerfully welcomed. Does it make any difference to the mass of Saturn? Dan. > >Best regards, > >Bert Gunter > >______________________________________________ >R-help@stat.math.ethz.ch mailing list >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html > ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html