> Date:    Mon, 8 Feb 2010 13:57:54 -0500
> From:    Bruce Robertson <[email protected]>
> Subject: AIC, data-dredging, and inappropriate stats
>
> Dear Ecologists,
>
> I've been using an information-theoretic model-selection approach as a
> part of my research and have found that the ecological literature
> appears to be very hypocritical and inconsistent in how these stats are
> used and interpreted.

The inconsistency problem at least is not limited to the ecological
literature, even broadly defined.

> 1. Data dredging. Starting with 6 independent variables and a single
> dependent variable.

The "which variables (Xs) are important in predicting/explaining Y"
question looms large often, but is a very thorny one to answer
conclusively. Nonetheless, this question and "data dredging" are not
always one in the same, as you note below.

> Historically, the recommended approach is to choose
> a set of a-priori models containing combinations of these variables that
> make ecological sense, then rank them using AIC scores and weights.

This is only step 1 of the process. A simple table of rankings is not
usually sufficient to make inferences. And, for completeness, AICc
would be the way to go, not AIC.

> Running all possible combinations of these 6 variables has historically
> been looked down upon because type II error goes through the roof.
> ...SNIP...
> There are not yet any methods to account for the type II
> error associated with running a bunch of spurious models in the AIC
> ranking approach.

If you know the models are spurious, why run them (though I am not
sure what the term means applied to models)?

> Why do I see soooo many paper in soooo many highly
> ranked ecological journals (e.g. ecology, ecology letters, ecological
> applications) that do this (run all possible comibations of variables)
> anyway?

Well, one reason might be that Burnham and Anderson suggest the
approach when *no better alternative* is available. Meaning that you
have no solid reasoning for developing a relatively small model set
for the analysis, among other things. It's a bad way to go for sure,
but in an exploratory stage it may be OK, and it is certainly
preferable to any of the classical stepwise approaches based on
p-value cutoffs.

Here is Anderson's summary from his recent Primer (Springer, 2008,
Section 5.3.1 on Rationale for Ranking the Relative Importance of
Predictor Variables):

"In general, I do not recommend running all the models; this is a
special case where every variable must be put on equal footing with
the rest [other models] for the ranking to be interpretable."

This gets messy quick and cuts into issue #2 (Summing of AIC Scores).
A long-ish thread on this and related subjects is here:
http://www.phidot.org/forum/viewtopic.php?f=1&t=1228

> 2. Summing of AIC scores.
> ...SNIP...
> The approach seems problematic because it is based upon data
> dredging (above), but seems common in journals like Ecology, Ecology
> letters, etc.

I would say that just about all journals are struggling with these
changes in inference methods. Short of a top-down "don't do this, do
this instead" decision from the Editorial Board, they have to (and I
am not recommending the top-down approach!).

> I actually saw one paper in the Journal of Biogeography in
> which the author choice to select a set of a-priori models to run, then
> took this summing approach. Wouldn't this just show that the most
> important variables were the variables that the analyst thought were
> important a-priori.

See above thread on Phidot forum. If this is all the author did with
the model set to make inference (?), it seems insufficient. (However,
if the model set was simple and one model had all the support, any
method will get the "right" answer.)

> 3. Use of other 'fit' statistics along with the model-selection
> approach. I often see people reporting other statistics (e.g. p-vals,
> r-squared) in combination with the AIC scores. My statistician friend
> says that this is totally inappropriate, and uninformative.

Mixing inferential paradigms is poor, and I don't think we need
shrines, scripture, or dogma in this discussion. For one, a direct
comparison of correct analyses from the two paradigms can lead to
different conclusions (I've seen it). How then can any theory support
putting them together? Often an author does this because they show the
same story, which is "analysis dredging." If you show both and they
disagree, and you don't choose one, there is no theory to guide you to
a proper conclusion (of course, the data may not be sufficient to
provide such).

Perhaps as folks get more familiar with this "new" model selection
framework, we will be unfortunately stuck with some overlap. All this
said, for simple linear models there is no reason not to report the
R^2 (how much variance explained) as long as it is not used for model
selection and inference. As Eric R notes, this gets at the related and
very important question of model(s) goodness-of-fit.

> My general impression is that, while the statistical world has yet to
> develop more robust techniques (e.g accounting for type II error in
> model selection), that there are clear recommendations that make some
> approaches (e.g. data dredging) clearly improper. Please comment on
> whether ecologists are simply not 'following the rules' (perhaps out of
> ignorance) or whether there really are different and statistically valid
> opinions on this topic.

I'd put my hat in the ring for not following the "rules" (or, more
accurately, not taking time to completely learn and understand them).
A great article to read on information-theoretic model selection,
which does not get the attention it deserves because of where it was
published, is:

Burnham and Anderson. 2004. Multimodel inference: understanding AIC
and BIC in model selection. Sociological Methods & Research
33(2):261-304. [I don't have a PDF!]

But, model selection theory is not restricted to AICc and its friends,
and progress has not halted on this front. There are a number of
simulation-based strategies for model selection and the recent work on
data cloning looks promising. E.g. (see also other work by Lele),

Ponciano, Taper, Dennis, Lele. 2009. Hierarchical models in ecology:
confidence intervals, hypothesis testing, and model selection using
data cloning. Ecology 90(2):356–362.

I agree with Ned D that this is a potentially fun (really?) and
contentious topic, although I see that often the contentiousness
develops because of incomplete understanding. Lastly, Ned, I'd say
there are plenty of folks ready to answer to how Bayesian and
classical frequentist methods are "different and independent
statistical paradigms", even from a practical standpoint!

--
Dave Hewitt

Reply via email to