Hello ecolog,

I disagree with the suggestion that maximizing R2 is a good way to predict future changes to a system... maximizing R2 may produce a perfect fit to your current data set, but you are fitting to the noise as well as the signal, and such a model will likely perform poorly with new data. I think that if you want to have predictive power, you should probably still use a parsimonious approach like AIC, since this will tend to reject covariates that only have a small impact on the model's predictive power.

Brian Mitchell
Date:    Wed, 10 Feb 2010 16:36:18 -0500
From:    "Fann, Sarah Lynn" <[email protected]>
Subject: Re: AIC, data-dredging, and inappropriate stats

Dear ecology,

AIC = model deviance + 2*(# of parameters). In essence, AIC is calculated so that a model that "best" balances between decreasing the deviance of the model from the data (we want this) and keeping a model simple and/or relevant. The deviance will be small if the covariates (explanatory variables) are "good" or if we have a ton of lousy covariates. Thus AIC penalizes excessive covariates by adding 2*# parameters (i.e. your Betas which are estimated for each covariate and covariate interaction).
Whether or not to use AIC, Rsq, or both comes down to the model design, and the 
results you are after. Do you want to explain the current state of a system and 
show which covariates are important? Minimize AIC. Do you want to predict 
future changes in the system? Maximize R2.

This is my view from a Statistics perspective since I haven't studied model selection in a biological setting.
Thank you very much,

Sarah Fann

Reply via email to