----- Forwarded message from "Collyer, Michael"
<michael.coll...@wku.edu> -----
Date: Sat, 15 Mar 2014
11:20:37 -0400
From: "Collyer, Michael"
<michael.coll...@wku.edu>
Reply-To: "Collyer, Michael"
<michael.coll...@wku.edu>
Subject: Re: Multivariate Model
Selection
To: "<morphmet@morphometrics.org>"
<morphmet@morphometrics.org>
Dear Fabio,
I would not say there is an inherent problem of AIC estimation for
multivariate data - unless one is using the wrong formula (see below) - but
there is an inherent problem with how we have been trained to view delta AIC
values.
Most people who use AIC have probably been introduced to the rule of thumb
for interpretation that a delta AIC value between 2-4 suggests that the model is
worth considering, even if not the "best" model. Let's look at the logic of
this "rule". The AIC
formula that is most often given is AIC = -2log(L) + 2K. L is the maximized
value of the likelihood function of the model (the model likelihood), and K is
the number of model parameters (usually expressed as k + 1, where k is the
number of model coefficients
and 1 is the number of variables for the error covariance matrix). For
univariate data, one might imagine what would happen if two models had the same
likelihood. Delta AIC would be 2K1 - 2K2 = 2(K1-K2). Therefore, two models
that produce similar likelihood
but differ in only 1-2 parameters would produce a delta AIC for one model of
only 2-4. Therefore, a model with a delta AIC of 2-4 is a viable model.
The problem is that AIC = -2log(L) + 2K is grossly simplified for
univariate data. K is not equal to k - 1. It is equal to pk + p(p + 1)/2,
where p is the number of variables. The pk part represents the p x k
dimensions of the B matrix of model coefficients,
and the p(p + 1)/2 represents the number of unique values of the p x p error
covariance matrix. (The error covariance matrix is square symmetric, so values
above the diagonal are the same as below the diagonal.) For univariate data, p
=1, so pk + p(p + 1)/2
simplifies to 1*k + 1(1+1)/2 = k +1. If one uses the same logic as the delta
AIC = 2-4 rule of thumb, say for multivariate data, then 2(K1-K2) = 2p(delta
k). What this means is that if p = 100, delta AIC values of 200-400 should
indicate viable models (i.e.,
if model likelihoods are similar, they differ by only 1-2 parameters).
In summary, large delta AIC values should be expected with multivariate
data. My explanation above is also superficial. in general, log likelihoods
for multivariate data are estimated from determinants of error covariance
matrices. For univariate data,
this is the estimated variance of the model error. Although this would
require some proof, I believe the reduction in error from additional model
parameters would have a multiplicative - not additive - effect on model
likelihood for multivariate data. So
multiplying 2-4 by p might fall short as a good rule of thumb. The value
might need to be much larger.
One must also be careful when using AIC functions in stats programs. The
AIC() function in R used to assume univariate data. It would inadvertently
convert the n x p residuals produced from a linear fit into a np x 1 vector of
residuals, and then calculate
the residual sum of squares from this. This produced incorrect AIC
calculations. This problem appears to no longer be an issue because the
logLik() function used by the AIC() function does not work with multivariate
data. However, the extractAIC() function
still makes this mistake. The lesson here is that one must use caution with
canned AIC calculations.
I hope this helps.
Michael Collyer
Assistant Professor
Department of Biology
Western Kentucky University
1906 College Heights Blvd. #11080
Bowling Green, KY 42101-1080
Phone: 270-745-8765; Fax: 270-745-6856
Email:
michael.coll...@wku.edu
wrote:
----- Forwarded message from "Fabio de A. Machado" <macfa...@gmail.com>
-----
Date: Sun, 23 Feb 2014 09:12:51 -0300
From: "Fabio de A. Machado" <macfa...@gmail.com>
Reply-To: "Fabio de A. Machado" <macfa...@gmail.com>
Subject: Multivariate Model Selection
To: morphmet morphmet <morphmet_modera...@morphometrics.org>
Dear all,
I'm trying to implement a model selection protocol for multivariate
morphometrics and I'm having some trouble with model selection criteria.
I intended to use AIC to select the best model, but in any real dataset
that I have tried this, the best model (lowest AIC) is always the one with the
most independent variables.
For nested models, I've tried to check the results using MANOVA procedures
(selecting only the significant independent variables) and Canonical Correlate
Analysis and both procedures are very similar (significant variables have the
highest scores on CCoA).
Also, when I use the chi-square approximation to test the difference between
linear models, I come up with fairly similar results from the MANOVA procedure.
But if I inspect the AIC of those reduced models, they are far higher then the
most complex model,
sometimes \DeltaAIC>1000, which seems very far from the \DeltaAIC<2 for
similar models.
Is this some inherent problem of AIC estimation for multivariate
data?
Best,
----- End forwarded message -----
----- End forwarded message
-----