Actually Wirt's reply (see below) was not incorrect.
The reason for the confusion is that Bonnie presented the log likelihood
values with the wrong sign and thus the AIC values were INCORRECT and were
negative when they should be POSITIVE.

The correct table should be:

log likelihood  k   AIC
  
 -532.5052 16  1097.01
 -509.8392 58  1135.68


As everyone agrees:
AIC = (-2 log likelihood) + (2k)

The first term is often referred to as Deviance.  

The likelihood of a model given the data is always between 0 and 1, and thus
log likelihood is always a negative number (it could be zero; but it's
always non-positive).  The BETTER the fit, the greater the likelihood, the
less negative will be log likelihood, and thus the SMALLER will be the
Deviance.  If these 2 models were nested, then we see that adding 42
parameters reduced Deviance (from 1065.01 to 1019.68) as it must.  If you
add additional parameters you will always improve model fit (assuming that
the model is not over specified and some parameters are redundant).

But when one adds the Deviance (i.e., -2 x log likelihood) to 2k, the
correct values AIC are shown above and we see that the LESS complex model
has the smaller AIC.  Thus, AIC does NOT justify the more complex model,
even though it yields a better fit.


----
Nadav Nur 
PRBO Conservation Science
4990 Shoreline Highway 1
Stinson Beach, CA 94970 USA
 

e-mail:  [EMAIL PROTECTED]
 

-----Original Message-----
From: Gareth Russell [mailto:[EMAIL PROTECTED] 
Sent: Sunday, March 05, 2006 5:35 AM
To: [email protected]
Subject: Re: [ECOLOG-L] AIC

Bonnie,

Wirt Atmar's erudite reply unfortunately ends with the wrong answer to your
question: it is the lowest number you want, not the lowest absolute number.
So in your case, the value of -1033 indicates the most parsimonious model.
That is the one with 16 parameters, which is good, as there many
difficulties with having 58-parameters! Having said that, 16 parameters is a
lot as well, so, without knowing anything about your model, you might want
to look for redundancy, and/or try to reduce your number of parameters with
PCA or some other technique.

Gareth Russell
NJIT/Rutgers

---
Bonnie asks:

> I'm getting conflicting answers to a question and am hoping you could
help.
>  
>  When using the Akaike Information Criterion (AIC), am I looking for 
> the  smallest number or the smallest absolute number?
>  
>  For example, I have two models.
>  
>  (-2 log likelihood) + (2k)
>  
>  log likelihood       k   AIC
>  
>  532.5052 16  -1033.0104
>  509.8392 58  -903.6784
>  
>  the difference between the two is great, but which is better?
>  -1033.0104 is the smallest, but -903.6784 is the smallest absolute value.
>  
>  I won't say which model I LIKE better.  =)

If the values are close, then it's a more a matter of preference and what
your goals in creating the model are than absolute values. It's important to
remember that the AIC was developed as an engineering metric for black-box
models, a mathematical expression of Ockham's Razor ("Pluralitas non est
ponenda sine neccesitate," -- don't add in things unnecessarily).

[a "black box" model is just that, a box with an input socket and a matching
output on the other side, but you have no idea of what's inside it or how
it's constructed internally. All you care about is its stimulus/response
behavior.]

The two values of the AIC 

     (-2 log likelihood) + (2k)

measure the goodness of fit of the model and its complexity, respectively. 

In engineering, more complexity means more components and thus generally
more unreliability over the long term, but if goodness of fit in the model
is of paramount importance, you may well accept a higher level of complexity
in order to obtain a very high quality prediction. On the other hand, if
reliability, and thus low parts count, is maximally important to your
application, especially if prediction to only "good enough for government
work" standards is all that you need, you will work to primarily minimize
the model's complexity.

These choices of course presume that you have multiple minima in your AIC
values. It is entirely possible that the optimization surface is a simple
bowl however with a single point of global optimality. If that's so, and the
true costs of additional complexity are accurately represented in the second
term, then you would always choose the lowest absolute value.

Wirt Atmar

Reply via email to