Actually Wirt's reply (see below) was not incorrect. The reason for the confusion is that Bonnie presented the log likelihood values with the wrong sign and thus the AIC values were INCORRECT and were negative when they should be POSITIVE.
The correct table should be: log likelihood k AIC -532.5052 16 1097.01 -509.8392 58 1135.68 As everyone agrees: AIC = (-2 log likelihood) + (2k) The first term is often referred to as Deviance. The likelihood of a model given the data is always between 0 and 1, and thus log likelihood is always a negative number (it could be zero; but it's always non-positive). The BETTER the fit, the greater the likelihood, the less negative will be log likelihood, and thus the SMALLER will be the Deviance. If these 2 models were nested, then we see that adding 42 parameters reduced Deviance (from 1065.01 to 1019.68) as it must. If you add additional parameters you will always improve model fit (assuming that the model is not over specified and some parameters are redundant). But when one adds the Deviance (i.e., -2 x log likelihood) to 2k, the correct values AIC are shown above and we see that the LESS complex model has the smaller AIC. Thus, AIC does NOT justify the more complex model, even though it yields a better fit. ---- Nadav Nur PRBO Conservation Science 4990 Shoreline Highway 1 Stinson Beach, CA 94970 USA e-mail: [EMAIL PROTECTED] -----Original Message----- From: Gareth Russell [mailto:[EMAIL PROTECTED] Sent: Sunday, March 05, 2006 5:35 AM To: [email protected] Subject: Re: [ECOLOG-L] AIC Bonnie, Wirt Atmar's erudite reply unfortunately ends with the wrong answer to your question: it is the lowest number you want, not the lowest absolute number. So in your case, the value of -1033 indicates the most parsimonious model. That is the one with 16 parameters, which is good, as there many difficulties with having 58-parameters! Having said that, 16 parameters is a lot as well, so, without knowing anything about your model, you might want to look for redundancy, and/or try to reduce your number of parameters with PCA or some other technique. Gareth Russell NJIT/Rutgers --- Bonnie asks: > I'm getting conflicting answers to a question and am hoping you could help. > > When using the Akaike Information Criterion (AIC), am I looking for > the smallest number or the smallest absolute number? > > For example, I have two models. > > (-2 log likelihood) + (2k) > > log likelihood k AIC > > 532.5052 16 -1033.0104 > 509.8392 58 -903.6784 > > the difference between the two is great, but which is better? > -1033.0104 is the smallest, but -903.6784 is the smallest absolute value. > > I won't say which model I LIKE better. =) If the values are close, then it's a more a matter of preference and what your goals in creating the model are than absolute values. It's important to remember that the AIC was developed as an engineering metric for black-box models, a mathematical expression of Ockham's Razor ("Pluralitas non est ponenda sine neccesitate," -- don't add in things unnecessarily). [a "black box" model is just that, a box with an input socket and a matching output on the other side, but you have no idea of what's inside it or how it's constructed internally. All you care about is its stimulus/response behavior.] The two values of the AIC (-2 log likelihood) + (2k) measure the goodness of fit of the model and its complexity, respectively. In engineering, more complexity means more components and thus generally more unreliability over the long term, but if goodness of fit in the model is of paramount importance, you may well accept a higher level of complexity in order to obtain a very high quality prediction. On the other hand, if reliability, and thus low parts count, is maximally important to your application, especially if prediction to only "good enough for government work" standards is all that you need, you will work to primarily minimize the model's complexity. These choices of course presume that you have multiple minima in your AIC values. It is entirely possible that the optimization surface is a simple bowl however with a single point of global optimality. If that's so, and the true costs of additional complexity are accurately represented in the second term, then you would always choose the lowest absolute value. Wirt Atmar
