On Tue, 21 Feb 2006, Thomas Lumley wrote:
>
> This might be a more suitable message for eg the stats-discuss mailing
> list or one of the sci.stat.* newsgroups.
>
> It is more complicated that it looks, partly because of the Anna Karenina
> problem: all nested models are the same, but non-nested models can be
> non-nested in different ways
And I am sure Akaike appreciated that, which may be why he only (AFAIK)
derived a theoretical basis for AIC under strictly limited conditions
including nesting.
> Some notes:
>
> 1) Sometimes the AIC is clearly inappropriate: eg comparing the fit of a
> Poisson regression to a least-squares linear regression for count data.
> Here the likelihoods are not densities with respect to the same
> measure, so the likelihood ratio is meaningless. You could also argue
> that the linear model isn't really being fitted by maximum likelihood.
>
> 2) You need to be careful when fitting models with different R functions,
> since they may omit different constants in the likelihood.
>
> 3) Transformations of the outcome are a problem. You can frame this as a
> mathematical problem or just note the difficulty of saying what you mean
> when you decide that the multiplicative error in one model is smaller than
> the additive error in another model.
>
> 4) If you have two least-squares linear regression models with the same
> outcome variable and different predictors then the AIC is choosing based
> on a consistent estimate of the mean squared prediction error, and in that
> sense it is a valid way to choose the model that predicts best. This may
> or may not be the criterion you want, but if it isn't what you want then
> AIC isn't going to help.
>
> 5) If you have a large number of models then (nested or not) there is no
> guarantee that the estimate of prediction error is *uniformly* consistent,
> so the arguments behind AIC do not necessarily work.
(That only makes sense if the model class changes with 'n', suitably
defined. You do get uniform consistency over a finite class of models,
one of Akaike (1973)'s conditions. However, to use AIC you don't just
need a consistent estimator, but to worry about the consistency of
the O(1/n) term in the mean since AIC/n is effectively s^2 + 2p/n.)
One other note.
AIC/n is a consistent estimator but only if the model is true, and one
with a lot of sampling error. Differences in AIC are much more precisely
estimated for a pair of nested models than for some non-nested pairs. So
sampling error can make comparisons of AIC meaningless unless the
differences are large (and 'large' grows with 'n' for some appropriate
'n').
A recent talk of mine
http://www.stats.ox.ac.uk/~ripley/Nelder80.pdf
may be illuminating. There is a published paper version.
>
> On Tue, 21 Feb 2006, Ruben Roa wrote:
>
>> -----Original Message-----
>> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Aaron MacNeil
>> Sent: 20 February 2006 15:17
>> To: [email protected]
>> Subject: [R] Nested AIC
>>
>> Greetings,
>> I have recently come into some confusion over weather or not AIC
>> results for comparing among models requires that they be nested.
>> Reading Burnham & Anderson (2002) they are explicit that nested models are
>> not required, but other respected statisticians have suggested that nesting
>> is a pre-requisite for comparison. Could anyone who feels strongly
>> regarding either position post their arguments for or against nested models
>> and AIC? This would assist me greatly in some analysis I am currently
>> conducting.
>> Many thanks,
>>
>> Aaron
>>
>> ----
>> Hi, Aaron, Burnham & Anderson are explicit but they do not go into any depth
>> regarding this issue. Akaike's colleagues Sakamoto, Ishiguro, and Kitagawa
>> (Akaike Information Criterion Statistics, 1986, KTK Scientific Publishers)
>> do no either, deal with it directly, and the examples they present that I
>> have examined (not even half of the total in the book), are all of nested
>> models. However, by reading some of Akaike's papers and the book quoted
>> above it does not appear to me that there is any restriction on the use of
>> the AIC related to nestedness. In fact, the theory does not preclude the
>> comparison of models with different *probability densities (or mass)* as
>> long as you keep all constants (like 1/sqrt(2pi) in the normal) in the
>> calculation.
>> Akaike (1973) wrote in the first sentence of his paper his general
>> principle, which he called an extension of the maximum likelihood principle:
>> "Given a set of estimates theta_hat's of the vector of parameters theta of a
>> probability distribution with density f(x|theta) we adopt as our final
>> estimate the one which will give the maximum of the expected log-likelihood,
>> which is by definition
>> E(log f(X|theta_hat))=E(INTEGRAL f(x|theta)log f(x|theta_hat)dx)
>> Where X is a random variable following the distribution with the density
>> function f(x|theta) and is independent of theta_hat".
>> All subsequent derivations in the paper, like the choice of distance
>> measure, class of estimates, and elimination of the true parameter value,
>> revolve around this principle. Now, nestedness is a mathematical property of
>> what Burnham & Anderson call "the structural model", whereas Akaike's
>> principle only concerns the probabilistic model f(x|theta) where the
>> structural model is embedded.
>> I reply to you even though I do not feel strongly about this issue and you
>> asked for replies from people who feel strongly about this issue.
>> Ruben
>>
>> ______________________________________________
>> [email protected] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>>
>
> Thomas Lumley Assoc. Professor, Biostatistics
> [EMAIL PROTECTED] University of Washington, Seattle
>
> ______________________________________________
> [email protected] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>
--
Brian D. Ripley, [EMAIL PROTECTED]
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html