Re: [R] Nested AIC

Prof Brian Ripley Tue, 21 Feb 2006 09:49:31 -0800

On Tue, 21 Feb 2006, Thomas Lumley wrote:

>
> This might be a more suitable message for eg the stats-discuss mailing
> list or one of the sci.stat.* newsgroups.
>
> It is more complicated that it looks, partly because of the Anna Karenina
> problem: all nested models are the same, but non-nested models can be
> non-nested in different ways


And I am sure Akaike appreciated that, which may be why he only (AFAIK) 
derived a theoretical basis for AIC under strictly limited conditions 
including nesting.

> Some notes:
>
> 1) Sometimes the AIC is clearly inappropriate: eg comparing the fit of a
> Poisson regression to a least-squares linear regression for count data.
> Here the likelihoods are not densities with respect to the same
> measure, so the likelihood ratio is meaningless.  You could also argue
> that the linear model isn't really being fitted by maximum likelihood.
>
> 2) You need to be careful when fitting models with different R functions,
> since they may omit different constants in the likelihood.
>
> 3) Transformations of the outcome are a problem. You can frame this as a
> mathematical problem or just note the difficulty of saying what you mean
> when you decide that the multiplicative error in one model is smaller than
> the additive error in another model.
>
> 4) If you have two least-squares linear regression models with the same
> outcome variable and different predictors then the AIC is choosing based
> on a consistent estimate of the mean squared prediction error, and in that
> sense it is a valid way to choose the model that predicts best.  This may
> or may not be the criterion you want, but if it isn't what you want then
> AIC isn't going to help.
>
> 5) If you have a large number of models then (nested or not) there is no
> guarantee that the estimate of prediction error is *uniformly* consistent,
> so the arguments behind AIC do not necessarily work.

(That only makes sense if the model class changes with 'n', suitably 
defined.  You do get uniform consistency over a finite class of models, 
one of Akaike (1973)'s conditions.  However, to use AIC you don't just 
need a consistent estimator, but to worry about the consistency of 
the O(1/n) term in the mean since AIC/n is effectively s^2 + 2p/n.)

One other note.

AIC/n is a consistent estimator but only if the model is true, and one 
with a lot of sampling error.  Differences in AIC are much more precisely 
estimated for a pair of nested models than for some non-nested pairs.  So 
sampling error can make comparisons of AIC meaningless unless the 
differences are large (and 'large' grows with 'n' for some appropriate 
'n').

A recent talk of mine

        http://www.stats.ox.ac.uk/~ripley/Nelder80.pdf

may be illuminating.  There is a published paper version.

>
> On Tue, 21 Feb 2006, Ruben Roa wrote:
>
>> -----Original Message-----
>> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Aaron MacNeil
>> Sent: 20 February 2006 15:17
>> To: [email protected]
>> Subject: [R] Nested AIC
>>
>> Greetings,
>> I have recently come into some confusion over weather or not AIC
>> results for comparing among models requires that they be nested.
>> Reading Burnham & Anderson (2002) they are explicit that nested models are 
>> not required, but other respected statisticians have suggested that nesting 
>> is a pre-requisite for comparison.  Could anyone who feels strongly 
>> regarding either position post their arguments for or against nested models 
>> and AIC? This would assist me greatly in some analysis I am currently 
>> conducting.
>> Many thanks,
>>
>> Aaron
>>
>> ----
>> Hi, Aaron, Burnham & Anderson are explicit but they do not go into any depth 
>> regarding this issue. Akaike's colleagues Sakamoto, Ishiguro, and Kitagawa 
>> (Akaike Information Criterion Statistics, 1986, KTK Scientific Publishers) 
>> do no either, deal with it directly, and the examples they present that I 
>> have examined (not even half of the total in the book), are all of nested 
>> models. However, by reading some of Akaike's papers and the book quoted 
>> above it does not appear to me that there is any restriction on the use of 
>> the AIC related to nestedness. In fact, the theory does not preclude the 
>> comparison of models with different *probability densities (or mass)* as 
>> long as you keep all constants (like 1/sqrt(2pi) in the normal) in the 
>> calculation.
>> Akaike (1973) wrote in the first sentence of his paper his general 
>> principle, which he called an extension of the maximum likelihood principle:
>> "Given a set of estimates theta_hat's of the vector of parameters theta of a 
>> probability distribution with density f(x|theta) we adopt as our final 
>> estimate the one which will give the maximum of the expected log-likelihood, 
>> which is by definition
>> E(log f(X|theta_hat))=E(INTEGRAL f(x|theta)log f(x|theta_hat)dx)
>> Where X is a random variable following the distribution with the density 
>> function f(x|theta) and is independent of theta_hat".
>> All subsequent derivations in the paper, like the choice of distance 
>> measure, class of estimates, and elimination of the true parameter value, 
>> revolve around this principle. Now, nestedness is a mathematical property of 
>> what Burnham & Anderson call "the structural model", whereas Akaike's 
>> principle only concerns the probabilistic model f(x|theta) where the 
>> structural model is embedded.
>> I reply to you even though I do not feel strongly about this issue and you 
>> asked for replies from people who feel strongly about this issue.
>> Ruben
>>
>> ______________________________________________
>> [email protected] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>>
>
> Thomas Lumley                 Assoc. Professor, Biostatistics
> [EMAIL PROTECTED]     University of Washington, Seattle
>
> ______________________________________________
> [email protected] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>

-- 
Brian D. Ripley,                  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Nested AIC

Reply via email to