"Spencer" == Spencer Graves <[EMAIL PROTECTED]> on Sat, 04 Dec 2004 17:09:26 -0800 writes:
Spencer> I don't know the "best" way, but the following looks like it will Spencer> work:
Spencer> tstDF <- data.frame(x=1:3, y=c(1,1,2)) Spencer> fit0 <- lm(y~1, tstDF) Spencer> fitDF <- lm(y~x, tstDF) Spencer> AIC(fitDF,fit0) Spencer> df AIC Spencer> fitDF 3 5.842516 Spencer> fit0 2 8.001399
Spencer> The function AIC with only 1 argument returns only
Spencer> a single number. However, given nested models, it
Spencer> returns a data.frame with colums df and AIC. At
Spencer> least in this example (and I would think in all
Spencer> other contexts as well), "df" is the K you want.
yes, but Benjamin would hardly want to invent a second model
just in order to call AIC() with both models and get the data
frame...
As Benjamin hoped, the "df" is part of the logLik(.) result, and if either of you had more carefully looked at help(logLik), you'd have seen
Value:
Returns an object, say 'r', of class 'logLik' which is a number with attributes, 'attr(r, "df")' (*d*egrees of *f*reedom) giving the number of parameters in the model. There's a simple 'print' method for 'logLik' objects.
More directly, you can use
> str(unclass(logLik(fit0))) atomic [1:1] -2 - attr(*, "nall")= int 3 - attr(*, "nobs")= int 3 - attr(*, "df")= num 2
or
> stats:::AIC.logLik
function (object, ..., k = 2) -2 * c(object) + k * attr(object, "df")
to see how these work.
Martin Maechler, ETH Zurich
[snip]
Yes, but see the problem recently outlined by Prof. Ripley about nobs:
On Sun, 31 Oct 2004, Prof Brian Ripley wrote:
The harder task is actually to get `n', not `npar'.
length(resid(x)) may or may not include missing values, depending on the na.action used, and will include observations with weight zero.
However, logLik's "lm" method returns an attribute "nobs" that is a better choice.
But only better, as it looks like it has fallen into the first trap. AFAICS, if 'fit' is an lm fit, then
n = df.residual(fit) + fit$rank
Quick check:
x <- rnorm(10); x[2] <- NA y <- rnorm(10); y[1] <- NA w <- c(rep(1,9), 0) fit <- lm(y ~x, weights = w, na.action=na.exclude) df.residual(fit) + fit$rank # 7, correct attributes(logLik(fit)) # nall 9 nobs 9 df 3 # but AIC fails length(resid(fit)) # 10
fit <- lm(y ~x, weights = w, na.action=na.omit) df.residual(fit) + fit$rank # 7, correct attributes(logLik(fit)) # nall 7 nobs 7 df 3 length(resid(fit)) # 8
The problem might also become a bit more complicated when dealing with other kind of models than lm (or weighted lm, etc).
For example, what is nobs when you are using the syntax:
cbind(success, failure) ~ covariates
in a binomial glm ? Is it the number of lines in the aggregated data frame, or the total number of observations ?
Best,
Renaud
-- Dr Renaud Lancelot, v�t�rinaire C/0 Ambassade de France - SCAC BP 834 Antananarivo 101 - Madagascar
e-mail: [EMAIL PROTECTED]
tel.: +261 32 40 165 53 (cell)
+261 20 22 665 36 ext. 225 (work)
+261 20 22 494 37 (home)______________________________________________ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
