Re: [R] lmer, p-values and all that

Nutter, Benjamin Wed, 03 Apr 2013 06:59:17 -0700

My apologies for coming late into this conversation, but I'm curious about 
something in your response


You use the following code to peform a likelihood ratio test between an lm 
object and a mer object

fm0 <- lm(distance~age,data=Orthodont)
fm2 <- lmer(distance~age+(1|Subject),data=Orthodont,REML=FALSE)

ddiff <- -2*c(logLik(fm0)-logLik(fm2))
pchisq(ddiff,1,lower.tail=FALSE)

It seems like this would be simple to roll up into a function, such as

lrtestGeneric <- function(fit1, fit2){
  chisq <- -2 * c(logLik(fit1) - logLik(fit2))
  df <- abs(attributes(logLik(fit1))$df - attributes(logLik(fit2))$df)
   p <- pchisq(chisq, df, lower.tail=FALSE)
   lrtest <- data.frame("L.R.Chisq"=chisq, "d.f."=df, "P"=p)
   return(lrtest)
}

lrtestGeneric(fm0, fm2)


My question is about whether it is appropriate to use the degrees of freedom 
returned by logLik or if I should just use 1 degree of freedom when comparing a 
model without the random effect to one with the random effect.  For instance, 
logLik returns a difference of 3 between degrees of freedom in the models.  
Should I be using the 3 degrees of freedom in the likelihood ratio test, or is 
it better to go with 1? 


fit0 <- lm(Reaction ~ Days, sleepstudy)
fit1 <- lmer(Reaction ~ Days + (Days|Subject), sleepstudy)

lrtestGeneric(fit0, fit2)


Any education you can provide would be great appreciated.

Thanks

  Benjamin Nutter |  Biostatistician     |  Quantitative Health Sciences
  Cleveland Clinic    |  9500 Euclid Ave.  |  Cleveland, OH 44195  | (216) 
445-1365


-----Original Message-----
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Ben Bolker
Sent: Wednesday, March 27, 2013 10:34 PM
To: David Winsemius
Cc: r-h...@stat.math.ethz.ch
Subject: Re: [R] lmer, p-values and all that

On 13-03-27 10:10 PM, David Winsemius wrote:
> 
> On Mar 27, 2013, at 7:00 PM, Ben Bolker wrote:
> 
>> Michael Grant <michael.grant <at> colorado.edu> writes:
>>> Dear Help:
>>
>>> I am trying to follow Professor Bates' recommendation, quoted by 
>>> Professor Crawley in The R Book, p629, to determine whether I should 
>>> model data using the 'plain old' lm function or the mixed model 
>>> function lmer by using the syntax anova(lmModel,lmerModel).
>>> Apparently I've not understood the recommendation or the proper 
>>> likelihood ratio test in question (or both) for I get this error
>>> message: Error: $ operator not defined for this S4 class.
>>
>>  I don't have the R Book handy (some more context would be extremely 
>> useful!  I would think it would count as "fair use" to quote the 
>> passage you're referring to ...)
> 
> This is the quoted Rhelp entry:
> 
> http://tolstoy.newcastle.edu.au/R/help/05/01/10006.html
> 
> (I'm unable to determine whether it applies to the question at hand.)

  OK, I misspoke -- sorry.  I think the lmer()/lme() likelihoods are actually 
comparable; it's GLMMs (glmer(), with no analogue in lme()-land except for 
MASS::glmmPQL, which doesn't give you log-likelihoods at all) where the problem 
arises.

  You can (1) use lme(), (2)  look at http://glmm.wikidot.com/faq for 
suggestions about testing random-effects terms (including "perhaps don't do it 
at all"), or (3) construct the likelihood ratio test yourself as follows:

library("nlme")
data(Orthodont)
fm1 <- lme(distance~age,random=~1|Subject,data=Orthodont)
fm0 <- lm(distance~age,data=Orthodont)
anova(fm1,fm0)[["p-value"]]
detach("package:nlme",unload=TRUE)
library("lme4.0") ## stable version of lme4
fm2 <- lmer(distance~age+(1|Subject),data=Orthodont,REML=FALSE)
anova(fm2,fm0) ## fails
ddiff <- -2*c(logLik(fm0)-logLik(fm2))
pchisq(ddiff,1,lower.tail=FALSE)
## not identical to above but close enough

> 
>>
>>> Would someone be kind enough to point out my blunder?
>>
>>  You should probably repost this to the 
>> r-sig-mixed-mod...@r-project.org mailing list.
>>
>>  My short answer would be: (1) I don't think you can actually use 
>> anova() to compare likelihoods between lm() and lme()/lmer() fits in 
>> the way that you want: *maybe* for lme() [don't recall], but almost 
>> certainly not for lmer().  See http://glmm.wikidot.com/faq for 
>> methods for testing significance/inclusion of random factors (short 
>> answer: you should *generally* try to make the decision whether to 
>> include random factors or not on _a priori_ grounds, not on the basis 
>> of statistical tests ...)
>>
>>  Ben Bolker
>>
>

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

===================================


 Please consider the environment before printing this e-mail

Cleveland Clinic is ranked one of the top hospitals in America by U.S.News & 
World Report (2012).  
Visit us online at http://www.clevelandclinic.org for a complete listing of our 
services, staff and locations.


Confidentiality Note:  This message is intended for use ...{{dropped:18}}

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] lmer, p-values and all that

Reply via email to