Re: [R] lmer, p-values and all that

2013-04-03 Thread Nutter, Benjamin
My apologies for coming late into this conversation, but I'm curious about 
something in your response

You use the following code to peform a likelihood ratio test between an lm 
object and a mer object

fm0 - lm(distance~age,data=Orthodont)
fm2 - lmer(distance~age+(1|Subject),data=Orthodont,REML=FALSE)

ddiff - -2*c(logLik(fm0)-logLik(fm2))
pchisq(ddiff,1,lower.tail=FALSE)

It seems like this would be simple to roll up into a function, such as

lrtestGeneric - function(fit1, fit2){
  chisq - -2 * c(logLik(fit1) - logLik(fit2))
  df - abs(attributes(logLik(fit1))$df - attributes(logLik(fit2))$df)
   p - pchisq(chisq, df, lower.tail=FALSE)
   lrtest - data.frame(L.R.Chisq=chisq, d.f.=df, P=p)
   return(lrtest)
}

lrtestGeneric(fm0, fm2)


My question is about whether it is appropriate to use the degrees of freedom 
returned by logLik or if I should just use 1 degree of freedom when comparing a 
model without the random effect to one with the random effect.  For instance, 
logLik returns a difference of 3 between degrees of freedom in the models.  
Should I be using the 3 degrees of freedom in the likelihood ratio test, or is 
it better to go with 1? 


fit0 - lm(Reaction ~ Days, sleepstudy)
fit1 - lmer(Reaction ~ Days + (Days|Subject), sleepstudy)

lrtestGeneric(fit0, fit2)


Any education you can provide would be great appreciated.

Thanks

  Benjamin Nutter |  Biostatistician     |  Quantitative Health Sciences
  Cleveland Clinic    |  9500 Euclid Ave.  |  Cleveland, OH 44195  | (216) 
445-1365


-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Ben Bolker
Sent: Wednesday, March 27, 2013 10:34 PM
To: David Winsemius
Cc: r-h...@stat.math.ethz.ch
Subject: Re: [R] lmer, p-values and all that

On 13-03-27 10:10 PM, David Winsemius wrote:
 
 On Mar 27, 2013, at 7:00 PM, Ben Bolker wrote:
 
 Michael Grant michael.grant at colorado.edu writes:
 Dear Help:

 I am trying to follow Professor Bates' recommendation, quoted by 
 Professor Crawley in The R Book, p629, to determine whether I should 
 model data using the 'plain old' lm function or the mixed model 
 function lmer by using the syntax anova(lmModel,lmerModel).
 Apparently I've not understood the recommendation or the proper 
 likelihood ratio test in question (or both) for I get this error
 message: Error: $ operator not defined for this S4 class.

  I don't have the R Book handy (some more context would be extremely 
 useful!  I would think it would count as fair use to quote the 
 passage you're referring to ...)
 
 This is the quoted Rhelp entry:
 
 http://tolstoy.newcastle.edu.au/R/help/05/01/10006.html
 
 (I'm unable to determine whether it applies to the question at hand.)

  OK, I misspoke -- sorry.  I think the lmer()/lme() likelihoods are actually 
comparable; it's GLMMs (glmer(), with no analogue in lme()-land except for 
MASS::glmmPQL, which doesn't give you log-likelihoods at all) where the problem 
arises.

  You can (1) use lme(), (2)  look at http://glmm.wikidot.com/faq for 
suggestions about testing random-effects terms (including perhaps don't do it 
at all), or (3) construct the likelihood ratio test yourself as follows:

library(nlme)
data(Orthodont)
fm1 - lme(distance~age,random=~1|Subject,data=Orthodont)
fm0 - lm(distance~age,data=Orthodont)
anova(fm1,fm0)[[p-value]]
detach(package:nlme,unload=TRUE)
library(lme4.0) ## stable version of lme4
fm2 - lmer(distance~age+(1|Subject),data=Orthodont,REML=FALSE)
anova(fm2,fm0) ## fails
ddiff - -2*c(logLik(fm0)-logLik(fm2))
pchisq(ddiff,1,lower.tail=FALSE)
## not identical to above but close enough

 

 Would someone be kind enough to point out my blunder?

  You should probably repost this to the 
 r-sig-mixed-mod...@r-project.org mailing list.

  My short answer would be: (1) I don't think you can actually use 
 anova() to compare likelihoods between lm() and lme()/lmer() fits in 
 the way that you want: *maybe* for lme() [don't recall], but almost 
 certainly not for lmer().  See http://glmm.wikidot.com/faq for 
 methods for testing significance/inclusion of random factors (short 
 answer: you should *generally* try to make the decision whether to 
 include random factors or not on _a priori_ grounds, not on the basis 
 of statistical tests ...)

  Ben Bolker



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

===


 Please consider the environment before printing this e-mail

Cleveland Clinic is ranked one of the top hospitals in America by U.S.News  
World Report (2012).  
Visit us online at http://www.clevelandclinic.org for a complete listing of our 
services, staff and locations.


Confidentiality Note:  This message is intended for use ...{{dropped:18

Re: [R] lmer, p-values and all that

2013-03-27 Thread Ben Bolker
Michael Grant michael.grant at colorado.edu writes:

 
 
 Dear Help:
 
 I am trying to follow Professor Bates' recommendation, quoted by
 Professor Crawley in The R Book, p629, to determine whether I should
 model data using the 'plain old' lm function or the mixed model
 function lmer by using the syntax anova(lmModel,lmerModel).
 Apparently I've not understood the recommendation or the proper
 likelihood ratio test in question (or both) for I get this error
 message: Error: $ operator not defined for this S4 class.

  I don't have the R Book handy (some more context would be extremely
useful!  I would think it would count as fair use to quote the
passage you're referring to ...)
 
 Would someone be kind enough to point out my blunder?

  You should probably repost this to the r-sig-mixed-mod...@r-project.org
mailing list.

  My short answer would be: (1) I don't think you can actually
use anova() to compare likelihoods between lm() and lme()/lmer()
fits in the way that you want: *maybe* for lme() [don't recall],
but almost certainly not for lmer().  See http://glmm.wikidot.com/faq
for methods for testing significance/inclusion of random factors
(short answer: you should *generally* try to make the decision
whether to include random factors or not on _a priori_ grounds,
not on the basis of statistical tests ...)

  Ben Bolker

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lmer, p-values and all that

2013-03-27 Thread David Winsemius

On Mar 27, 2013, at 7:00 PM, Ben Bolker wrote:

 Michael Grant michael.grant at colorado.edu writes:
 
 
 
 Dear Help:
 
 I am trying to follow Professor Bates' recommendation, quoted by
 Professor Crawley in The R Book, p629, to determine whether I should
 model data using the 'plain old' lm function or the mixed model
 function lmer by using the syntax anova(lmModel,lmerModel).
 Apparently I've not understood the recommendation or the proper
 likelihood ratio test in question (or both) for I get this error
 message: Error: $ operator not defined for this S4 class.
 
  I don't have the R Book handy (some more context would be extremely
 useful!  I would think it would count as fair use to quote the
 passage you're referring to ...)

This is the quoted Rhelp entry:

http://tolstoy.newcastle.edu.au/R/help/05/01/10006.html

(I'm unable to determine whether it applies to the question at hand.)

 
 Would someone be kind enough to point out my blunder?
 
  You should probably repost this to the r-sig-mixed-mod...@r-project.org
 mailing list.
 
  My short answer would be: (1) I don't think you can actually
 use anova() to compare likelihoods between lm() and lme()/lmer()
 fits in the way that you want: *maybe* for lme() [don't recall],
 but almost certainly not for lmer().  See http://glmm.wikidot.com/faq
 for methods for testing significance/inclusion of random factors
 (short answer: you should *generally* try to make the decision
 whether to include random factors or not on _a priori_ grounds,
 not on the basis of statistical tests ...)
 
  Ben Bolker
 

-- 
David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lmer, p-values and all that

2013-03-27 Thread Nicole Ford
i literally just ran one.

when i ran one of mine and the did summary(mod) i get the following:

 mod - lmer(dem ~ xbar + cpi + (1 | country), data=wvsAB)
 summary(mod)
Linear mixed model fit by REML 
Formula: dem ~ xbar + cpi + (1 | country) 
   Data: wvsAB 
   AIC   BIC logLik deviance REMLdev
 34383 34418 -1718734355   34373
Random effects:
 Groups   NameVariance Std.Dev.

with a bunch more stuff below.


On Mar 27, 2013, at 10:00 PM, Ben Bolker wrote:

 Michael Grant michael.grant at colorado.edu writes:
 
 
 
 Dear Help:
 
 I am trying to follow Professor Bates' recommendation, quoted by
 Professor Crawley in The R Book, p629, to determine whether I should
 model data using the 'plain old' lm function or the mixed model
 function lmer by using the syntax anova(lmModel,lmerModel).
 Apparently I've not understood the recommendation or the proper
 likelihood ratio test in question (or both) for I get this error
 message: Error: $ operator not defined for this S4 class.
 
  I don't have the R Book handy (some more context would be extremely
 useful!  I would think it would count as fair use to quote the
 passage you're referring to ...)
 
 Would someone be kind enough to point out my blunder?
 
  You should probably repost this to the r-sig-mixed-mod...@r-project.org
 mailing list.
 
  My short answer would be: (1) I don't think you can actually
 use anova() to compare likelihoods between lm() and lme()/lmer()
 fits in the way that you want: *maybe* for lme() [don't recall],
 but almost certainly not for lmer().  See http://glmm.wikidot.com/faq
 for methods for testing significance/inclusion of random factors
 (short answer: you should *generally* try to make the decision
 whether to include random factors or not on _a priori_ grounds,
 not on the basis of statistical tests ...)
 
  Ben Bolker
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lmer, p-values and all that

2013-03-27 Thread Nicole Ford
i meant to add, i am not sure if an example lmer model woulf be helpful, 
since i can't see the OP.


On Mar 27, 2013, at 10:13 PM, Nicole Ford wrote:

 i literally just ran one.
 
 when i ran one of mine and the did summary(mod) i get the following:
 
 mod - lmer(dem ~ xbar + cpi + (1 | country), data=wvsAB)
 summary(mod)
 Linear mixed model fit by REML 
 Formula: dem ~ xbar + cpi + (1 | country) 
   Data: wvsAB 
   AIC   BIC logLik deviance REMLdev
 34383 34418 -1718734355   34373
 Random effects:
 Groups   NameVariance Std.Dev.
 
 with a bunch more stuff below.
 
 
 On Mar 27, 2013, at 10:00 PM, Ben Bolker wrote:
 
 Michael Grant michael.grant at colorado.edu writes:
 
 
 
 Dear Help:
 
 I am trying to follow Professor Bates' recommendation, quoted by
 Professor Crawley in The R Book, p629, to determine whether I should
 model data using the 'plain old' lm function or the mixed model
 function lmer by using the syntax anova(lmModel,lmerModel).
 Apparently I've not understood the recommendation or the proper
 likelihood ratio test in question (or both) for I get this error
 message: Error: $ operator not defined for this S4 class.
 
 I don't have the R Book handy (some more context would be extremely
 useful!  I would think it would count as fair use to quote the
 passage you're referring to ...)
 
 Would someone be kind enough to point out my blunder?
 
 You should probably repost this to the r-sig-mixed-mod...@r-project.org
 mailing list.
 
 My short answer would be: (1) I don't think you can actually
 use anova() to compare likelihoods between lm() and lme()/lmer()
 fits in the way that you want: *maybe* for lme() [don't recall],
 but almost certainly not for lmer().  See http://glmm.wikidot.com/faq
 for methods for testing significance/inclusion of random factors
 (short answer: you should *generally* try to make the decision
 whether to include random factors or not on _a priori_ grounds,
 not on the basis of statistical tests ...)
 
 Ben Bolker
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lmer, p-values and all that

2013-03-27 Thread Nicole Ford
i did find this for you, down towards the end, they discuss the anova method.

i am on my way to a bayesian analysis/lmer is a step towards that- so i won't 
be doing anova.  i can't be of much specific help with that question, but here 
you go.

https://stat.ethz.ch/pipermail/r-sig-mixed-models/2011q1/015591.html
On Mar 27, 2013, at 10:13 PM, Nicole Ford wrote:

 i literally just ran one.
 
 when i ran one of mine and the did summary(mod) i get the following:
 
 mod - lmer(dem ~ xbar + cpi + (1 | country), data=wvsAB)
 summary(mod)
 Linear mixed model fit by REML 
 Formula: dem ~ xbar + cpi + (1 | country) 
   Data: wvsAB 
   AIC   BIC logLik deviance REMLdev
 34383 34418 -1718734355   34373
 Random effects:
 Groups   NameVariance Std.Dev.
 
 with a bunch more stuff below.
 
 
 On Mar 27, 2013, at 10:00 PM, Ben Bolker wrote:
 
 Michael Grant michael.grant at colorado.edu writes:
 
 
 
 Dear Help:
 
 I am trying to follow Professor Bates' recommendation, quoted by
 Professor Crawley in The R Book, p629, to determine whether I should
 model data using the 'plain old' lm function or the mixed model
 function lmer by using the syntax anova(lmModel,lmerModel).
 Apparently I've not understood the recommendation or the proper
 likelihood ratio test in question (or both) for I get this error
 message: Error: $ operator not defined for this S4 class.
 
 I don't have the R Book handy (some more context would be extremely
 useful!  I would think it would count as fair use to quote the
 passage you're referring to ...)
 
 Would someone be kind enough to point out my blunder?
 
 You should probably repost this to the r-sig-mixed-mod...@r-project.org
 mailing list.
 
 My short answer would be: (1) I don't think you can actually
 use anova() to compare likelihoods between lm() and lme()/lmer()
 fits in the way that you want: *maybe* for lme() [don't recall],
 but almost certainly not for lmer().  See http://glmm.wikidot.com/faq
 for methods for testing significance/inclusion of random factors
 (short answer: you should *generally* try to make the decision
 whether to include random factors or not on _a priori_ grounds,
 not on the basis of statistical tests ...)
 
 Ben Bolker
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lmer, p-values and all that

2013-03-27 Thread Ben Bolker
On 13-03-27 10:10 PM, David Winsemius wrote:
 
 On Mar 27, 2013, at 7:00 PM, Ben Bolker wrote:
 
 Michael Grant michael.grant at colorado.edu writes:
 Dear Help:

 I am trying to follow Professor Bates' recommendation, quoted by
 Professor Crawley in The R Book, p629, to determine whether I should
 model data using the 'plain old' lm function or the mixed model
 function lmer by using the syntax anova(lmModel,lmerModel).
 Apparently I've not understood the recommendation or the proper
 likelihood ratio test in question (or both) for I get this error
 message: Error: $ operator not defined for this S4 class.

  I don't have the R Book handy (some more context would be extremely
 useful!  I would think it would count as fair use to quote the
 passage you're referring to ...)
 
 This is the quoted Rhelp entry:
 
 http://tolstoy.newcastle.edu.au/R/help/05/01/10006.html
 
 (I'm unable to determine whether it applies to the question at hand.)

  OK, I misspoke -- sorry.  I think the lmer()/lme() likelihoods are
actually comparable; it's GLMMs (glmer(), with no analogue in lme()-land
except
for MASS::glmmPQL, which doesn't give you log-likelihoods at all)
where the problem arises.

  You can (1) use lme(), (2)  look at http://glmm.wikidot.com/faq for
suggestions about testing random-effects terms (including perhaps
don't do it at all), or (3) construct the likelihood ratio test
yourself as follows:

library(nlme)
data(Orthodont)
fm1 - lme(distance~age,random=~1|Subject,data=Orthodont)
fm0 - lm(distance~age,data=Orthodont)
anova(fm1,fm0)[[p-value]]
detach(package:nlme,unload=TRUE)
library(lme4.0) ## stable version of lme4
fm2 - lmer(distance~age+(1|Subject),data=Orthodont,REML=FALSE)
anova(fm2,fm0) ## fails
ddiff - -2*c(logLik(fm0)-logLik(fm2))
pchisq(ddiff,1,lower.tail=FALSE)
## not identical to above but close enough

 

 Would someone be kind enough to point out my blunder?

  You should probably repost this to the r-sig-mixed-mod...@r-project.org
 mailing list.

  My short answer would be: (1) I don't think you can actually
 use anova() to compare likelihoods between lm() and lme()/lmer()
 fits in the way that you want: *maybe* for lme() [don't recall],
 but almost certainly not for lmer().  See http://glmm.wikidot.com/faq
 for methods for testing significance/inclusion of random factors
 (short answer: you should *generally* try to make the decision
 whether to include random factors or not on _a priori_ grounds,
 not on the basis of statistical tests ...)

  Ben Bolker



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.