Re: [R] lmer() vs. lme() gave different variance component estimates

2010-09-21 Thread Douglas Bates
I haven't had the time to keep up with this discussion, or many of the
other discussions on the R-SIG-Mixed-Models email list.  I swamped
with other duties at present.

It is important to remember that the nlme and lme4 packages take a
model specification and provide code to evaluate the deviance.  Other
code is used to optimize the deviance.  In the case of nlme it is the
nlminb function and in the case of lme4 it is the minqa function.
There are no guarantees for numerical optimization routines.  If the
problem is ill-conditioned then they can converge to optima that are
not the global optimum.

Whoever suggested using the verbose option to see the progress of the
iterations has the right idea.


On Mon, Sep 20, 2010 at 2:50 PM, array chip arrayprof...@yahoo.com wrote:
 Thank you Peter and Ben for your comments.

 John


 - Original Message 
 From: Peter Dalgaard pda...@gmail.com
 To: array chip arrayprof...@yahoo.com
 Cc: r-help@r-project.org; r-sig-mixed-mod...@r-project.org
 Sent: Mon, September 20, 2010 12:28:43 PM
 Subject: Re: [R] lmer() vs. lme() gave different variance component estimates

 On 09/20/2010 08:09 PM, array chip wrote:
 Thank you Peter for your explanation of relationship between aov and lme. It
 makes perfect sense.


 When you said you might have computed the average of all 8
 measurements on each animal and computed a 1-way ANOVA for treatment effect,
 would this be the case for balanced design, or it is also true for unbalanced
 data?

 It is only exactly true for a balanced design, although it can be a
 practical expedient in nearly-balanced cases, especially if there is a
 clearly dominant animal variation. In strongly unbalanced data, you get
 reduced efficiency because animals with less data should be downweighted
 (not proportionally if there is substantial between-animal variation,
 though). And of course the whole thing relies on the fact that you have
 individuals nested in treatment (no animals had multiple treatments)


 Another question is if 1-way ANOVA is equivalent to mixed model for testing
 treatment effect, what would be reason why mixed model is used? Just to
estimate

 the variance components? If the interest is not in the estimation of variance
 components, then there is no need to run mixed models to test treatment
effects?

 Not too far off the mark. In more complex cases, there is the advantage
 that the mixed model helps figure out a sensible analysis for you.


 And my last question is I am glad to find that glht() from multcomp package
 works well with a lmer() fit for multiple comparisons. Given Professor 
 Bates's

 view that denominator degree's of freedom is not well defined in mixed 
 models,

 are the results from glht() reasonable/meaningful? If not, will the suggested
 1-way ANOVA used together with glht() give us correct post-hoc multiple
 comparsion results?

 I think Doug's view is that DFs are not _reliably_estimated_ with any of
 the current procedures. In the balanced cases, they are very well
 defined (well, give or take the issues with negative variances), and I
 would expect glht() to give meaningful results. Do check the residuals
 for at least approximate normality, though.



 Thank you very much!

 John





 - Original Message 
 From: Peter Dalgaard pda...@gmail.com
 To: array chip arrayprof...@yahoo.com
 Cc: r-help@r-project.org; r-sig-mixed-mod...@r-project.org
 Sent: Sat, September 18, 2010 1:35:45 AM
 Subject: Re: [R] lmer() vs. lme() gave different variance component estimates


 For a nested design, the relation is quite straightforward: The residual
 MS are the variances of sample means scaled to be comparable with the
 residuals (so that in the absense of random components, all
 MS are equal to within the F-ratio variability). So to get the id:eye
 variance component, subtract the Within MS from the id:eye MS and divide
 by the number of replicates (4 in this case since you have 640
 observations on 160 eyes) (14.4 - 0.01875)/4 = 3.59, and similarly, the
 id variance is the MS for id minus that for id:eye scaled by 8:
 (42.482-14.4)/8 = 3.51.

 I.e. it is reproducing the lmer results above, but of course not those
 from your original post.

 (Notice, by the way, that if you are only interested in the treatment
 effect, you might as well have computed the average of all 8
 measurements on each animal and computed a 1-way ANOVA).



 --
 Peter Dalgaard
 Center for Statistics, Copenhagen Business School
 Phone: (+45)38153501
 Email: pd@cbs.dk  Priv: pda...@gmail.com

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R

Re: [R] lmer() vs. lme() gave different variance component estimates

2010-09-21 Thread Douglas Bates
On Tue, Sep 21, 2010 at 1:39 PM, Douglas Bates ba...@stat.wisc.edu wrote:
 I haven't had the time to keep up with this discussion, or many of the
 other discussions on the R-SIG-Mixed-Models email list.  I swamped
 with other duties at present.

 It is important to remember that the nlme and lme4 packages take a
 model specification and provide code to evaluate the deviance.  Other
 code is used to optimize the deviance.  In the case of nlme it is the
 nlminb function and in the case of lme4 it is the minqa function.

Shouldn't send email when I'm tired.  The optimizer function is bobyqa
in the minqa package. (Sort of makes you nostalgic for the punched
cards and line printers when you see the Fortran names jammed into six
characters.)

 There are no guarantees for numerical optimization routines.  If the
 problem is ill-conditioned then they can converge to optima that are
 not the global optimum.

 Whoever suggested using the verbose option to see the progress of the
 iterations has the right idea.


 On Mon, Sep 20, 2010 at 2:50 PM, array chip arrayprof...@yahoo.com wrote:
 Thank you Peter and Ben for your comments.

 John


 - Original Message 
 From: Peter Dalgaard pda...@gmail.com
 To: array chip arrayprof...@yahoo.com
 Cc: r-help@r-project.org; r-sig-mixed-mod...@r-project.org
 Sent: Mon, September 20, 2010 12:28:43 PM
 Subject: Re: [R] lmer() vs. lme() gave different variance component estimates

 On 09/20/2010 08:09 PM, array chip wrote:
 Thank you Peter for your explanation of relationship between aov and lme. It
 makes perfect sense.


 When you said you might have computed the average of all 8
 measurements on each animal and computed a 1-way ANOVA for treatment 
 effect,
 would this be the case for balanced design, or it is also true for 
 unbalanced
 data?

 It is only exactly true for a balanced design, although it can be a
 practical expedient in nearly-balanced cases, especially if there is a
 clearly dominant animal variation. In strongly unbalanced data, you get
 reduced efficiency because animals with less data should be downweighted
 (not proportionally if there is substantial between-animal variation,
 though). And of course the whole thing relies on the fact that you have
 individuals nested in treatment (no animals had multiple treatments)


 Another question is if 1-way ANOVA is equivalent to mixed model for testing
 treatment effect, what would be reason why mixed model is used? Just to
estimate

 the variance components? If the interest is not in the estimation of 
 variance
 components, then there is no need to run mixed models to test treatment
effects?

 Not too far off the mark. In more complex cases, there is the advantage
 that the mixed model helps figure out a sensible analysis for you.


 And my last question is I am glad to find that glht() from multcomp package
 works well with a lmer() fit for multiple comparisons. Given Professor 
 Bates's

 view that denominator degree's of freedom is not well defined in mixed 
 models,

 are the results from glht() reasonable/meaningful? If not, will the 
 suggested
 1-way ANOVA used together with glht() give us correct post-hoc multiple
 comparsion results?

 I think Doug's view is that DFs are not _reliably_estimated_ with any of
 the current procedures. In the balanced cases, they are very well
 defined (well, give or take the issues with negative variances), and I
 would expect glht() to give meaningful results. Do check the residuals
 for at least approximate normality, though.



 Thank you very much!

 John





 - Original Message 
 From: Peter Dalgaard pda...@gmail.com
 To: array chip arrayprof...@yahoo.com
 Cc: r-help@r-project.org; r-sig-mixed-mod...@r-project.org
 Sent: Sat, September 18, 2010 1:35:45 AM
 Subject: Re: [R] lmer() vs. lme() gave different variance component 
 estimates


 For a nested design, the relation is quite straightforward: The residual
 MS are the variances of sample means scaled to be comparable with the
 residuals (so that in the absense of random components, all
 MS are equal to within the F-ratio variability). So to get the id:eye
 variance component, subtract the Within MS from the id:eye MS and divide
 by the number of replicates (4 in this case since you have 640
 observations on 160 eyes) (14.4 - 0.01875)/4 = 3.59, and similarly, the
 id variance is the MS for id minus that for id:eye scaled by 8:
 (42.482-14.4)/8 = 3.51.

 I.e. it is reproducing the lmer results above, but of course not those
 from your original post.

 (Notice, by the way, that if you are only interested in the treatment
 effect, you might as well have computed the average of all 8
 measurements on each animal and computed a 1-way ANOVA).



 --
 Peter Dalgaard
 Center for Statistics, Copenhagen Business School
 Phone: (+45)38153501
 Email: pd@cbs.dk  Priv: pda...@gmail.com

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r

Re: [R] lmer() vs. lme() gave different variance component estimates

2010-09-21 Thread Peter Dalgaard
On 09/21/2010 09:02 PM, Douglas Bates wrote:
 On Tue, Sep 21, 2010 at 1:39 PM, Douglas Bates ba...@stat.wisc.edu wrote:
 I haven't had the time to keep up with this discussion, or many of the
 other discussions on the R-SIG-Mixed-Models email list.  I swamped
 with other duties at present.

It's not like I don't know how you feel



 It is important to remember that the nlme and lme4 packages take a
 model specification and provide code to evaluate the deviance.  Other
 code is used to optimize the deviance.  In the case of nlme it is the
 nlminb function and in the case of lme4 it is the minqa function.
 
 Shouldn't send email when I'm tired.  The optimizer function is bobyqa
 in the minqa package. (Sort of makes you nostalgic for the punched
 cards and line printers when you see the Fortran names jammed into six
 characters.)
 
 There are no guarantees for numerical optimization routines.  If the
 problem is ill-conditioned then they can converge to optima that are
 not the global optimum.

Right. However, what we have here is a case where I'm pretty damn sure
that the likelihood function is unimodal (it's a linear
reparametrization of three independent chi-square terms) and has an
optimum in the interior of the feasible region.

In any case, I'm thoroughly confused about the lme4/lme4a/lme4b
subtrees. Which algorithm is used by the current CRAN lme4?? (I have

 sessionInfo()
R version 2.11.1 (2010-05-31)
i386-redhat-linux-gnu

locale:
 [1] LC_CTYPE=en_US.utf8   LC_NUMERIC=C
 [3] LC_TIME=en_US.utf8LC_COLLATE=en_US.utf8
 [5] LC_MONETARY=C LC_MESSAGES=en_US.utf8
 [7] LC_PAPER=en_US.utf8   LC_NAME=C
 [9] LC_ADDRESS=C  LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.utf8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

other attached packages:
[1] lme4_0.999375-35   Matrix_0.999375-39 lattice_0.18-8

loaded via a namespace (and not attached):
[1] grid_2.11.1   nlme_3.1-96   stats4_2.11.1 tcltk_2.11.1  tools_2.

)




 Whoever suggested using the verbose option to see the progress of the
 iterations has the right idea.


 On Mon, Sep 20, 2010 at 2:50 PM, array chip arrayprof...@yahoo.com wrote:
 Thank you Peter and Ben for your comments.

 John


 - Original Message 
 From: Peter Dalgaard pda...@gmail.com
 To: array chip arrayprof...@yahoo.com
 Cc: r-help@r-project.org; r-sig-mixed-mod...@r-project.org
 Sent: Mon, September 20, 2010 12:28:43 PM
 Subject: Re: [R] lmer() vs. lme() gave different variance component 
 estimates

 On 09/20/2010 08:09 PM, array chip wrote:
 Thank you Peter for your explanation of relationship between aov and lme. 
 It
 makes perfect sense.


 When you said you might have computed the average of all 8
 measurements on each animal and computed a 1-way ANOVA for treatment 
 effect,
 would this be the case for balanced design, or it is also true for 
 unbalanced
 data?

 It is only exactly true for a balanced design, although it can be a
 practical expedient in nearly-balanced cases, especially if there is a
 clearly dominant animal variation. In strongly unbalanced data, you get
 reduced efficiency because animals with less data should be downweighted
 (not proportionally if there is substantial between-animal variation,
 though). And of course the whole thing relies on the fact that you have
 individuals nested in treatment (no animals had multiple treatments)


 Another question is if 1-way ANOVA is equivalent to mixed model for testing
 treatment effect, what would be reason why mixed model is used? Just to
 estimate

 the variance components? If the interest is not in the estimation of 
 variance
 components, then there is no need to run mixed models to test treatment
 effects?

 Not too far off the mark. In more complex cases, there is the advantage
 that the mixed model helps figure out a sensible analysis for you.


 And my last question is I am glad to find that glht() from multcomp package
 works well with a lmer() fit for multiple comparisons. Given Professor 
 Bates's

 view that denominator degree's of freedom is not well defined in mixed 
 models,

 are the results from glht() reasonable/meaningful? If not, will the 
 suggested
 1-way ANOVA used together with glht() give us correct post-hoc multiple
 comparsion results?

 I think Doug's view is that DFs are not _reliably_estimated_ with any of
 the current procedures. In the balanced cases, they are very well
 defined (well, give or take the issues with negative variances), and I
 would expect glht() to give meaningful results. Do check the residuals
 for at least approximate normality, though.



 Thank you very much!

 John





 - Original Message 
 From: Peter Dalgaard pda...@gmail.com
 To: array chip arrayprof...@yahoo.com
 Cc: r-help@r-project.org; r-sig-mixed-mod...@r-project.org
 Sent: Sat, September 18, 2010 1:35:45 AM
 Subject: Re: [R] lmer() vs. lme() gave different variance component

Re: [R] lmer() vs. lme() gave different variance component estimates

2010-09-21 Thread Douglas Bates
On Tue, Sep 21, 2010 at 4:21 PM, Peter Dalgaard pda...@gmail.com wrote:
 On 09/21/2010 09:02 PM, Douglas Bates wrote:
 On Tue, Sep 21, 2010 at 1:39 PM, Douglas Bates ba...@stat.wisc.edu wrote:
 I haven't had the time to keep up with this discussion, or many of the
 other discussions on the R-SIG-Mixed-Models email list.  I swamped
 with other duties at present.

 It's not like I don't know how you feel



 It is important to remember that the nlme and lme4 packages take a
 model specification and provide code to evaluate the deviance.  Other
 code is used to optimize the deviance.  In the case of nlme it is the
 nlminb function and in the case of lme4 it is the minqa function.

 Shouldn't send email when I'm tired.  The optimizer function is bobyqa
 in the minqa package. (Sort of makes you nostalgic for the punched
 cards and line printers when you see the Fortran names jammed into six
 characters.)

 There are no guarantees for numerical optimization routines.  If the
 problem is ill-conditioned then they can converge to optima that are
 not the global optimum.

 Right. However, what we have here is a case where I'm pretty damn sure
 that the likelihood function is unimodal (it's a linear
 reparametrization of three independent chi-square terms) and has an
 optimum in the interior of the feasible region.

 In any case, I'm thoroughly confused about the lme4/lme4a/lme4b
 subtrees. Which algorithm is used by the current CRAN lme4?? (I have

lme4b was a placeholder.  It should be ignored now.

lme4a is the development version that will become the released version
once Martin and I are convinced that it does all the things that lme4
does.  Before that can happen I have to get my head above water for a
week or two to catch up and it could be well into October before that
happens (I have a particularly heavy teaching load this semester).

lme4a is cleaner than lme4 but leans heavily on the Rcpp package to
achieve that.  So it uses classes and methods (although in different
senses) at both the R and the compiled code levels.  lme4a allows for
profiling the deviance in a LMM with respect to the parameters.

I had thought that the released version of lme4 (the one shown below)
allowed for selection of the optimizer but, upon further review (a
phrase known to fans of what Americans call football) it turns out
that it doesn't.  So I misspoke.  The released version of lme4 uses
nlminb, as shown by the form of the verbose output.

The lme4a version of lmer uses bobyqa.  It would probably be
worthwhile running this test case in that version.

 sessionInfo()
 R version 2.11.1 (2010-05-31)
 i386-redhat-linux-gnu

 locale:
  [1] LC_CTYPE=en_US.utf8       LC_NUMERIC=C
  [3] LC_TIME=en_US.utf8        LC_COLLATE=en_US.utf8
  [5] LC_MONETARY=C             LC_MESSAGES=en_US.utf8
  [7] LC_PAPER=en_US.utf8       LC_NAME=C
  [9] LC_ADDRESS=C              LC_TELEPHONE=C
 [11] LC_MEASUREMENT=en_US.utf8 LC_IDENTIFICATION=C

 attached base packages:
 [1] stats     graphics  grDevices utils     datasets  methods   base

 other attached packages:
 [1] lme4_0.999375-35   Matrix_0.999375-39 lattice_0.18-8

 loaded via a namespace (and not attached):
 [1] grid_2.11.1   nlme_3.1-96   stats4_2.11.1 tcltk_2.11.1  tools_2.

 )




 Whoever suggested using the verbose option to see the progress of the
 iterations has the right idea.


 On Mon, Sep 20, 2010 at 2:50 PM, array chip arrayprof...@yahoo.com wrote:
 Thank you Peter and Ben for your comments.

 John


 - Original Message 
 From: Peter Dalgaard pda...@gmail.com
 To: array chip arrayprof...@yahoo.com
 Cc: r-help@r-project.org; r-sig-mixed-mod...@r-project.org
 Sent: Mon, September 20, 2010 12:28:43 PM
 Subject: Re: [R] lmer() vs. lme() gave different variance component 
 estimates

 On 09/20/2010 08:09 PM, array chip wrote:
 Thank you Peter for your explanation of relationship between aov and lme. 
 It
 makes perfect sense.


 When you said you might have computed the average of all 8
 measurements on each animal and computed a 1-way ANOVA for treatment 
 effect,
 would this be the case for balanced design, or it is also true for 
 unbalanced
 data?

 It is only exactly true for a balanced design, although it can be a
 practical expedient in nearly-balanced cases, especially if there is a
 clearly dominant animal variation. In strongly unbalanced data, you get
 reduced efficiency because animals with less data should be downweighted
 (not proportionally if there is substantial between-animal variation,
 though). And of course the whole thing relies on the fact that you have
 individuals nested in treatment (no animals had multiple treatments)


 Another question is if 1-way ANOVA is equivalent to mixed model for 
 testing
 treatment effect, what would be reason why mixed model is used? Just to
 estimate

 the variance components? If the interest is not in the estimation of 
 variance
 components, then there is no need to run mixed models to test treatment
 effects

Re: [R] lmer() vs. lme() gave different variance component estimates

2010-09-20 Thread array chip
Thank you Peter for your explanation of relationship between aov and lme. It 
makes perfect sense. 


When you said you might have computed the average of all 8
measurements on each animal and computed a 1-way ANOVA for treatment effect, 
would this be the case for balanced design, or it is also true for unbalanced 
data?

Another question is if 1-way ANOVA is equivalent to mixed model for testing 
treatment effect, what would be reason why mixed model is used? Just to 
estimate 
the variance components? If the interest is not in the estimation of variance 
components, then there is no need to run mixed models to test treatment effects?

And my last question is I am glad to find that glht() from multcomp package 
works well with a lmer() fit for multiple comparisons. Given Professor Bates's 
view that denominator degree's of freedom is not well defined in mixed models, 
are the results from glht() reasonable/meaningful? If not, will the suggested 
1-way ANOVA used together with glht() give us correct post-hoc multiple 
comparsion results?

Thank you very much!

John





- Original Message 
From: Peter Dalgaard pda...@gmail.com
To: array chip arrayprof...@yahoo.com
Cc: r-help@r-project.org; r-sig-mixed-mod...@r-project.org
Sent: Sat, September 18, 2010 1:35:45 AM
Subject: Re: [R] lmer() vs. lme() gave different variance component estimates


For a nested design, the relation is quite straightforward: The residual
MS are the variances of sample means scaled to be comparable with the
residuals (so that in the absense of random components, all
MS are equal to within the F-ratio variability). So to get the id:eye
variance component, subtract the Within MS from the id:eye MS and divide
by the number of replicates (4 in this case since you have 640
observations on 160 eyes) (14.4 - 0.01875)/4 = 3.59, and similarly, the
id variance is the MS for id minus that for id:eye scaled by 8:
(42.482-14.4)/8 = 3.51.

I.e. it is reproducing the lmer results above, but of course not those
from your original post.

(Notice, by the way, that if you are only interested in the treatment
effect, you might as well have computed the average of all 8
measurements on each animal and computed a 1-way ANOVA).

-- 
Peter Dalgaard
Center for Statistics, Copenhagen Business School
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lmer() vs. lme() gave different variance component estimates

2010-09-20 Thread Peter Dalgaard
On 09/20/2010 08:09 PM, array chip wrote:
 Thank you Peter for your explanation of relationship between aov and lme. It 
 makes perfect sense. 
 
 
 When you said you might have computed the average of all 8
 measurements on each animal and computed a 1-way ANOVA for treatment effect, 
 would this be the case for balanced design, or it is also true for unbalanced 
 data?

It is only exactly true for a balanced design, although it can be a
practical expedient in nearly-balanced cases, especially if there is a
clearly dominant animal variation. In strongly unbalanced data, you get
reduced efficiency because animals with less data should be downweighted
(not proportionally if there is substantial between-animal variation,
though). And of course the whole thing relies on the fact that you have
individuals nested in treatment (no animals had multiple treatments)

 
 Another question is if 1-way ANOVA is equivalent to mixed model for testing 
 treatment effect, what would be reason why mixed model is used? Just to 
 estimate 
 the variance components? If the interest is not in the estimation of variance 
 components, then there is no need to run mixed models to test treatment 
 effects?

Not too far off the mark. In more complex cases, there is the advantage
that the mixed model helps figure out a sensible analysis for you.

 
 And my last question is I am glad to find that glht() from multcomp package 
 works well with a lmer() fit for multiple comparisons. Given Professor 
 Bates's 
 view that denominator degree's of freedom is not well defined in mixed 
 models, 
 are the results from glht() reasonable/meaningful? If not, will the suggested 
 1-way ANOVA used together with glht() give us correct post-hoc multiple 
 comparsion results?

I think Doug's view is that DFs are not _reliably_estimated_ with any of
the current procedures. In the balanced cases, they are very well
defined (well, give or take the issues with negative variances), and I
would expect glht() to give meaningful results. Do check the residuals
for at least approximate normality, though.


 
 Thank you very much!
 
 John
 
 
 
 
 
 - Original Message 
 From: Peter Dalgaard pda...@gmail.com
 To: array chip arrayprof...@yahoo.com
 Cc: r-help@r-project.org; r-sig-mixed-mod...@r-project.org
 Sent: Sat, September 18, 2010 1:35:45 AM
 Subject: Re: [R] lmer() vs. lme() gave different variance component estimates
 
 
 For a nested design, the relation is quite straightforward: The residual
 MS are the variances of sample means scaled to be comparable with the
 residuals (so that in the absense of random components, all
 MS are equal to within the F-ratio variability). So to get the id:eye
 variance component, subtract the Within MS from the id:eye MS and divide
 by the number of replicates (4 in this case since you have 640
 observations on 160 eyes) (14.4 - 0.01875)/4 = 3.59, and similarly, the
 id variance is the MS for id minus that for id:eye scaled by 8:
 (42.482-14.4)/8 = 3.51.
 
 I.e. it is reproducing the lmer results above, but of course not those
 from your original post.
 
 (Notice, by the way, that if you are only interested in the treatment
 effect, you might as well have computed the average of all 8
 measurements on each animal and computed a 1-way ANOVA).
 


-- 
Peter Dalgaard
Center for Statistics, Copenhagen Business School
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lmer() vs. lme() gave different variance component estimates

2010-09-20 Thread array chip
Thank you Peter and Ben for your comments.

John


- Original Message 
From: Peter Dalgaard pda...@gmail.com
To: array chip arrayprof...@yahoo.com
Cc: r-help@r-project.org; r-sig-mixed-mod...@r-project.org
Sent: Mon, September 20, 2010 12:28:43 PM
Subject: Re: [R] lmer() vs. lme() gave different variance component estimates

On 09/20/2010 08:09 PM, array chip wrote:
 Thank you Peter for your explanation of relationship between aov and lme. It 
 makes perfect sense. 
 
 
 When you said you might have computed the average of all 8
 measurements on each animal and computed a 1-way ANOVA for treatment effect, 
 would this be the case for balanced design, or it is also true for unbalanced 
 data?

It is only exactly true for a balanced design, although it can be a
practical expedient in nearly-balanced cases, especially if there is a
clearly dominant animal variation. In strongly unbalanced data, you get
reduced efficiency because animals with less data should be downweighted
(not proportionally if there is substantial between-animal variation,
though). And of course the whole thing relies on the fact that you have
individuals nested in treatment (no animals had multiple treatments)

 
 Another question is if 1-way ANOVA is equivalent to mixed model for testing 
 treatment effect, what would be reason why mixed model is used? Just to 
estimate 

 the variance components? If the interest is not in the estimation of variance 
 components, then there is no need to run mixed models to test treatment 
effects?

Not too far off the mark. In more complex cases, there is the advantage
that the mixed model helps figure out a sensible analysis for you.


 And my last question is I am glad to find that glht() from multcomp package 
 works well with a lmer() fit for multiple comparisons. Given Professor 
 Bates's 

 view that denominator degree's of freedom is not well defined in mixed 
 models, 

 are the results from glht() reasonable/meaningful? If not, will the suggested 
 1-way ANOVA used together with glht() give us correct post-hoc multiple 
 comparsion results?

I think Doug's view is that DFs are not _reliably_estimated_ with any of
the current procedures. In the balanced cases, they are very well
defined (well, give or take the issues with negative variances), and I
would expect glht() to give meaningful results. Do check the residuals
for at least approximate normality, though.


 
 Thank you very much!
 
 John
 
 
 
 
 
 - Original Message 
 From: Peter Dalgaard pda...@gmail.com
 To: array chip arrayprof...@yahoo.com
 Cc: r-help@r-project.org; r-sig-mixed-mod...@r-project.org
 Sent: Sat, September 18, 2010 1:35:45 AM
 Subject: Re: [R] lmer() vs. lme() gave different variance component estimates
 
 
 For a nested design, the relation is quite straightforward: The residual
 MS are the variances of sample means scaled to be comparable with the
 residuals (so that in the absense of random components, all
 MS are equal to within the F-ratio variability). So to get the id:eye
 variance component, subtract the Within MS from the id:eye MS and divide
 by the number of replicates (4 in this case since you have 640
 observations on 160 eyes) (14.4 - 0.01875)/4 = 3.59, and similarly, the
 id variance is the MS for id minus that for id:eye scaled by 8:
 (42.482-14.4)/8 = 3.51.
 
 I.e. it is reproducing the lmer results above, but of course not those
 from your original post.
 
 (Notice, by the way, that if you are only interested in the treatment
 effect, you might as well have computed the average of all 8
 measurements on each animal and computed a 1-way ANOVA).
 


-- 
Peter Dalgaard
Center for Statistics, Copenhagen Business School
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lmer() vs. lme() gave different variance component estimates

2010-09-18 Thread Peter Dalgaard
On 09/17/2010 10:50 PM, array chip wrote:
 
 
 Thank you Peter. Actually 3 people from mixed model mailing list tried my 
 code 
 using lmer(). They got the same results as what I got from lme4(). So they 
 couldn't replicate my lmer() results:
 
 Random effects:
 Groups   NameVariance Std.Dev.
 eye:id   (Intercept) 3.59531  1.89613 
 id   (Intercept) 3.51025  1.87357 
 Residual 0.01875  0.13693 
 Number of obs: 640, groups: eye:id, 160; id, 80
 
 The only difference they can think of is they are using Mac and FreeBSD while 
 mine is PC. I can't imagine this can be the reason. I re-install lme4 
 package, 
 but still got weird results with lmer().
 
 Per your suggestion, here is the results for aov()
 
 summary(aov(score~trt+Error(id/eye), data=dat))
 
 Error: id
   Df Sum Sq Mean Sq F valuePr(F)
 trt7 1353.6 193.378   4.552 0.0002991 ***
 Residuals 72 3058.7  42.482  
 ---
 Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 
 
 Error: id:eye
   Df Sum Sq Mean Sq F value Pr(F)
 Residuals 80   115214.4   
 
 Error: Within
Df Sum Sq Mean Sq F value Pr(F)
 Residuals 480  9 0.01875
 
 As can be seen, thr within subject variance estimate (0.01875) is the same 
 between aov, lmer and lme. But I am not sure how to relate results between 
 aov 
 and lmer/lme for the other 2 variance components (id and eye%in%id).
 

For a nested design, the relation is quite straightforward: The residual
MS are the variances of sample means scaled to be comparable with the
residuals (so that in the absense of random components, all
MS are equal to within the F-ratio variability). So to get the id:eye
variance component, subtract the Within MS from the id:eye MS and divide
by the number of replicates (4 in this case since you have 640
observations on 160 eyes) (14.4 - 0.01875)/4 = 3.59, and similarly, the
id variance is the MS for id minus that for id:eye scaled by 8:
(42.482-14.4)/8 = 3.51.

I.e. it is reproducing the lmer results above, but of course not those
from your original post.

(Notice, by the way, that if you are only interested in the treatment
effect, you might as well have computed the average of all 8
measurements on each animal and computed a 1-way ANOVA).

-- 
Peter Dalgaard
Center for Statistics, Copenhagen Business School
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lmer() vs. lme() gave different variance component estimates

2010-09-17 Thread Peter Dalgaard
On 09/17/2010 09:14 PM, array chip wrote:
 Hi, I asked this on mixed model mailing list, but that list is not very 
 active, 
 so I'd like to try the general R mailing list. Sorry if anyone receives the 
 double post.
 
 
 Hi, I have a dataset of animals receiving some eye treatments. There are 8 
 
 treatments, each animal's right and left eye was measured with some scores 
 (ranging from 0 to 7) 4 times after treatment. So there are nesting groups 
 eyes 
 within animal. Dataset attached
 
 dat-read.table(dat.txt,sep='\t',header=T,row.names=1)
 dat$id-factor(dat$id)
 str(dat)
 'data.frame':   640 obs. of  5 variables:
 $ score: int  0 2 0 7 4 7 0 2 0 7 ...
 $ id   : Factor w/ 80 levels 1,3,6,10,..: 7 48 66 54 18 26 38 52 39 
 63 
 ...
 $ rep  : int  1 1 1 1 1 1 1 1 1 1 ...
 $ eye  : Factor w/ 2 levels L,R: 2 2 2 2 2 2 2 2 2 2 ...
 $ trt  : Factor w/ 8 levels A,B,C,Control,..: 1 1 1 1 1 1 1 1 1 1 ...
 
 I fit a mixed model using both lmer() from lme4 package and lme() from nlme 
 package:
 
 lmer(score~trt+(1|id/eye),dat)
 
 Linear mixed model fit by REML 
 Formula: score ~ trt + (1 | id/eye) 
Data: dat 
AIC   BIC logLik deviance REMLdev
 446.7 495.8 -212.4430.9   424.7
 Random effects:
 Groups   NameVariance   Std.Dev.  
 eye:id   (Intercept) 6.9208e+00 2.630742315798
 id   (Intercept) 1.4471e-16 0.00012030
 Residual 1.8750e-02 0.136930641909
 Number of obs: 640, groups: eye:id, 160; id, 80
 
 summary(lme(score~trt, random=(~1|id/eye), dat))
 
 Linear mixed-effects model fit by REML
 Data: dat 
AIC  BIClogLik
   425.1569 474.0947 -201.5785
 
 Random effects:
 Formula: ~1 | id
 (Intercept)
 StdDev:1.873576
 
 Formula: ~1 | eye %in% id
 (Intercept)  Residual
 StdDev:1.896126 0.1369306
 
 As you can see, the variance components estimates of random effects are quite 
 different between the 2 model fits. From the data, I know that the variance 
 component for id can't be near 0, which is what lmer() fit produced, so I 
 think the lme() fit is correct while lmer() fit is off. This can also be seen 
 from AIC, BIC etc. lme() fit has better values than lmer() fit. 
 
 
 I guess this might be due to lmer() didn't converge very well, is there 
 anyway 
 to adjust to make lmer() converge better to get similar results as lme()?

That's your guess... I'd be more careful about jumping to conclusions.
If this is a balanced data set, perhaps you could supply the result of

summary(aov(score~trt+Error(id/eye), data=dat))

The correct estimates should be computable from the ANOVA table.


-- 
Peter Dalgaard
Center for Statistics, Copenhagen Business School
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lmer() vs. lme() gave different variance component estimates

2010-09-17 Thread array chip


Thank you Peter. Actually 3 people from mixed model mailing list tried my code 
using lmer(). They got the same results as what I got from lme4(). So they 
couldn't replicate my lmer() results:

Random effects:
Groups   NameVariance Std.Dev.
eye:id   (Intercept) 3.59531  1.89613 
id   (Intercept) 3.51025  1.87357 
Residual 0.01875  0.13693 
Number of obs: 640, groups: eye:id, 160; id, 80

The only difference they can think of is they are using Mac and FreeBSD while 
mine is PC. I can't imagine this can be the reason. I re-install lme4 package, 
but still got weird results with lmer().

Per your suggestion, here is the results for aov()

summary(aov(score~trt+Error(id/eye), data=dat))

Error: id
  Df Sum Sq Mean Sq F valuePr(F)
trt7 1353.6 193.378   4.552 0.0002991 ***
Residuals 72 3058.7  42.482  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Error: id:eye
  Df Sum Sq Mean Sq F value Pr(F)
Residuals 80   115214.4   

Error: Within
   Df Sum Sq Mean Sq F value Pr(F)
Residuals 480  9 0.01875

As can be seen, thr within subject variance estimate (0.01875) is the same 
between aov, lmer and lme. But I am not sure how to relate results between aov 
and lmer/lme for the other 2 variance components (id and eye%in%id).

Thanks

John






- Original Message 
From: Peter Dalgaard pda...@gmail.com
To: array chip arrayprof...@yahoo.com
Cc: r-help@r-project.org
Sent: Fri, September 17, 2010 1:05:27 PM
Subject: Re: [R] lmer() vs. lme() gave different variance component estimates

On 09/17/2010 09:14 PM, array chip wrote:
 Hi, I asked this on mixed model mailing list, but that list is not very 
 active, 

 so I'd like to try the general R mailing list. Sorry if anyone receives the 
 double post.
 
 
 Hi, I have a dataset of animals receiving some eye treatments. There are 8 
 
 treatments, each animal's right and left eye was measured with some scores 
 (ranging from 0 to 7) 4 times after treatment. So there are nesting groups 
 eyes 

 within animal. Dataset attached
 
 dat-read.table(dat.txt,sep='\t',header=T,row.names=1)
 dat$id-factor(dat$id)
 str(dat)
 'data.frame':   640 obs. of  5 variables:
 $ score: int  0 2 0 7 4 7 0 2 0 7 ...
 $ id   : Factor w/ 80 levels 1,3,6,10,..: 7 48 66 54 18 26 38 52 39 
 63 

 ...
 $ rep  : int  1 1 1 1 1 1 1 1 1 1 ...
 $ eye  : Factor w/ 2 levels L,R: 2 2 2 2 2 2 2 2 2 2 ...
 $ trt  : Factor w/ 8 levels A,B,C,Control,..: 1 1 1 1 1 1 1 1 1 1 ...
 
 I fit a mixed model using both lmer() from lme4 package and lme() from nlme 
 package:
 
 lmer(score~trt+(1|id/eye),dat)
 
 Linear mixed model fit by REML 
 Formula: score ~ trt + (1 | id/eye) 
Data: dat 
AIC   BIC logLik deviance REMLdev
 446.7 495.8 -212.4430.9   424.7
 Random effects:
 Groups   NameVariance   Std.Dev.  
 eye:id   (Intercept) 6.9208e+00 2.630742315798
 id   (Intercept) 1.4471e-16 0.00012030
 Residual 1.8750e-02 0.136930641909
 Number of obs: 640, groups: eye:id, 160; id, 80
 
 summary(lme(score~trt, random=(~1|id/eye), dat))
 
 Linear mixed-effects model fit by REML
 Data: dat 
AIC  BIClogLik
   425.1569 474.0947 -201.5785
 
 Random effects:
 Formula: ~1 | id
 (Intercept)
 StdDev:1.873576
 
 Formula: ~1 | eye %in% id
 (Intercept)  Residual
 StdDev:1.896126 0.1369306
 
 As you can see, the variance components estimates of random effects are quite 
 different between the 2 model fits. From the data, I know that the variance 
 component for id can't be near 0, which is what lmer() fit produced, so I 
 think the lme() fit is correct while lmer() fit is off. This can also be seen 
 from AIC, BIC etc. lme() fit has better values than lmer() fit. 
 
 
 I guess this might be due to lmer() didn't converge very well, is there 
 anyway 

 to adjust to make lmer() converge better to get similar results as lme()?

That's your guess... I'd be more careful about jumping to conclusions.
If this is a balanced data set, perhaps you could supply the result of

summary(aov(score~trt+Error(id/eye), data=dat))

The correct estimates should be computable from the ANOVA table.


-- 
Peter Dalgaard
Center for Statistics, Copenhagen Business School
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lmer() vs. lme() gave different variance component estimates

2010-09-17 Thread Kevin Wright
I confirm John's problems with lmer.  I'm using R 2.11.1. on Windows XP.

R m4 - lmer(score~trt+(1|id/eye),dat)
R m4
Linear mixed model fit by REML
Formula: score ~ trt + (1 | id/eye)
   Data: dat
   AIC   BIC logLik deviance REMLdev
 446.7 495.8 -212.4430.9   424.7
Random effects:
 Groups   NameVariance   Std.Dev.
 eye:id   (Intercept) 6.9208e+00 2.6307e+00
 id   (Intercept) 1.4471e-16 1.2030e-08
 Residual 1.8750e-02 1.3693e-01
Number of obs: 640, groups: eye:id, 160; id, 80

One other note--the variance components from asreml agree with *lme*:

ma - asreml(score~trt , random=~id + id:eye, data=dat)
R sqrt(ma$gammas * ma$sigma2)
id!id.var id:eye!id.varR!variance
1.8735643 1.8961309 0.1369306

Kevin


On Fri, Sep 17, 2010 at 3:50 PM, array chip arrayprof...@yahoo.com wrote:



 Thank you Peter. Actually 3 people from mixed model mailing list tried my
 code
 using lmer(). They got the same results as what I got from lme4(). So they
 couldn't replicate my lmer() results:

 Random effects:
 Groups   NameVariance Std.Dev.
 eye:id   (Intercept) 3.59531  1.89613
 id   (Intercept) 3.51025  1.87357
 Residual 0.01875  0.13693
 Number of obs: 640, groups: eye:id, 160; id, 80

 The only difference they can think of is they are using Mac and FreeBSD
 while
 mine is PC. I can't imagine this can be the reason. I re-install lme4
 package,
 but still got weird results with lmer().

 Per your suggestion, here is the results for aov()

 summary(aov(score~trt+Error(id/eye), data=dat))

 Error: id
  Df Sum Sq Mean Sq F valuePr(F)
 trt7 1353.6 193.378   4.552 0.0002991 ***
 Residuals 72 3058.7  42.482
 ---
 Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

 Error: id:eye
  Df Sum Sq Mean Sq F value Pr(F)
 Residuals 80   115214.4

 Error: Within
   Df Sum Sq Mean Sq F value Pr(F)
 Residuals 480  9 0.01875

 As can be seen, thr within subject variance estimate (0.01875) is the same
 between aov, lmer and lme. But I am not sure how to relate results between
 aov
 and lmer/lme for the other 2 variance components (id and eye%in%id).

 Thanks

 John






 - Original Message 
 From: Peter Dalgaard pda...@gmail.com
 To: array chip arrayprof...@yahoo.com
 Cc: r-help@r-project.org
 Sent: Fri, September 17, 2010 1:05:27 PM
 Subject: Re: [R] lmer() vs. lme() gave different variance component
 estimates

 On 09/17/2010 09:14 PM, array chip wrote:
  Hi, I asked this on mixed model mailing list, but that list is not very
 active,
 
  so I'd like to try the general R mailing list. Sorry if anyone receives
 the
  double post.
 
 
  Hi, I have a dataset of animals receiving some eye treatments. There are
 8
 
  treatments, each animal's right and left eye was measured with some
 scores
  (ranging from 0 to 7) 4 times after treatment. So there are nesting
 groups eyes
 
  within animal. Dataset attached
 
  dat-read.table(dat.txt,sep='\t',header=T,row.names=1)
  dat$id-factor(dat$id)
  str(dat)
  'data.frame':   640 obs. of  5 variables:
  $ score: int  0 2 0 7 4 7 0 2 0 7 ...
  $ id   : Factor w/ 80 levels 1,3,6,10,..: 7 48 66 54 18 26 38 52
 39 63

  ...
  $ rep  : int  1 1 1 1 1 1 1 1 1 1 ...
  $ eye  : Factor w/ 2 levels L,R: 2 2 2 2 2 2 2 2 2 2 ...
  $ trt  : Factor w/ 8 levels A,B,C,Control,..: 1 1 1 1 1 1 1 1 1 1
 ...
 
  I fit a mixed model using both lmer() from lme4 package and lme() from
 nlme
  package:
 
  lmer(score~trt+(1|id/eye),dat)
 
  Linear mixed model fit by REML
  Formula: score ~ trt + (1 | id/eye)
 Data: dat
 AIC   BIC logLik deviance REMLdev
  446.7 495.8 -212.4430.9   424.7
  Random effects:
  Groups   NameVariance   Std.Dev.
  eye:id   (Intercept) 6.9208e+00 2.630742315798
  id   (Intercept) 1.4471e-16 0.00012030
  Residual 1.8750e-02 0.136930641909
  Number of obs: 640, groups: eye:id, 160; id, 80
 
  summary(lme(score~trt, random=(~1|id/eye), dat))
 
  Linear mixed-effects model fit by REML
  Data: dat
 AIC  BIClogLik
425.1569 474.0947 -201.5785
 
  Random effects:
  Formula: ~1 | id
  (Intercept)
  StdDev:1.873576
 
  Formula: ~1 | eye %in% id
  (Intercept)  Residual
  StdDev:1.896126 0.1369306
 
  As you can see, the variance components estimates of random effects are
 quite
  different between the 2 model fits. From the data, I know that the
 variance
  component for id can't be near 0, which is what lmer() fit produced, so
 I
  think the lme() fit is correct while lmer() fit is off. This can also be
 seen
  from AIC, BIC etc. lme() fit has better values than lmer() fit.
 
 
  I guess this might be due to lmer() didn't converge very well, is there
 anyway

  to adjust to make lmer() converge better to get similar results as lme()?

 That's your guess... I'd be more careful about jumping to conclusions.
 If this is a balanced data set, perhaps you could supply the result of

 summary(aov(score