Re: [R] nested mixed-effect model: variance components
Dear John I've put your mail back to the R list since I have no explanation for the lmer result. Maybe someone else has an idea. I adapted it to show some else that I do not understand. ## Creation of the data (habitat is nested in lagoon): set.seed(1) dat - data.frame(y = rnorm(100), lagoon = factor(rep(1:4,each = 25)), habitat = factor(rep(1:20, each = 5))) ## I do not understand how the random intercepts for lagoon and ## lagoon:habitat can both be estimated. It seems a little bit ## strange to me that they are identical (0.46565). library(lme4) summary(reg3 - lmer(y~habitat+(1|lagoon)+(1|lagoon:habitat), data = dat)) ## Furthermore I do not understand why the standard errors for ## the fixed effect of habitat are 1.131 for some habitats ## and 1.487 for the others??? ## If one removes (1|lagoon), the variance component ## (1|lagoon:habitat) does not change its value (still 0.46565)??? summary(reg3a - lmer(y~habitat+(1|lagoon:habitat), data = dat)) ## Now all standard errors for the fixed factor habitat are 1.131. ## Altogether it seems a little bit strange to me and with the ## warnings and errors of the lme and aov call, I'd be carefull ## by using the output of lmer in that case. In addition I do ## not understand the interpretation of the random effect lagoon ## in top of the nested FIXED factor habitat. summary(aov(y~habitat + Error(lagoon/habitat), data = dat)) detach(package:Matrix) detach(package:lme4) library(nlme) summary(reg2 - lme(y~habitat, random = ~1|lagoon/habitat, data = dat)) anova(reg2) Best regards, Christoph Buser -- Christoph Buser [EMAIL PROTECTED] Seminar fuer Statistik, LEO C13 ETH Zurich 8092 Zurich SWITZERLAND phone: x-41-44-632-4673 fax: 632-1228 http://stat.ethz.ch/~buser/ -- Christoph, I am sending this off list bacause I tried 'lmer'and it seems to work with your contrived data,but I don't know why. Can you explain it ? John detach(package:nlme) library(Matrix) summary(lmer(y~habitat+(1|lagoon)+(1|lagoon:habitat), data = dat)) Linear mixed-effects model fit by REML Formula: y ~ habitat + (1 | lagoon) + (1 | lagoon:habitat) Data: dat AIC BIClogLik MLdeviance REMLdeviance 292.3627 349.6764 -124.1814 280.0245 248.3627 Random effects: Groups NameVariance Std.Dev. lagoon:habitat (Intercept) 0.46565 0.68239 lagoon (Intercept) 0.46565 0.68239 Residual 0.87310 0.93440 number of obs: 100, groups: lagoon:habitat, 20; lagoon, 4 Fixed effects: Estimate Std. Error t value (Intercept) 0.1292699 1.0516317 0.12292 habitat2 0.0058658 1.1316138 0.00518 habitat3-0.0911469 1.1316138 -0.08055 habitat4 0.3302971 1.1316138 0.29188 habitat5-0.0480394 1.1316138 -0.04245 habitat6-0.4778469 1.4872319 -0.32130 habitat7-0.0867301 1.4872319 -0.05832 habitat8 0.0696507 1.4872319 0.04683 habitat9-0.0998728 1.4872319 -0.06715 habitat100.1096064 1.4872319 0.07370 habitat11 -0.0430979 1.4872319 -0.02898 habitat120.0714719 1.4872319 0.04806 habitat130.3380993 1.4872319 0.22733 habitat140.3057808 1.4872319 0.20560 habitat15 -0.4915582 1.4872319 -0.33052 habitat16 -0.2624539 1.4872319 -0.17647 habitat17 -0.2203461 1.4872319 -0.14816 habitat180.2165269 1.4872319 0.14559 habitat190.6932896 1.4872319 0.46616 habitat20 -0.7271468 1.4872319 -0.48893 anova(summary(lmer(y~habitat+(1|lagoon)+(1|lagoon:habitat), data = dat))) Analysis of Variance Table Df Sum Sq Mean Sq habitat 19 2.65706 0.13985 VarCorr(summary(summary(lmer(y~habitat+(1|lagoon)+(1|lagoon:habitat), data = dat))) + ) $`lagoon:habitat` 1 x 1 Matrix of class dpoMatrix (Intercept) (Intercept) 0.4656545 $lagoon 1 x 1 Matrix of class dpoMatrix (Intercept) (Intercept) 0.4656545 attr(,sc) [1] 0.9343993 Christoph Buser wrote --- Dear Eric Do you really have habitats nested within lagoons or are they partially crossed (meaning that you have the same habitats in different lagoons)? If you have them perfectly nested, I think that you cannot calculate both a fixed effect for habitats and a random effect for lagoon (see the example below, lme and aov). You can compare e.g. two lagoons by defining a contrast of the habitats of one lagoon against the habitats of the other (if you think that this is a meaningful test to interpret), but you cannot estimate a random effect lagoon in presence of a nested FIXED effect habitat. aov() will not return you the test and warn you about the singular model. lme() will estimate a variance component for lagoon, but does not provide you a test for the fixed factor. Regards, Christoph Buser set.seed(1)
Re: [R] nested mixed-effect model: variance components
Dear Eric Do you really have habitats nested within lagoons or are they partially crossed (meaning that you have the same habitats in different lagoons)? If you have them perfectly nested, I think that you cannot calculate both a fixed effect for habitats and a random effect for lagoon (see the example below, lme and aov). You can compare e.g. two lagoons by defining a contrast of the habitats of one lagoon against the habitats of the other (if you think that this is a meaningful test to interpret), but you cannot estimate a random effect lagoon in presence of a nested FIXED effect habitat. aov() will not return you the test and warn you about the singular model. lme() will estimate a variance component for lagoon, but does not provide you a test for the fixed factor. Regards, Christoph Buser set.seed(1) dat - data.frame(y = rnorm(100), lagoon = factor(rep(1:4,each = 25)), habitat = factor(rep(1:20, each = 5))) summary(aov(y~habitat + Error(lagoon/habitat), data = dat)) library(nlme) summary(lme(y~habitat, random = ~1|lagoon/habitat, data = dat)) -- Christoph Buser [EMAIL PROTECTED] Seminar fuer Statistik, LEO C13 ETH Zurich 8092 Zurich SWITZERLAND phone: x-41-44-632-4673 fax: 632-1228 http://stat.ethz.ch/~buser/ -- Eric Pante writes: Dear listers, I am trying to assess variance components for a nested, mixed-effects model. I think I got an answer that make sense from R, but I have a warning message and I wanted to check that what I am looking at is actually what I need: my data are organized as transects within stations, stations within habitats, habitats within lagoons. lagoons: random, habitats: fixed the question is: how much variation is due to lagoons? habitats? lagoons*habitat? transects? Here is my code: res - aov(COVER ~ HABITAT + Error(HABITAT+LAGOON+LAGOON/HABITAT), data=cov) summary(res) and I get Sum Sq for each to calculate variance components: Error: STRATE Df Sum Sq Mean Sq STRATE 5 4493.1 898.6 Error: ATOLL Df Sum Sq Mean Sq F value Pr(F) Residuals 5 3340.5 668.1 Error: STRATE:ATOLL Df Sum Sq Mean Sq F value Pr(F) Residuals 18 2442.71 135.71 Error: Within Df Sum Sq Mean Sq F value Pr(F) Residuals 145 6422.044.3 My error message seems to come from the LAGOON/HABITAT, the Error is computed. Warning message: Error() model is singular in: aov(COVER ~ HABITAT + Error(HABITAT+LAGOON+LAGOON/HABITAT), data=cov), THANKS !!! eric Eric Pante College of Charleston, Grice Marine Laboratory 205 Fort Johnson Road, Charleston SC 29412 Phone: 843-953-9190 (lab) -9200 (main office) On ne force pas la curiosite, on l'eveille ... Daniel Pennac __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] nested mixed-effect model: variance components
I have seen no reply to this, so I will offer a couple of comments in spite of the fact that I know very little about aov other than it is old and has largely been superceded by lme in the nlme package. I've replied to many posts on random and mixed effects over the past few years, and I only remember one case where 'aov' returned an answer that could not have been obtained easily from 'lme'. That one application involved estimating a saturated model from perfectly balanced experimental data. 'lme' refused to return an answer, at least as I specified the model, and 'aov' returned numbers from which anyone familiar with the theory for that special case could compute the desired F ratios and p-values. In all other cases that I remember, 'aov' would either return the same answer, or much more commonly, a substantively inferior answer -- if it were feasible to use 'aov' at all. I mention this, because the simplest way I can think of to check your answer is to suggest you try to work it using 'lme'. To learn how to do this, I strongly encourage you to consult Pinheiro and Bates (2000) Mixed-Effects Models in S and S-PLUS (Springer). Bates has been for many years one of the leading contributors in this area and is the primary author of the 'nlme', 'lme4' and 'Matrix packages to support working with these kinds of models. This book discusses both how to fit these kinds of models as well as how to produce plots of various kinds that can be very helpful for explaining the experimental results to others as well as diagnosing potential problem. Moreover, the R scripts discussed in the book are available in files called ch01.R, ch02.R, ..., ch06.R, ch08.R in a subfolder '~library\nlme\scripts' of the directory in which R is installed on your hard drive. This makes it much easier to work through the examples in the book one line at a time, experimenting at with modifications. In addition, there is one point I know where the R syntax differs from that in the book: S-Plus will accept x^2 in a formula as meaning the square of a numeric variable; R will not. To specify a square in R, you need something like I(x^2). When I copied the commands out of the book, I had serious trouble getting answers like those in the book until I identified and corrected this difference in syntax. If you use the script files, they provide the R syntax. I'm not certain, but I believe the following should estimate the model you wrote below: fit - lme(fixed=COVER ~ HABITAT, random = ~1|LAGOON/HABITAT, data=cov). For comparison, I refer you to Pinheiro and Bates, p. 47, where I find the following: fm1Oats - lme( yield ~ ordered(nitro) * Variety, data = Oats, random = ~ 1 | Block/Variety ) There are 3 Varieties and 6 Blocks in this Oats data.frame. The fixed effect of Variety has therefore 2 degrees of freedom. However, the random effect of Variety occurs in 18 levels, being all the 6*3 Block:Variety combinations. You can see this by examining 'str(fm1Oats)'. If you want to know how much variation is due to lagoons? habitats? lagoons*habitat? transects?, this model will NOT tell you that, and I don't know how to answer that question using 'lme'. I was able to answer a question like that using 'lmer' associated with the 'lme4' and 'Matrix' packages. Unfortunately, these packages have some names conflicts with 'nlme', and the simplest way I know to change from one to the other is to q() and restart R Before I did, however, I made a local copy of the Oats data.frame. After I did that, I seemed to get sensible numbers from the following: library(lme4) fm1Oats4 - lmer(yield~ ordered(nitro) * Variety +(1|Block)+(1|Variety)+(1|Block:Variety), data=Oats) For both lme and lmer, the default method is REML (restricted maximum likelihood). This is what you want for estimation. For testing random effects, you still want REML, but you should adjust the degrees of freedom of the reference F distribution as discussed in section 2.4 of Pinheiro and Bates; this also applies to confidence intervals for the random effects. For testing fixed effects, you should use method = 'ML'. lme4 is newer than nlme and does not currently have available the complete set of helper functions for plotting, etc. Thus, you will likely want to use both. For documentation on 'lmer', you should still start with Pinheiro and Bates for the general theory, then refer to the vignettes associated with the mlmRev and lme4 packages; if you don't know about vignettes RSiteSearch(graves vignette) will lead you to http://finzi.psych.upenn.edu/R/Rhelp02a/archive/67006.html; and other replies to r-help where I've described how to use them. You will also want the relevant article by Doug Bates in R News. To find that, go www.r-project.org -
Re: [R] nested mixed-effect model: variance components
see inline Eric Pante wrote: Hi Spencer, First, thank you very much for taking the time to write this detailed reply ! I did try exactly the formula you suggested: fit - lme(fixed=COVER ~ HABITAT, random = ~1|LAGOON/HABITAT, data=cov) before writing my post, and indeed, neither summary() or anova() returned the sums of squares to assess variance components. Do you want sums of squares or estimates of variance components? I believe that for over half a century, there is been a substantial consensus among leading experts in statistical methods that maximum likelihood (or Bayesian posterior where an adequate prior exists) is theoretically the best method for most statistical questions. With mixed effects, this is interpreted as requiring the maximization of the marginal likelihood (often after integrating out fixed effects to obtain the restricted likelihood maximized with method = REML). However, until the last couple of decades, ML or REML was not computationally feasible for many cases lacking balance. This led to an extensive literature on alternative methods like MINQUE. These methods are now substantially obsolete as far as I know except in cases with appropriate balance where they produce exactly the same answers as ML or REML. If you want estimates of the variance components, then use 'lme': With appropriately balanced data sets, these would be exactly what you would get by writing out the expected mean squares and solving for the variance components. Where the balance is lacking, the other methods will generally produce less efficient estimates of what you really want. If you want sums of squares for testing, do your testing as described in Pinheiro and Bates. Except in the saturated case I mentioned, you should get identical or superior estimates from lme and lmer than from aov. If you want sums of squares for something other than an intermediate step for estimation or testing, please explain. Be careful what you ask for, because you might get it -- and it might not be what you want. how do you specify in the formula that you want a nested approach (I will check the Pinheiro and Bates book) ? For these two reasons, I had the feeling that aov might be the way to go ... I understood you to say that you thought habitat should be nested within lagoon, but you also want to know how much variation is due to lagoons? habitats? lagoons*habitat?. That sounds to me like you want to know how to clean the spark plugs on your bicycle. Have you studied the Oats example in my post? (See also http://finzi.psych.upenn.edu/R/Rhelp02a/archive/63788.html;.) Bon Chance! J’espère que ceci vous aide. Spencer Graves I will try your suggestions concerning the lme4 package. Thanks again ! eric pante Eric Pante Spencer Graves wrote: I have seen no reply to this, so I will offer a couple of comments in spite of the fact that I know very little about aov other than it is old and has largely been superceded by lme in the nlme package. I've replied to many posts on random and mixed effects over the past few years, and I only remember one case where 'aov' returned an answer that could not have been obtained easily from 'lme'. That one application involved estimating a saturated model from perfectly balanced experimental data. 'lme' refused to return an answer, at least as I specified the model, and 'aov' returned numbers from which anyone familiar with the theory for that special case could compute the desired F ratios and p-values. In all other cases that I remember, 'aov' would either return the same answer, or much more commonly, a substantively inferior answer -- if it were feasible to use 'aov' at all. I mention this, because the simplest way I can think of to check your answer is to suggest you try to work it using 'lme'. To learn how to do this, I strongly encourage you to consult Pinheiro and Bates (2000) Mixed-Effects Models in S and S-PLUS (Springer). Bates has been for many years one of the leading contributors in this area and is the primary author of the 'nlme', 'lme4' and 'Matrix packages to support working with these kinds of models. This book discusses both how to fit these kinds of models as well as how to produce plots of various kinds that can be very helpful for explaining the experimental results to others as well as diagnosing potential problem. Moreover, the R scripts discussed in the book are available in files called ch01.R, ch02.R, ..., ch06.R, ch08.R in a subfolder '~library\nlme\scripts' of the directory in which R is installed on your hard drive. This makes it much easier to work through the examples in the book one line at a time, experimenting at with modifications. In addition, there is one point I know where the R syntax