Re: [R] nested mixed-effect model: variance components

2006-06-19 Thread Christoph Buser
Dear John

I've put your mail back to the R list since I have no
explanation for the lmer result. Maybe someone else has an
idea. I adapted it to show some else that I do not understand. 

## Creation of the data (habitat is nested in lagoon):
set.seed(1)
dat - data.frame(y = rnorm(100), lagoon = factor(rep(1:4,each = 25)),
  habitat = factor(rep(1:20, each = 5)))

## I do not understand how the random intercepts for lagoon and
## lagoon:habitat can both be estimated. It seems a little bit
## strange to me that they are identical (0.46565).
library(lme4)
summary(reg3 - lmer(y~habitat+(1|lagoon)+(1|lagoon:habitat), data = dat))

## Furthermore I do not understand why the standard errors for
## the fixed effect of habitat are 1.131 for some habitats
## and 1.487 for the others???


## If one removes (1|lagoon), the variance component
## (1|lagoon:habitat) does not change its value (still 0.46565)???
summary(reg3a - lmer(y~habitat+(1|lagoon:habitat), data = dat))

## Now all standard errors for the fixed factor habitat are 1.131.

## Altogether it seems a little bit strange to me and with the
## warnings and errors of the lme and aov call, I'd be carefull
## by using the output of lmer in that case. In addition I do
## not understand the interpretation of the random effect lagoon
## in top of the nested FIXED factor habitat.

summary(aov(y~habitat + Error(lagoon/habitat), data = dat))

detach(package:Matrix)
detach(package:lme4)
library(nlme)
summary(reg2 - lme(y~habitat, random = ~1|lagoon/habitat, data = dat))
anova(reg2)


Best regards,

Christoph Buser

--
Christoph Buser [EMAIL PROTECTED]
Seminar fuer Statistik, LEO C13
ETH Zurich  8092 Zurich  SWITZERLAND
phone: x-41-44-632-4673 fax: 632-1228
http://stat.ethz.ch/~buser/
--



 Christoph,
 
 I am sending this off list bacause I tried 'lmer'and
 it seems to work with your contrived data,but I don't know why.
 Can you explain it ?
 
 
 John
 
 
 
 
  detach(package:nlme)
  library(Matrix)
 
 summary(lmer(y~habitat+(1|lagoon)+(1|lagoon:habitat), data = dat))
 
 Linear mixed-effects model fit by REML
 Formula: y ~ habitat + (1 | lagoon) + (1 | lagoon:habitat)
   Data: dat
   AIC  BIClogLik MLdeviance REMLdeviance
  292.3627 349.6764 -124.1814   280.0245 248.3627
 Random effects:
  Groups NameVariance Std.Dev.
  lagoon:habitat (Intercept) 0.46565  0.68239
  lagoon (Intercept) 0.46565  0.68239
  Residual   0.87310  0.93440
 number of obs: 100, groups: lagoon:habitat, 20; lagoon, 4
 
 Fixed effects:
   Estimate Std. Error  t value
 (Intercept)  0.1292699  1.0516317  0.12292
 habitat2 0.0058658  1.1316138  0.00518
 habitat3-0.0911469  1.1316138 -0.08055
 habitat4 0.3302971  1.1316138  0.29188
 habitat5-0.0480394  1.1316138 -0.04245
 habitat6-0.4778469  1.4872319 -0.32130
 habitat7-0.0867301  1.4872319 -0.05832
 habitat8 0.0696507  1.4872319  0.04683
 habitat9-0.0998728  1.4872319 -0.06715
 habitat100.1096064  1.4872319  0.07370
 habitat11   -0.0430979  1.4872319 -0.02898
 habitat120.0714719  1.4872319  0.04806
 habitat130.3380993  1.4872319  0.22733
 habitat140.3057808  1.4872319  0.20560
 habitat15   -0.4915582  1.4872319 -0.33052
 habitat16   -0.2624539  1.4872319 -0.17647
 habitat17   -0.2203461  1.4872319 -0.14816
 habitat180.2165269  1.4872319  0.14559
 habitat190.6932896  1.4872319  0.46616
 habitat20   -0.7271468  1.4872319 -0.48893
  anova(summary(lmer(y~habitat+(1|lagoon)+(1|lagoon:habitat), data = dat)))
 
 Analysis of Variance Table
 Df  Sum Sq Mean Sq
 habitat 19 2.65706 0.13985
 
  VarCorr(summary(summary(lmer(y~habitat+(1|lagoon)+(1|lagoon:habitat), data
 = dat)))
 + )
 $`lagoon:habitat`
 1 x 1 Matrix of class dpoMatrix
 (Intercept)
 (Intercept)   0.4656545
 
 $lagoon
 1 x 1 Matrix of class dpoMatrix
 (Intercept)
 (Intercept)   0.4656545
 
 attr(,sc)
 [1] 0.9343993
 
 Christoph Buser wrote ---
 
 Dear Eric
 
 Do you really have habitats nested within lagoons or are they
 partially crossed (meaning that you have the same habitats in
 different lagoons)?
 
 If you have them perfectly nested, I think that you cannot
 calculate both a fixed effect for habitats and a random effect
 for lagoon (see the example below, lme and aov).
 
 You can compare e.g. two lagoons by defining a contrast of the
 habitats of one lagoon against the habitats of the other (if you
 think that this is a meaningful test to interpret), but you
 cannot estimate a random effect lagoon in presence of a nested
 FIXED effect habitat.
 
 aov() will not return you the test and warn you about the
 singular model.
 
 lme() will estimate a variance component for lagoon, but does
 not provide you a test for the fixed factor.
 
 Regards,
 
 Christoph Buser
 
 
 
 set.seed(1)
 

Re: [R] nested mixed-effect model: variance components

2006-06-12 Thread Christoph Buser
Dear Eric

Do you really have habitats nested within lagoons or are they
partially crossed (meaning that you have the same habitats in
different lagoons)?

If you have them perfectly nested, I think that you cannot
calculate both a fixed effect for habitats and a random effect
for lagoon (see the example below, lme and aov).

You can compare e.g. two lagoons by defining a contrast of the
habitats of one lagoon against the habitats of the other (if you
think that this is a meaningful test to interpret), but you
cannot estimate a random effect lagoon in presence of a nested
FIXED effect habitat. 

aov() will not return you the test and warn you about the
singular model. 

lme() will estimate a variance component for lagoon, but does
not provide you a test for the fixed factor.

Regards,

Christoph Buser



set.seed(1)
dat - data.frame(y = rnorm(100), lagoon = factor(rep(1:4,each = 25)),
  habitat = factor(rep(1:20, each = 5)))

summary(aov(y~habitat + Error(lagoon/habitat), data = dat))

library(nlme)
summary(lme(y~habitat, random = ~1|lagoon/habitat, data = dat))


--
Christoph Buser [EMAIL PROTECTED]
Seminar fuer Statistik, LEO C13
ETH Zurich  8092 Zurich  SWITZERLAND
phone: x-41-44-632-4673 fax: 632-1228
http://stat.ethz.ch/~buser/
--



Eric Pante writes:
  Dear listers,
  
  I am trying to assess variance components for a nested, mixed-effects 
  model. I think I got an answer that make sense from R, but I have a 
  warning message and I wanted to check that what I am looking at is 
  actually what I need:
  
  my data are organized as transects within stations, stations within 
  habitats, habitats within lagoons.
  lagoons: random, habitats: fixed
  the question is: how much variation is due to lagoons? habitats? 
  lagoons*habitat? transects?
  
  Here is my code:
  
  res - aov(COVER ~ HABITAT + Error(HABITAT+LAGOON+LAGOON/HABITAT), 
  data=cov)
  summary(res)
  
  and I get Sum Sq for each to calculate variance components:
  
  Error: STRATE
  Df Sum Sq Mean Sq
  STRATE  5 4493.1   898.6
  
  Error: ATOLL
 Df Sum Sq Mean Sq F value Pr(F)
  Residuals  5 3340.5   668.1
  
  Error: STRATE:ATOLL
 Df  Sum Sq Mean Sq F value Pr(F)
  Residuals 18 2442.71  135.71
  
  Error: Within
  Df Sum Sq Mean Sq F value Pr(F)
  Residuals 145 6422.044.3
  
  My error message seems to come from the LAGOON/HABITAT, the Error is 
  computed.
  Warning message: Error() model is singular in: aov(COVER ~ HABITAT + 
  Error(HABITAT+LAGOON+LAGOON/HABITAT), data=cov),
  
  THANKS !!!
  eric
  
  
  
  Eric Pante
  
  College of Charleston, Grice Marine Laboratory
  205 Fort Johnson Road, Charleston SC 29412
  Phone: 843-953-9190 (lab)  -9200 (main office)
  
  
   On ne force pas la curiosite, on l'eveille ...
   Daniel Pennac
  
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] nested mixed-effect model: variance components

2006-06-10 Thread Spencer Graves
  I have seen no reply to this, so I will offer a couple of comments in 
spite of the fact that I know very little about aov other than it is 
old and has largely been superceded by lme in the nlme package. 
I've replied to many posts on random and mixed effects over the past few 
years, and I only remember one case where 'aov' returned an answer that 
could not have been obtained easily from 'lme'.  That one application 
involved estimating a saturated model from perfectly balanced 
experimental data.  'lme' refused to return an answer, at least as I 
specified the model, and 'aov' returned numbers from which anyone 
familiar with the theory for that special case could compute the desired 
F ratios and p-values.  In all other cases that I remember, 'aov' would 
either return the same answer, or much more commonly, a substantively 
inferior answer -- if it were feasible to use 'aov' at all.

  I mention this, because the simplest way I can think of to check your 
answer is to suggest you try to work it using 'lme'.  To learn how to do 
this, I strongly encourage you to consult Pinheiro and Bates (2000) 
Mixed-Effects Models in S and S-PLUS (Springer).  Bates has been for 
many years one of the leading contributors in this area and is the 
primary author of the 'nlme', 'lme4' and 'Matrix packages to support 
working with these kinds of models.  This book discusses both how to fit 
these kinds of models as well as how to produce plots of various kinds 
that can be very helpful for explaining the experimental results to 
others as well as diagnosing potential problem.  Moreover, the R scripts 
discussed in the book are available in files called ch01.R, ch02.R, 
..., ch06.R, ch08.R in a subfolder '~library\nlme\scripts' of the 
directory in which R is installed on your hard drive.  This makes it 
much easier to work through the examples in the book one line at a time, 
experimenting at with modifications.  In addition, there is one point I 
know where the R syntax differs from that in the book:  S-Plus will 
accept x^2 in a formula as meaning the square of a numeric variable;  R 
will not.  To specify a square in R, you need something like I(x^2). 
When I copied the commands out of the book, I had serious trouble 
getting answers like those in the book until I identified and corrected 
this difference in syntax.  If you use the script files, they provide 
the R syntax.

  I'm not certain, but I believe the following should estimate the 
model you wrote below:

  fit - lme(fixed=COVER ~ HABITAT,
random = ~1|LAGOON/HABITAT,
data=cov).

  For comparison, I refer you to Pinheiro and Bates, p. 47, where I 
find the following:

fm1Oats - lme( yield ~ ordered(nitro) * Variety, data = Oats,
   random = ~ 1 | Block/Variety )

  There are 3 Varieties and 6 Blocks in this Oats data.frame.  The 
fixed effect of Variety has therefore 2 degrees of freedom.  However, 
the random effect of Variety occurs in 18 levels, being all the 6*3 
Block:Variety combinations.  You can see this by examining 'str(fm1Oats)'.

  If you want to know how much variation is due to lagoons? habitats? 
lagoons*habitat? transects?, this model will NOT tell you that, and I 
don't know how to answer that question using 'lme'.  I was able to 
answer a question like that using 'lmer' associated with the 'lme4' and 
'Matrix' packages.  Unfortunately, these packages have some names 
conflicts with 'nlme', and the simplest way I know to change from one to 
the other is to q() and restart R  Before I did, however, I made a 
local copy of the Oats data.frame.  After I did that, I seemed to get 
sensible numbers from the following:

library(lme4)

fm1Oats4 - lmer(yield~ ordered(nitro) * Variety
 +(1|Block)+(1|Variety)+(1|Block:Variety),
  data=Oats)

  For both lme and lmer, the default method is REML (restricted 
maximum likelihood).  This is what you want for estimation.  For testing 
random effects, you still want REML, but you should adjust the degrees 
of freedom of the reference F distribution as discussed in section 2.4 
of Pinheiro and Bates;  this also applies to confidence intervals for 
the random effects.  For testing fixed effects, you should use method = 
'ML'.

  lme4 is newer than nlme and does not currently have available the 
complete set of helper functions for plotting, etc.  Thus, you will 
likely want to use both.  For documentation on 'lmer', you should still 
start with Pinheiro and Bates for the general theory, then refer to the 
vignettes associated with the mlmRev and lme4 packages;  if you 
don't know about vignettes RSiteSearch(graves vignette) will lead you 
to http://finzi.psych.upenn.edu/R/Rhelp02a/archive/67006.html; and 
other replies to r-help where I've described how to use them.  You will 
also want the relevant article by Doug Bates in R News.  To find that, 
go www.r-project.org - 

Re: [R] nested mixed-effect model: variance components

2006-06-10 Thread Spencer Graves
see inline

Eric Pante wrote:
  Hi Spencer,
 
  First, thank you very much for taking the time to write this detailed
  reply !
  I did try exactly the formula you suggested:
 
fit - lme(fixed=COVER ~ HABITAT, random = ~1|LAGOON/HABITAT,
  data=cov)
 
  before writing my post, and indeed, neither summary() or anova()
  returned the sums of squares to assess variance components.

  Do you want sums of squares or estimates of variance components?  I 
believe that for over half a century, there is been a substantial 
consensus among leading experts in statistical methods that maximum 
likelihood (or Bayesian posterior where an adequate prior exists) is 
theoretically the best method for most statistical questions.   With 
mixed effects, this is interpreted as requiring the maximization of the 
marginal likelihood (often after integrating out fixed effects to obtain 
the restricted likelihood maximized with method = REML).

  However, until the last couple of decades, ML or REML was not 
computationally feasible for many cases lacking balance.  This led to an 
extensive literature on alternative methods like MINQUE.  These methods 
are now substantially obsolete as far as I know except in cases with 
appropriate balance where they produce exactly the same answers as ML or 
REML.

  If you want estimates of the variance components, then use 'lme': 
With appropriately balanced data sets, these would be exactly what you 
would get by writing out the expected mean squares and solving for the 
variance components.  Where the balance is lacking, the other methods 
will generally produce less efficient estimates of what you really want. 
  If you want sums of squares for testing, do your testing as described 
in Pinheiro and Bates.  Except in the saturated case I mentioned, you 
should get identical or superior estimates from lme and lmer than from 
aov.

  If you want sums of squares for something other than an intermediate 
step for estimation or testing, please explain.

  Be careful what you ask for, because you might get it -- and it might 
not be what you want.

  how do you specify in the formula that you want a nested approach (I
  will check the Pinheiro and Bates book) ? For these two reasons, I had
  the feeling that aov might be the way to go ...

  I understood you to say that you thought habitat should be nested 
within lagoon, but you also want to know how much variation is due to 
lagoons? habitats? lagoons*habitat?.  That sounds to me like you want 
to know how to clean the spark plugs on your bicycle.

  Have you studied the Oats example in my post?  (See also 
http://finzi.psych.upenn.edu/R/Rhelp02a/archive/63788.html;.)

  Bon Chance!
  J’espère que ceci vous aide.
  Spencer Graves
 
  I will try your suggestions concerning the lme4 package.
 
  Thanks again !
  eric pante
 
  Eric Pante

Spencer Graves wrote:
   I have seen no reply to this, so I will offer a couple of comments 
 in spite of the fact that I know very little about aov other than it 
 is old and has largely been superceded by lme in the nlme package. 
 I've replied to many posts on random and mixed effects over the past few 
 years, and I only remember one case where 'aov' returned an answer that 
 could not have been obtained easily from 'lme'.  That one application 
 involved estimating a saturated model from perfectly balanced 
 experimental data.  'lme' refused to return an answer, at least as I 
 specified the model, and 'aov' returned numbers from which anyone 
 familiar with the theory for that special case could compute the desired 
 F ratios and p-values.  In all other cases that I remember, 'aov' would 
 either return the same answer, or much more commonly, a substantively 
 inferior answer -- if it were feasible to use 'aov' at all.
 
   I mention this, because the simplest way I can think of to check 
 your answer is to suggest you try to work it using 'lme'.  To learn how 
 to do this, I strongly encourage you to consult Pinheiro and Bates 
 (2000) Mixed-Effects Models in S and S-PLUS (Springer).  Bates has been 
 for many years one of the leading contributors in this area and is the 
 primary author of the 'nlme', 'lme4' and 'Matrix packages to support 
 working with these kinds of models.  This book discusses both how to fit 
 these kinds of models as well as how to produce plots of various kinds 
 that can be very helpful for explaining the experimental results to 
 others as well as diagnosing potential problem.  Moreover, the R scripts 
 discussed in the book are available in files called ch01.R, ch02.R, 
 ..., ch06.R, ch08.R in a subfolder '~library\nlme\scripts' of the 
 directory in which R is installed on your hard drive.  This makes it 
 much easier to work through the examples in the book one line at a time, 
 experimenting at with modifications.  In addition, there is one point I 
 know where the R syntax