Peter's example and Doug's "different test" reply sent me Scheffé's discussion of the balanced and replicated mixed-effect 2-away layout. As I note below, the obvious F test for the fixed effect does not appear to be likelihood ratio for anything.
Douglas Bates wrote: > On 9/7/06, Douglas Bates <[EMAIL PROTECTED]> wrote: > >> On 07 Sep 2006 17:20:29 +0200, Peter Dalgaard <[EMAIL PROTECTED]> wrote: >> >>> Martin Maechler <[EMAIL PROTECTED]> writes: >>> >>> >>>>>>>>> "DB" == Douglas Bates <[EMAIL PROTECTED]> >>>>>>>>> on Thu, 7 Sep 2006 07:59:58 -0500 writes: >>>>>>>>> >>>> DB> Thanks for your summary, Hank. >>>> DB> On 9/7/06, Martin Henry H. Stevens <[EMAIL PROTECTED]> wrote: >>>> >> Dear lmer-ers, >>>> >> My thanks for all of you who are sharing your trials and >>>> tribulations >>>> >> publicly. >>>> >>>> >> I was hoping to elicit some feedback on my thoughts on denominator >>>> >> degrees of freedom for F ratios in mixed models. These thoughts and >>>> >> practices result from my reading of previous postings by Doug Bates >>>> >> and others. >>>> >>>> >> - I start by assuming that the appropriate denominator degrees lies >>>> >> between n - p and and n - q, where n=number of observations, >>>> p=number >>>> >> of fixed effects (rank of model matrix X), and q=rank of Z:X. >>>> >>>> DB> I agree with this but the opinion is by no means universal. >>>> Initially >>>> DB> I misread the statement because I usually write the number of >>>> columns >>>> DB> of Z as q. >>>> >>>> DB> It is not easy to assess rank of Z:X numerically. In many cases >>>> one >>>> DB> can reason what it should be from the form of the model but a >>>> general >>>> DB> procedure to assess the rank of a matrix, especially a sparse >>>> matrix, >>>> DB> is difficult. >>>> >>>> DB> An alternative which can be easily calculated is n - t where t is >>>> the >>>> DB> trace of the 'hat matrix'. The function 'hatTrace' applied to a >>>> DB> fitted lmer model evaluates this trace (conditional on the >>>> estimates >>>> DB> of the relative variances of the random effects). >>>> >>>> >> - I then conclude that good estimates of P values on the F ratios >>>> lie >>>> >> between 1 - pf(F.ratio, numDF, n-p) and 1 - pf(F.ratio, numDF, >>>> n-q). >>>> >> -- I further surmise that the latter of these (1 - pf(F.ratio, >>>> numDF, >>>> >> n-q)) is the more conservative estimate. >>>> >>>> This assumes that the true distribution (under H0) of that "F ratio" >>>> *is* F_{n1,n2} for some (possibly non-integer) n1 and n2. >>>> But AFAIU, this is only approximately true at best, and AFAIU, >>>> the quality of this approximation has only been investigated >>>> empirically for some situations. >>>> Hence, even your conservative estimate of the P value could be >>>> wrong (I mean "wrong on the wrong side" instead of just >>>> "conservatively wrong"). Consequently, such a P-value is only >>>> ``approximately conservative'' ... >>>> I agree howevert that in some situations, it might be a very >>>> useful "descriptive statistic" about the fitted model. >>>> >>> I'm very wary of ANY attempt at guesswork in these matters. >>> >>> I may be understanding the post wrongly, but consider this case: Y_ij >>> = mu + z_i + eps_ij, i = 1..3, j=1..100 >>> >>> I get rank(X)=1, rank(X:Z)=3, n=300 >>> >>> It is well known that the test for mu=0 in this case is obtained by >>> reducing data to group means, xbar_i, and then do a one-sample t test, >>> the square of which is F(1, 2), but it seems to be suggested that >>> F(1, 297) is a conservative test???! >>> >> It's a different test, isn't it? Your test is based upon the between >> group sum of squares with 2 df. I am proposing to use the within >> group sum of squares or its generalization. >> > > On closer examination I see that you are indeed correct. I have heard > that "well-known" result many times and finally sat down to prove it > to myself. For a balanced design the standard error of the intercept > using the REML estimates is the same as the standard error of the mean > calculated from the group means. > > >> data(Rail, package = 'nlme') >> library(lme4) >> summary(fm1 <- lmer(travel ~ 1 + (1|Rail), Rail)) >> > Linear mixed-effects model fit by REML > Formula: travel ~ 1 + (1 | Rail) > Data: Rail > AIC BIC logLik MLdeviance REMLdeviance > 126.2 128.0 -61.09 128.6 122.2 > Random effects: > Groups Name Variance Std.Dev. > Rail (Intercept) 615.286 24.8050 > Residual 16.167 4.0208 > number of obs: 18, groups: Rail, 6 > > Fixed effects: > Estimate Std. Error t value > (Intercept) 66.50 10.17 6.538 > >> mns <- with(Rail, tapply(travel, Rail, mean)) # group means >> sd(mns)/sqrt(length(mns)) # standard error matches that from lmer >> > [1] 10.17104 > >> t.test(mns) >> > > One Sample t-test > > data: mns > t = 6.5382, df = 5, p-value = 0.001253 > alternative hypothesis: true mean is not equal to 0 > 95 percent confidence interval: > 40.35452 92.64548 > sample estimates: > mean of x > 66.5 > > >> ctab <- summary(fm1)@coefs # coefficient table >> ctab[,1] + c(-1,1) * qt(0.975, 15) * ctab[,2] # 95% conf. int. >> > [1] 44.82139 88.17861 > >> ## interval using df = # of obs - rank of [Z:X] is too narrow >> > > So my proposal of using either the trace of the hat matrix or the rank > of the combined model matrices as the degrees of freedom for the model > is not conservative. > > However, look at the following > > >> set.seed(123454321) # for reproducibility >> sm1 <- mcmcsamp(fm1, 50000) >> library(coda) >> HPDinterval(sm1) >> > lower upper > (Intercept) 40.470663 92.608514 > log(sigma^2) 2.060179 3.716326 > log(Rail.(In)) 5.371858 8.056897 > deviance 128.567329 137.487455 > attr(,"Probability") > [1] 0.95 > > The HPD interval calculated from a MCMC sample reproduce the interval > from the group means almost exactly. This makes sense in that the > MCMC sample takes into account the variation in the estimates of the > variance components, just as defining intervals based on the Student's > t does. > > So for this case where the distribution of the estimate of the mean > has a known distribution the correct degrees of freedom and the MCMC > sample produce similar answers. > > This gives me more confidence in the results from the MCMC sample in > general cases. > > The problem I have with trying to work out what the degrees of freedom > "should be" is that the rules seem rather arbitrary. For example, the > "between-within" rule used in SAS PROC Mixed is popular (many accept > it as the "correct" answer) but it assumes that the degrees of freedom > associated with a random effect grouped by a factor with k levels is > always k - 1. This value is used even when there is a random > intercept and a random slope for each group. In fact you could have > an arbitrary number of random effects for each level of the grouping > factor and it would still apparently only cost you k - 1 degrees of > freedom. That doesn't make sense to me. > > Anyway, I thank you for pointing out the errors of my ways Peter. > For the traditional, balanced, replicated, 2-way mixed-effects analysis, Scheffé (1959, Table 8.1.1, p. 269) gives the expected mean squares for a two-way layout with "I" levels of a fixed effect A, "J" levels of a random effect B, and "K" replicates, as follows: EMS(A: fixed) = var(e) + K*var(A:B) + J*K*MeanSquareA EMS(B: random) = var(e) + I*K*var(B) EMS(A:B; random)=var(e)+K*var(A:B) EMSE = var(e). In this case, the "obvious" test for A is MS(A: fixed) / MS(A:B, random), because this gives us a standard F statistic to test MeanSquareA = 0. However, it doesn't make sense to me to test A without simultaneously assuming var(A:B) = 0. The same argument applies to Peter's "simpler" case discussed above: With "Y_ij = mu + z_i + eps_ij", it only rarely makes sense to test mu=0 while assuming var(z) != 0. In the balanced 2-way, mixed-effects analysis, the Neyman-Pearson thing to do, I would think, would be to test simultaneously MeanSquareA = 0 with var(A:B) = 0. In lmer, I might write this as follows: anova(lmer(y~A+(A|B)), lmer(y~1+(1|B)). However, this does NOT match the standard analysis associated with this design, does it? To check this, I considered problem 8.1 in Scheffé (p. 289), which compares 3 different nozzles (fixed effect) tested by 5 different operators (random effect). The data are as follows: y <- c(6,6,-15, 26,12,5, 11,4,4, 21,14,7, 25,18,25, 13,6,13, 4,4,11, 17,10,17, -5,2,-5, 15,8,1, 10,10,-11, -35,0,-14, 11,-10,-17, 12,-2,-16, -4,10,24) Nozzle <- data.frame(Nozzle=rep(LETTERS[1:3], e=15), Operator=rep(letters[1:5], e=3), flowRate=y) The traditional analysis can be obtained from anova(lm(flowRate~Nozzle*Operator, ...)), but comparing MeanSq.Nozzle to MeanSq.Nozzle:Operator rather than MeanSquareResidual, as follows: > fitAB0 <- lm(flowRate~Nozzle*Operator, data=Nozzle) > (aov.AB0 <- anova(fitAB0)) Analysis of Variance Table Response: flowRate Df Sum Sq Mean Sq F value Pr(>F) Nozzle 2 1426.98 713.49 7.0456 0.003101 ** Operator 4 798.80 199.70 1.9720 0.124304 Nozzle:Operator 8 1821.47 227.68 2.2484 0.051640 . Residuals 30 3038.00 101.27 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Scheffé must have computed the following: > (F.A.Scheffe <- aov.AB0[1, "Mean Sq"]/aov.AB0[3, "Mean Sq"]) [1] 3.133690 > pf(F.A.Scheffe, 1, 2, lower.tail=FALSE) [1] 0.2187083 However, I think I prefer the likelihood ratio answer to this, because I think it is better to have an approximate solution to the exact problem than an exact solution to the approximate problem. [I got this from someone else like Tukey, but I don't have a citation.]) I can get this likelihood ratio answer from either lme or lmer. When I tried to fit this model with 'mle'; it didn't want to converge: library(nlme) fitAB. <- lme(flowRate~Nozzle, random=~Nozzle|Operator, data=Nozzle, method="ML") Error in lme.formula(flowRate ~ Nozzle, random = ~Nozzle | Operator, data = Nozzle, : nlminb problem, convergence error code = 1; message = iteration limit reached without convergence (9) After several false starts, I got the following to work: fitAB. <- lme(flowRate~Nozzle, random=~Nozzle|Operator, data=Nozzle, method="ML", control=lmeControl(opt="optim")) > anova(fitAB., fitB.) Model df AIC BIC logLik Test L.Ratio p-value fitAB. 1 10 361.9022 379.9688 -170.9511 fitB. 2 3 361.3637 366.7837 -177.6819 1 vs 2 13.46153 0.0616 I got essentially the same answer from lmer (without the convergence problem, but quitting R in between: > fitAB <- lmer(flowRate~Nozzle+(Nozzle|Operator), + data=Nozzle, method="ML") > fitB <- lmer(flowRate~1+(1|Operator), data=Nozzle, + method="ML") > anova(fitAB, fitB) Data: Nozzle Models: fitB: flowRate ~ 1 + (1 | Operator) fitAB: flowRate ~ Nozzle + (Nozzle | Operator) Df AIC BIC logLik Chisq Chi Df Pr(>Chisq) fitB 2 359.36 362.98 -177.68 fitAB 9 359.88 376.14 -170.94 13.479 7 0.06126 . Comments? Spencer Graves p.s. For the lme fit: > sessionInfo() Version 2.3.1 Patched (2006-08-13 r38872) i386-pc-mingw32 attached base packages: [1] "methods" "stats" "graphics" "grDevices" "utils" "datasets" [7] "base" other attached packages: nlme "3.1-75" > ______________________________________________ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.