Spencer, Thanks for the reply. I concluded the same wrt between group variation soon after posting. However, the approach I ended up with was fully parametric as opposed to the resampling approach that you use in your reply. Interestingly, the two approaches yield different P-values, I think because your approach retains overdispersion in the data (?). In any case, my parametric stab at this is below.
iter <- 10 chisq.sim <- rep(NA, iter) Zt <- slot(model1,"Zt") # see ?lmer-class n.grps <- dim(ranef(model1)[[1]])[1] sd.ran.effs <- sqrt(VarCorr(model1)[[1]][1]) X <- slot(model1,"X") # see ?lmer-class fix.effs <- matrix(rep(fixef(model1),dim(X)[1]), nrow=dim(X)[1], byrow=T) model.parms <- X*fix.effs # This gives parameters for each case # Generate predicted values pred.vals <- as.vector(apply(model.parms, 1, sum)) for(i in 1:iter) { rand.new <- as.vector(rnorm(grps,0, sd.ran.effs)) rand.vals <- as.vector(rand.new%*%Zt) # Assign random effects mu <- pred.vals+rand.vals # Expected mean resp <- rpois(length(mu), exp(mu)) sim.data <- data.frame(slot(model2,"frame"), resp) # Make data frame sim.model1 <- lmer(resp~1+(1|subject), data=sim.data, family="poisson") sim.model2 <- lmer(resp~pred+(1|subject), data=sim.data, family="poisson") chisq.sim[i] <- anova(sim.model1,sim.model2)$Chisq[[2]] } Manuel On Sun, 2006-08-20 at 11:22 -0700, Spencer Graves wrote: > You've raised a very interesting question about testing a > fixed-effect factor with more than 2 levels using Monte Carlo. Like > you, I don't know how to use 'mcmcsamp' to refine the naive > approximation. If we are lucky, someone else might comment on this for us. > > Beyond this, you are to be commended for providing such a simple, > self-contained example for such a sophisticated question. I think you > simulation misses one important point: It assumes the between-subject > variance is zero. To overcome this, I think I might try either the > bootstrap or permutation distribution scrambling the assignment of > subjects to treatment groups but otherwise keeping the pairs of > observations together. > > To this end, consider the following: > > # Build a table to translate subject into 'pred' > o <- with(epil3, order(subject, y)) > epil3. <- epil3[o,] > norep <- with(epil3., subject[-1]!=subject[-dim(epil3)[1]]) > subj1 <- which(c(T, norep)) > subj.pred <- epil3.[subj1, c("subject", "pred")] > subj. <- as.character(subj.pred$subject) > pred. <- subj.pred$pred > names(pred.) <- subj. > > iter <- 10 > chisq.sim <- rep(NA, iter) > > set.seed(1) > for(i in 1:iter){ > ## Parameteric version s.i <- sample(subj.) > # Randomize subject assignments to 'pred' groups > epil3.$pred <- pred.[s.i][epil3.$subject] > fit1 <- lmer(y ~ pred+(1 | subject), > family = poisson, data = epil3.) > fit0 <- lmer(y ~ 1+(1 | subject), > family = poisson, data = epil3.) > chisq.sim[i] <- anova(fit0, fit1)[2, "Chisq"] > } > > Hope this helps. > Spencer Graves > > Manuel Morales wrote: > > Dear list, > > > > This is more of a stats question than an R question per se. First, I > > realize there has been a lot of discussion about the problems with > > estimating P-values from F-ratios for mixed-effects models in lme4. > > Using mcmcsamp() seems like a great alternative for evaluating the > > significance of individual coefficients, but not for groups of > > coefficients as might occur in an experimental design with 3 treatment > > levels. I'm wondering if the simulation approach I use below to estimate > > the P-value for a 3-level factor is appropriate, or if there are any > > suggestions on how else to approach this problem. The model and data in > > the example are from section 10.4 of MASS. > > > > Thanks! > > Manuel > > > > # Load req. package (see functions to generate data at end of script) > > library(lme4) > > library(MASS) > > > > # Full and reduced models - pred is a factor with 3 levels > > result.full <- lmer(y~pred+(1|subject), data=epil3, family="poisson") > > result.base <- lmer(y~1+(1|subject), data=epil3, family="poisson") > > > > # Naive P-value from LR for significance of "pred" factor > > anova(result.base,result.full)$"Pr(>Chisq)"[[2]] # P-value > > (test.stat <- anova(result.base,result.full)$Chisq[[2]]) # Chisq-stat <snip> Wrong approach here</snip> > > # Script to generate data, from section 10.4 of MASS > > epil2 <- epil[epil$period == 1, ] > > epil2["period"] <- rep(0, 59); epil2["y"] <- epil2["base"] > > epil["time"] <- 1; epil2["time"] <- 4 > > epil2 <- rbind(epil, epil2) > > epil2$pred <- unclass(epil2$trt) * (epil2$period > 0) > > epil2$subject <- factor(epil2$subject) > > epil3 <- aggregate(epil2, list(epil2$subject, epil2$period > 0), > > function(x) if(is.numeric(x)) sum(x) else x[1]) > > epil3$pred <- factor(epil3$pred, labels = c("base", "placebo", "drug")) > > > > ______________________________________________ > > R-help@stat.math.ethz.ch mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.