On 9/12/06, Douglas Bates <[EMAIL PROTECTED]> wrote: > On 9/11/06, Manuel Morales <[EMAIL PROTECTED]> wrote: [snip] > > Am I right that the MCMC sample can not be used, however, to evaluate > > the significance of parameter groups. For example, to assess the > > significance of a three-level factor? Are there better alternatives than > > simply adjusting the CI for the number of factor levels > > (1-alpha/levels). > > Hmm - I'm not sure what confidence interval and what number of levels > you mean there so I can't comment on that method. > > Suppose we go back to Spencer's example and consider if there is a > signficant effect for the Nozzle factor. That is equivalent to the > hypothesis H_0: beta_2 = beta_3 = 0 versus the general alternative. A > "p-value" could be formulated from an MCMC sample if we assume that > the marginal distribution of the parameter estimates for beta_2 and > beta_3 has roughly elliptical contours and you can evaluate that by, > say, examining a hexbin plot of the values in the MCMC sample. One > could take the ellipses as defined by the standard errors and > estimated correlation or, probably better, by the observed standard > deviations and correlations in the MCMC sample. Then determine the > proportion of (beta_2, beta_3) pairs in the sample that fall outside > the ellipse centered at the estimates and with that eccentricity and > scaling factors that passes through (0,0). That would be an empirical > p-value for the test. > > I would recommend calculating this for a couple of samples to check on > the reproducibility.
There was some follow-up on this thread, including some code and results that I find encouraging. I didn't notice that the R-help list had been dropped off the cc: in later exchanges. I enclose my contribution to the conversation so that others on the list will get to see it. --- exerpt from previous private message ---- As soon as I described the idea I knew that someone would ask for a function to perform it. Here's one mcmcpvalue <- function(samp) { ## elementary version that creates an empirical p-value for the ## hypothesis that the columns of samp have mean zero versus a ## general multivariate distribution with elliptical contours. ## differences from the mean standardized by the observed ## variance-covariance factor std <- backsolve(chol(var(samp)), cbind(0, t(samp)) - colMeans(samp), transpose = TRUE) sqdist <- colSums(std * std) sum(sqdist[-1] > sqdist[1])/nrow(samp) } At least I think I have the standardization by the Cholesky factor of the observed variance-covariance matrix correct. However I always manage to confuse myself on that calculation so please let me know if I have it wrong. As an example, consider a model fit to the AvgDailyGain data from the SASmixed package. > library(nlme) > data(AvgDailyGain, package = "SASmixed") > summary(fm1Adg <- lme(adg ~ InitWt*Treatment - 1, AvgDailyGain, random = > ~1|Block)) Linear mixed-effects model fit by REML Data: AvgDailyGain AIC BIC logLik 85.32685 97.10739 -32.66342 Random effects: Formula: ~1 | Block (Intercept) Residual StdDev: 0.5092266 0.2223268 Fixed effects: adg ~ InitWt * Treatment - 1 Value Std.Error DF t-value p-value InitWt 0.0022937 0.0017473 17 1.3126947 0.2067 Treatment0 0.4391370 0.7110881 17 0.6175564 0.5451 Treatment10 1.4261187 0.6375458 17 2.2368880 0.0390 Treatment20 0.4796285 0.5488867 17 0.8738206 0.3944 Treatment30 0.2001071 0.7751989 17 0.2581365 0.7994 InitWt:Treatment10 -0.0012108 0.0023326 17 -0.5190774 0.6104 InitWt:Treatment20 0.0010720 0.0021737 17 0.4931507 0.6282 InitWt:Treatment30 0.0021543 0.0027863 17 0.7731996 0.4500 Correlation: InitWt Trtmn0 Trtm10 Trtm20 Trtm30 IW:T10 IW:T20 Treatment0 -0.961 Treatment10 0.034 0.039 Treatment20 0.003 0.080 0.334 Treatment30 0.050 0.011 0.097 0.043 InitWt:Treatment10 -0.772 0.742 -0.631 -0.164 -0.059 InitWt:Treatment20 -0.806 0.775 -0.180 -0.555 -0.019 0.724 InitWt:Treatment30 -0.666 0.640 -0.046 0.024 -0.754 0.529 0.520 Standardized Within-Group Residuals: Min Q1 Med Q3 Max -1.82903364 -0.44913967 -0.03023488 0.44738506 1.59877700 Number of Observations: 32 Number of Groups: 8 > anova(fm1Adg) numDF denDF F-value p-value InitWt 1 17 91.68230 <.0001 Treatment 4 17 8.81312 0.0005 InitWt:Treatment 3 17 0.93118 0.4471 Fitting the same model in lmer then generating an MCMC sample and testing for the three interaction coefficients being zero would look like > data(AvgDailyGain, package = "SASmixed") > (fm1Adg <- lmer(adg ~ InitWt*Treatment - 1 + (1|Block), AvgDailyGain)) Linear mixed-effects model fit by REML Formula: adg ~ InitWt * Treatment - 1 + (1 | Block) Data: AvgDailyGain AIC BIC logLik MLdeviance REMLdeviance 83.33 96.52 -32.66 10.10 65.33 Random effects: Groups Name Variance Std.Dev. Block (Intercept) 0.25930 0.50922 Residual 0.04943 0.22233 number of obs: 32, groups: Block, 8 Fixed effects: Estimate Std. Error t value InitWt 0.002294 0.001747 1.3127 Treatment0 0.439128 0.711092 0.6175 Treatment10 1.426113 0.637549 2.2369 Treatment20 0.479621 0.548889 0.8738 Treatment30 0.200115 0.775204 0.2581 InitWt:Treatment10 -0.001211 0.002333 -0.5191 InitWt:Treatment20 0.001072 0.002174 0.4931 InitWt:Treatment30 0.002154 0.002786 0.7732 Correlation of Fixed Effects: InitWt Trtmn0 Trtm10 Trtm20 Trtm30 IW:T10 IW:T20 Treatment0 -0.961 Treatment10 0.034 0.039 Treatment20 0.003 0.080 0.334 Treatment30 0.050 0.011 0.097 0.043 IntWt:Trt10 -0.772 0.742 -0.631 -0.164 -0.059 IntWt:Trt20 -0.806 0.775 -0.180 -0.555 -0.019 0.724 IntWt:Trt30 -0.666 0.640 -0.046 0.024 -0.754 0.529 0.520 > AdgS1 <- mcmcsamp(fm1Adg, 50000) > library(coda) > HPDinterval(AdgS1) lower upper InitWt -0.001402986 0.006164517 Treatment0 -1.144313821 1.925351624 Treatment10 0.047128188 2.776309715 Treatment20 -0.761106539 1.602588258 Treatment30 -1.440932884 1.887536852 InitWt:Treatment10 -0.006309270 0.003569772 InitWt:Treatment20 -0.003578127 0.005614077 InitWt:Treatment30 -0.004003873 0.008055591 log(sigma^2) -3.617259562 -2.198161379 log(Blck.(In)) -2.438121552 0.002976392 deviance 12.769059400 32.841699073 attr(,"Probability") [1] 0.95 > mcmcpvalue(as.matrix(AdgS1)[, 6:8]) [1] 0.46876 If we drop the interaction term and consider the treatment term the test becomes > (fm2Adg <- lmer(adg ~ InitWt + Treatment + (1|Block), AvgDailyGain)) Linear mixed-effects model fit by REML Formula: adg ~ InitWt + Treatment + (1 | Block) Data: AvgDailyGain AIC BIC logLik MLdeviance REMLdeviance 48.34 57.13 -18.17 13.62 36.34 Random effects: Groups Name Variance Std.Dev. Block (Intercept) 0.24084 0.49076 Residual 0.05008 0.22379 number of obs: 32, groups: Block, 8 Fixed effects: Estimate Std. Error t value (Intercept) 0.2490338 0.3776318 0.659 InitWt 0.0027797 0.0008334 3.336 Treatment10 0.4835075 0.1128530 4.284 Treatment20 0.4639445 0.1120505 4.140 Treatment30 0.5520737 0.1148132 4.808 Correlation of Fixed Effects: (Intr) InitWt Trtm10 Trtm20 InitWt -0.863 Treatment10 -0.035 -0.130 Treatment20 -0.102 -0.053 0.502 Treatment30 -0.338 0.224 0.454 0.475 > AdgS2 <- mcmcsamp(fm2Adg, 50000) > HPDinterval(AdgS2) lower upper (Intercept) -0.579312545 1.044946476 InitWt 0.001114375 0.004616652 Treatment10 0.246254015 0.714185069 Treatment20 0.220236717 0.686356392 Treatment30 0.316078702 0.797956531 log(sigma^2) -3.553700311 -2.280238850 log(Blck.(In)) -2.448486301 -0.072657242 deviance 14.687593543 29.699825900 attr(,"Probability") [1] 0.95 > mcmcpvalue(as.matrix(AdgS2[,3:5])) [1] 0.00026 so these p-values seem to be in line with the results from the analysis of variance p-values using one of the formula for calculation of a residual degrees of freedom. Related to the other question of the use of a likelihood ratio test for the fixed effects, the p-values for the likelihood ratio tests seem quite different > anova(fm2Adg, fm1Adg) Data: AvgDailyGain Models: fm2Adg: adg ~ InitWt + Treatment + (1 | Block) fm1Adg: adg ~ InitWt * Treatment - 1 + (1 | Block) Df AIC BIC logLik Chisq Chi Df Pr(>Chisq) fm2Adg 6 25.623 34.417 -6.812 fm1Adg 9 28.098 41.290 -5.049 3.5249 3 0.3176 > fm3Adg <- lmer(adg ~ InitWt + (1|Block), AvgDailyGain) > anova(fm2Adg, fm3Adg) Data: AvgDailyGain Models: fm3Adg: adg ~ InitWt + (1 | Block) fm2Adg: adg ~ InitWt + Treatment + (1 | Block) Df AIC BIC logLik Chisq Chi Df Pr(>Chisq) fm3Adg 3 41.913 46.310 -17.957 fm2Adg 6 25.623 34.417 -6.812 22.29 3 5.677e-05 *** and it is not just a matter of the evaluation of the log-likelihood > fm2AdgML <- update(fm2Adg, method = "ML") > fm3AdgML <- update(fm3Adg, method = "ML") > anova(fm3AdgML, fm2AdgML) Data: AvgDailyGain Models: fm3AdgML: adg ~ InitWt + (1 | Block) fm2AdgML: adg ~ InitWt + Treatment + (1 | Block) Df AIC BIC logLik Chisq Chi Df Pr(>Chisq) fm3AdgML 3 41.882 46.279 -17.941 fm2AdgML 6 25.617 34.412 -6.809 22.265 3 5.746e-05 *** ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.