On Tue, Jun 08, 2004 at 08:32:24AM -0700, Spencer Graves wrote: > Hi, Doug: > > Thanks. I'll try the things you suggests. The observed > proportions ranged from roughly 0.2 to 0.8 in 100 binomial random > samples where sigma is at most 0.05. Jim Lindsey's "glmm" does > Gauss-Hermite quadrature, but I don't know if it bothers with the > adaptive step. With it, I've seen estimates of the variance component > ranging from 0.4 to 0.7 or so. Since I simulated normal 0 standard > deviation of 1, the algorithm was clearly underestimating what was > simulated. My next step, I think, is to program adaptive Gauss-Hermite > quadrature for something closer to my real problem (as you just > suggested), and see what I get. [...]
> Douglas Bates wrote: > > >Spencer Graves <[EMAIL PROTECTED]> writes: > > [...] > >> Consider the following: > >> > >>> set.seed(1); N <- 10 > >>> z <- rnorm(N) > >>> pz <- inv.logit(z) > >>> DF <- data.frame(z=z, pz=pz, y=rbinom(N, 100, pz)/100, n=100, > >>> smpl=factor(1:N)) > >> > >>> GLMM(y~1, family=binomial, data=DF, random=~1|smpl, weights=n) I think the "weights=n" is the problem, i.e., you summarize Bernoulli data to Bin(100, p) data, and that gives a completely different estimate of the variance of the random effect. (This looks as an error in lme4 to me, or am I missing something? Doug?) Really, the two ways of representing data should give equivalent analyses, but it doesn't. The same phenomenon appears in 'glm', i.e. without random effects, but only for the residual sum of squares, df, and AIC. The following is a small function calling lme4 twice, first as before, and then with data 'wrapped up' into Bernoulli data. I also run 'glmmML' in the latter case (since glmmML only allows the Bernoulli representation): ------------------------------------------------------------------- sim <- function(N = 10, grpSize = 10, std = 1){ require(glmmML) require(lme4) set.seed(1) z <- rnorm(N, mean = 0, sd = std) # pz <- inv.logit(z); is identical to (in 'stats') pz <- plogis(z) Y <- rbinom(N, grpSize, pz) ## 'Summary' data frame: DF <- data.frame(z = z, pz = pz, y = Y / grpSize, n = grpSize, smpl = factor(1:N)) fit1 <- GLMM(y~1, family = binomial, data = DF, random = ~1|smpl, weights = n) ##fit1 <- glm(y~1, family = binomial, data = DF, weights = n) ## 'Individual' data frame: n <- N * grpSize z <- rep(z, rep(grpSize, N)) pz <- rep(pz, rep(grpSize, N)) y <- numeric(n) for (i in 1:N) y[((i - 1) * grpSize + 1):((i-1)*grpSize + Y[i])] <- 1 smpl <- as.factor(rep(1:N, rep(grpSize, N))) bigDF <- data.frame(z = z, pz = pz, y = y, smpl = smpl) fit2 <- GLMM(y~1, family=binomial, data=bigDF, random=~1|smpl) ##fit2 <- glm(y~1, family=binomial, data=bigDF) fitML <- glmmML(y ~ 1, family = binomial, data = bigDF, cluster = bigDF$smpl) return(list(fit1 = fit1, fit2 = fit2, fitML = fitML ) ) } ---------------------------------------------------------------- Output: $fit1 Generalized Linear Mixed Model Family: binomial family with logit link Fixed: y ~ 1 Data: DF log-likelihood: -11.59919 Random effects: Groups Name Variance Std.Dev. smpl (Intercept) 3.0384e-10 1.7431e-05 Estimated scale (compare to 1) 1.391217 Fixed effects: Estimate Std. Error z value Pr(>|z|) (Intercept) 0.48955 0.20602 2.3762 0.01749 Number of Observations: 10 Number of Groups: 10 $fit2 Generalized Linear Mixed Model Family: binomial family with logit link Fixed: y ~ 1 Data: bigDF log-likelihood: -64.8084 Random effects: Groups Name Variance Std.Dev. smpl (Intercept) 0.58353 0.7639 Estimated scale (compare to 1) 0.9563351 Fixed effects: Estimate Std. Error z value Pr(>|z|) (Intercept) 0.52811 0.32286 1.6357 0.1019 Number of Observations: 100 Number of Groups: 10 $fitML Call: glmmML(formula = y ~ 1, data = bigDF, cluster = bigDF$smpl, family = binomial) coef se(coef) z Pr(>|z|) (Intercept) 0.5587 0.3313 1.686 0.0917 Standard deviation in mixing distribution: 0.764 Std. Error: 0.3655 Residual deviance: 129.5 on 98 degrees of freedom AIC: 133.5 --------------- Note the big difference in estimated random effect 'sd' in the two lme4 runs! Note further how close to each other the corresponding estimates for the second run of lme4 and of glmmML are. [...] Göran -- Göran Broström tel: +46 90 786 5223 Department of Statistics fax: +46 90 786 6614 Umeå University http://www.stat.umu.se/egna/gb/ SE-90187 Umeå, Sweden e-mail: [EMAIL PROTECTED] ______________________________________________ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html