Thanks. I'll try the things you suggests. The observed proportions ranged from roughly 0.2 to 0.8 in 100 binomial random samples where sigma is at most 0.05. Jim Lindsey's "glmm" does Gauss-Hermite quadrature, but I don't know if it bothers with the adaptive step. With it, I've seen estimates of the variance component ranging from 0.4 to 0.7 or so. Since I simulated normal 0 standard deviation of 1, the algorithm was clearly underestimating what was simulated. My next step, I think, is to program adaptive Gauss-Hermite quadrature for something closer to my real problem (as you just suggested), and see what I get.
You mentioned the little vs. big approximations: My real application involves something close to a binomial response driven by Poisson defects, where the Poisson defect rate is not constant. I've shown that it can make a difference whether the defect rate is lognormal or gamma, so that is another complication and another reason to write my own log(likelihood). I've thought about writing my own function to do adaptive Gauss-Hermite quadrature, as you suggested, but decided to check more carefully the available tools before I jumped into my own software development effort.
Thanks again.
Spencer Graves
Douglas Bates wrote:
Spencer Graves <[EMAIL PROTECTED]> writes:
Another GLMM/glmm problem: I simulate rbinom(N, 100, pz), where logit(pz) = rnorm(N). I'd like to estimate the mean and standard deviation of logit(pz). I've tried GLMM{lme4}, glmmPQL{MASS}, and glmm{Jim Lindsey's repeated}. In several replicates of this for N = 10, 100, 500, etc., my glmm call produced estimates of the standard deviation of the random effect in the range of 0.6 to 0.8 (never as high as the 1 simulated). Meanwhile, my calls to GLMM produced estimates between 1e-12 and 1e-9, while the glmmPQL results tended to be closer to 0.001, though it gave one number as high as 0.7. (I'm running R 1.9.1 alpha, lme4 0.6-1 under Windows 2000)
Am I doing something wrong, or do these results suggest bugs in the software or deficiencies in the theory or ... ?
Consider the following:
> set.seed(1); N <- 10 > z <- rnorm(N) > pz <- inv.logit(z) > DF <- data.frame(z=z, pz=pz, y=rbinom(N, 100, pz)/100, n=100, > smpl=factor(1:N))
> GLMM(y~1, family=binomial, data=DF, random=~1|smpl, weights=n)
Generalized Linear Mixed Model
Check the observed proportions in the data and see if they apparently vary enough to be able to expect to estimate a random effect.
It is entirely possible to have the MLE of a variance component be zero.
Another thing to do it to check the convergence. Use
GLMM(y ~ 1, family = binomial, random = ~1|smpl, weigths = n, control = list(EMv=TRUE, msV=TRUE))
or
GLMM(y ~ 1, family = binomial, random = ~1|smpl, weigths = n, control = list(EMv=TRUE, msV=TRUE, opt = 'optim'))
You will see that both optimizers push the precision of the random effects to very large values (i.e. the variance going to zero) in the second of the penalized least squares steps.
I think that this is a legitimate optimum for the approximate problem. It may be an indication that the approximate problem is not the best one to use. As George Box would tell us,
You have a big approximation and a small approximation. The big approximation is that your approximation to the problem you want to solve. The small approximation is involved in getting the solution to the approximate problem.
For this case, even if I turn off the PQL iterations and go directly to the Laplacian approximation I still get a near-zero estimate of the variance component. You can see the gory details with
GLMM(y ~ 1, family = binomial, random = ~1|smpl, weigths = n, control = list(EMv=TRUE, msV=TRUE, glmmMaxIter = 1), method = 'Laplace')
I am beginning to suspect that for these data the MLE of the variance component is zero.
So far, the best tool I've got for this problem is a normal
probability plot of a transform of the binomial responses
with Monte Carlo confidence bands, as suggested by Venables
and Ripley, S Programming and Atkinson (1985). However, I
ultimately need to estimate these numbers.
I think the most reliable method of fitting this particular form of a GLMM is using adaptive Gauss-Hermite quadrature to evaluate the marginal likelihood and optimizing that directly. In this model the marginal likelihood is a function of two parameters. If you have access to SAS I would try to fit these data with PROC NLMIXED and see what that does. You may also be able to use Goran Brostrom's package for R on this model. As a third option you could set up evaluation of the marginal likelihood using either the Laplacian approximation to the integral or your own version of adaptive Gauss-Hermite and look at the contours of the marginal log-likelihood.
______________________________________________
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
______________________________________________ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html