[R] Do Users of Nonlinear Mixed Effects Models Know Whether Their Software Really Works?

2005-10-14 Thread Hans Julius Skaug

Dear Andrew and R-list,

I guess Fournier is addressing the properties of the numerical routines
underlying the various packages, not the statistical properties of the MLE 
itself.
For this purpose using a small tricky dataset makes sense. Clearly,
a true unique MLE exists (except in pathological cases), defined
as the maximizer of the marginal likelihood, evaluated using perfect precision 
numerical integration.
Since all the packages are aiming at calculating the MLE, it makes sense to 
compare them 
on this ground. I think the point in Lesaffre et al is that the default 
settings of many packages may 
give you something very different from the true MLE.


best regards,

hans


 1) If I understand correctly, you're trying to estimate parameters
from a real dataset.  Why not try a simulated dataset, where you
know exactly what the true values (and parameter distributions)
are?
 
 2) Furthermore, an argument from one dataset isn't very
convincing. The sample size for inference is too small.  Why not
repeat this procedure many times, sampling from the same base
model? 
 
 3) Then, you could also vary the structure of the underlying model
systematically, and assess the comparison of fits as a function of
the underlying model/dataset nexus.
 
 4) Next, a problem with the example (as I understand it) is that
although you've computed what you call exact MLE's, I think that
they're exact when conditioned on the model.  Are they very robust
to model misspecification?  (I mean beyond large-sample theory).
 
 5) Finally, of course, then making the scripts available for forsenic
investigations.
 
 Cheers,
 
 Andrew

_
Hans Julius Skaug

Department of Mathematics
University of Bergen
Johannes Brunsgate 12
5008 Bergen
Norway
ph. (+47) 55 58 48 61

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Do Users of Nonlinear Mixed Effects Models Know Whether Their Software Really Works?

2005-10-14 Thread Andrew Robinson
Dear Hans,

these are interesting points.  I guess that I'm approaching it from
the point of view of a decision: I'd be more comfortable using a
fitting routine that has stability under a wide range of identifiable
circumstances. Obtaining the MLE exactly in any instance is a function
of the data and the model.  So, to me, obtaining it well in one
instance is less interesting than obtaining it well in a wide array of
instances. 

In short, I guess that I'm connecting the numerical routines with the
actual data, in the sense that that's what they operate on, and
therefore the statistical properties of the overall approach.  Perhaps
I'm being naive!

Cheers,

Andrew

On Fri, Oct 14, 2005 at 02:55:59PM +0200, Hans Julius Skaug wrote:
 
 Dear Andrew and R-list,
 
 I guess Fournier is addressing the properties of the numerical routines
 underlying the various packages, not the statistical properties of the MLE 
 itself.
 For this purpose using a small tricky dataset makes sense. Clearly,
 a true unique MLE exists (except in pathological cases), defined
 as the maximizer of the marginal likelihood, evaluated using perfect 
 precision numerical integration.
 Since all the packages are aiming at calculating the MLE, it makes sense to 
 compare them 
 on this ground. I think the point in Lesaffre et al is that the default 
 settings of many packages may 
 give you something very different from the true MLE.
 
 
 best regards,
 
 hans
 
 
  1) If I understand correctly, you're trying to estimate parameters
 from a real dataset.  Why not try a simulated dataset, where you
 know exactly what the true values (and parameter distributions)
 are?
  
  2) Furthermore, an argument from one dataset isn't very
 convincing. The sample size for inference is too small.  Why not
 repeat this procedure many times, sampling from the same base
 model? 
  
  3) Then, you could also vary the structure of the underlying model
 systematically, and assess the comparison of fits as a function of
 the underlying model/dataset nexus.
  
  4) Next, a problem with the example (as I understand it) is that
 although you've computed what you call exact MLE's, I think that
 they're exact when conditioned on the model.  Are they very robust
 to model misspecification?  (I mean beyond large-sample theory).
  
  5) Finally, of course, then making the scripts available for forsenic
 investigations.
  
  Cheers,
  
  Andrew
 
 _
 Hans Julius Skaug
 
 Department of Mathematics
 University of Bergen
 Johannes Brunsgate 12
 5008 Bergen
 Norway
 ph. (+47) 55 58 48 61
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

-- 
Andrew Robinson
Senior Lecturer in Statistics   Tel: +61-3-8344-9763
Department of Mathematics and StatisticsFax: +61-3-8344-4599
University of Melbourne, VIC 3010 Australia
Email: [EMAIL PROTECTED]Website: http://www.ms.unimelb.edu.au

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Do Users of Nonlinear Mixed Effects Models Know Whether Their Software Really Works?

2005-10-13 Thread dave fournier
   Do Users of Nonlinear Mixed Effects Models Know

  Whether Their Software Really Works?


   Lesaffre et. al. (Appl. Statist. (2001) 50, Part3, pp 325-335)
   analyzed
   some simple clinical trials data using a logistic random effects
   model. Several packages and methods MIXOR, SAS NLMIXED were employed.
   They reported obtaining very different parameter estimates and
   P values for the log-likelihood with the different packages and
   methods. We thought it  would be interesting to revisit this example
   using the AD Model Builder random effects module which we feel is
   the most stable software available for this problem at this time.

  http://otter-rsch.com/admodel.htm

   You can get Table 2 from Lesaffre et al at

   http://otter-rsch.com/downloads/other/lesaffre.pdf

   The data and more information are available
   from the publisher.

   http://www.blackwellpublishers.co.uk/rss/Volumes/Cv50p3.htm

   We considered three questions:

 1.) What are the  estimates using the Laplace approximation
 for integrating out the random effects.

 2.) What are the exact MLE's.

 3.) How well does hypothesis testing (likelihood-ratio) using
 the Laplace approximation compare with the exact MLE's.

   We first fit the data using ADMB-RE's Laplace approximation
   option.

   Laplace approximation estimates:

  # Number of parameters = 4  log-likelihood = -629.817
   value  std dev  P value
b_1 -2.3321e+00 7.6973e-01   0.0024
b_2 -6.8795e-01 6.6185e-010.298
b_3 -4.6134e-01 4.e-02   0.001
  sigma  4.5738e+00 7.0970e-01


The parameter of interest here the treatment effect b_2 which is the
   parameter reported in Lesaffre et. al.

To calculate the exact MLE we fit the model using 100 point adaptive
Gaussian integration. The ADMB-RE results were:

Gaussian integration estimates:

  # Number of parameters = 4  log-likelihood = -627.481

  name value  std devP value
   b_1 -1.4463e+00 4.2465e-010.001
   b_2 -5.2225e-01 5.5716e-01 0.348
   b_3 -4.5150e-01 3.6663e-020.001
 sigma  4.0137e+00 3.8083e-01

   Of the estimates reported in Lesaffre et al. in table 2 only the
   50 point quadrature for the program MIXOR appear to be correct
   for both the log-likelihood value and the parameter estimates
   while the authors concluded that the SAS NLMIXED parameter estimates
   they obtained were correct.  So even though these authors were looking
   for pathological behaviour and were presumably very careful, and their
   paper was presumably peer-reviewed, they came to the wrong conclusion
   using SAS NLMIXED.

   How do we know that our exact MLE's are correct? To confirm our
   results we used our parameter estimates as initial values in the SAS
   NLMIXED procedure using 100 point adaptive quadrature. The procedure
   returned our values, that is it agreed that these are the maximum
   likelihood estimates.  However we verified that to get these estimates
   from the SAS NLMIXED procedure one must begin with fairly good
   starting values. In contrast the ADMB-RE procedure is very insensitive
   to the starting values used. Our conclusion is that while SAS NLMIXED
   might work for this very simple problem it probably begins to break
   down when the problem is a bit more difficult.

   The ADMB-RE software is more stable because it calculates exact higher
   oreder derivatives by automatic differentiation for use in its
   optimization procedure and calculations while other packages do not.

   Gauss-Hermite integration for the random effects can
   be used for this model because the Hessian for the random effects is
   diagonal which permits one dimensional integration over the random
   effects to great accuracy.  However this procedure does not scale well
   to problems where the Hessian is not diagonal.  Suppose that it takes
   a 20 point quadrature to obtain reliable parameter estimates with a
   diagonal Hessian.  Then with a block diagonal Hessian where the blocks
   are of size 4x4 it would take 160,000 points.

   Results using R

   We fit the model using what appear to be the currently available
   procedures in R.  The two routines lmer (lme4 package) and glmmPQL
   (MASS library) were tried.

   The call


lmer(y ~ treat + time + (1|subject),data=lesaffre,family=binomial)

   resulted in a warning message from lme4() but both routines produced
   the same results.

 Generalized linear mixed model fit using PQL
 Formula: y ~ treat + time + (1 | subject)
Data: lesaffre
  Family: binomial(logit link)
   AIC  BIClogLik deviance
  1305.859 1333.628 -647.9295 1295.859
 Random effects:
  GroupsNameVarianceStd.Dev.
 subject (Intercept)  

Re: [R] Do Users of Nonlinear Mixed Effects Models Know Whether Their Software Really Works?

2005-10-13 Thread Prof Brian Ripley
On Thu, 13 Oct 2005, dave fournier wrote:

 Internal Virus Database is out-of-date.

Talk about not being careful!

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Do Users of Nonlinear Mixed Effects Models Know Whether Their Software Really Works?

2005-10-13 Thread Andrew Robinson
Dave,

that's an interesting start for a comparison.  Let me point out some
ways that you might construct a compelling argument.  Of course, these
aren't exhaustive, and others may well provide further depth.

1) If I understand correctly, you're trying to estimate parameters
   from a real dataset.  Why not try a simulated dataset, where you
   know exactly what the true values (and parameter distributions)
   are?

2) Furthermore, an argument from one dataset isn't very
   convincing. The sample size for inference is too small.  Why not
   repeat this procedure many times, sampling from the same base
   model? 

3) Then, you could also vary the structure of the underlying model
   systematically, and assess the comparison of fits as a function of
   the underlying model/dataset nexus.

4) Next, a problem with the example (as I understand it) is that
   although you've computed what you call exact MLE's, I think that
   they're exact when conditioned on the model.  Are they very robust
   to model misspecification?  (I mean beyond large-sample theory).

5) Finally, of course, then making the scripts available for forsenic
   investigations.

Cheers,

Andrew

On Thu, Oct 13, 2005 at 01:19:24PM -0700, dave fournier wrote:
Do Users of Nonlinear Mixed Effects Models Know
 
   Whether Their Software Really Works?
 

-- 
Andrew Robinson
Senior Lecturer in Statistics   Tel: +61-3-8344-9763
Department of Mathematics and StatisticsFax: +61-3-8344-4599
University of Melbourne, VIC 3010 Australia
Email: [EMAIL PROTECTED]Website: http://www.ms.unimelb.edu.au

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html