Re: [R] Problem trying to use boot and lme together
At 23:09 21/06/05, Prof Brian Ripley wrote: >On Tue, 21 Jun 2005, Douglas Bates wrote: > >>On 6/21/05, Søren Højsgaard <[EMAIL PROTECTED]> wrote: Thanks everyone for your help, more comments at the foot >>>The problem with simulate.lme is that it only returns logL for a given >>>model fitted to a simulated data set - not the simulated data set >>>itself (which one might have expected a function with that name to >>>do...). It would be nice with that functionality... >>>Søren >> >>You could add it. You just need to create a matrix that will be large >>enough to hold all the simulated data sets and fill a column (or row >>if you prefer but column is probably better because of the way that >>matrices are stored) during each iteration of the simulation and >>remember to include that matrix in the returned object. > >Note: you don't need to store it: you can do the analysis at that point >and return the statistics you want, rather than just logL. > >I did say `see simulate.lme', not `use simulate.lme'. I know nlme is no >longer being developed, but if it were I would be suggesting/contributing >a modification that allowed the user to specify an `extraction' function >from the fit -- quite a few pieces of bootstrap code work that way. > >-- >Brian D. Ripley, [EMAIL PROTECTED] >Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ >University of Oxford, Tel: +44 1865 272861 (self) >1 South Parks Road, +44 1865 272866 (PA) >Oxford OX1 3TG, UKFax: +44 1865 272595 This has been most informative for me. Given the rather stern comments in Pinheiro and Bates that most things do not have the reference distribution you might naively think I now realise that simulate.lme is more important than its rather cursory treatment in the book. As suggested I have looked at the code but although I can see broadly what each section does I lack the skill to modify it myself. I will have to wait for someone more gifted. If there is to be a successor edition to Pinheiro and Bates perhaps I could suggest that this topic merits a bit more discussion? Michael Dewey [EMAIL PROTECTED] http://www.aghmed.fsnet.co.uk/home.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Problem trying to use boot and lme together
On Tue, 21 Jun 2005, Douglas Bates wrote: On 6/21/05, Søren Højsgaard <[EMAIL PROTECTED]> wrote: The problem with simulate.lme is that it only returns logL for a given model fitted to a simulated data set - not the simulated data set itself (which one might have expected a function with that name to do...). It would be nice with that functionality... Søren You could add it. You just need to create a matrix that will be large enough to hold all the simulated data sets and fill a column (or row if you prefer but column is probably better because of the way that matrices are stored) during each iteration of the simulation and remember to include that matrix in the returned object. Note: you don't need to store it: you can do the analysis at that point and return the statistics you want, rather than just logL. I did say `see simulate.lme', not `use simulate.lme'. I know nlme is no longer being developed, but if it were I would be suggesting/contributing a modification that allowed the user to specify an `extraction' function from the fit -- quite a few pieces of bootstrap code work that way. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Problem trying to use boot and lme together
On 6/21/05, Søren Højsgaard <[EMAIL PROTECTED]> wrote: > The problem with simulate.lme is that it only returns logL for a given model > fitted to a simulated data set - not the simulated data set itself (which > one might have expected a function with that name to do...). It would be nice > with that functionality... > Søren You could add it. You just need to create a matrix that will be large enough to hold all the simulated data sets and fill a column (or row if you prefer but column is probably better because of the way that matrices are stored) during each iteration of the simulation and remember to include that matrix in the returned object. The reason that we didn't do that in the original design is because that matrix can become rather large when you have either a lot of data or a lot of simulations and we were working on the computers or 5 to 10 years ago. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Problem trying to use boot and lme together
The problem with simulate.lme is that it only returns logL for a given model fitted to a simulated data set - not the simulated data set itself (which one might have expected a function with that name to do...). It would be nice with that functionality... Søren Fra: [EMAIL PROTECTED] på vegne af Prof Brian Ripley Sendt: ti 21-06-2005 18:53 Til: Michael Dewey Cc: r-help-stat.math.ethz.ch Emne: Re: [R] Problem trying to use boot and lme together We don't have the structure of your dataset. But it seems pretty clear that your resampling is not preserving the random effects structure: you are always fitting to the same clusters labelled by gp and so missing the major source of variability. You either need to resample clusters, or resample the cluster random effects, as well as resample the within-cluster effects. I doubt if there is much theoretical support for such bootstrapping schemes, nor software support: I would prefer a so-called parametric bootstrapping approach, that is resample from the fitted model -- see simulate.lme. On Tue, 21 Jun 2005, Michael Dewey wrote: > The outcome variable in my dataset is cost. I prefer not to transform it as > readers will really be interested in pounds, not log(pounds) or > sqrt(pounds). I have fitted my model using lme and now wish to use boot on > it. I have therefore plagiarised the example in the article in RNews 2/3 > December 2002 > > after loading the dataset I source this file > > > require(boot) > require(nlme) > model.0 <- lme(tot ~ time + timepost + pct + totpat >+ (time + timepost) : single + single >+ (time + timepost) : train + train >+ time:gpattend + timepost:gpattend + gpattend, >data = common, >random = ~time + timepost | gp > ) > ints.9 <- intervals(model.0, which="fixed")$fixed[9,] > # > common$fit <- fitted(model.0) > common$res <- resid(model.0, type = "response") > cats.fit <- function(data) { >mod <- lme(tot ~ time + timepost + pct + totpat > + (time + timepost) : single + single > + (time + timepost) : train + train > + time:gpattend + timepost:gpattend + gpattend, >data = data, >random = ~ time + timepost | gp) >summ <- summary(mod) >c(fixef(summ), diag(summ$varFix)) > } > model.fun <- function(d, i) { >d$tot <- d$fit+d$res[i] >cats.fit(d) > } > tot.boot <- boot(common, model.fun, R=999) > > So I fit the model and then generate fitted values and residuals which I > use within the model.fun function to generate the bootstrap resample. > > If I do this the plot looks fine as does the jack.after.boot plot but the > confidence intervals are about 10% of the width of the ones from the lme > output. I wondered whether I was using the wrong fitted and residuals so I > added level = 0 to the calls of fitted and resid respectively (level = 1 is > the default) but this seems to lead to resamples to which lme cannot fit > the model. > > Can anyone spot what I am doing wrong? > > I would appreciate a cc'ed response as my ISP has taken to bouncing the > digest posts from R-help with probability approximately 0.3. > > FWIW I am using 2.1.0 under XP (SP2) with the versions of boot and nlme > which shipped with the binary. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Problem trying to use boot and lme together
We don't have the structure of your dataset. But it seems pretty clear that your resampling is not preserving the random effects structure: you are always fitting to the same clusters labelled by gp and so missing the major source of variability. You either need to resample clusters, or resample the cluster random effects, as well as resample the within-cluster effects. I doubt if there is much theoretical support for such bootstrapping schemes, nor software support: I would prefer a so-called parametric bootstrapping approach, that is resample from the fitted model -- see simulate.lme. On Tue, 21 Jun 2005, Michael Dewey wrote: > The outcome variable in my dataset is cost. I prefer not to transform it as > readers will really be interested in pounds, not log(pounds) or > sqrt(pounds). I have fitted my model using lme and now wish to use boot on > it. I have therefore plagiarised the example in the article in RNews 2/3 > December 2002 > > after loading the dataset I source this file > > > require(boot) > require(nlme) > model.0 <- lme(tot ~ time + timepost + pct + totpat >+ (time + timepost) : single + single >+ (time + timepost) : train + train >+ time:gpattend + timepost:gpattend + gpattend, >data = common, >random = ~time + timepost | gp > ) > ints.9 <- intervals(model.0, which="fixed")$fixed[9,] > # > common$fit <- fitted(model.0) > common$res <- resid(model.0, type = "response") > cats.fit <- function(data) { >mod <- lme(tot ~ time + timepost + pct + totpat > + (time + timepost) : single + single > + (time + timepost) : train + train > + time:gpattend + timepost:gpattend + gpattend, >data = data, >random = ~ time + timepost | gp) >summ <- summary(mod) >c(fixef(summ), diag(summ$varFix)) > } > model.fun <- function(d, i) { >d$tot <- d$fit+d$res[i] >cats.fit(d) > } > tot.boot <- boot(common, model.fun, R=999) > > So I fit the model and then generate fitted values and residuals which I > use within the model.fun function to generate the bootstrap resample. > > If I do this the plot looks fine as does the jack.after.boot plot but the > confidence intervals are about 10% of the width of the ones from the lme > output. I wondered whether I was using the wrong fitted and residuals so I > added level = 0 to the calls of fitted and resid respectively (level = 1 is > the default) but this seems to lead to resamples to which lme cannot fit > the model. > > Can anyone spot what I am doing wrong? > > I would appreciate a cc'ed response as my ISP has taken to bouncing the > digest posts from R-help with probability approximately 0.3. > > FWIW I am using 2.1.0 under XP (SP2) with the versions of boot and nlme > which shipped with the binary. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Problem trying to use boot and lme together
The outcome variable in my dataset is cost. I prefer not to transform it as readers will really be interested in pounds, not log(pounds) or sqrt(pounds). I have fitted my model using lme and now wish to use boot on it. I have therefore plagiarised the example in the article in RNews 2/3 December 2002 after loading the dataset I source this file require(boot) require(nlme) model.0 <- lme(tot ~ time + timepost + pct + totpat + (time + timepost) : single + single + (time + timepost) : train + train + time:gpattend + timepost:gpattend + gpattend, data = common, random = ~time + timepost | gp ) ints.9 <- intervals(model.0, which="fixed")$fixed[9,] # common$fit <- fitted(model.0) common$res <- resid(model.0, type = "response") cats.fit <- function(data) { mod <- lme(tot ~ time + timepost + pct + totpat + (time + timepost) : single + single + (time + timepost) : train + train + time:gpattend + timepost:gpattend + gpattend, data = data, random = ~ time + timepost | gp) summ <- summary(mod) c(fixef(summ), diag(summ$varFix)) } model.fun <- function(d, i) { d$tot <- d$fit+d$res[i] cats.fit(d) } tot.boot <- boot(common, model.fun, R=999) So I fit the model and then generate fitted values and residuals which I use within the model.fun function to generate the bootstrap resample. If I do this the plot looks fine as does the jack.after.boot plot but the confidence intervals are about 10% of the width of the ones from the lme output. I wondered whether I was using the wrong fitted and residuals so I added level = 0 to the calls of fitted and resid respectively (level = 1 is the default) but this seems to lead to resamples to which lme cannot fit the model. Can anyone spot what I am doing wrong? I would appreciate a cc'ed response as my ISP has taken to bouncing the digest posts from R-help with probability approximately 0.3. FWIW I am using 2.1.0 under XP (SP2) with the versions of boot and nlme which shipped with the binary. Michael Dewey [EMAIL PROTECTED] http://www.aghmed.fsnet.co.uk/home.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html