Thank you, Professor Ripley. cbind(log(pr1$fit) - 1.96*pr1$se.fit/pr1$fit, log(pr1$fit) + 1.96*pr1$se.fit/pr1$fit)
... is precisely what had eluded me, self-evident though it appears after you have illuminated the way. Again, thank you. Charles Annis, P.E. [EMAIL PROTECTED] phone: 561-352-9699 eFax: 614-455-3265 http://www.StatisticalEngineering.com -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Prof Brian Ripley Sent: Tuesday, May 03, 2005 2:55 AM To: Charles Annis, P.E. Cc: R-help@stat.math.ethz.ch Subject: Re: [R] comparing lm(), survreg( ... , dist="gaussian") and survreg(... , dist="lognormal") On Mon, 2 May 2005, Charles Annis, P.E. wrote: > I have tried everything I can think of and hope not to appear too foolish > when my error is pointed out to me. > > I have some real data (18 points) that look linear on a log-log plot so I > used them for a comparison of lm() and survreg. There are no suspensions. > > survreg.df <- data.frame(Cycles=c(2009000, 577000, 145000, 376000, 37000, > 979000, 17420000, 71065000, 46397000, 70168000, 69120000, 68798000, > 72615000, 133051000, 38384000, 15204000, 1558000, 14181000), stress=c(90, > 100, 110, 90, 100, 80, 70, 60, 56, 62, 62, 59, 56, 53, 59, 70, 90, 70), > event=rep(1, 18)) > > > sN.lm<- lm(log(Cycles) ~ log10(stress), data=survreg.df) > > and > vvvvvvvvvvv > gaussian.survreg<- survreg(formula=Surv(time=log(Cycles), event) ~ > log10(stress), dist="gaussian", data=survreg.df) > > produce identical parameter estimates and differ slightly in the residual > standard error and scale, which is accounted for by scale being the MLE and > thus biased. Correcting by sqrt(18/16) produces agreement. Using predict() > for the lm, and predict.survreg() for the survreg model and correcting for > the differences in stdev, produces identical plots of the fit and the upper > and lower confidence intervals. All of this is as it should be. I trust you called predict() on both and let R choose the method. > And, > vvvvvv > lognormal.survreg<- survreg(formula=Surv(time=(Cycles), event) ~ > log10(stress), dist="lognormal", data=survreg.df) > > produces summary() results that are identical to the earlier call to > survreg(), except for the call, of course. The parameter estimates and SE > are identical. Again this is as I would expect it. > > But since the call uses Cycles, rather than log(Cycles) predict.survreg() > returns $fit in Cycles units, rather than logs, and of course the fits are > identical when plotted on a log-log grid and also agree with lm() > > Here is the fly in the ointment: The upper and lower confidence intervals, > based on the $se.fit for the dist="lognormal" are quite obviously different > from the other two methods, and although I have tried everything I could > imagine I cannot reconcile the differences. How did you do this? (BTW, I assume you mean upper and lower confidence >limits< for the predicted means.) For the predictions and standard errors are (or should be) on the response scale, a non-linear function of the parameters. In that case it is normal to form confidence limits on the linear predictor scale and transform. > I believe that the confidence bounds for both models should agree. After > all, both calls to survreg() produce identical parameter estimates. They will, if computed on the same basis. On log-scale (to avoid large numbers) pr1 <- predict(lognormal.survreg, se.fit=T) log(cbind(pr1$fit - 1.96*pr1$se.fit, pr1$fit + 1.96*pr1$se.fit)) pr2 <- predict(gaussian.survreg, se.fit=T) cbind(pr2$fit - 1.96*pr2$se.fit, pr2$fit + 1.96*pr2$se.fit) are really pretty close. The main difference is a slight shift, which comes about because the mean of a log(X) is not log(mean(X)). Note that the second set at the preferred ones. Transforming to log scale before making the confidence limits: cbind(log(pr1$fit) - 1.96*pr1$se.fit/pr1$fit, log(pr1$fit) + 1.96*pr1$se.fit/pr1$fit) does give identical answers. Consider care is needed in interpreting what predict() is actually predicting in non-linear models. For both glm() and survreg() it is closer to the median of the uncertainty in the predictions than to the mean. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595 ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html