Re: [R] Is there an equivalent to predict(..., type=linear) of a Proportional hazard model for a Cox model instead?

2010-11-26 Thread Terry Therneau
1. survreg() does NOT fit a proportional hazards model, a mistake
repeated multiple times in your post

2. The coxph function operates on the risk scale: large values of Xbeta
= large death rates = bad
   The survreg operates on the time scale: large values of xbeta =
longer liftetime = good.

3. predict(fit, type='risk') = exp(predict(fit, type='linear')) in a Cox
model returns an estimate of the relative risk for each subject.  That
is, his/her predicted death rate as compared to the others in the
sample.  It has no units of years or days or anything else.  The
predicted survival TIME for a subject is something else entirely.

  predict(fit, type='response') in a survreg model does give predicted
survvival times.  

  If you really want to understand the interrelationships of these
things more deeply I think you need some textbook time.  Read the book
by Kalbfleisch and Prentice for accelerated failure time models, or even
better Escobar and Meeker which comes from the industrial reliability
view.  For predicted survival from a Cox model see Chapter 10 of
Therneau and Grambsch.  The answers to your specific questions would be
a document rather than an email.

Terry Therneau

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Is there an equivalent to predict(..., type=linear) of a Proportional hazard model for a Cox model instead?

2010-11-26 Thread Ben Rhelp
 Hi Terry, David, and Thomas,
 
Thank you for all your emails and the time you to took to clarify my 
misunderstanding on survival analysis. I will need a bit of time to digest all 
this information and to do some more reading.
 
Best regards,
 
Ben


 From: Terry Therneau
 
 1. survreg() does NOT fit a proportional hazards model, a mistake
 repeated  multiple times in your post
 
 2. The coxph function operates on the risk  scale: large values of Xbeta
 = large death rates = bad
The survreg  operates on the time scale: large values of xbeta =
 longer liftetime =  good.
 
 3. predict(fit, type='risk') = exp(predict(fit, type='linear')) in  a Cox
 model returns an estimate of the relative risk for each subject.   That
 is, his/her predicted death rate as compared to the others in  the
 sample.  It has no units of years or days or anything  else.  The
 predicted survival TIME for a subject is something else  entirely.
 
   predict(fit, type='response') in a survreg model does  give predicted
 survvival times.  
 
   If you really want to  understand the interrelationships of these
 things more deeply I think you  need some textbook time.  Read the book
 by Kalbfleisch and Prentice for  accelerated failure time models, or even
 better Escobar and Meeker which  comes from the industrial reliability
 view.  For predicted survival from  a Cox model see Chapter 10 of
 Therneau and Grambsch.  The answers to  your specific questions would be
 a document rather than an  email.
 
 Terry Therneau
 
 
 




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Is there an equivalent to predict(..., type=linear) of a Proportional hazard model for a Cox model instead?

2010-11-25 Thread Ben Rhelp
I manage to achieve similar results with a Cox model as follows but I don't 
really understand why we have to take the inverse of the linear prediction with 
the Cox model and why we do not need to divide by the number of days in the 
year 
anymore?

Am I getting a similar result out of pure luck?

thanks in advance,

Ben

# MASS example with the proportional hazard model
par(mfrow = c(1, 2));
(aids.ps - survreg(Surv(survtime + 0.9, status) ~ state + T.categ + 
pspline(age, df=6), data = Aidsp))

zz - predict(aids.ps, data.frame(state = factor(rep(NSW, 83), levels = 
levels(Aidsp$state)),
T.categ = factor(rep(hs, 83), levels = levels(Aidsp$T.categ)), age = 
0:82), se = T, type = linear)
plot(0:82, exp(zz$fit)/365.25, type = l, ylim = c(0, 2), xlab = age, ylab = 
expected lifetime (years))
lines(0:82, exp(zz$fit+1.96*zz$se.fit)/365.25, lty = 3, col = 2)
lines(0:82, exp(zz$fit-1.96*zz$se.fit)/365.25, lty = 3, col = 2)
rug(Aidsp$age + runif(length(Aidsp$age), -0.5, 0.5), ticksize = 0.015)


# same example but with a Cox model instead
(aids.pscp - coxph(Surv(survtime + 0.9, status) ~ state + T.categ + 
pspline(age, df=6), data = Aidsp))
zzcp - predict(aids.pscp, data.frame(state = factor(rep(NSW, 83), levels = 
levels(Aidsp$state)),
T.categ = factor(rep(hs, 83), levels = levels(Aidsp$T.categ)), age = 
0:82), se = T, type = lp)
plot(0:82, 1/exp(zzcp$fit), type = l, ylim = c(0, 2), xlab = age, ylab = 
expected lifetime (years))
lines(0:82, 1/exp(zzcp$fit+1.96*zzcp$se.fit), lty = 3, col = 2)
lines(0:82, 1/exp(zzcp$fit-1.96*zzcp$se.fit), lty = 3, col = 2)
rug(Aidsp$age + runif(length(Aidsp$age), -0.5, 0.5), ticksize = 0.015)




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Is there an equivalent to predict(..., type=linear) of a Proportional hazard model for a Cox model instead?

2010-11-25 Thread David Winsemius


On Nov 25, 2010, at 7:27 AM, Ben Rhelp wrote:

I manage to achieve similar results with a Cox model as follows but  
I don't
really understand why we have to take the inverse of the linear  
prediction with

the Cox model


Different parameterization. You can find expanded answer(s) in the  
archives and in the documentation of survreg.distributions.




and why we do not need to divide by the number of days in the year
anymore?


Here I'm guessing (since you don't offer enough evidence to confirm)  
that the difference is in the time scales used in your Aidsp$survtime  
versus some other example to which you are comparing .


 data(Aidsp)
Warning message:
In data(Aidsp) : data set 'Aidsp' not found



Am I getting a similar result out of pure luck?

thanks in advance,

Ben

# MASS example with the proportional hazard model
par(mfrow = c(1, 2));
(aids.ps - survreg(Surv(survtime + 0.9, status) ~ state + T.categ +
pspline(age, df=6), data = Aidsp))

zz - predict(aids.ps, data.frame(state = factor(rep(NSW, 83),  
levels =

levels(Aidsp$state)),
   T.categ = factor(rep(hs, 83), levels = levels(Aidsp$T.categ)),  
age =

0:82), se = T, type = linear)
plot(0:82, exp(zz$fit)/365.25, type = l, ylim = c(0, 2), xlab =  
age, ylab =

expected lifetime (years))
lines(0:82, exp(zz$fit+1.96*zz$se.fit)/365.25, lty = 3, col = 2)
lines(0:82, exp(zz$fit-1.96*zz$se.fit)/365.25, lty = 3, col = 2)
rug(Aidsp$age + runif(length(Aidsp$age), -0.5, 0.5), ticksize = 0.015)


# same example but with a Cox model instead
(aids.pscp - coxph(Surv(survtime + 0.9, status) ~ state + T.categ +
pspline(age, df=6), data = Aidsp))
zzcp - predict(aids.pscp, data.frame(state = factor(rep(NSW, 83),  
levels =

levels(Aidsp$state)),
   T.categ = factor(rep(hs, 83), levels = levels(Aidsp$T.categ)),  
age =

0:82), se = T, type = lp)
plot(0:82, 1/exp(zzcp$fit), type = l, ylim = c(0, 2), xlab =  
age, ylab =

expected lifetime (years))
lines(0:82, 1/exp(zzcp$fit+1.96*zzcp$se.fit), lty = 3, col = 2)
lines(0:82, 1/exp(zzcp$fit-1.96*zzcp$se.fit), lty = 3, col = 2)
rug(Aidsp$age + runif(length(Aidsp$age), -0.5, 0.5), ticksize = 0.015)




David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Is there an equivalent to predict(..., type=linear) of a Proportional hazard model for a Cox model instead?

2010-11-25 Thread Ben Rhelp
Hi David,

Thank you for your reply. See below for more information.


 From: David Winsemius 
 
 
 On Nov 25, 2010, at 7:27 AM, Ben Rhelp wrote:
 
  I manage to  achieve similar results with a Cox model as follows but I don't
  really  understand why we have to take the inverse of the linear prediction 
with
   the Cox model
 
 Different parameterization. You can find expanded answer(s)  in the archives 
and in the documentation of  survreg.distributions.
 

I understand (i think) the difference in model structures between a Cox (coxph) 
and Proportional hazard model (survreg). 


 
  and why we do not need to divide by the  number of days in the year
  anymore?
 
 Here I'm guessing (since you  don't offer enough evidence to confirm) that 
 the 
difference is in the time  scales used in your Aidsp$survtime versus some 
other 
example to which you are  comparing .

Both models are run from the same data, so I am not expecting any differences 
in 
time scales. 

To get similar results, I need to actually run the following equations:
expected_lifetime_in_years = exp(fit)/365.25--- Linear 
prediction of the Proportional hazard model
expected_lifetime_in_years = 1/exp(fit)--- Linear 
prediction of the Cox model
where fit come from the linear prediction of each models, respectively.

Actually, in the code below, I re-run the models and predictions based on a 
yearly sampling time (instead of daily).
Again, to get similar results, I now need to actually run the following 
equations:
expected_lifetime_in_years = exp(fit)   --- Linear 
prediction of the Proportional hazard model
expected_lifetime_in_years = 1/exp(fit)--- Linear 
prediction of the Cox model

I think I understand the logic behind the results of the proportional hazard 
model, but not for the prediction of the Cox model.

Thank you for your help. I hope this is not a too stupid hole in my logic.

Here is the self contained R code to produce the charts:

library(MASS);
library(survival);

#Same data but parametric fit
make.aidsp - function(){
   cutoff - 10043 # July 1987 in julian days
   btime - pmin(cutoff, Aids2$death) - pmin(cutoff, Aids2$diag)
   atime - pmax(cutoff, Aids2$death) - pmax(cutoff, Aids2$diag)
   survtime - btime + 0.5*atime
   status - as.numeric(Aids2$status)
   data.frame(survtime, status = status - 1, state = Aids2$state,
   T.categ = Aids2$T.categ, age = Aids2$age, sex = Aids2$sex)
}
Aidsp - make.aidsp()

# MASS example with the proportional hazard model
par(mfrow = c(1, 2));
(aids.ps - survreg(Surv(survtime + 0.9, status) ~ state + T.categ + 
pspline(age, df=6), data = Aidsp))
zz - predict(aids.ps, data.frame(state = factor(rep(NSW, 83), levels = 
levels(Aidsp$state)),
T.categ = factor(rep(hs, 83), levels = levels(Aidsp$T.categ)), age = 
0:82), se = T, type = linear)
plot(0:82, exp(zz$fit)/365.25, type = l, ylim = c(0, 2), xlab = age, ylab = 
expected lifetime (years))
lines(0:82, exp(zz$fit+1.96*zz$se.fit)/365.25, lty = 3, col = 2)
lines(0:82, exp(zz$fit-1.96*zz$se.fit)/365.25, lty = 3, col = 2)
rug(Aidsp$age + runif(length(Aidsp$age), -0.5, 0.5), ticksize = 0.015)


# Same example but with a Cox model instead
(aids.pscp - coxph(Surv(survtime + 0.9, status) ~ state + T.categ + 
pspline(age, df=6), data = Aidsp))
zzcp - predict(aids.pscp, data.frame(state = factor(rep(NSW, 83), levels = 
levels(Aidsp$state)),
T.categ = factor(rep(hs, 83), levels = levels(Aidsp$T.categ)), age = 
0:82), se = T, type = lp)
plot(0:82, 1/exp(zzcp$fit), type = l, ylim = c(0, 2), xlab = age, ylab = 
expected lifetime (years))
lines(0:82, 1/exp(zzcp$fit+1.96*zzcp$se.fit), lty = 3, col = 2)
lines(0:82, 1/exp(zzcp$fit-1.96*zzcp$se.fit), lty = 3, col = 2)
rug(Aidsp$age + runif(length(Aidsp$age), -0.5, 0.5), ticksize = 0.015)


# Change the sampling time from daily to yearly
par(mfrow = c(1, 1));
(aids.ps - survreg(Surv((survtime + 0.9)/365.25, status) ~ state + T.categ + 
pspline(age, df=6), data = Aidsp))
zz - predict(aids.ps, data.frame(state = factor(rep(NSW, 83), levels = 
levels(Aidsp$state)),
T.categ = factor(rep(hs, 83), levels = levels(Aidsp$T.categ)), age = 
0:82), se = T, type = linear)
plot(0:82, exp(zz$fit), type = l, ylim = c(0, 2), xlab = age, ylab = 
expected lifetime (years))

(aids.pscp - coxph(Surv((survtime + 0.9)/365.25, status) ~ state + T.categ + 
pspline(age, df=6), data = Aidsp))
zzcp - predict(aids.pscp, data.frame(state = factor(rep(NSW, 83), levels = 
levels(Aidsp$state)),
T.categ = factor(rep(hs, 83), levels = levels(Aidsp$T.categ)), age = 
0:82), se = T, type = lp)
lines(0:82, 1/exp(zzcp$fit),col=4)





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Is there an equivalent to predict(..., type=linear) of a Proportional hazard model for a Cox model instead?

2010-11-25 Thread David Winsemius


On Nov 25, 2010, at 10:08 AM, Ben Rhelp wrote:


Hi David,

Thank you for your reply. See below for more information.



From: David Winsemius




On Nov 25, 2010, at 7:27 AM, Ben Rhelp wrote:

I manage to  achieve similar results with a Cox model as follows  
but I don't
really  understand why we have to take the inverse of the linear  
prediction

with

the Cox model


Different parameterization. You can find expanded answer(s)  in the  
archives

and in the documentation of  survreg.distributions.



I understand (i think) the difference in model structures between a  
Cox (coxph)

and Proportional hazard model (survreg).


I couldn't tell whether this means you decided that those citations  
answered your question. If not, then refer to Therneau's or Lumley's  
replies in rhelp to a similar question earlier this month.:


https://stat.ethz.ch/pipermail/r-help/2010-November/259796.html
https://stat.ethz.ch/pipermail/r-help/2010-November/259747.html







and why we do not need to divide by the  number of days in the year
anymore?


Here I'm guessing (since you  don't offer enough evidence to  
confirm) that the
difference is in the time  scales used in your Aidsp$survtime  
versus some other

example to which you are  comparing .


Both models are run from the same data, so I am not expecting any  
differences in

time scales.

To get similar results, I need to actually run the following  
equations:
expected_lifetime_in_years = exp(fit)/365.25---  
Linear

prediction of the Proportional hazard model
expected_lifetime_in_years = 1/exp(fit) 
--- Linear

prediction of the Cox model
where fit come from the linear prediction of each models,  
respectively.


Actually, in the code below, I re-run the models and predictions  
based on a

yearly sampling time (instead of daily).
Again, to get similar results, I now need to actually run the  
following

equations:
expected_lifetime_in_years = exp(fit)
--- Linear

prediction of the Proportional hazard model
expected_lifetime_in_years = 1/exp(fit) 
--- Linear

prediction of the Cox model

I think I understand the logic behind the results of the  
proportional hazard

model, but not for the prediction of the Cox model.


 Cox models are PH models.



Thank you for your help. I hope this is not a too stupid hole in my  
logic.


Here is the self contained R code to produce the charts:

library(MASS);
library(survival);

#Same data but parametric fit
make.aidsp - function(){
  cutoff - 10043 # July 1987 in julian days
  btime - pmin(cutoff, Aids2$death) - pmin(cutoff, Aids2$diag)
  atime - pmax(cutoff, Aids2$death) - pmax(cutoff, Aids2$diag)
  survtime - btime + 0.5*atime
  status - as.numeric(Aids2$status)
  data.frame(survtime, status = status - 1, state = Aids2$state,
  T.categ = Aids2$T.categ, age = Aids2$age, sex = Aids2$sex)
}
Aidsp - make.aidsp()

# MASS example with the proportional hazard model
par(mfrow = c(1, 2));
(aids.ps - survreg(Surv(survtime + 0.9, status) ~ state + T.categ +
pspline(age, df=6), data = Aidsp))
zz - predict(aids.ps, data.frame(state = factor(rep(NSW, 83),  
levels =

levels(Aidsp$state)),
   T.categ = factor(rep(hs, 83), levels = levels(Aidsp$T.categ)),  
age =

0:82), se = T, type = linear)
plot(0:82, exp(zz$fit)/365.25, type = l, ylim = c(0, 2), xlab =  
age, ylab =

expected lifetime (years))
lines(0:82, exp(zz$fit+1.96*zz$se.fit)/365.25, lty = 3, col = 2)
lines(0:82, exp(zz$fit-1.96*zz$se.fit)/365.25, lty = 3, col = 2)
rug(Aidsp$age + runif(length(Aidsp$age), -0.5, 0.5), ticksize = 0.015)


# Same example but with a Cox model instead
(aids.pscp - coxph(Surv(survtime + 0.9, status) ~ state + T.categ +
pspline(age, df=6), data = Aidsp))
zzcp - predict(aids.pscp, data.frame(state = factor(rep(NSW, 83),  
levels =

levels(Aidsp$state)),
   T.categ = factor(rep(hs, 83), levels = levels(Aidsp$T.categ)),  
age =

0:82), se = T, type = lp)
plot(0:82, 1/exp(zzcp$fit), type = l, ylim = c(0, 2), xlab =  
age, ylab =

expected lifetime (years))
lines(0:82, 1/exp(zzcp$fit+1.96*zzcp$se.fit), lty = 3, col = 2)
lines(0:82, 1/exp(zzcp$fit-1.96*zzcp$se.fit), lty = 3, col = 2)
rug(Aidsp$age + runif(length(Aidsp$age), -0.5, 0.5), ticksize = 0.015)


# Change the sampling time from daily to yearly
par(mfrow = c(1, 1));
(aids.ps - survreg(Surv((survtime + 0.9)/365.25, status) ~ state +  
T.categ +

pspline(age, df=6), data = Aidsp))
zz - predict(aids.ps, data.frame(state = factor(rep(NSW, 83),  
levels =

levels(Aidsp$state)),
   T.categ = factor(rep(hs, 83), levels = levels(Aidsp$T.categ)),  
age =

0:82), se = T, type = linear)
plot(0:82, exp(zz$fit), type = l, ylim = c(0, 2), xlab = age,  
ylab =

expected lifetime (years))

(aids.pscp - coxph(Surv((survtime + 0.9)/365.25, status) ~ state +  
T.categ +

pspline(age, df=6), data = Aidsp))
zzcp - predict(aids.pscp, data.frame(state = factor(rep(NSW, 83),  
levels =

levels(Aidsp$state)),
   

Re: [R] Is there an equivalent to predict(..., type=linear) of a Proportional hazard model for a Cox model instead?

2010-11-25 Thread David Winsemius
I hit the send button on my second reply before I intended to. Since  
then I have noticed that the question I thought you were asking is not  
at all a good match to the Subject line of your message. There is a  
type =lp in predict.coxph and that is the linear predictor, although  
it is not a value that is on any transformation of time as would be  
the linear predictor of an accelerated failure time estimate. If you  
are looking for another method (in addition to predict.coxph) to  
produce expected survival from a coxph fit, you could also look at  
survexp. The ratetable argument accepts a fitted Cox model.


I have still not found a good answer to the question that I thought  
the body of your second posting was posing: namely why days and years  
are not handled the same in predict.survreg and predict.coxph.


--
David.
On Nov 25, 2010, at 12:35 PM, David Winsemius wrote:



On Nov 25, 2010, at 10:08 AM, Ben Rhelp wrote:


Hi David,

Thank you for your reply. See below for more information.



From: David Winsemius




On Nov 25, 2010, at 7:27 AM, Ben Rhelp wrote:

I manage to  achieve similar results with a Cox model as follows  
but I don't
really  understand why we have to take the inverse of the linear  
prediction

with

the Cox model


Different parameterization. You can find expanded answer(s)  in  
the archives

and in the documentation of  survreg.distributions.



I understand (i think) the difference in model structures between a  
Cox (coxph)

and Proportional hazard model (survreg).


I couldn't tell whether this means you decided that those citations  
answered your question. If not, then refer to Therneau's or Lumley's  
replies in rhelp to a similar question earlier this month.:


https://stat.ethz.ch/pipermail/r-help/2010-November/259796.html
https://stat.ethz.ch/pipermail/r-help/2010-November/259747.html







and why we do not need to divide by the  number of days in the year
anymore?


Here I'm guessing (since you  don't offer enough evidence to  
confirm) that the
difference is in the time  scales used in your Aidsp$survtime  
versus some other

example to which you are  comparing .


Both models are run from the same data, so I am not expecting any  
differences in

time scales.

To get similar results, I need to actually run the following  
equations:
expected_lifetime_in_years = exp(fit)/365.25--- 
 Linear

prediction of the Proportional hazard model
expected_lifetime_in_years = 1/exp(fit) 
--- Linear

prediction of the Cox model
where fit come from the linear prediction of each models,  
respectively.


Actually, in the code below, I re-run the models and predictions  
based on a

yearly sampling time (instead of daily).
Again, to get similar results, I now need to actually run the  
following

equations:
expected_lifetime_in_years = exp(fit)
--- Linear

prediction of the Proportional hazard model
expected_lifetime_in_years = 1/exp(fit) 
--- Linear

prediction of the Cox model

I think I understand the logic behind the results of the  
proportional hazard

model, but not for the prediction of the Cox model.


Cox models are PH models.



Thank you for your help. I hope this is not a too stupid hole in my  
logic.


Here is the self contained R code to produce the charts:

library(MASS);
library(survival);

#Same data but parametric fit
make.aidsp - function(){
 cutoff - 10043 # July 1987 in julian days
 btime - pmin(cutoff, Aids2$death) - pmin(cutoff, Aids2$diag)
 atime - pmax(cutoff, Aids2$death) - pmax(cutoff, Aids2$diag)
 survtime - btime + 0.5*atime
 status - as.numeric(Aids2$status)
 data.frame(survtime, status = status - 1, state = Aids2$state,
 T.categ = Aids2$T.categ, age = Aids2$age, sex = Aids2$sex)
}
Aidsp - make.aidsp()

# MASS example with the proportional hazard model
par(mfrow = c(1, 2));
(aids.ps - survreg(Surv(survtime + 0.9, status) ~ state + T.categ +
pspline(age, df=6), data = Aidsp))
zz - predict(aids.ps, data.frame(state = factor(rep(NSW, 83),  
levels =

levels(Aidsp$state)),
  T.categ = factor(rep(hs, 83), levels = levels(Aidsp$T.categ)),  
age =

0:82), se = T, type = linear)
plot(0:82, exp(zz$fit)/365.25, type = l, ylim = c(0, 2), xlab =  
age, ylab =

expected lifetime (years))
lines(0:82, exp(zz$fit+1.96*zz$se.fit)/365.25, lty = 3, col = 2)
lines(0:82, exp(zz$fit-1.96*zz$se.fit)/365.25, lty = 3, col = 2)
rug(Aidsp$age + runif(length(Aidsp$age), -0.5, 0.5), ticksize =  
0.015)



# Same example but with a Cox model instead
(aids.pscp - coxph(Surv(survtime + 0.9, status) ~ state + T.categ +
pspline(age, df=6), data = Aidsp))
zzcp - predict(aids.pscp, data.frame(state = factor(rep(NSW,  
83), levels =

levels(Aidsp$state)),
  T.categ = factor(rep(hs, 83), levels = levels(Aidsp$T.categ)),  
age =

0:82), se = T, type = lp)
plot(0:82, 1/exp(zzcp$fit), type = l, ylim = c(0, 2), xlab =  
age, ylab =

expected lifetime (years))
lines(0:82,