On Nov 11, 2010, at 12:14 PM, Michael Haenlein wrote:
Thanks for the comment, James!
The problem is that my initial sample (Dataset 1) is truncated. That
means I
only observe "time to death" for those individuals who actually died
before
end of my observation period. It is my understanding that this type of
truncation creates a bias when I use a "normal" regression analysis.
Hence
my idea to use some form of survival model.
I had another look at predict.survreg and I think the option
"response"
could work for me.
When I run the following code I get ptime = 290.3648.
I assume this means that an individual with ph.ecog=2 can be
expected to
life another 290.3648 days before death occurs [days is the time
scale of
the time variable).
It is a prediction under specific assumptions underpinning a
parametric estimate.
Could someone confirm whether this makes sense?
You ought to confirm that it "makes sense" by comparing to your data:
reauire(Hmisc); require(survival)
<your code>
> describe(lung[lung$status==1&lung$ph.ecog==2,"time"])
lung[lung$status == 1 & lung$ph.ecog == 2, "time"]
n missing unique Mean
6 0 6 293.7
92 105 211 292 511 551
Frequency 1 1 1 1 1 1
% 17 17 17 17 17 17
> ?lung
So status==1 is a censored case and the observed times are status==2
> describe(lung[lung$status==2&lung$ph.ecog==2,"time"])
lung[lung$status == 2 & lung$ph.ecog == 2, "time"]
n missing unique Mean .05 .10 .25 .50 .
75 .90 .95
44 1 44 226.0 14.95 36.90 94.50 178.50
295.75 500.00 635.85
lowest : 11 12 13 26 30, highest: 524 533 654 707 814
And the mean time to death (in a group that had only 6 censored
individual at times from 92 to 551) was 226 and median time to death
among 44 individuals is 178 with a right skewed distribution. You need
to decide whether you want to make that particular prediction when you
know that you forced a specific distributional form on the regression
machinery by accepting the default.
lfit <- survreg(Surv(time, status) ~ ph.ecog, data=lung)
ptime <- predict(lfit, newdata=data.frame(ph.ecog=2), type='response')
On Thu, Nov 11, 2010 at 5:26 PM, James C. Whanger
<james.whan...@gmail.com>wrote:
Michael,
You are looking to compute an estimated time to death -- rather
than the
odds of death conditional upon time. Thus, you will want to use
"time to
death" as your dependent variable rather than a dichotomous outcome (
0=alive, 1=death). You can accomplish this with a straight forward
regression analysis.
Best,
Jim
On Thu, Nov 11, 2010 at 3:44 AM, Michael Haenlein <haenl...@escpeurope.eu
>wrote:
Dear all,
I'm struggling with predicting "expected time until death" for a
coxph and
survreg model.
I have two datasets. Dataset 1 includes a certain number of people
for
which
I know a vector of covariates (age, gender, etc.) and their event
times
(i.e., I know whether they have died and when if death occurred
prior to
the
end of the observation period). Dataset 2 includes another set of
people
for
which I only have the covariate vector. I would like to use
Dataset 1 to
calibrate either a coxph or survreg model and then use this model to
determine an "expected time until death" for the individuals in
Dataset 2.
For example, I would like to know when a person in Dataset 2 will
die,
given
his/ her age and gender.
I checked predict.coxph and predict.survreg as well as the
document "A
Package for Survival Analysis in S" written by Terry M. Therneau
but I
have
to admit that I'm a bit lost here.
Could anyone give me some advice on how this could be done?
Thanks very much in advance,
Michael
Michael Haenlein
Professor of Marketing
David Winsemius, MD
West Hartford, CT
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.