R: [R] time dependency of Cox regression
Hi, in order to fit Cox model with time-dependent coeff, you have to restruct your dataframe. For instance you can use the counting process formulation (start,stop,status). Some years ago I wrote a function (reCox() below) to make the job. It seems to work if there are not ties, but if there are ties it works but some problem rises (results do not match exactly). Let me know if you can fix such problem. Note that in the restructed dataset, a (linear) time varying effect for the variable X, say, may be included by the X:stop term in the model. Hope this helps, regards, vito #example taken from ?cph in the Hmisc package n - 100 set.seed(731) age - 50 + 12*rnorm(n) sex - factor(sample(c('Male','Female'), n,rep=TRUE, prob=c(.6, .4))) cens - 15*runif(n) h - .02*exp(.04*(age-50)+.8*(sex=='Female')) ttime - -log(runif(n))/h status - ifelse(ttime = cens,1,0) ttime - pmin(ttime, cens) d-data.frame(ttime=ttime,status=status,age=age,sex=sex) #restruct the original dataframe dd-reCox(d) #compare fitted model without ties (minor (negligible?) differences) coxph(Surv(ttime,status) ~ age + sex,data=d) coxph(Surv(start,stop,status) ~ age + sex,data=dd) #data with ties (major differences) d$ttime[d$status==1]-round(d$ttime[d$status==1],1) dd-reCox(d,k=1:2,epss=0.01) #here you have to specify epss0 to allow coxph() to work coxph(Surv(ttime,status) ~ age + sex,data=d) coxph(Surv(start,stop,status) ~ age + sex,data=dd) reCox-function(data,k=1:2,epss=0){ #FUNZIONA SOLO SE NON CI SONO TIES :-( #(Preliminary) function to reshape dataframe according to the counting-process formulation # author: [EMAIL PROTECTED] #data: the data-frame to be transformed #k: indices of SurvTime and Status variables in data #epss: if for any new record start==stop, then stop is incremented by epss if(ncol(data)=3) data[,tmp]-rep(99,nrow(data)) dati-data[order(data[,k[1]]),] #order(unique(surv.time))#??? status-dati[,k[2]] b-dati[,-k] dati[,start]-rep(0,nrow(dati)) names(dati)[k[1]]-stop n-nrow(dati) a-matrix(-99,(n*(n-1)/2+n),3) a[1,]-c(1,as.numeric(as.matrix(dati)[1,c(start,stop)])) colnames(a)-c('id.new','start','stop') a[,id.new]-rep(1:n,1:n) for(i in 2:nrow(dati)){ a[a[,1]==i,-1]-rbind(a[a[,1]==(i-1),2:3],c(dati[(i-1),stop],dati[i,stop ])) } a-cbind(a,status=rep(0,nrow(a))) a[cumsum(1:n),status]-status bb-sapply(b,function(x)rep(x,1:n))#le categorie le trasforma in numeri #bb-lapply(b,function(x)rep(x,1:n)) #bb-apply(b,2,function(x)rep(x,1:n))NO!!! A-data.frame(cbind(a,bb),row.names=NULL) #if(!missing(epss)) A[,stop]-A[,stop]+ifelse(A[,stop]==A[,start],epss,0) if(epss0) A[,stop]-A[,stop]+ifelse(A[,stop]==A[,start],epss,0) A$tmp-NULL return(A) } - Original Message - From: array chip [EMAIL PROTECTED] To: [EMAIL PROTECTED] Sent: Wednesday, November 03, 2004 12:41 AM Subject: [R] time dependency of Cox regression Hi, How can I specify a Cox proportional hazards model with a covariate which i believe its strength on survival changes/diminishes with time? The value of the covariate was only recorded once at the beginning of the study for each individual (e.g. at the diagnosis of the disease), so I do not have the time course data of the covariate for any given individual. For example, I want to state at the end of the analysis that the hazard ratio of the covariate is 6 at the beginning, decrease to 3 after 2 years and decrease to 1.5 after 5 years. Is this co-called time-dependent covariate? I guess not, because it's really about the influence of the covariate (which was measured once at the beginning) on survival changing over time. Thanks for any input. __ Check out the new Yahoo! Front Page. www.yahoo.com __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] time dependency of Cox regression
array chip wrote: Hi, How can I specify a Cox proportional hazards model with a covariate which i believe its strength on survival changes/diminishes with time? The value of the covariate was only recorded once at the beginning of the study for each individual (e.g. at the diagnosis of the disease), so I do not have the time course data of the covariate for any given individual. For example, I want to state at the end of the analysis that the hazard ratio of the covariate is 6 at the beginning, decrease to 3 after 2 years and decrease to 1.5 after 5 years. If you fit a Cox model with the fixed covariate, plot(cox.zph(model)) will show you an estimate of how the log hazard ratio changes over time, with pointwise confidence intervals. If you want more precise estimates and confidence intervals you can split up your covariate into a set of time-dependent covariates. If you wanted a time period for each year up to 6 years you would make 6 time dependent covariates, looking like x 0 0 0 0 0 0 x 0 0 0 0 0 0 x 0 0 0 0 0 0 x 0 0 0 0 0 0 x 0 0 0 0 0 0 x and have (up to) six records per person. The survSplit() function in the survival package will do the splitting, you then need to set the appropriate terms to zero and fit the model. -thomas __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] time dependency of Cox regression
Thanks very much for the suggestion. still some quiestions. In your example of splitting the covariate into 6 time-dependent covariates (6 records per person), will the survival time and censored status be the same for each of the 6 records? If that's the case, how does the model know that each of the 6 time-dependent covariates corresponds to 6 consecutive time points? I am thinking about create a dummy factor variable called time to indicate which time interval each patient's survival time is in. For example, if a patient's survival time is less than 2 years, then the dummy variable is 2, and so on for each patient. Then I specify a covariate x time interaction term in the Cox regression. I would assume the Cox regression will return a separte hazard ratio for each level of the dummy factor variable which corresponds to the hazard ratio of each year. Is this a reasonable way to do it? Thanks --- Thomas Lumley [EMAIL PROTECTED] wrote: array chip wrote: Hi, How can I specify a Cox proportional hazards model with a covariate which i believe its strength on survival changes/diminishes with time? The value of the covariate was only recorded once at the beginning of the study for each individual (e.g. at the diagnosis of the disease), so I do not have the time course data of the covariate for any given individual. For example, I want to state at the end of the analysis that the hazard ratio of the covariate is 6 at the beginning, decrease to 3 after 2 years and decrease to 1.5 after 5 years. If you fit a Cox model with the fixed covariate, plot(cox.zph(model)) will show you an estimate of how the log hazard ratio changes over time, with pointwise confidence intervals. If you want more precise estimates and confidence intervals you can split up your covariate into a set of time-dependent covariates. If you wanted a time period for each year up to 6 years you would make 6 time dependent covariates, looking like x 0 0 0 0 0 0 x 0 0 0 0 0 0 x 0 0 0 0 0 0 x 0 0 0 0 0 0 x 0 0 0 0 0 0 x and have (up to) six records per person. The survSplit() function in the survival package will do the splitting, you then need to set the appropriate terms to zero and fit the model. -thomas __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] time dependency of Cox regression
On Wed, 3 Nov 2004, array chip wrote: Thanks very much for the suggestion. still some quiestions. In your example of splitting the covariate into 6 time-dependent covariates (6 records per person), will the survival time and censored status be the same for each of the 6 records? If that's the case, how does the model know that each of the 6 time-dependent covariates corresponds to 6 consecutive time points? No, it won't. This is what survSplit handles. If someone dies in the middle of year 4 they will have records start stop event 0 10 1 20 2 30 3 40 4 4.5 1 and no record for subsequent years I am thinking about create a dummy factor variable called time to indicate which time interval each patient's survival time is in. For example, if a patient's survival time is less than 2 years, then the dummy variable is 2, and so on for each patient. Then I specify a covariate x time interaction term in the Cox regression. I would assume the Cox regression will return a separte hazard ratio for each level of the dummy factor variable which corresponds to the hazard ratio of each year. Is this a reasonable way to do it? No. This doesn't work unless you also split the records. The problem is that the person is labelled as dying in year 2 at every time point, but you only want to change the risk during year 2. It is quite possible in principle to allow for arbitrary functional time dependence in a Cox model, but the R implementation doesn't. I have an implementation that does, but it's in Xlispstat. -thomas __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] time dependency of Cox regression
Hi, How can I specify a Cox proportional hazards model with a covariate which i believe its strength on survival changes/diminishes with time? The value of the covariate was only recorded once at the beginning of the study for each individual (e.g. at the diagnosis of the disease), so I do not have the time course data of the covariate for any given individual. For example, I want to state at the end of the analysis that the hazard ratio of the covariate is 6 at the beginning, decrease to 3 after 2 years and decrease to 1.5 after 5 years. Is this co-called time-dependent covariate? I guess not, because it's really about the influence of the covariate (which was measured once at the beginning) on survival changing over time. Thanks for any input. __ Check out the new Yahoo! Front Page. www.yahoo.com __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] time dependency of Cox regression
array chip wrote: Hi, How can I specify a Cox proportional hazards model with a covariate which i believe its strength on survival changes/diminishes with time? The value of the covariate was only recorded once at the beginning of the study for each individual (e.g. at the diagnosis of the disease), so I do not have the time course data of the covariate for any given individual. For example, I want to state at the end of the analysis that the hazard ratio of the covariate is 6 at the beginning, decrease to 3 after 2 years and decrease to 1.5 after 5 years. Is this co-called time-dependent covariate? I guess not, because it's really about the influence of the covariate (which was measured once at the beginning) on survival changing over time. Thanks for any input. You might try a log-normal or log-logistic accelerated failure time model. These models dictate decreasing hazard ratios over time for baseline covariates. -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html