Re: [R] How to specify "newdata" in a Cox-Modell with a time dependent interaction term?

Terry Therneau Tue, 26 Jun 2012 12:10:58 -0700

I'm finally back from vacation and looking at your email.

1. The primary mistake is in your call, where you say
     fit <- survfit(mod.allison.5, newdata.1, id="Id")


This will use the character string "Id" as the value of the identifier, 
not the data.  The effect is exactly the same as the difference between 
print(x) and print('x').

2. In reply to John's comment that "all the id values are the same".  It 
is correct.  Normally the survfit routine is used to produce multiple 
curves, one curve per line of the input data, for time-independent 
variables.   The presence of an id argument is used to tell it that 
there are multiple lines per subject in the data, e.g. time-dependent 
covariates.  So even though there is only one curve being produced we 
need an id statement to trigger the behavior.
   If you only want one curve for one individual, then "individual=TRUE" 
is an alternate, as John pointed out.

3. "It's very important to specify the Surv object and the formula 
directly in the coxph function ..."
Yes, I agree.  I always use your suggested form because it gives better 
documentation -- variable names are directly visible in the coxph call.  
I don't understand the attraction of the other form, but lot's of people 
use it.
    Why did it go wrong?  Because the survfit function was evaluating
         Surv(Rossi.2$start, Rossi.2$stop, Rossi.2$arrest.time) ~ fin + 
age + age:stop + pro, data=newdata.1

The length of the variables will be different.  The error message comes 
from the R internals, not my program.

Terry Therneau


On 06/16/2012 08:04 AM, Jürgen Biedermann wrote:
>
> Dear Mr. Therneau, Mr. Fox, or to whoever, who has some time...
>
> I don't find a solution to use the "survfit" function (package:
> survival)  for a defined pattern of covariates with a Cox-Model
> including a time dependent interaction term. Somehow the definition of
> my "newdata" argument seems to be erroneous.
> I already googled the problem, found many persons having the same or a
> similar problem, but still no solution.
> I want to stress that my time-dependent covariate does not depend on the
> failure of an individual (in this case it doesn't seem sensible to
> predict a survivor function for an individual). Rather one of my effects
> declines with time (time-dependent coefficient).
>
> For illustration, I use the example of John Fox's paper "Cox
> Proportional - Hazards Regression for Survival Data".
> http://cran.r-project.org/doc/contrib/Fox-Companion/appendix-cox-regression.pdf
>
> Do you know any help? See code below
>
> Thanks very much in advance
> Jürgen Biedermann
>
> #----------------------------------------
> #Code
>
> Rossi <-
> read.table("http://cran.r-project.org/doc/contrib/Fox-Companion/Rossi.txt";,
> header=T)
>
> Rossi.2 <- fold(Rossi, time='week',
>      event='arrest', cov=11:62, cov.names='employed')
>
> # see below for the fold function from John Fox
>
> # modeling an interaction with time (Page 14)
>
> mod.allison.5 <- coxph(Surv(start, stop, arrest.time) ~
>      fin + age + age:stop + prio,
>      data=Rossi.2)
> mod.allison.5
>
> # Attempt to get the survivor function of a person with age=30, fin=0
> and prio=5
>
> newdata.1 <-
> data.frame(unique(Rossi.2[c("start","stop")]),fin=0,age=30,prio=5,Id=1,arrest.time=0)
> fit <- survfit(mod.allison.5,newdata.1,id="Id")
>
> Error message:
>
> >Fehler in model.frame.default(data = newdata.1, id = "Id", formula =
> Surv(start,  :
>    Variablenlängen sind unterschiedlich (gefunden für '(id)')
>
> --> failure, length of variables are different.
>
> #-----------------------------------------------------------------
> fold <- function(data, time, event, cov,
>      cov.names=paste('covariate', '.', 1:ncovs, sep=""),
>      suffix='.time', cov.times=0:ncov, common.times=TRUE, lag=0){
>      vlag <- function(x, lag) c(rep(NA, lag), x[1:(length(x)-lag)])
>      xlag <- function(x, lag) apply(as.matrix(x), 2, vlag, lag=lag)
>      all.cov <- unlist(cov)
>      if (!is.list(cov)) cov <- list(cov)
>      ncovs <- length(cov)
>      nrow <- nrow(data)
>      ncol <- ncol(data)
>      ncov <- length(cov[[1]])
>      nobs <- nrow*ncov
>      if (length(unique(c(sapply(cov, length), length(cov.times)-1))) > 1)
>          stop(paste(
>              "all elements of cov must be of the same length and \n",
>              "cov.times must have one more entry than each element of
> cov."))
>      var.names <- names(data)
>      subjects <- rownames(data)
>      omit.cols <- if (!common.times) c(all.cov, cov.times) else all.cov
>      keep.cols <- (1:ncol)[-omit.cols]
>      nkeep <- length(keep.cols)
>      if (is.numeric(event)) event <- var.names[event]
>      times <- if (common.times) matrix(cov.times, nrow, ncov+1, byrow=T)
>          else data[, cov.times]
>      new.data <- matrix(Inf, nobs, 3 + ncovs + nkeep)
>      rownames <- rep("", nobs)
>      colnames(new.data) <- c('start', 'stop', paste(event, suffix, 
> sep=""),
>          var.names[-omit.cols], cov.names)
>      end.row <- 0
>      for (i in 1:nrow){
>          start.row <- end.row + 1
>          end.row <- end.row + ncov
>          start <- times[i, 1:ncov]
>          stop <- times[i, 2:(ncov+1)]
>          event.time <- ifelse (stop == data[i, time] & data[i, event] ==
> 1, 1, 0)
>          keep <- matrix(unlist(data[i, -omit.cols]), ncov, nkeep, byrow=T)
>          select <- apply(matrix(!is.na(data[i, all.cov]), ncol=ncovs),
> 1, all)
>          rows <- start.row:end.row
>          cov.mat <- xlag(matrix(unlist(data[i, all.cov]),
> nrow=length(rows)), lag)
>          new.data[rows[select], ] <-
>              cbind(start, stop, event.time, keep, cov.mat)[select,]
>          rownames[rows] <- paste(subjects[i], '.', seq(along=rows), 
> sep="")
>          }
>      row.names(new.data) <- rownames
>      as.data.frame(new.data[new.data[, 1] != Inf &
>          apply(as.matrix(!is.na(new.data[, cov.names])), 1, all), ])
>      }
> #-----------------------------------------------------------------
>

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to specify "newdata" in a Cox-Modell with a time dependent interaction term?

Reply via email to