I'm finally back from vacation and looking at your email. 1. The primary mistake is in your call, where you say fit <- survfit(mod.allison.5, newdata.1, id="Id")
This will use the character string "Id" as the value of the identifier, not the data. The effect is exactly the same as the difference between print(x) and print('x'). 2. In reply to John's comment that "all the id values are the same". It is correct. Normally the survfit routine is used to produce multiple curves, one curve per line of the input data, for time-independent variables. The presence of an id argument is used to tell it that there are multiple lines per subject in the data, e.g. time-dependent covariates. So even though there is only one curve being produced we need an id statement to trigger the behavior. If you only want one curve for one individual, then "individual=TRUE" is an alternate, as John pointed out. 3. "It's very important to specify the Surv object and the formula directly in the coxph function ..." Yes, I agree. I always use your suggested form because it gives better documentation -- variable names are directly visible in the coxph call. I don't understand the attraction of the other form, but lot's of people use it. Why did it go wrong? Because the survfit function was evaluating Surv(Rossi.2$start, Rossi.2$stop, Rossi.2$arrest.time) ~ fin + age + age:stop + pro, data=newdata.1 The length of the variables will be different. The error message comes from the R internals, not my program. Terry Therneau On 06/16/2012 08:04 AM, Jürgen Biedermann wrote: > > Dear Mr. Therneau, Mr. Fox, or to whoever, who has some time... > > I don't find a solution to use the "survfit" function (package: > survival) for a defined pattern of covariates with a Cox-Model > including a time dependent interaction term. Somehow the definition of > my "newdata" argument seems to be erroneous. > I already googled the problem, found many persons having the same or a > similar problem, but still no solution. > I want to stress that my time-dependent covariate does not depend on the > failure of an individual (in this case it doesn't seem sensible to > predict a survivor function for an individual). Rather one of my effects > declines with time (time-dependent coefficient). > > For illustration, I use the example of John Fox's paper "Cox > Proportional - Hazards Regression for Survival Data". > http://cran.r-project.org/doc/contrib/Fox-Companion/appendix-cox-regression.pdf > > Do you know any help? See code below > > Thanks very much in advance > Jürgen Biedermann > > #---------------------------------------- > #Code > > Rossi <- > read.table("http://cran.r-project.org/doc/contrib/Fox-Companion/Rossi.txt", > header=T) > > Rossi.2 <- fold(Rossi, time='week', > event='arrest', cov=11:62, cov.names='employed') > > # see below for the fold function from John Fox > > # modeling an interaction with time (Page 14) > > mod.allison.5 <- coxph(Surv(start, stop, arrest.time) ~ > fin + age + age:stop + prio, > data=Rossi.2) > mod.allison.5 > > # Attempt to get the survivor function of a person with age=30, fin=0 > and prio=5 > > newdata.1 <- > data.frame(unique(Rossi.2[c("start","stop")]),fin=0,age=30,prio=5,Id=1,arrest.time=0) > fit <- survfit(mod.allison.5,newdata.1,id="Id") > > Error message: > > >Fehler in model.frame.default(data = newdata.1, id = "Id", formula = > Surv(start, : > Variablenlängen sind unterschiedlich (gefunden für '(id)') > > --> failure, length of variables are different. > > #----------------------------------------------------------------- > fold <- function(data, time, event, cov, > cov.names=paste('covariate', '.', 1:ncovs, sep=""), > suffix='.time', cov.times=0:ncov, common.times=TRUE, lag=0){ > vlag <- function(x, lag) c(rep(NA, lag), x[1:(length(x)-lag)]) > xlag <- function(x, lag) apply(as.matrix(x), 2, vlag, lag=lag) > all.cov <- unlist(cov) > if (!is.list(cov)) cov <- list(cov) > ncovs <- length(cov) > nrow <- nrow(data) > ncol <- ncol(data) > ncov <- length(cov[[1]]) > nobs <- nrow*ncov > if (length(unique(c(sapply(cov, length), length(cov.times)-1))) > 1) > stop(paste( > "all elements of cov must be of the same length and \n", > "cov.times must have one more entry than each element of > cov.")) > var.names <- names(data) > subjects <- rownames(data) > omit.cols <- if (!common.times) c(all.cov, cov.times) else all.cov > keep.cols <- (1:ncol)[-omit.cols] > nkeep <- length(keep.cols) > if (is.numeric(event)) event <- var.names[event] > times <- if (common.times) matrix(cov.times, nrow, ncov+1, byrow=T) > else data[, cov.times] > new.data <- matrix(Inf, nobs, 3 + ncovs + nkeep) > rownames <- rep("", nobs) > colnames(new.data) <- c('start', 'stop', paste(event, suffix, > sep=""), > var.names[-omit.cols], cov.names) > end.row <- 0 > for (i in 1:nrow){ > start.row <- end.row + 1 > end.row <- end.row + ncov > start <- times[i, 1:ncov] > stop <- times[i, 2:(ncov+1)] > event.time <- ifelse (stop == data[i, time] & data[i, event] == > 1, 1, 0) > keep <- matrix(unlist(data[i, -omit.cols]), ncov, nkeep, byrow=T) > select <- apply(matrix(!is.na(data[i, all.cov]), ncol=ncovs), > 1, all) > rows <- start.row:end.row > cov.mat <- xlag(matrix(unlist(data[i, all.cov]), > nrow=length(rows)), lag) > new.data[rows[select], ] <- > cbind(start, stop, event.time, keep, cov.mat)[select,] > rownames[rows] <- paste(subjects[i], '.', seq(along=rows), > sep="") > } > row.names(new.data) <- rownames > as.data.frame(new.data[new.data[, 1] != Inf & > apply(as.matrix(!is.na(new.data[, cov.names])), 1, all), ]) > } > #----------------------------------------------------------------- > [[alternative HTML version deleted]]
______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.