Re: [Rd] new behavior in model.response -- Solved
Thanks to multiple readers for comments and patience as I sorted this out. I now have working length and names methods for Surv objects, which do not seem to break anything. I just ran the test suites for 471 packages that depend on survival, but I don't test against bioconductor so cannot speak to that corpus. Essentially, if I wanted to think of Surv(1:6, rep(1:0, 3)) = 1 2+ 3 4+ 5 6+ as a vector of length 6, although stored in a longer representation, I needed to a. add a length method b. add names and names<- methods c. But, the names method cannot create a length=6 names attribute. Handling of a names attribute is baked into the low-level code, and that code treats number of elements as the length, no matter what you say. d. The solution was to make names.Surv store the results in the rownames. Terry T. [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] new behavior in model.response
> On Jun 27, 2018, at 3:58 PM, Achim Zeileis wrote: > > On Thu, 28 Jun 2018, Therneau, Terry M., Ph.D. via R-devel wrote: > >> I now understand the issue, which leads to a different and deeper issue >> which is "how to assign a proper length to Surv objects". >> >> > Surv(c(1,2,3), c(1,0,1)) >> [1] 1 2+ 3 >> >> The above prints as 3 elements and is conceptually 3 elements. But if I give >> it length method to return a 3 then I need a names method, and names<- pays >> no attention to my defined length. How do we conceive of and manage a vector >> whose elements happen to require more than one storage slot for their >> representation? An obvious example is the complex type, but it seems that >> had to be baked right down into the core. > > I think you just have to implement all methods required to make it look like > a vector even if it is internally a matrix. Thus, you need methods for length > and for names and names<-. Internally, the names can be stored as row names. I think a closer look at model.response() would help. IIUC, the reasoning therein is that comparing length(data[[1L]]) (aka `length(v)') to length(attr(data, "row.names")) (aka `nrows') is to decide whether names are sensibly assigned to `v'. I think for Surv objects they are not. I suppose you could define `names<-.Surv` <- function(...) NULL but that seems so silly I think there must be a better way. I am low on coffee right now, but I wonder if having a non-exported version of model.response in the survival package would solve this without breaking anything else. Chuck __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] new behavior in model.response
On Thu, 28 Jun 2018, Therneau, Terry M., Ph.D. via R-devel wrote: I now understand the issue, which leads to a different and deeper issue which is "how to assign a proper length to Surv objects". > Surv(c(1,2,3), c(1,0,1)) [1] 1 2+ 3 The above prints as 3 elements and is conceptually 3 elements. But if I give it length method to return a 3 then I need a names method, and names<- pays no attention to my defined length. How do we conceive of and manage a vector whose elements happen to require more than one storage slot for their representation? An obvious example is the complex type, but it seems that had to be baked right down into the core. I think you just have to implement all methods required to make it look like a vector even if it is internally a matrix. Thus, you need methods for length and for names and names<-. Internally, the names can be stored as row names. Further useful methods for "Surv" objects might include - as.double/as.integer (presumably just extracting the "time"), - c - str Possibly also a dedicated summary. An example for such a class is our "paircomp" in "psychotools". But I'm sure there are other/better examples elsewhere. Best, Achim __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] new behavior in model.response
I now understand the issue, which leads to a different and deeper issue which is "how to assign a proper length to Surv objects". > Surv(c(1,2,3), c(1,0,1)) [1] 1 2+ 3 The above prints as 3 elements and is conceptually 3 elements. But if I give it length method to return a 3 then I need a names method, and names<- pays no attention to my defined length. How do we conceive of and manage a vector whose elements happen to require more than one storage slot for their representation? An obvious example is the complex type, but it seems that had to be baked right down into the core. [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] new behavior of model.response
Charles Berry pointed out an error in my reasoning. In the current survival I forgot the S3method line for length in the NAMESPACE file, so the behavior is really not new. Nonetheless it remains surprising and non-intuitive. Why does model.response sometimes attach spurious names, when the Surv object itself does not have them? Terry tmt% R --vanilla R version 3.4.2 (2017-09-28) -- "Short Summer" test <- data.frame(time=1:8, status=rep(0:1, 4), age=60:67) row.names(test) <- letters[1:8] library(survival) mf2 <- model.frame(Surv(time, status) ~ age, data=test) names(mf2[[1]]) # NULL names(model.response(mf2)) # NULL length.Surv <- survival:::length.Surv names(model.response(mf2)) # [1] "a" "b" "c" "d" "e" "f" "g" "h" NA NA NA NA NA NA NA NA [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] new behavior of model.response
I am getting some unexplained changes in the latest version of survival, and finally traced it down to this: model.response acts differently for Surv objects. Here is a closed form example using a made up class Durv = diagnose survival. I tracked it down by removing methods one by one from Surv; I had just added some new ones so they were my suspects. test <- data.frame(time=1:8, status=rep(0:1, 4), age=60:67) row.names(test) <- letters[1:8] Durv <- function(...) { temp <- cbind(...) class(temp) <- "Durv" temp } mf1 <- model.frame(Durv(time, status) ~ age, data=test) names(model.response(mf1)) # NULL length.Durv <- function(x) nrow(x) names(model.response(mf1)) # [1] "a" "b" "c" "d" "e" "f" "g" "h" NA NA NA NA NA NA NA NA The length method for Surv objects has been around for some while, this behavior is new. It caused the 'time' component of survfit objects to suddenly have names and so was discovered in my test suite. I had planned to submit an update today, but now need to delay it. The length of the Surv (Durv) object above is 8, BTW; the fact that it's representation requires either 16 elements (right censored) or 24 (interval censored) is a footnote. Terry Therneau [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel