Meant to respond to this but forgot. I didn't write a new terms() function -- I added an attribute to the terms() (a vector of the names of the constructed model matrix), thus preserving the information at the point when it was available. I do agree that it would be preferable to have an upstream fix ...
On Thu, Mar 8, 2018 at 9:39 AM, Therneau, Terry M., Ph.D. via R-devel <r-devel@r-project.org> wrote: > Ben, > > > Looking at your notes, it appears that your solution is to write your own > terms() function > for lme. It is easy to verify that the "varnames.fixed" attribute is not > returned by the > ususal terms function. > > Then I also need to write my own terms function for the survival and coxme > pacakges? > Because of the need to treat strata() terms in a special way I manipulate > the > formula/terms in nearly every routine. > > Extrapolating: every R package that tries to examine formulas and partition > them into bits > needs its own terms function? This does not look like a good solution to > me. > > On 03/07/2018 07:39 AM, Ben Bolker wrote: >> >> I knew I had seen this before but couldn't previously remember where. >> https://github.com/lme4/lme4/issues/441 ... I initially fixed with >> gsub(), but (pushed by Martin Maechler to do better) I eventually >> fixed it by storing the original names of the model frame (without >> backticks) as an attribute for later retrieval: >> >> https://github.com/lme4/lme4/commit/56416fc8b3b5153df7df5547082835c5d5725e89. >> >> >> On Wed, Mar 7, 2018 at 8:22 AM, Therneau, Terry M., Ph.D. via R-devel >> <r-devel@r-project.org> wrote: >>> >>> Thanks to Bill Dunlap for the clarification. On follow-up it turns out >>> that >>> this will be an issue for many if not most of the routines in the >>> survival >>> package: a lot of them look at the terms structure and make use of the >>> dimnames of attr(terms, 'factors'), which also keeps the unneeded >>> backquotes. Others use the term.labels attribute. To dodge this I will >>> need to create a fixterms() routine which I call at the top of every >>> single >>> routine in the library. >>> >>> Is there a chance for a fix at a higher level? >>> >>> Terry T. >>> >>> >>> >>> On 03/05/2018 03:55 PM, William Dunlap wrote: >>>> >>>> I believe this has to do terms() making "term.labels" (hence the >>>> dimnames >>>> of "factors") >>>> with deparse(), so that the backquotes are included for non-syntactic >>>> names. The backquotes >>>> are not in the column names of the input data.frame (nor model frame) so >>>> you get a mismatch >>>> when subscripting the data.frame or model.frame with elements of >>>> terms()$term.labels. >>>> >>>> I think you can avoid the problem by adding right after >>>> ll <- attr(Terms, "term.labels") >>>> the line >>>> ll <- gsub("^`|`$", "", ll) >>>> >>>> E.g., >>>> >>>> > d <- data.frame(check.names=FALSE, y=1/(1:5), `b$a$d`=sin(1:5)+2, `x >>>> y >>>> z`=cos(1:5)+2) >>>> > Terms <- terms( y ~ log(`b$a$d`) + `x y z` ) >>>> > m <- model.frame(Terms, data=d) >>>> > colnames(m) >>>> [1] "y" "log(`b$a$d`)" "x y z" >>>> > attr(Terms, "term.labels") >>>> [1] "log(`b$a$d`)" "`x y z`" >>>> > ll <- attr(Terms, "term.labels") >>>> > gsub("^`|`$", "", ll) >>>> [1] "log(`b$a$d`)" "x y z" >>>> >>>> It is a bit of a mess. >>>> >>>> >>>> Bill Dunlap >>>> TIBCO Software >>>> wdunlap tibco.com <http://tibco.com> >>>> >>>> On Mon, Mar 5, 2018 at 12:55 PM, Therneau, Terry M., Ph.D. via R-devel >>>> <r-devel@r-project.org <mailto:r-devel@r-project.org>> wrote: >>>> >>>> A user reported a problem with the survdiff function and the use of >>>> variables that >>>> contain a space. Here is a simple example. The same issue occurs >>>> in >>>> survfit for the >>>> same reason. >>>> >>>> lung2 <- lung >>>> names(lung2)[1] <- "in st" # old name is inst >>>> survdiff(Surv(time, status) ~ `in st`, data=lung2) >>>> Error in `[.data.frame`(m, ll) : undefined columns selected >>>> >>>> In the body of the code the program want to send all of the >>>> right-hand >>>> side variables >>>> forward to the strata() function. The code looks more or less like >>>> this, where m is >>>> the model frame >>>> >>>> Terms <- terms(m) >>>> index <- attr(Terms, "term.labels") >>>> if (length(index) ==0) X <- rep(1L, n) # no coariates >>>> else X <- strata(m[index]) >>>> >>>> For the variable with a space in the name the term.label is "`in >>>> st`", >>>> and the >>>> subscript fails. >>>> >>>> Is this intended behaviour or a bug? The issue is that the name of >>>> this column in the >>>> model frame does not have the backtics, while the terms structure >>>> does >>>> have them. >>>> >>>> Terry T. >>>> >>>> ______________________________________________ >>>> R-devel@r-project.org <mailto:R-devel@r-project.org> mailing list >>>> https://stat.ethz.ch/mailman/listinfo/r-devel >>>> <https://stat.ethz.ch/mailman/listinfo/r-devel> >>>> >>>> >>> ______________________________________________ >>> R-devel@r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-devel > > > ______________________________________________ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel