Re: [Rd] [EXTERNAL] Re: issue with model.frame()

2018-05-01 Thread Berry, Charles
Unfortunately, I spoke too soon.

model.frame calls formula <- terms(formula, data = data) if formula does not 
inherit from class "terms" as in your case.

And that is where the bad terms.labels attribute comes from.

So, the fix I suggested won't work.

But maybe you can just supply a terms object to model.frame that has correct 
term.labels.

Chuck


> On May 1, 2018, at 10:55 AM, Therneau, Terry M., Ph.D. via R-devel 
>  wrote:
> 
> Great catch.  I'm very reluctant to use my own model.frame, since that locks 
> me into tracking all the base R changes, potentially breaking survival in a 
> bad way if I miss one.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] [EXTERNAL] Re: issue with model.frame()

2018-05-01 Thread Therneau, Terry M., Ph.D. via R-devel
Great catch.  I'm very reluctant to use my own model.frame, since that locks me into 
tracking all the base R changes, potentially breaking survival in a bad way if I miss one.


But, this shows me clearly what the issue is and will allow me to think about 
it.

Another solution for the user is to use multiple ridge() calls to break it up; since 
he/she was using a fixed tuning parameter the result is the same.


Terry T.


On 05/01/2018 11:43 AM, Berry, Charles wrote:




On May 1, 2018, at 6:11 AM, Therneau, Terry M., Ph.D. via R-devel 
 wrote:

A user sent me an example where coxph fails, and the root of the failure is a 
case where names(mf) is not equal to the term.labels attribute of the formula 
-- the latter has an extraneous newline. Here is an example that does not use 
the survival library.

# first create a data set with many long names
n <- 30  # number of rows for the dummy data set
vname <- vector("character", 26)
for (i in 1:26) vname[i] <- paste(rep(letters[1:i],2), collapse='')  # long 
variable names

tdata <- data.frame(y=1:n, matrix(runif(n*26), nrow=n))
names(tdata) <- c('y', vname)

# Use it in a formula
myform <- paste("y ~ cbind(", paste(vname, collapse=", "), ")")
mf <- model.frame(formula(myform), data=tdata)

match(attr(terms(mf), "term.labels"), names(mf))   # gives NA



In the user's case the function is ridge(x1, x2, ) rather than cbind, but 
the effect is the same.
Any ideas for a work around?


Maybe add a `yourclass' class to mf and dispatch to a model.frame.yourclass 
method where the width cutoff arg here (around lines 57-58 of 
model.frame.default) is made larger:

varnames <- sapply(vars, function(x) paste(deparse(x, width.cutoff = 500),
 collapse = " "))[-1L]

??



Aside: the ridge() function is very simple, it was added as an example to show 
how a user can add their own penalization to coxph.  I never expected serious 
use of it.  For this particular user the best answer is to use glmnet instead.  
 He/she is trying to apply an L2 penalty to a large number of SNP * covariate 
interactions.

Terry T.




HTH,

Chuck



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel