On 25/11/2011 10:37 AM, Terry Therneau wrote:
On Fri, 2011-11-25 at 09:50 -0500, Duncan Murdoch wrote:
>  I think the general idea in formulas is that it is up to the user to
>  define the meaning of functions used in them.  Normally the user has
>  attached the package that is working on the formula, so the package
>  author can provide useful things like s(), but if a user wanted to
>  redefine s() to their own function, that should be possible.
>  Formulas
>  do have environments attached, so both variables and functions should
>  be
>  looked up there.
>

I don't agree that this is the best way.  A function like coxph could
easily have in its documentation a list of the "formula specials" that
it defines internally.  If the user want something of their own they can
easily use a different word.  In fact, I would strongly recommend that
they don't use one of these key names.  For things that work across
mutiple packages like ns(), what user in his right mind would redefine
it?

Yes, that's what I described in the second part of my answer, and you can do it too in coxph. It requires some work to do special processing of symbols in a formula, but it is already being done for + and : and *, so doing it as well for some other functions would be reasonable. If you don't mind some programming on the formula object, it's not even very hard.

As to a user defining their own ns() function: that seems like it's not something we should disallow, especially if it was done in a context where natural splines weren't being used. It might have nothing to do with the ns() function in the splines package, but it might mean something to the user in terms of his own data. The splines package is a base package so it's not a great idea to re-use the name, but many users would not have splines attached, and wouldn't notice that they had just masked the splines::ns function.

   So I re-raise the question.  Is there a reasonably simple way to make
the survival ridge() function specific to survival formulas?  It sets up
structures that have no meaning anywhere else, and its global definition
stands in the way of other sensible uses.  Having it be not exported +
obey namespace type sematics would be a plus all around.

Yes, there is a way to do what you want. Don't export the function from the package, but preprocess formulas coming into coxph to substitute things that look like calls to ridge() with calls to something local.

For example, this does the substitution. I haven't checked it much, so it might mess up something else (and there might be more elegant ways to write it, using e.g. rapply). It is definitely slightly more elaborate than it needs to be (no need for the separate local function), but that's so you can make the outer function do a bit more than the recursive part does.

fixRidge <- function( formula ) {

  recurse <- function( e ) {
    if (length(e) == 1) {
       if (as.character(e) == "ridge") e <- quote(survival:::ridge)
    }  else for (i in seq_along(e))
          e[[i]] <- recurse(e[[i]])
   e
  }

  recurse(formula)
}

This replace calls to ridge in the formula with calls to survival:::ridge.


Philosophical aside:
   I have discovered to my dismay that formulas do have environments
attached, and that variables/functions are looked up there.  This made
sensible semantics for predict() within a function impossible for some
of the survival functions, unless I were to change all the routines to a
model=TRUE default.  (And a change of that magnitude to survival, with
its long list of dependencies, is not fun to contemplate.  A very quick
survey reveals several dependent packages will break.) So I don't agree
nearly so fully with the "should" part of your last sentence.  The out
of context evaluations allowed by environments are, I find, always
tricky and often lead to intricate special cases.
   Thus, moving back and forth between how it seems that a formula should
work, and how it actually does work, sometimes leaves my head
spinning.


It all comes down to the question: who owns the name? Generally the caller owns the name. So you should look it up in the context of the caller. In R, that means you need to carry along the environment of the caller.

Duncan Murdoch

Terry T.


Terry Therneau


______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to