I don't see a big downside, but I will say that there's always a bit of a tradeoff between "train the users to do it right" (by writing clear documentation and informative error messages) and "make things easy for the user" (by making the code more complicated to handle things for them automatically).

For example, part of me wishes that (1) there were only one way to provide a response variable for a binomial variable with N>1 (preferably by specifying proportions and a weights argument) and (2) grouping variables in lme4/nlme/et al always had to be specified as factors (rather than automatically being coerced to factors). Making those decisions would avoid so much code complexity ... (and eliminate one class of errors, i.e. people including a continuous covariate as a random-effect grouping variable because they think of 'random effect' and 'nuisance variable' as synonyms ...)

But taking the "train the users to do it right" path does also involve more discussion with users ("if your software knows what I should be doing why can't it just do it for me?")

  cheers
   Ben Bolker

On 2024-08-27 9:43 a.m., Therneau, Terry M., Ph.D. via R-devel wrote:
You are right of course, Peter, but I can see where some will get confused.   
In a formula
some symbols and functions are special operators, and others are simple 
functions.   That
is the reason one needs I(events/time) to put a rate in as a variable.    
Someone who
types 'offset' at the command line will see that there actually IS a function 
behind the
scenes.

Does anyone see a downside to Bill Dunlap's suggestion where the first step of 
my formula
processing would be to "clean off" any survival:: modifiers?    That is, 
something that
will break? After all, the code already has a lot of  "if (....) "  lines for 
other common
user errors.   I could view it as just saving me the time to deal with the 'we 
found an
error' emails.   I would output the corrected version as the "call" component.

Terry

On 8/27/24 03:38, peter dalgaard wrote:
In my view, that's just plain wrong, because strata() is not a function but a 
special operator in a model formula. Wouldn't it also blow up on 
stats::offset()?

Oh, yes it would:

lm(y~x+offset(z))
Call:
lm(formula = y ~ x + offset(z))

Coefficients:
(Intercept)            x
       0.7350       0.0719

lm(y~x+stats::offset(z))
Call:
lm(formula = y ~ x + stats::offset(z))

Coefficients:
       (Intercept)                 x  stats::offset(z)
            0.6457            0.1078            0.8521


Or, to be facetious:

lm(y~base::"+"(x,z))
Call:
lm(formula = y ~ base::"+"(x, z))

Coefficients:
      (Intercept)  base::"+"(x, z)
           0.4516           0.4383



-pd

On 26 Aug 2024, at 16:42 , Therneau, Terry M., Ph.D. via 
R-devel<r-devel@r-project.org>  wrote:

The survival package makes significant use of the "specials" argument of 
terms(), before
calling model.frame; it is part of nearly every modeling function. The reason 
is that
strata argments simply have to be handled differently than other things on the 
right hand
side. Likewise for tt() and cluster(), though those are much less frequent.

I now get "bug reports" from the growing segment that believes one should put
packagename:: in front of every single instance.   For instance
        fit <- survival::survdiff( survival::Surv(time, status) ~ ph.karno +
survival::strata(inst),  data= survival::lung)

This fails to give the correct answer because it fools terms(formula, specials=
"strata").    I've stood firm in my response of "that's your bug, not mine", 
but I begin
to believe I am swimming uphill.   One person responded that it was company 
policy to
qualify everything.

I don't see an easy way to fix survival, and even if I did it would be a 
tremendous amout
of work.   What are other's thoughts?

Terry



--

Terry M Therneau, PhD
Department of Quantitative Health Sciences
Mayo Clinic
thern...@mayo.edu

"TERR-ree THUR-noh"

        [[alternative HTML version deleted]]

______________________________________________
R-devel@r-project.org  mailing list
https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-devel&data=05%7C02%7Ctherneau%40mayo.edu%7C7659a5f0f0d34746966a08dcc6739fed%7Ca25fff9c3f634fb29a8ad9bdd0321f9a%7C0%7C0%7C638603447151664511%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=UAkeksswfFdLwOdzQIOXUPC2Ey255oW%2FX41kptNZNcU%3D&reserved=0

        [[alternative HTML version deleted]]

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

--
Dr. Benjamin Bolker
Professor, Mathematics & Statistics and Biology, McMaster University
Director, School of Computational Science and Engineering
> E-mail is sent at my convenience; I don't expect replies outside of working hours.

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to