Given how much documentation is available on R coding in general, it is
surprising how little is available specifically on writing model code.
Researchers who come up with a new method of regression, and who want to
write an S3 model for that method, must currently go all the way back to the
Venables and Ripley S programming book.
On 26.06.2015 14:09, Stephen Milborrow wrote:
> Once we have built a regression model, we typically want to use the
> model for further processing, such as making predictions from the model
> or plotting the residuals. Unfortunately, for many packages on CRAN
> this can be difficult.
>
> For example, some models don't have a residuals method and don't save
> the call or data --- so you can't tell how to generate the residuals
> from the model object itself.
>
> A common snag is that for some models the new data for predict() has to
> be a matrix; for others it has to be a data.frame. This places an
> unnecessary burden on the user when both data.frames and matrices can
> easily be supported by predict.
>
> To mitigate such issues, I'm going out on a limb and presenting some
> guidelines for writers of S3 regression model functions (this document
> is currently part of the plotmo package):
> http://www.milbo.org/doc/modguide.pdf
On 26.06.2015 16:41, Achim Zeileis wrote:
I think this is a nice and useful starting point. It's probably not
comprehensive (yet) but will surely help.
You could add something more about writing the formula interface and the
correct processing of model.frame, terms, model.response, model.matrix,
model.weights, model.offset. Especially for models with linear predictors
the latter two can be very useful and are often not hard to implement. In
case the model has multiple parts or multiple responses, the "Formula"
package (and its vignette) might also be helpful.
As for the S3 methods, I would omit coefficients, fitted.values, and resid
from the list. These dispatch to coef, fitted, and residuals anyway. For
inference it would also be very useful to add nobs(), df.residual(),
vcov(), and logLik() and/or deviance() where applicable. An overview which
lists some (but not all) useful methods is in Table 1 of
vignette("betareg", package = "betareg").
For coef() and vcov() it is useful/important that the names and dimension
match. Then Wald tests can be easily computed in functions like
car::linearHypothesis(), car::deltaMethod(), lmtest::waldtest(), or
lmtest::coeftest().
Thanks for these, I'll update the document.
Stephen Milborrow
______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel