Re: [Rd] request for patch in drop1 (add.R)
Ben Bolker bbol...@gmail.com on Wed, 23 Feb 2011 19:36:43 -0500 writes: On 11-02-23 06:12 PM, Prof Brian Ripley wrote: residuals() and $residuals are often very different: residuals() is generic, but even the default method is *not* simple extraction. Their values can be of different lengths: think about an lm fit with na.action = na.exclude. That is precisely the sort of thing the tests in add.R were designed to detect, hence the use of $residuals. None of this is used in drop1()! One of the places $residuals was used in that file was the default method for drop1(), others being step() and the default method for add1(). As default methods these have to continue to work for any class of object that has previously been thrown at them over the last 10+ years, and even all CRAN packages will not be a good enough test suite. In any case, the current code in R-devel makes use of the new generic nobs(), which is intended to help sort out the many versions of functions called 'BIC in packages, but is also useful here. (It is still under development.) terms() is also generic so there is also some danger that its substitution could give an inappropriate result. But as it used in several other places in add.R the breakage will probably occur elsewhere already. :-) yes. I think it is very good change to have drop1.default() and other R functions that do generic model analysis work via calls to simpler generic functions such as residuals(), or the new nobs() . Thanks for introducing that, a really good idea I think (and along which I think I was also musing when I last looked at the BIC() implementations..). I'm glad you've started to clean this up nicely. Thanks Prof. Ripley. (I will say that, while I understand why residuals(x) and x$residuals could be different, I am happy that a more transparent form of coding is being introduced ...) (indeed, see above). Advertising the use of nobs(), i.e. asking package authors to write nobs() methods for their models will be probably worth doing, as soon as 2.13.0 hits the roads.. Martin On Wed, 23 Feb 2011, Martin Maechler wrote: Ben Bolker bbol...@gmail.com on Wed, 23 Feb 2011 09:14:37 -0500 writes: By changing three lines in drop1 from access based on $ to access based on standard accessor methods (terms() and residuals()), it becomes *much* easier to extend drop1 to work with other model types. The use of $ rather than accessors in this context seems to be an oversight rather than a design decision, but maybe someone knows better ... In particular, if one makes these changes (which I am pretty certain will not break anything, as the original code basically mimicked the default methods anyway), it becomes possible to make drop1() work with mer objects (Doug Bates's new mixed model code) merely by defining: terms.mer - function(x, ...) { attr(x@frame,terms) } extractAIC.default - function(fit, scale=0, k=2, ...) { L - logLik(fit) edf - attr(L,df) c(edf,-2*L+2*edf) } Adding this definition of extractAIC.default also makes drop1() work with lme fits ... Comments? Should I submit to the bug database as enhancement request? Are there any hidden downsides to this? drawback: a possible very small performance cut for the cases where the $ access is ok. But that should not count. I like the idea {it's a pity that only S3 methods work that way, because residuals(), terms(), etc... are unfortunately not general (implicit) S4 generics but just S3 ones.. I'm currently testing your change, plus some more in step(). However, for step() to work automagically there is more needed. It currently relies in more places on 'object' being a list to which you can append new components, basically by fit - object fit$new1 - ... fit$new2 - ... That would have to be changed to something like fit - list(obj = object) fit$new1 - ... fit$new2 - ... and more changes where 'fit' has to be replaced by 'fit$obj'. Would that not be of interest as well? Martin Ben Bolker -- Index: add.R === --- add.R (revision 54562) +++ add.R (working copy) @@ -330,7 +330,7 @@ drop1.default - function(object, scope, scale = 0, test=c(none, Chisq), k = 2, trace = FALSE, ...) { - tl - attr(object$terms, term.labels) + tl - attr(terms(object), term.labels) if(missing(scope)) scope - drop.scope(object) else {
[Rd] request for patch in drop1 (add.R)
By changing three lines in drop1 from access based on $ to access based on standard accessor methods (terms() and residuals()), it becomes *much* easier to extend drop1 to work with other model types. The use of $ rather than accessors in this context seems to be an oversight rather than a design decision, but maybe someone knows better ... In particular, if one makes these changes (which I am pretty certain will not break anything, as the original code basically mimicked the default methods anyway), it becomes possible to make drop1() work with mer objects (Doug Bates's new mixed model code) merely by defining: terms.mer - function(x, ...) { attr(x@frame,terms) } extractAIC.default - function(fit, scale=0, k=2, ...) { L - logLik(fit) edf - attr(L,df) c(edf,-2*L+2*edf) } Adding this definition of extractAIC.default also makes drop1() work with lme fits ... Comments? Should I submit to the bug database as enhancement request? Are there any hidden downsides to this? Ben Bolker Index: add.R === --- add.R (revision 54562) +++ add.R (working copy) @@ -330,7 +330,7 @@ drop1.default - function(object, scope, scale = 0, test=c(none, Chisq), k = 2, trace = FALSE, ...) { -tl - attr(object$terms, term.labels) +tl - attr(terms(object), term.labels) if(missing(scope)) scope - drop.scope(object) else { if(!is.character(scope)) @@ -344,7 +344,7 @@ ans - matrix(nrow = ns + 1L, ncol = 2L, dimnames = list(c(none, scope), c(df, AIC))) ans[1, ] - extractAIC(object, scale, k = k, ...) -n0 - length(object$residuals) +n0 - length(residuals(object)) env - environment(formula(object)) for(i in seq(ns)) { tt - scope[i] @@ -356,7 +356,7 @@ evaluate = FALSE) nfit - eval(nfit, envir=env) # was eval.parent(nfit) ans[i+1, ] - extractAIC(nfit, scale, k = k, ...) -if(length(nfit$residuals) != n0) +if(length(residuals(nfit)) != n0) stop(number of rows in use has changed: remove missing values?) } dfs - ans[1L , 1L] - ans[, 1L] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] request for patch in drop1 (add.R)
Ben Bolker bbol...@gmail.com on Wed, 23 Feb 2011 09:14:37 -0500 writes: By changing three lines in drop1 from access based on $ to access based on standard accessor methods (terms() and residuals()), it becomes *much* easier to extend drop1 to work with other model types. The use of $ rather than accessors in this context seems to be an oversight rather than a design decision, but maybe someone knows better ... In particular, if one makes these changes (which I am pretty certain will not break anything, as the original code basically mimicked the default methods anyway), it becomes possible to make drop1() work with mer objects (Doug Bates's new mixed model code) merely by defining: terms.mer - function(x, ...) { attr(x@frame,terms) } extractAIC.default - function(fit, scale=0, k=2, ...) { L - logLik(fit) edf - attr(L,df) c(edf,-2*L+2*edf) } Adding this definition of extractAIC.default also makes drop1() work with lme fits ... Comments? Should I submit to the bug database as enhancement request? Are there any hidden downsides to this? drawback: a possible very small performance cut for the cases where the $ access is ok. But that should not count. I like the idea {it's a pity that only S3 methods work that way, because residuals(), terms(), etc... are unfortunately not general (implicit) S4 generics but just S3 ones.. I'm currently testing your change, plus some more in step(). However, for step() to work automagically there is more needed. It currently relies in more places on 'object' being a list to which you can append new components, basically by fit - object fit$new1 - ... fit$new2 - ... That would have to be changed to something like fit - list(obj = object) fit$new1 - ... fit$new2 - ... and more changes where 'fit' has to be replaced by 'fit$obj'. Would that not be of interest as well? Martin Ben Bolker -- Index: add.R === --- add.R (revision 54562) +++ add.R (working copy) @@ -330,7 +330,7 @@ drop1.default - function(object, scope, scale = 0, test=c(none, Chisq), k = 2, trace = FALSE, ...) { -tl - attr(object$terms, term.labels) +tl - attr(terms(object), term.labels) if(missing(scope)) scope - drop.scope(object) else { if(!is.character(scope)) @@ -344,7 +344,7 @@ ans - matrix(nrow = ns + 1L, ncol = 2L, dimnames = list(c(none, scope), c(df, AIC))) ans[1, ] - extractAIC(object, scale, k = k, ...) -n0 - length(object$residuals) +n0 - length(residuals(object)) env - environment(formula(object)) for(i in seq(ns)) { tt - scope[i] @@ -356,7 +356,7 @@ evaluate = FALSE) nfit - eval(nfit, envir=env) # was eval.parent(nfit) ans[i+1, ] - extractAIC(nfit, scale, k = k, ...) -if(length(nfit$residuals) != n0) +if(length(residuals(nfit)) != n0) stop(number of rows in use has changed: remove missing values?) } dfs - ans[1L , 1L] - ans[, 1L] -- __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] request for patch in drop1 (add.R)
On 11-02-23 03:20 PM, Martin Maechler wrote: Ben Bolker bbol...@gmail.com on Wed, 23 Feb 2011 09:14:37 -0500 writes: By changing three lines in drop1 from access based on $ to access based on standard accessor methods (terms() and residuals()), it becomes *much* easier to extend drop1 to work with other model types. The use of $ rather than accessors in this context seems to be an oversight rather than a design decision, but maybe someone knows better ... In particular, if one makes these changes (which I am pretty certain will not break anything, as the original code basically mimicked the default methods anyway), it becomes possible to make drop1() work with mer objects (Doug Bates's new mixed model code) merely by defining: terms.mer - function(x, ...) { attr(x@frame,terms) } extractAIC.default - function(fit, scale=0, k=2, ...) { L - logLik(fit) edf - attr(L,df) c(edf,-2*L+2*edf) } Adding this definition of extractAIC.default also makes drop1() work with lme fits ... Comments? Should I submit to the bug database as enhancement request? Are there any hidden downsides to this? drawback: a possible very small performance cut for the cases where the $ access is ok. But that should not count. I like the idea {it's a pity that only S3 methods work that way, because residuals(), terms(), etc... are unfortunately not general (implicit) S4 generics but just S3 ones.. I'm currently testing your change, plus some more in step(). However, for step() to work automagically there is more needed. It currently relies in more places on 'object' being a list to which you can append new components, basically by fit - object fit$new1 - ... fit$new2 - ... That would have to be changed to something like fit - list(obj = object) fit$new1 - ... fit$new2 - ... and more changes where 'fit' has to be replaced by 'fit$obj'. Would that not be of interest as well? Martin Potentially, but I am personally much more interested in enabling drop1(), which seems to be a much more legitimate tool for testing terms in models than step(), which is so easy to abuse ... Ben Bolker __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] request for patch in drop1 (add.R)
On Feb 23, 2011, at 21:38 , Ben Bolker wrote: Potentially, but I am personally much more interested in enabling drop1(), which seems to be a much more legitimate tool for testing terms in models than step(), which is so easy to abuse ... Yes, although repeated use of drop1() easily leads to backwards elimination However, I have a different point: To make drop1() a better generic, I suspect that something needs to be done about the test = c(none, Chisq) bit. It would be nice if the list of possible tests could vary according to model type, e.g. by doing all tests via anova(model1,model2,...). I'm not quite up to figuring out how complicated that would be. -- Peter Dalgaard Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] request for patch in drop1 (add.R)
residuals() and $residuals are often very different: residuals() is generic, but even the default method is *not* simple extraction. Their values can be of different lengths: think about an lm fit with na.action = na.exclude. That is precisely the sort of thing the tests in add.R were designed to detect, hence the use of $residuals. None of this is used in drop1()! One of the places $residuals was used in that file was the default method for drop1(), others being step() and the default method for add1(). As default methods these have to continue to work for any class of object that has previously been thrown at them over the last 10+ years, and even all CRAN packages will not be a good enough test suite. In any case, the current code in R-devel makes use of the new generic nobs(), which is intended to help sort out the many versions of functions called 'BIC in packages, but is also useful here. (It is still under development.) terms() is also generic so there is also some danger that its substitution could give an inappropriate result. But as it used in several other places in add.R the breakage will probably occur elsewhere already. On Wed, 23 Feb 2011, Martin Maechler wrote: Ben Bolker bbol...@gmail.com on Wed, 23 Feb 2011 09:14:37 -0500 writes: By changing three lines in drop1 from access based on $ to access based on standard accessor methods (terms() and residuals()), it becomes *much* easier to extend drop1 to work with other model types. The use of $ rather than accessors in this context seems to be an oversight rather than a design decision, but maybe someone knows better ... In particular, if one makes these changes (which I am pretty certain will not break anything, as the original code basically mimicked the default methods anyway), it becomes possible to make drop1() work with mer objects (Doug Bates's new mixed model code) merely by defining: terms.mer - function(x, ...) { attr(x@frame,terms) } extractAIC.default - function(fit, scale=0, k=2, ...) { L - logLik(fit) edf - attr(L,df) c(edf,-2*L+2*edf) } Adding this definition of extractAIC.default also makes drop1() work with lme fits ... Comments? Should I submit to the bug database as enhancement request? Are there any hidden downsides to this? drawback: a possible very small performance cut for the cases where the $ access is ok. But that should not count. I like the idea {it's a pity that only S3 methods work that way, because residuals(), terms(), etc... are unfortunately not general (implicit) S4 generics but just S3 ones.. I'm currently testing your change, plus some more in step(). However, for step() to work automagically there is more needed. It currently relies in more places on 'object' being a list to which you can append new components, basically by fit - object fit$new1 - ... fit$new2 - ... That would have to be changed to something like fit - list(obj = object) fit$new1 - ... fit$new2 - ... and more changes where 'fit' has to be replaced by 'fit$obj'. Would that not be of interest as well? Martin Ben Bolker -- Index: add.R === --- add.R (revision 54562) +++ add.R (working copy) @@ -330,7 +330,7 @@ drop1.default - function(object, scope, scale = 0, test=c(none, Chisq), k = 2, trace = FALSE, ...) { -tl - attr(object$terms, term.labels) +tl - attr(terms(object), term.labels) if(missing(scope)) scope - drop.scope(object) else { if(!is.character(scope)) @@ -344,7 +344,7 @@ ans - matrix(nrow = ns + 1L, ncol = 2L, dimnames = list(c(none, scope), c(df, AIC))) ans[1, ] - extractAIC(object, scale, k = k, ...) -n0 - length(object$residuals) +n0 - length(residuals(object)) env - environment(formula(object)) for(i in seq(ns)) { tt - scope[i] @@ -356,7 +356,7 @@ evaluate = FALSE) nfit - eval(nfit, envir=env) # was eval.parent(nfit) ans[i+1, ] - extractAIC(nfit, scale, k = k, ...) -if(length(nfit$residuals) != n0) +if(length(residuals(nfit)) != n0) stop(number of rows in use has changed: remove missing values?) } dfs - ans[1L , 1L] - ans[, 1L] -- __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1
Re: [Rd] request for patch in drop1 (add.R)
On 11-02-23 06:12 PM, Prof Brian Ripley wrote: residuals() and $residuals are often very different: residuals() is generic, but even the default method is *not* simple extraction. Their values can be of different lengths: think about an lm fit with na.action = na.exclude. That is precisely the sort of thing the tests in add.R were designed to detect, hence the use of $residuals. None of this is used in drop1()! One of the places $residuals was used in that file was the default method for drop1(), others being step() and the default method for add1(). As default methods these have to continue to work for any class of object that has previously been thrown at them over the last 10+ years, and even all CRAN packages will not be a good enough test suite. In any case, the current code in R-devel makes use of the new generic nobs(), which is intended to help sort out the many versions of functions called 'BIC in packages, but is also useful here. (It is still under development.) terms() is also generic so there is also some danger that its substitution could give an inappropriate result. But as it used in several other places in add.R the breakage will probably occur elsewhere already. Thanks Prof. Ripley. (I will say that, while I understand why residuals(x) and x$residuals could be different, I am happy that a more transparent form of coding is being introduced ...) On Wed, 23 Feb 2011, Martin Maechler wrote: Ben Bolker bbol...@gmail.com on Wed, 23 Feb 2011 09:14:37 -0500 writes: By changing three lines in drop1 from access based on $ to access based on standard accessor methods (terms() and residuals()), it becomes *much* easier to extend drop1 to work with other model types. The use of $ rather than accessors in this context seems to be an oversight rather than a design decision, but maybe someone knows better ... In particular, if one makes these changes (which I am pretty certain will not break anything, as the original code basically mimicked the default methods anyway), it becomes possible to make drop1() work with mer objects (Doug Bates's new mixed model code) merely by defining: terms.mer - function(x, ...) { attr(x@frame,terms) } extractAIC.default - function(fit, scale=0, k=2, ...) { L - logLik(fit) edf - attr(L,df) c(edf,-2*L+2*edf) } Adding this definition of extractAIC.default also makes drop1() work with lme fits ... Comments? Should I submit to the bug database as enhancement request? Are there any hidden downsides to this? drawback: a possible very small performance cut for the cases where the $ access is ok. But that should not count. I like the idea {it's a pity that only S3 methods work that way, because residuals(), terms(), etc... are unfortunately not general (implicit) S4 generics but just S3 ones.. I'm currently testing your change, plus some more in step(). However, for step() to work automagically there is more needed. It currently relies in more places on 'object' being a list to which you can append new components, basically by fit - object fit$new1 - ... fit$new2 - ... That would have to be changed to something like fit - list(obj = object) fit$new1 - ... fit$new2 - ... and more changes where 'fit' has to be replaced by 'fit$obj'. Would that not be of interest as well? Martin Ben Bolker -- Index: add.R === --- add.R(revision 54562) +++ add.R(working copy) @@ -330,7 +330,7 @@ drop1.default - function(object, scope, scale = 0, test=c(none, Chisq), k = 2, trace = FALSE, ...) { -tl - attr(object$terms, term.labels) +tl - attr(terms(object), term.labels) if(missing(scope)) scope - drop.scope(object) else { if(!is.character(scope)) @@ -344,7 +344,7 @@ ans - matrix(nrow = ns + 1L, ncol = 2L, dimnames = list(c(none, scope), c(df, AIC))) ans[1, ] - extractAIC(object, scale, k = k, ...) -n0 - length(object$residuals) +n0 - length(residuals(object)) env - environment(formula(object)) for(i in seq(ns)) { tt - scope[i] @@ -356,7 +356,7 @@ evaluate = FALSE) nfit - eval(nfit, envir=env) # was eval.parent(nfit) ans[i+1, ] - extractAIC(nfit, scale, k = k, ...) -if(length(nfit$residuals) != n0) +if(length(residuals(nfit)) != n0) stop(number of rows in use has changed: remove missing values?) } dfs - ans[1L , 1L] - ans[, 1L] -- __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel