Re: [Rd] request for patch in drop1 (add.R)

2011-02-24 Thread Martin Maechler
 Ben Bolker bbol...@gmail.com
 on Wed, 23 Feb 2011 19:36:43 -0500 writes:

 On 11-02-23 06:12 PM, Prof Brian Ripley wrote:
 residuals() and $residuals are often very different:
 residuals() is generic, but even the default method is *not*
 simple extraction. Their values can be of different lengths:
 think about an lm fit with na.action = na.exclude.  That is
 precisely the sort of thing the tests in add.R were designed to
 detect, hence the use of $residuals.
 
 None of this is used in drop1()!  One of the places $residuals
 was used in that file was the default method for drop1(),
 others being step() and the default method for add1().  As
 default methods these have to continue to work for any class of
 object that has previously been thrown at them over the last
 10+ years, and even all CRAN packages will not be a good enough
 test suite.
 
 In any case, the current code in R-devel makes use of the new
 generic nobs(), which is intended to help sort out the many
 versions of functions called 'BIC in packages, but is also
 useful here.  (It is still under development.)
 
 terms() is also generic so there is also some danger that its
 substitution could give an inappropriate result.  But as it
 used in several other places in add.R the breakage will
 probably occur elsewhere already.

:-) yes.  I think it is very good change to have drop1.default() and
other R functions that do generic model analysis work
via calls to simpler generic functions such as residuals(), or
the new  nobs() .  Thanks for introducing that, a really good
idea I think (and along which I think I was also musing when I
last looked at the BIC() implementations..).
I'm glad you've started to clean this up nicely.

 Thanks Prof. Ripley.  (I will say that, while I understand why
 residuals(x) and x$residuals could be different, I am happy that
 a more transparent form of coding is being introduced ...)

(indeed, see above).
Advertising the use of nobs(), i.e. asking package authors to
write nobs() methods for their models will be probably worth
doing, as soon as 2.13.0 hits the roads..

Martin


 On Wed, 23 Feb 2011, Martin Maechler wrote:
 
 Ben Bolker bbol...@gmail.com
 on Wed, 23 Feb 2011 09:14:37 -0500 writes:
 
By changing three lines in drop1 from access based on $
  to access based on standard accessor methods (terms() and
  residuals()), it becomes *much* easier to extend drop1 to
  work with other model types.  The use of $ rather than
  accessors in this context seems to be an oversight rather
  than a design decision, but maybe someone knows better ...
 
In particular, if one makes these changes (which I am
  pretty certain will not break anything, as the original
  code basically mimicked the default methods anyway), it
  becomes possible to make drop1() work with mer objects
  (Doug Bates's new mixed model code) merely by defining:
 
  terms.mer - function(x, ...) {
  attr(x@frame,terms)
  }
 
  extractAIC.default - function(fit, scale=0, k=2, ...) {
  L - logLik(fit)
  edf - attr(L,df)
  c(edf,-2*L+2*edf)
  }
 
  Adding this definition of extractAIC.default also makes
  drop1() work with lme fits ...
 
  Comments?  Should I submit to the bug database as
  enhancement request?  Are there any hidden downsides to
  this?
 
 drawback: a possible very small performance cut for the cases
 where the $ access is ok.  But that should not count.
 
 I like the idea {it's a pity that only S3 methods work
 that way, because residuals(), terms(), etc... are
 unfortunately not general (implicit) S4 generics but just S3
 ones..
 
 I'm currently testing your change, plus some more in step().
 However, for step() to work automagically there is more
 needed.  It currently relies in more places on 'object' being
 a list to which you can append new components, basically by
 fit - object fit$new1 - ...  fit$new2 - ...
 
 That would have to be changed to something like fit -
 list(obj = object) fit$new1 - ...  fit$new2 - ...  and more
 changes where 'fit' has to be replaced by 'fit$obj'.  Would
 that not be of interest as well?
 
 Martin
 
 
  Ben Bolker
 
 
 --
  Index: add.R
  ===
  --- add.R (revision 54562) +++ add.R (working copy) @@
  -330,7 +330,7 @@ drop1.default - function(object, scope,
  scale = 0,
 test=c(none, Chisq),
  k = 2, trace = FALSE, ...)  { - tl - attr(object$terms,
  term.labels) + tl - attr(terms(object), term.labels)
  if(missing(scope)) scope - drop.scope(object) else {
  

[Rd] request for patch in drop1 (add.R)

2011-02-23 Thread Ben Bolker

  By changing three lines in drop1 from access based on $ to access
based on standard accessor methods (terms() and residuals()), it becomes
*much* easier to extend drop1 to work with other model types.
The use of $ rather than accessors in this context seems to be an
oversight rather than a design decision, but maybe someone knows better ...

  In particular, if one makes these changes (which I am pretty certain
will not break anything, as the original code basically mimicked the
default methods anyway), it becomes possible to make drop1() work with
mer objects (Doug Bates's new mixed model code) merely by defining:

terms.mer - function(x, ...) {
  attr(x@frame,terms)
}

extractAIC.default - function(fit, scale=0, k=2, ...) {
  L - logLik(fit)
  edf - attr(L,df)
  c(edf,-2*L+2*edf)
}

 Adding this definition of extractAIC.default also makes drop1() work
with lme fits ...

  Comments?  Should I submit to the bug database as enhancement
request?  Are there any hidden downsides to this?

  Ben Bolker
Index: add.R
===
--- add.R   (revision 54562)
+++ add.R   (working copy)
@@ -330,7 +330,7 @@
 drop1.default - function(object, scope, scale = 0, test=c(none, Chisq),
  k = 2, trace = FALSE, ...)
 {
-tl - attr(object$terms, term.labels)
+tl - attr(terms(object), term.labels)
 if(missing(scope)) scope - drop.scope(object)
 else {
if(!is.character(scope))
@@ -344,7 +344,7 @@
 ans - matrix(nrow = ns + 1L, ncol = 2L,
   dimnames =  list(c(none, scope), c(df, AIC)))
 ans[1, ] - extractAIC(object, scale, k = k, ...)
-n0 - length(object$residuals)
+n0 - length(residuals(object))
 env - environment(formula(object))
 for(i in seq(ns)) {
tt - scope[i]
@@ -356,7 +356,7 @@
evaluate = FALSE)
nfit - eval(nfit, envir=env) # was  eval.parent(nfit)
ans[i+1, ] - extractAIC(nfit, scale, k = k, ...)
-if(length(nfit$residuals) != n0)
+if(length(residuals(nfit)) != n0)
 stop(number of rows in use has changed: remove missing values?)
 }
 dfs - ans[1L , 1L] - ans[, 1L]
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] request for patch in drop1 (add.R)

2011-02-23 Thread Martin Maechler
 Ben Bolker bbol...@gmail.com
 on Wed, 23 Feb 2011 09:14:37 -0500 writes:

   By changing three lines in drop1 from access based on $
 to access based on standard accessor methods (terms() and
 residuals()), it becomes *much* easier to extend drop1 to
 work with other model types.  The use of $ rather than
 accessors in this context seems to be an oversight rather
 than a design decision, but maybe someone knows better ...

   In particular, if one makes these changes (which I am
 pretty certain will not break anything, as the original
 code basically mimicked the default methods anyway), it
 becomes possible to make drop1() work with mer objects
 (Doug Bates's new mixed model code) merely by defining:

 terms.mer - function(x, ...) {
 attr(x@frame,terms)
 }

 extractAIC.default - function(fit, scale=0, k=2, ...) {
 L - logLik(fit)
 edf - attr(L,df)
 c(edf,-2*L+2*edf)
 }

 Adding this definition of extractAIC.default also makes drop1() work
 with lme fits ...

 Comments?  Should I submit to the bug database as enhancement
 request?  Are there any hidden downsides to this?

drawback: a possible very small performance cut for the cases
where the $ access is ok.  But that should not count.

I like the idea {it's a pity that only S3 methods work that way,
because residuals(), terms(), etc... are unfortunately not
general (implicit) S4 generics but just S3 ones..

I'm currently testing your change, plus some more in step().
However, for step() to work automagically there is more needed.
It currently relies in more places on 'object' being a list to
which you can append new components, basically by
   fit - object
   fit$new1 - ...
   fit$new2 - ...

That would have to be changed to something like
   fit - list(obj = object)
   fit$new1 - ...
   fit$new2 - ...
and more changes where 'fit' has to be replaced by 'fit$obj'.
Would that not be of interest as well?

Martin


 Ben Bolker

 --
 Index: add.R
 ===
 --- add.R (revision 54562)
 +++ add.R (working copy)
 @@ -330,7 +330,7 @@
 drop1.default - function(object, scope, scale = 0, test=c(none, 
Chisq),
 k = 2, trace = FALSE, ...)
 {
 -tl - attr(object$terms, term.labels)
 +tl - attr(terms(object), term.labels)
 if(missing(scope)) scope - drop.scope(object)
 else {
 if(!is.character(scope))
 @@ -344,7 +344,7 @@
 ans - matrix(nrow = ns + 1L, ncol = 2L,
 dimnames =  list(c(none, scope), c(df, AIC)))
 ans[1, ] - extractAIC(object, scale, k = k, ...)
 -n0 - length(object$residuals)
 +n0 - length(residuals(object))
 env - environment(formula(object))
 for(i in seq(ns)) {
 tt - scope[i]
 @@ -356,7 +356,7 @@
 evaluate = FALSE)
 nfit - eval(nfit, envir=env) # was  eval.parent(nfit)
 ans[i+1, ] - extractAIC(nfit, scale, k = k, ...)
 -if(length(nfit$residuals) != n0)
 +if(length(residuals(nfit)) != n0)
 stop(number of rows in use has changed: remove missing values?)
 }
 dfs - ans[1L , 1L] - ans[, 1L]

 --
 __
 R-devel@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] request for patch in drop1 (add.R)

2011-02-23 Thread Ben Bolker
On 11-02-23 03:20 PM, Martin Maechler wrote:
 Ben Bolker bbol...@gmail.com
 on Wed, 23 Feb 2011 09:14:37 -0500 writes:
 
By changing three lines in drop1 from access based on $
  to access based on standard accessor methods (terms() and
  residuals()), it becomes *much* easier to extend drop1 to
  work with other model types.  The use of $ rather than
  accessors in this context seems to be an oversight rather
  than a design decision, but maybe someone knows better ...
 
In particular, if one makes these changes (which I am
  pretty certain will not break anything, as the original
  code basically mimicked the default methods anyway), it
  becomes possible to make drop1() work with mer objects
  (Doug Bates's new mixed model code) merely by defining:
 
  terms.mer - function(x, ...) {
  attr(x@frame,terms)
  }
 
  extractAIC.default - function(fit, scale=0, k=2, ...) {
  L - logLik(fit)
  edf - attr(L,df)
  c(edf,-2*L+2*edf)
  }
 
  Adding this definition of extractAIC.default also makes drop1() work
  with lme fits ...
 
  Comments?  Should I submit to the bug database as enhancement
  request?  Are there any hidden downsides to this?
 
 drawback: a possible very small performance cut for the cases
 where the $ access is ok.  But that should not count.
 
 I like the idea {it's a pity that only S3 methods work that way,
 because residuals(), terms(), etc... are unfortunately not
 general (implicit) S4 generics but just S3 ones..
 
 I'm currently testing your change, plus some more in step().
 However, for step() to work automagically there is more needed.
 It currently relies in more places on 'object' being a list to
 which you can append new components, basically by
fit - object
fit$new1 - ...
fit$new2 - ...
 
 That would have to be changed to something like
fit - list(obj = object)
fit$new1 - ...
fit$new2 - ...
 and more changes where 'fit' has to be replaced by 'fit$obj'.
 Would that not be of interest as well?
 
 Martin

  Potentially, but I am personally much more interested in enabling
drop1(), which seems to be a much more legitimate tool for testing terms
in models than step(), which is so easy to abuse ...

Ben Bolker

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] request for patch in drop1 (add.R)

2011-02-23 Thread peter dalgaard

On Feb 23, 2011, at 21:38 , Ben Bolker wrote:

 
  Potentially, but I am personally much more interested in enabling
 drop1(), which seems to be a much more legitimate tool for testing terms
 in models than step(), which is so easy to abuse ...

Yes, although repeated use of drop1() easily leads to backwards elimination 

However, I have a different point: To make drop1() a better generic, I suspect 
that something needs to be done about the test = c(none, Chisq) bit. It 
would be nice if the list of possible tests could vary according to model type, 
e.g. by doing all tests via anova(model1,model2,...). I'm not quite up to 
figuring out how complicated that would be.

-- 
Peter Dalgaard
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] request for patch in drop1 (add.R)

2011-02-23 Thread Prof Brian Ripley
residuals() and $residuals are often very different:  residuals() is 
generic, but even the default method is *not* simple extraction. Their 
values can be of different lengths: think about an lm fit with 
na.action = na.exclude.  That is precisely the sort of thing the tests 
in add.R were designed to detect, hence the use of $residuals.


None of this is used in drop1()!  One of the places $residuals was 
used in that file was the default method for drop1(), others being 
step() and the default method for add1().  As default methods these 
have to continue to work for any class of object that has previously 
been thrown at them over the last 10+ years, and even all CRAN 
packages will not be a good enough test suite.


In any case, the current code in R-devel makes use of the new generic 
nobs(), which is intended to help sort out the many versions of 
functions called 'BIC in packages, but is also useful here.  (It is 
still under development.)


terms() is also generic so there is also some danger that its 
substitution could give an inappropriate result.  But as it used in 
several other places in add.R the breakage will probably occur 
elsewhere already.


On Wed, 23 Feb 2011, Martin Maechler wrote:


Ben Bolker bbol...@gmail.com
on Wed, 23 Feb 2011 09:14:37 -0500 writes:


  By changing three lines in drop1 from access based on $
to access based on standard accessor methods (terms() and
residuals()), it becomes *much* easier to extend drop1 to
work with other model types.  The use of $ rather than
accessors in this context seems to be an oversight rather
than a design decision, but maybe someone knows better ...

  In particular, if one makes these changes (which I am
pretty certain will not break anything, as the original
code basically mimicked the default methods anyway), it
becomes possible to make drop1() work with mer objects
(Doug Bates's new mixed model code) merely by defining:

terms.mer - function(x, ...) {
attr(x@frame,terms)
}

extractAIC.default - function(fit, scale=0, k=2, ...) {
L - logLik(fit)
edf - attr(L,df)
c(edf,-2*L+2*edf)
}

Adding this definition of extractAIC.default also makes drop1() work
with lme fits ...

Comments?  Should I submit to the bug database as enhancement
request?  Are there any hidden downsides to this?

drawback: a possible very small performance cut for the cases
where the $ access is ok.  But that should not count.

I like the idea {it's a pity that only S3 methods work that way,
because residuals(), terms(), etc... are unfortunately not
general (implicit) S4 generics but just S3 ones..

I'm currently testing your change, plus some more in step().
However, for step() to work automagically there is more needed.
It currently relies in more places on 'object' being a list to
which you can append new components, basically by
  fit - object
  fit$new1 - ...
  fit$new2 - ...

That would have to be changed to something like
  fit - list(obj = object)
  fit$new1 - ...
  fit$new2 - ...
and more changes where 'fit' has to be replaced by 'fit$obj'.
Would that not be of interest as well?

Martin


Ben Bolker

--
Index: add.R
===
--- add.R   (revision 54562)
+++ add.R   (working copy)
@@ -330,7 +330,7 @@
drop1.default - function(object, scope, scale = 0, test=c(none, 
Chisq),
k = 2, trace = FALSE, ...)
{
-tl - attr(object$terms, term.labels)
+tl - attr(terms(object), term.labels)
if(missing(scope)) scope - drop.scope(object)
else {
if(!is.character(scope))
@@ -344,7 +344,7 @@
ans - matrix(nrow = ns + 1L, ncol = 2L,
dimnames =  list(c(none, scope), c(df, AIC)))
ans[1, ] - extractAIC(object, scale, k = k, ...)
-n0 - length(object$residuals)
+n0 - length(residuals(object))
env - environment(formula(object))
for(i in seq(ns)) {
tt - scope[i]
@@ -356,7 +356,7 @@
evaluate = FALSE)
nfit - eval(nfit, envir=env) # was  eval.parent(nfit)
ans[i+1, ] - extractAIC(nfit, scale, k = k, ...)
-if(length(nfit$residuals) != n0)
+if(length(residuals(nfit)) != n0)
stop(number of rows in use has changed: remove missing values?)
}
dfs - ans[1L , 1L] - ans[, 1L]

--
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 

Re: [Rd] request for patch in drop1 (add.R)

2011-02-23 Thread Ben Bolker
On 11-02-23 06:12 PM, Prof Brian Ripley wrote:
 residuals() and $residuals are often very different:  residuals() is
 generic, but even the default method is *not* simple extraction. Their
 values can be of different lengths: think about an lm fit with na.action
 = na.exclude.  That is precisely the sort of thing the tests in add.R
 were designed to detect, hence the use of $residuals.
 
 None of this is used in drop1()!  One of the places $residuals was used
 in that file was the default method for drop1(), others being step() and
 the default method for add1().  As default methods these have to
 continue to work for any class of object that has previously been thrown
 at them over the last 10+ years, and even all CRAN packages will not be
 a good enough test suite.
 
 In any case, the current code in R-devel makes use of the new generic
 nobs(), which is intended to help sort out the many versions of
 functions called 'BIC in packages, but is also useful here.  (It is
 still under development.)
 
 terms() is also generic so there is also some danger that its
 substitution could give an inappropriate result.  But as it used in
 several other places in add.R the breakage will probably occur elsewhere
 already.

  Thanks Prof. Ripley.
  (I will say that, while I understand why residuals(x) and x$residuals
could be different, I am happy that a more transparent form of coding is
being introduced ...)
 
 On Wed, 23 Feb 2011, Martin Maechler wrote:
 
 Ben Bolker bbol...@gmail.com
 on Wed, 23 Feb 2011 09:14:37 -0500 writes:

   By changing three lines in drop1 from access based on $
 to access based on standard accessor methods (terms() and
 residuals()), it becomes *much* easier to extend drop1 to
 work with other model types.  The use of $ rather than
 accessors in this context seems to be an oversight rather
 than a design decision, but maybe someone knows better ...

   In particular, if one makes these changes (which I am
 pretty certain will not break anything, as the original
 code basically mimicked the default methods anyway), it
 becomes possible to make drop1() work with mer objects
 (Doug Bates's new mixed model code) merely by defining:

 terms.mer - function(x, ...) {
 attr(x@frame,terms)
 }

 extractAIC.default - function(fit, scale=0, k=2, ...) {
 L - logLik(fit)
 edf - attr(L,df)
 c(edf,-2*L+2*edf)
 }

 Adding this definition of extractAIC.default also makes drop1() work
 with lme fits ...

 Comments?  Should I submit to the bug database as enhancement
 request?  Are there any hidden downsides to this?

 drawback: a possible very small performance cut for the cases
 where the $ access is ok.  But that should not count.

 I like the idea {it's a pity that only S3 methods work that way,
 because residuals(), terms(), etc... are unfortunately not
 general (implicit) S4 generics but just S3 ones..

 I'm currently testing your change, plus some more in step().
 However, for step() to work automagically there is more needed.
 It currently relies in more places on 'object' being a list to
 which you can append new components, basically by
   fit - object
   fit$new1 - ...
   fit$new2 - ...

 That would have to be changed to something like
   fit - list(obj = object)
   fit$new1 - ...
   fit$new2 - ...
 and more changes where 'fit' has to be replaced by 'fit$obj'.
 Would that not be of interest as well?

 Martin


 Ben Bolker


 --
 Index: add.R
 ===
 --- add.R(revision 54562)
 +++ add.R(working copy)
 @@ -330,7 +330,7 @@
 drop1.default - function(object, scope, scale = 0,
 test=c(none, Chisq),
 k = 2, trace = FALSE, ...)
 {
 -tl - attr(object$terms, term.labels)
 +tl - attr(terms(object), term.labels)
 if(missing(scope)) scope - drop.scope(object)
 else {
 if(!is.character(scope))
 @@ -344,7 +344,7 @@
 ans - matrix(nrow = ns + 1L, ncol = 2L,
 dimnames =  list(c(none, scope), c(df, AIC)))
 ans[1, ] - extractAIC(object, scale, k = k, ...)
 -n0 - length(object$residuals)
 +n0 - length(residuals(object))
 env - environment(formula(object))
 for(i in seq(ns)) {
 tt - scope[i]
 @@ -356,7 +356,7 @@
 evaluate = FALSE)
 nfit - eval(nfit, envir=env) # was  eval.parent(nfit)
 ans[i+1, ] - extractAIC(nfit, scale, k = k, ...)
 -if(length(nfit$residuals) != n0)
 +if(length(residuals(nfit)) != n0)
 stop(number of rows in use has changed: remove missing values?)
 }
 dfs - ans[1L , 1L] - ans[, 1L]


 --
 __
 R-devel@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-devel