If Martin's proposal is accepted, does
that mean that the median method for
data frames would be something like:

function (x, ...)
{
        stop(paste("you probably mean to use the command: sapply(",
                deparse(substitute(x)), ", median)", sep=""))
}

Pat


On 29/04/2011 15:25, Martin Maechler wrote:
Paul Johnson<pauljoh...@gmail.com>
     on Thu, 28 Apr 2011 00:20:27 -0500 writes:

     >  On Wed, Apr 27, 2011 at 12:44 PM, Patrick Burns
     >  <pbu...@pburns.seanet.com>  wrote:
     >>  Here are some data frames:
     >>
     >>  df3.2<- data.frame(1:3, 7:9)
     >>  df4.2<- data.frame(1:4, 7:10)
     >>  df3.3<- data.frame(1:3, 7:9, 10:12)
     >>  df4.3<- data.frame(1:4, 7:10, 10:13)
     >>  df3.4<- data.frame(1:3, 7:9, 10:12, 15:17)
     >>  df4.4<- data.frame(1:4, 7:10, 10:13, 15:18)
     >>
     >>  Now here are some commands and their answers:

     >>>  median(df4.4)
     >>  [1]  8.5 11.5
     >>>  median(df3.2[c(1,2,3),])
     >>  [1] 2 8
     >>>  median(df3.2[c(1,3,2),])
     >>  [1]  2 NA
     >>  Warning message:
     >>  In mean.default(X[[2L]], ...) :
     >>    argument is not numeric or logical: returning NA
     >>
     >>
     >>
     >>  The sessionInfo is below, but it looks
     >>  to me like the present behavior started
     >>  in 2.10.0.
     >>
     >>  Sometimes it gets the right answer.  I'd
     >>  be grateful to hear how it does that -- I
     >>  can't figure it out.
     >>

     >  Hello, Pat.

     >  Nice poetry there!  I think I have an actual answer, as opposed to the
     >  usual crap I spew.

     >  I would agree if you said median.data.frame ought to be written to
     >  work columnwise, similar to mean.data.frame.

     >  apply and sapply  always give the correct answer

     >>  apply(df3.3, 2, median)
     >  X1.3   X7.9 X10.12
     >  2      8     11

     [...........]

exactly

     >  mean.data.frame is now implemented as

     >  mean.data.frame<- function(x, ...) sapply(x, mean, ...)

exactly.

My personal oppinion is that  mean.data.frame() should never have
been written.
People should know, or learn, to use apply functions for such a
task.

The unfortunate fact that mean.data.frame() exists makes people
think that median.data.frame() should too,
and then

   var.data.frame()
    sd.data.frame()
   mad.data.frame()
   min.data.frame()
   max.data.frame()
   ...
   ...

all just in order to *not* to have to know  sapply()
????

No, rather not.

My vote is for deprecating  mean.data.frame().

Martin


--
Patrick Burns
pbu...@pburns.seanet.com
twitter: @portfolioprobe
http://www.portfolioprobe.com/blog
http://www.burns-stat.com
(home of 'Some hints for the R beginner'
and 'The R Inferno')

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to