Re: [R] problems with by()

Thomas Lumley Wed, 29 Jan 2003 16:20:07 -0800

On Wed, 29 Jan 2003, Heberto Ghezzo wrote:

> Hello, another problem.
>  > x<-rep(1,10)
>  > y<-rep(c(1,2),c(5,5))
>  > z<-seq(1:10)
>  > ab<-data.frame(x,y,z)
> #
>     now I want to do some work by the value of 'y'
>  > by(ab,y,mean)
> y: 1
> x y z
> 1 1 3
> ------------------------------------------------------------
> y: 2
> x y z
> 1 2 8
> #
>     I do not want all the means, only the mean of 'z'
>  > by(ab,y,function(x) mean(z))
> y: 1
> [1] 5.5
> ------------------------------------------------------------
> y: 2
> [1] 5.5
>  > by(ab,y,function(x) mean(z,data=x))
> y: 1
> [1] 5.5
> ------------------------------------------------------------
> y: 2
> [1] 5.5
>  >
> #
>     so, how can I get the function(x) to be applied to each level
> of the index variable y.
> Actually I use my own function but the same happens, it is applied to all
> the data and there is no partition of the data acording to index


The function you are applying is

        function(x) mean(z)

That is, no matter what x is supplied, it calculates the mean of the
variable z, which is in your global workspace. The mean of z is 5.5

What you want is
        function(x) mean(x$z)
That is, take a supplied data frame and compute the mean of its `z'
column.

I try to use argument names that remind me what is happening in functions
like by()

        by(ab,y, function(df) mean(df$z))
or even
        by(ab, y, function(subset) mean(subset$z))

> Do not tell me that this version of R is completely buggy, I was waiting
> for the 1.7 to be out before upgrading

I think it's fair to characterise this sort of commment as `unhelpful'.

        -thomas

______________________________________________
[EMAIL PROTECTED] mailing list
http://www.stat.math.ethz.ch/mailman/listinfo/r-help

Re: [R] problems with by()

Reply via email to