On Jul 23, 2009, at 4:18 PM, Alexis Maluendas wrote:

Hi R experts,

I need know how calculate a weighted mean by group in a data frame. I have
tried with aggragate() function:

data.frame(x=c(15,12,3,10,10),g=c(1,1,1,2,2,3,3),w=c(2,3,1,5,5,2,5)) -> d
aggregate(d$x,by=list(d$g),weighted.mean,w=d$w)

Generating the following error:

Error en FUN(X[[1L]], ...) : 'x' and 'w' must have the same length

Thanks in advance


Did you not notice the error message when creating the data frame:

> d <- data.frame(x=c(15,12,3,10,10),g=c(1,1,1,2,2,3,3),w=c(2,3,1,5,5,2,5))
Error in data.frame(x = c(15, 12, 3, 10, 10), g = c(1, 1, 1, 2, 2, 3,  :
  arguments imply differing number of rows: 5, 7

You have 5 elements in 'x' and 7 in each of 'g' and 'w'...

In addition, you are passing all 7 elements in d$w to each of the subsets created by d$g, hence you are getting the aggregate() error message.

This is one of those cases where you may be better served by using split() directly to break up the data frame into groups and then use sapply() over the subsets:

# I am adding data here to create the data frame
d <- data .frame(x=c(15,12,3,10,10,12,12),g=c(1,1,1,2,2,3,3),w=c(2,3,1,5,5,2,5))

> d
   x g w
1 15 1 2
2 12 1 3
3  3 1 1
4 10 2 5
5 10 2 5
6 12 3 2
7 12 3 5

> split(d, d$g)
$`1`
   x g w
1 15 1 2
2 12 1 3
3  3 1 1

$`2`
   x g w
4 10 2 5
5 10 2 5

$`3`
   x g w
6 12 3 2
7 12 3 5



> sapply(split(d, d$g), function(x) weighted.mean(x$x, w = x$w))
   1    2    3
11.5 10.0 12.0


See ?split, which is used by tapply(), which in turn is used in aggregate().

HTH,

Marc Schwartz

______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to