On Jul 23, 2009, at 4:18 PM, Alexis Maluendas wrote:
Hi R experts,
I need know how calculate a weighted mean by group in a data frame.
I have
tried with aggragate() function:
data.frame(x=c(15,12,3,10,10),g=c(1,1,1,2,2,3,3),w=c(2,3,1,5,5,2,5))
-> d
aggregate(d$x,by=list(d$g),weighted.mean,w=d$w)
Generating the following error:
Error en FUN(X[[1L]], ...) : 'x' and 'w' must have the same length
Thanks in advance
Did you not notice the error message when creating the data frame:
> d <-
data.frame(x=c(15,12,3,10,10),g=c(1,1,1,2,2,3,3),w=c(2,3,1,5,5,2,5))
Error in data.frame(x = c(15, 12, 3, 10, 10), g = c(1, 1, 1, 2, 2, 3, :
arguments imply differing number of rows: 5, 7
You have 5 elements in 'x' and 7 in each of 'g' and 'w'...
In addition, you are passing all 7 elements in d$w to each of the
subsets created by d$g, hence you are getting the aggregate() error
message.
This is one of those cases where you may be better served by using
split() directly to break up the data frame into groups and then use
sapply() over the subsets:
# I am adding data here to create the data frame
d <-
data
.frame(x=c(15,12,3,10,10,12,12),g=c(1,1,1,2,2,3,3),w=c(2,3,1,5,5,2,5))
> d
x g w
1 15 1 2
2 12 1 3
3 3 1 1
4 10 2 5
5 10 2 5
6 12 3 2
7 12 3 5
> split(d, d$g)
$`1`
x g w
1 15 1 2
2 12 1 3
3 3 1 1
$`2`
x g w
4 10 2 5
5 10 2 5
$`3`
x g w
6 12 3 2
7 12 3 5
> sapply(split(d, d$g), function(x) weighted.mean(x$x, w = x$w))
1 2 3
11.5 10.0 12.0
See ?split, which is used by tapply(), which in turn is used in
aggregate().
HTH,
Marc Schwartz
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.