On Thu, 2006-10-05 at 15:44 -0700, Kaom Te wrote:
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
Hello,
I'm a novice user trying to figure out how to retain NA aggregate
values. For example, given a data frame with data for 3 of the 4
possible factor colors(orange is omitted from the data frame), I want
to calculate the average height by color, but I'd like to retain the
knowledge that orange is a possible factor, its just missing. Here is
the example code:
data - data.frame(color = factor(c(blue,red,red,green,blue),
levels = c(blue,red,green,orange)),
height = c(2,8,4,4,5))
aggregate(data$height, list(color = data$color), mean)
color x
1 blue 3.5
2 red 6.0
3 green 4.0
Instead I would like to get
color x
1 blue 3.5
2red 6.0
3 green 4.0
4 orange NA
Is this possible. I've read as much documentation as I can find, but am
unable to find the solution. It seems like something people would need
to do. So I would assume it must be built in somewhere or do I need to
write my own version of aggregate?
Thanks in advance,
Kaom
If you review the Details section of ?aggregate, you will note:
Empty subsets are removed, ...
Thus, one approach is:
tmp - tapply(data$height, data$color, mean, na.rm = TRUE)
tmp
bluered green orange
3.56.04.0 NA
DF - data.frame(color = names(tmp), mean.height = tmp,
row.names = seq(along = tmp))
DF
color mean.height
1 blue 3.5
2red 6.0
3 green 4.0
4 orange NA
HTH,
Marc Schwartz
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.