On 05/01/2010 1:29 PM, Geoffrey Smith wrote:
Hello, does anyone know how to take the mean for a subset of observations?
For example, suppose my data looks like this:
OBS NAME SCORE
1 Tom 92
2 Tom 88
3 Tom 56
4 James 85
5 James 75
6 James 32
7 Dawn 56
8 Dawn 91
9 Clara 95
10 Clara 84
Is there a way to get the mean of the SCORE variable by NAME but only when
the number of observations is equal to 3? In other words, is there a way to
get the mean of the SCORE variable for Tom and James, but not for Dawn and
Clara? Thank you.
You probably want to do it in two steps: first, find which names have 3
observations, and take that subset of the dataset; then do the mean on
all groups. This is one way:
> counts <- table(dataset$NAME)
> keep <- names(counts)[counts == 3]
> subset <- dataset[ dataset$NAME %in% keep,]
> tapply(subset$SCORE, subset$NAME, mean)
Clara Dawn James Tom
NA NA 64.00000 78.66667
Duncan Murdoch
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.