I have a data.frame with ~250 observations (rows) in each of ~50 categories (columns). I would like to perform t.tests on subsets of observations within each column, with the subsets according to index vectors contained in other columns of the data.frame.

My data.frame looks something like this:

x<-data.frame(matrix(rnorm(200,mean=5,sd=.5),nrow=20))
colnames(x)<-c("site", "status", "X1", "X2", "X3", "X4", "X5", "X6", "X7", "X8")
x$site<-as.factor(rep(c("A", "A", "B", "B", "C"), 4))
x$status<-as.factor(rep(c("D", "L"), 10))

I want to do t.tests on the numeric observations within the data.frame by "site" and by "status":

t.test(x[x$site == "A" & x$status =="D",]$X1, x[x$site == "A" & x $status =="L",]$X1) t.test(x[x$site == "B" & x$status =="D",]$X1, x[x$site == "B" & x $status =="L",]$X1) t.test(x[x$site == "C" & x$status =="D",]$X1, x[x$site == "C" & x $status =="L",]$X1)

t.test(x[x$site == "A" & x$status =="D",]$X2, x[x$site == "A" & x $status =="L",]$X2) t.test(x[x$site == "B" & x$status =="D",]$X2, x[x$site == "B" & x $status =="L",]$X2) t.test(x[x$site == "C" & x$status =="D",]$X2, x[x$site == "C" & x $status =="L",]$X2)

etc...

I know I must be able to do this more efficently using a loop and one of the apply functions, e.g. something like this:

k=length(levels(x$site))
for (i in 1:k)
{
site<-levels(x$site)[i]
x1<-x[x$site == site, ]
results[i]<-apply(x1, 2, function(x1) {t.test(x1[x1$status == "D",], x1[x1$status == "L",])})
results
}

But I can't figure out how to do the apply function correctly...

Also wonder whether there's a way to use the apply-type function and aviod the loop all together.

Thanks in advance!

Ali

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to