I have been using ddply to do aggregation, and I frequently define a single aggregation function that I use to aggregate over different groups. For example,
require(plyr) dat <- data.frame(x = sample(3, 100, replace=TRUE), y = sample(3, 100, replace = TRUE), z = rnorm(100)) f <- function(x) { data.frame(mean.z = mean(x$z), sd.z = sd(x$z)) } ddply(dat, "x", f) ddply(dat, "y", f) ddply(dat, c("x", "y"), f) I recently discovered the data.table package, which dramatically speeds up the aggregation: require(data.table) dat <- data.table(dat) dat[, list(mean.z = mean(z), sd.z = sd(z)), list(x)] dat[, list(mean.z = mean(z), sd.z = sd(z)), list(y)] dat[, list(mean.z = mean(z), sd.z = sd(z)), list(x,y)] But I can't figure out how to save the aggregation function "list(mean.z = mean(z), sd.z = sd(z))" as a variable that I can reuse, similar to the function "f" above. Can someone please explain how to do that? Thanks. - Elliot -- Elliot Joel Bernstein, Ph.D. | Research Associate | FDO Partners, LLC 134 Mount Auburn Street | Cambridge, MA | 02138 Phone: (617) 503-4619 | Email: elliot.bernst...@fdopartners.com ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.