> In this simple example, it took less than half a second to generate the > result. That is on a 2.93 Ghz MacBook Pro. > > > So, for your data, the code would look something like this: > > > system.time(DF.new <- do.call(rbind, > lapply(split(patch_summary, > patch_summary$UniqueID), > function(x) x[sample(nrow(x), 1), ])))
For large data, you can make it even faster with sample_rows <- function(df, n) { df[sample(nrow(df), n), ] } library(plyr) system.time(DF.new <- ddply(DF, "ID", sample_rows, n = 1)) ddply uses some tricks to avoid copying DF which really make a different for large data (unfortunately it also increases the overhead so it is currently slower for small data) Hadley -- http://had.co.nz/ ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.