Hello all, I'm trying to transform data frames by grouping the rows by the values in a particular column, ordered by another column, then picking the first row in each group.
I'd like to convert a data frame like this: x y z 1 10 20 1 11 19 2 12 18 4 13 17 into one with three rows, like this, where i've discarded one row: x y z 1 1 11 19 2 2 12 18 4 4 13 17 I've got a solution using aggregate, but it gets very slow with any volume of data - the performance seems mostly IO bound and never finishes with a data set ~6MB Here's how I'm currently trying to do this d = data.frame(x=c(1,1,2,4),y=c(10,11,12,13),z=c(20,19,18,17)) d.ordered = d[order(-d$y),] aggregate(d.ordered,by=list(key=d.ordered$x),FUN=function(x){x[1]}) I've tried to use split and unsplit, but unsplit complained about duplicate row names when reassembling the sub frames. thanks for your suggestions -james [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.