I'm trying to drop all rows except for the ones with the most recent year. So I split the data frame by NPERMNO and keep just the last record of all groups.
datg=t(sapply(split(datgic, datgic$NPERMNO, drop=TRUE), function(x){return( x[nrow(x),] )})) I get something like this... GVKEY NPERMNO GIC year 10001 12994 10001 55102010 2007 10002 19049 10002 40101015 2007 10009 16739 10009 40101010 1999 Has this been made into a proper data frame. How come the row numbers are not 1,2,3,4...? Thank you so much, and I would really appreciate any help! Mike ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.