Hi: This question arose a few days ago. There are two simple ways to do this: (i) using ddply in the plyr package and (ii) using the firstobs() function in the doBy package.
(i) library(plyr) > ddply(x, .(ID), head, n = 1) ID Time1 Time2 1 1 3 4 2 2 3 5 3 3 4 5 (ii) library(doBy) x[firstobs(x[, 1]), ] ID Time1 Time2 1 1 3 4 3 2 3 5 5 3 4 5 HTH, Dennis On Sat, Jan 16, 2010 at 2:04 PM, Bryan M Hangartner <[email protected]>wrote: > To Whomever is Interested, > > I have spent several days searching the web, help files, the R wiki and the > archives of this mailing list for a solution to this problem, but > nonetheless I apologize in advance if I have missed something obvious. > > The problem is this; I have a 5-column data frame with about 4.2 million > rows, and want to create a new (and hopefully much smaller) data frame that > contains only the rows which have a unique value in the first column only. > In other words, I do not care about the uniqueness of the values in the > other four rows, only the uniqueness of the entries in the first row. The > "unique" command does not seem to have this option available, at least based > on what I've read in the help file. > > A simplified example matrix (designated as "traveltimes"): > > ID Time1 Time2 > 1 3 4 > 1 4 7 > 2 3 5 > 2 5 6 > 3 4 5 > 3 2 8 > > When I use a command such as > > matches <- unique(traveltimes, incomparables = FALSE, fromLast = FALSE) > > I will end up with a 6-row matrix, exactly what I already have. What I > would like to do is to remove the duplicate values in the column labeled > "ID" and their associated Time1 and Time2 entries. This will give me a 3x3 > matrix which contains only one instance of each "ID" variable. For the > purposes of this particular problem, the uniqueness of the Time1 and Time2 > rows is not relevant. > > If this question is not clear enough please let me know. Thank you for your > time. > > > -- > Bryan Hangartner > [email protected] > > ______________________________________________ > [email protected] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] ______________________________________________ [email protected] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

