Spencer's solution is considerably more inefficient then using duplicated() and subscripting: in a small example with 3 columns and 10000 rows, it took 5 times as long on my Windows setup.
The reason is that aggregate() is basically a wrapper for tapply and tapply basically loops in R. duplicated() loops in C (and uses hashing, I believe). Cheers, -- Bert Gunter Genentech Non-Clinical Statistics South San Francisco, CA "The business of the statistician is to catalyze the scientific learning process." - George E. P. Box > -----Original Message----- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of Spencer Graves > Sent: Thursday, December 23, 2004 9:06 AM > To: G�ran Brostr�m > Cc: Rudi Alberts; [email protected] > Subject: Re: [R] subsetting a data.frame to the 'unique' of a column > > What about "aggregate"? > > DF <- data.frame(a=c(1,1,2), b=1:3, c=letters[1:3]) > aggregate(DF[2:3], DF[1], function(x)x[1]) > a b c > 1 1 1 1 > 2 2 3 3 > > hope this helps. spencer graves > > G�ran Brostr�m wrote: > > >On Thu, Dec 23, 2004 at 11:28:31AM -0800, Rudi Alberts wrote: > > > > > >>Hi, > >> > >>I often run into this problem: > >>I have a data.frame with one column containing entries that are not > >>unique. What I then want is a subset of the data.frame in which > >>the entries in that column have become the 'unique' of the original > >>column. > >>Normally I program around it by taking the unique of the column and > >>making a new data.frame with it and filling the rest of the data. > >> > >>(By the way, when moving to the smaller data.frame for > example 5 rows > >>with the same value in that column will be replaced by one > row for that > >>value. I don't mind which of the rows now..) > >> > >> > >>something like this, however, this gives me the complete df. > >> > >>df[df$colname %in% unique(df$colname),] > >> > >>or this, which doesnt work > >> > >>df[df$colname == unique(df$colname),] > >> > >> > >> > > Use 'duplicated': > > > > > > > >>df[!duplicated(df$colname), ] > >> > >> > > > > > > > > -- > Spencer Graves, PhD, Senior Development Engineer > O: (408)938-4420; mobile: (408)655-4567 > > ______________________________________________ > [email protected] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html > ______________________________________________ [email protected] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
