Hi:

This question arose a few days ago. There are two simple ways to do this:
(i) using ddply in the
plyr package and (ii) using the firstobs() function in the doBy package.

(i)  library(plyr)

> ddply(x, .(ID), head, n = 1)
  ID Time1 Time2
1  1     3     4
2  2     3     5
3  3     4     5

(ii) library(doBy)

 x[firstobs(x[, 1]), ]
  ID Time1 Time2
1  1     3     4
3  2     3     5
5  3     4     5

HTH,
Dennis

On Sat, Jan 16, 2010 at 2:04 PM, Bryan M Hangartner
<[email protected]>wrote:

> To Whomever is Interested,
>
> I have spent several days searching the web, help files, the R wiki and the
> archives of this mailing list for a solution to this problem, but
> nonetheless I apologize in advance if I have missed something obvious.
>
> The problem is this; I have a 5-column data frame with about 4.2 million
> rows, and want to create a new (and hopefully much smaller) data frame that
> contains only the rows which have a unique value in the first column only.
> In other words, I do not care about the uniqueness of the values in the
> other four rows, only the uniqueness of the entries in the first row. The
> "unique" command does not seem to have this option available, at least based
> on what I've read in the help file.
>
> A simplified example matrix (designated as "traveltimes"):
>
> ID Time1 Time2
> 1    3     4
> 1    4     7
> 2    3     5
> 2    5     6
> 3    4     5
> 3    2     8
>
> When I use a command such as
>
> matches <- unique(traveltimes, incomparables = FALSE, fromLast = FALSE)
>
> I will end up with a 6-row matrix, exactly what I already have. What I
> would like to do is to remove the duplicate values in the column labeled
> "ID" and their associated Time1 and Time2 entries. This will give me a 3x3
> matrix which contains only one instance of each "ID" variable. For the
> purposes of this particular problem, the uniqueness of the Time1 and Time2
> rows is not relevant.
>
> If this question is not clear enough please let me know. Thank you for your
> time.
>
>
> --
> Bryan Hangartner
> [email protected]
>
> ______________________________________________
> [email protected] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to