[R] Removing duplicates without a for loop

2012-09-26 Thread wwreith
I have several thousand rows of shipment data imported into R as a data frame, with two columns of particular interest, col 1 is the entry date, and col 2 is the tracking number (colname is REQ.NR). Tracking numbers should be unique but on occassion aren't because they get entered more than once.

Re: [R] Removing duplicates without a for loop

2012-09-26 Thread Jean V Adams
This might be quicker. Para.5C.sorted - Para.5C[order(Para.5C[, 1]), ] Para.5C.final - Para.5C.sorted[!duplicated(Para.5C.sorted$REQ.NR), ] If your data are already sorted by date, then you can skip the first step and just run Para.5C.final - Para.5C[!duplicated(Para.5C$REQ.NR), ] Jean

Re: [R] Removing duplicates without a for loop

2012-09-26 Thread Rui Barradas
Sorry, but in my previous post I've confused the columns. It's by REQ.NR, not by date REQ.NR - 1:4 REQ.NR - c(REQ.NR, sample(REQ.NR, 2)) dat - data.frame(date = Sys.Date() + 1:6, REQ.NR = REQ.NR, value = rnorm(6)) aggregate(dat, by = list(dat$REQ.NR), FUN = tail, 1) Rui Barradas Em

Re: [R] Removing duplicates without a for loop

2012-09-26 Thread Rui Barradas
Hello, If I understand it correctly, something like this will get you what you want. d - Sys.Date() + 1:4 d2 - sample(d, 2) dat - data.frame(id = 1:6, date = c(d, d2), value = rnorm(6)) aggregate(dat, by = list(dat$date), FUN = tail, 1) Hope this helps, Rui Barradas Em 26-09-2012 16:19,

Re: [R] Removing duplicates without a for loop

2012-09-26 Thread Clint Bowman
?duplicated Clint BowmanINTERNET: cl...@ecy.wa.gov Air Quality Modeler INTERNET: cl...@math.utah.edu Department of Ecology VOICE: (360) 407-6815 PO Box 47600FAX:(360) 407-7534 Olympia, WA 98504-7600

Re: [R] Removing duplicates without a for loop

2012-09-26 Thread David Winsemius
On Sep 26, 2012, at 11:23 AM, Rui Barradas wrote: Hello, If I understand it correctly, something like this will get you what you want. d - Sys.Date() + 1:4 d2 - sample(d, 2) dat - data.frame(id = 1:6, date = c(d, d2), value = rnorm(6)) aggregate(dat, by = list(dat$date), FUN =