On Tue, Apr 29, 2014 at 10:04 AM, Michael Smith <[email protected]> wrote: > All, > > Is there some data.table-idiomatic way to filter based on a previous > observation/row? For example, I want to remove a row if > DT$a[row]==DT$a[row-1]. > > It could be done by first calculating the lag and then filtering based > on that, but I wonder if there's a more direct way. > > The following example works, but my feeling is there should be a more > elegant solution: > > ( DT <- data.table(a = c(1, 2, 2, 3), b = 8:5) ) > DT[, L.a := c(NA, head(a, -1))][a != L.a | is.na(L.a)][, L.a := NULL][]
If the unique elements always appear consecutively then the following would work. (For example, if `a` were in ascending order (as in the example) or descending order then that would be satisfied. If DT were keyed on 'a' then this would always be the case.) DT[ !duplicated(a) ] Note that 'a' need not be numeric. -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com _______________________________________________ datatable-help mailing list [email protected] https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
