Good. Well, all correct, known and expected then. There is a feature request to optimize .SD[i] in DT[,.SD[i],by=...] to not actually create the whole .SD just to get the first or last row (or indeed any subset). Since that's the most natural syntax. I often would like that myself. In the meatime the other suggestions from Michael should be fast. As he said: the one using .I[.N] should be fast.
Matthew

On 24.04.2013 20:26, Sam Steingold wrote:
* Michael Nelson <[email protected]> [2013-04-24 00:41:33 +0000]:

frame[, .SD[.N], by = id]

I tried
--8<---------------cut here---------------start------------->8---
dt <- frame[, .SD[1] ,by=id]
--8<---------------cut here---------------end--------------->8---
(I don't care whether I take first or last, see another message).
and I got the note
--8<---------------cut here---------------start------------->8---
Finding groups (bysameorder=TRUE) ... done in 0.121secs.
bysameorder=TRUE and o__ is length 0
Optimization is on but j left unchanged as '.SD[1]'
Starting dogroups ... The result of j is a named list. It's very
inefficient to create the same names over and over again for each
group. When j=list(...), any names are detected, removed and put back
after grouping has completed, for efficiency. Using j=transform(), for
example, prevents that speedup (consider changing to :=).
--8<---------------cut here---------------end--------------->8---
and indeed it runs unbelievably slow (as if I were using data.table)

thanks a lot for your detailed reply!

_______________________________________________
datatable-help mailing list
[email protected]
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help

Reply via email to