Hello, I have been learning to use data.table and studying the vignette located here...
https://rawgit.com/wiki/Rdatatable/data.table/vignettes/datatable-intro-vignette.html Section 2f. shows how to subset a data.table to select an arbitrary number of rows in each .SD. That's really handy. 2. Aggregations f. Subset .SD for each group: ans <- flights[, head(.SD, 2), by=month] In a similar way, I can get the last row of the .SD using either tail, nrow or dim (I don't think it matters much, but dim seems to be a faster*). ans <- flights[,.SD[dim(.SD)[1]], by=month] I got to wondering if the number of rows in .SD might be exposed in each grouping iteration. Is there an equivalent to .N for the subset data.table, .SD? Something like .SDN or the like? Thanks for data.table! Ben * After reading this discussion http://r.789695.n4.nabble.com/What-is-the-fastest-way-to-determine-that-data-table-is-empty-td4638348.html#a4638451 I tried out a couple of methods for getting the last element of a grouping using nrow(), tail() and dim(). # using tail > microbenchmark( last1 <- flights[, tail(.SD, 1), by=month] ) Unit: milliseconds expr min lq mean median uq max neval last1 <- flights[, tail(.SD, 1), by = month] 16.65898 16.89704 18.26415 17.37007 19.20147 40.12966 100 # using dim > microbenchmark( last2 <- flights[,.SD[dim(.SD)[1]], by=month] ) Unit: milliseconds expr min lq mean median uq max neval last2 <- flights[, .SD[dim(.SD)[1]], by = month] 15.51243 15.87788 17.40978 16.19426 17.83308 59.22429 100 # using nrow > microbenchmark( last3 <- flights[,.SD[nrow(.SD)], by=month] ) Unit: milliseconds expr min lq mean median uq max neval last3 <- flights[, .SD[nrow(.SD)], by = month] 15.63919 15.92073 17.28836 16.52588 18.33867 24.92624 100 > identical(last1, last2) [1] TRUE > identical(last1, last3) [1] TRUE Ben Tupper Bigelow Laboratory for Ocean Sciences 60 Bigelow Drive, P.O. Box 380 East Boothbay, Maine 04544 http://www.bigelow.org _______________________________________________ datatable-help mailing list [email protected] https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
