One idiom for testing group-level conditions is: data[, if (mean(x) < 10) .SD, by=g]
This might be slower in the special case of taking a mean. See ?GForce. There's a request for an idiom like SQL HAVING over here: https://github.com/Rdatatable/data.table/issues/788 --Frank On Wed, Aug 16, 2017 at 4:44 PM, Bernstein, Elliot J < [email protected]> wrote: > Is there a way to subset a data table by the result of a grouped > aggregation without adding an interim column to the table? For example, if > I want to select all rows for which the group mean value of x is less than > 10, I can do the following: > > > > data <- data.table(x = 1:20, g = rep(c("a", "b"), each = 10)) > > data[, mean.x := mean(x), by = .(g)] > > data[mean.x < 10,] > > > > But I’m not really interested in “mean.x”. Can I do the same thing without > adding it to the table? > > > > Thanks. > > > > - Elliot > > _______________________________________________ > datatable-help mailing list > [email protected] > https://lists.r-forge.r-project.org/cgi-bin/mailman/ > listinfo/datatable-help >
_______________________________________________ datatable-help mailing list [email protected] https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
