Just posted to that stackoverflow thread showing a worst, better and best use of data.table. Hope that gets the point across. I'm sure plyr can be used better too so I sent Hadley the link to it.
New feature requests for data.table (any thoughts anyone?) : https://r-forge.r-project.org/tracker/index.php?func=detail&aid=1230&group_id=240&atid=978 https://r-forge.r-project.org/tracker/index.php?func=detail&aid=1231&group_id=240&atid=978 Matthew On Tue, 2010-12-07 at 14:37 -0500, Tom Short wrote: > Forgot to reply to the list... > > On Tue, Dec 7, 2010 at 2:07 PM, Matthew Dowle <[email protected]> wrote: > > > > Does anyone have time to see if this post uses data.table correctly : > > > > http://stackoverflow.com/questions/4322219/whats-the-fastest-way-to-merge-join-data-frames-in-r > > Not enough time to do it justice. On my system, I get the following: > > > system.time(aggregate <- aggregate(d[c("x", "y")], d[c("g1", "g2")], mean)) > user system elapsed > 6.72 0.08 6.65 > > system.time(dt1 <- dt[, list(x=mean(x), y=mean(y)), by = "g1,g2"]) > user system elapsed > 3.95 0.02 3.87 > > system.time(dt2 <- dt[, list(x=.Internal(mean(x)), y=.Internal(mean(y))), > > by = "g1,g2"]) > user system elapsed > 0.12 0.01 0.19 > > This is a "many groups" case. > > - Tom > _______________________________________________ > datatable-help mailing list > [email protected] > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help _______________________________________________ datatable-help mailing list [email protected] https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
