I find myself using setnames(...,"V1","...") very often because setting them in aggregation is expensive:
--8<---------------cut here---------------start------------->8--- > delays.short <- delays.dt[,sum(count),by="delay"] Finding groups (bysameorder=TRUE) ... done in 1.262secs. bysameorder=TRUE and o__ is length 0 Detected that j uses these columns: count Optimization is on but j left unchanged as 'sum(count)' Starting dogroups ... done dogroups in 8.612 secs > delays.short <- delays.dt[,list(count=sum(count)),by="delay"] Finding groups (bysameorder=TRUE) ... done in 1.051secs. bysameorder=TRUE and o__ is length 0 Detected that j uses these columns: count Optimization is on but j left unchanged as 'list(sum(count))' Starting dogroups ... done dogroups in 11.918 secs --8<---------------cut here---------------end--------------->8--- 38% difference is a lot (3 seconds is not a big deal, but this is just a toy dataset). ISTR that I have asked this question before - is this still (data.table 1.8.10) the state of the art, or am I doing something stupid? Thanks! -- Sam Steingold (http://sds.podval.org/) on Ubuntu 13.04 (raring) X 11.0.11303000 http://www.childpsy.net/ http://think-israel.org http://truepeace.org http://thereligionofpeace.com http://americancensorship.org http://iris.org.il Money does not "play a role", it writes the scenario. _______________________________________________ datatable-help mailing list [email protected] https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
