[datatable-help] Summing over many variables

Joseph Voelkel Thu, 23 Dec 2010 11:35:19 -0800

Consider this set of code:

DT1<-data.table(A1=1:100,A2=1:100,A3=1:100,B1=101:200,B2=101:200,B3=101:200,C1=301:400,D1=301:400,grp=rep(1:5,each=20))
setkey(DT1,grp)
(DT2<-DT1[,lapply(.SD,sum),by=grp]) # from data.table FAQ



I have two questions:
1. I have many columns like C1 and D1 that I don't want to include in the new 
data.table (nor do I want grp.1 in it). How can I nicely have these not be part 
of my result? (If it helps, I know the indices for the A and the B columns)

2. However, in addition to (and sometimes instead of) DT2, what I want is this 
result:
DT2[,list(sum(A1+A2+A3),sum(B1+B2+B3)),by=grp]

Now, in the actual data set, it's more like A1 to A30 and B1 to B20, and I will 
be doing this for many subsets of the A's and B's.
So, I would like to have a way to easily find the sum (or sd, or ...) by some 
easier method than by referencing the column names--use of column indices would 
be nice for the actual problem. I know that the column numbers can be 
referenced with with=FALSE, but don't really see how to use that in this 
problem.

Any ideas? Thanks.

_______________________________________________
datatable-help mailing list
[email protected]
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help

[datatable-help] Summing over many variables

Reply via email to