Thanks Matthew I am not sure I understand the code (actually, I am sure I do not :-( . More specifically, I would expect the two expressions below to yield tables of the same dimension (basically all combinations of wdpaint and pnnid):
aa <- SPFdt[, .N, by=list(sample(wdpaint,replace=FALSE),pnvid)] dim(aa) > 254 3 bb <- SPFdt[, .N, by=list(wdpaint,pnvid) dim(bb) > 170 3 What I am looking for is creating a cross table of pnvid and wdpaint, i.e., the frequency or number of occurrences of each combination of pnvid and wdpaint. Shuffling wdpaint should give in that case a different frequency distribution, like in the example below: table(c(1,1,2,2), c(3,3,4,4)) table(c(2,2,1,1), c(3,3,4,4)) Basically what I want to do is run X permutations on a data set which I will then use to create a confidence interval on the frequency distribution of sample points over wdpaint and pnvid Cheers, Paulo On Tue, Jun 19, 2012 at 3:30 PM, Matthew Dowle <[email protected]>wrote: > > Hi, > > Welcome to the list. > > Rather than picking a column and calling length() on it, .N is a little > more convenient (and faster if that column isn't otherwise used, as in > this example). Search ?data.table for the string ".N" to find out more. > > And to group by expressions of column names, wrap with list(). So, > > SPF[, .N, by=list(sample(wdpaint,replace=FALSE),pnvid)] > > But that won't calculate any different statistics, just return the groups > in a different order. Seems like just an example, rather than the real > task, iiuc, which is fine of course. > > Matthew > > > > Hi, I am new to this package and not sure how to implement the sample() > > function with data.table. > > > > I have a data frame SPF with three columns cat, pnvid and wdpaint. The > > pnvid variables has values 1:3, the wdpaint has values 1:10. I am > > interested in the count of all combinations of wdpaint and pnvid in my > > data > > set, which can be calculated using table or tapply (I use the latter in > > the > > example code below). > > > > Normally I would use something like: > > > > *c <- tapply(SPF$cat, list(as.factor(SPF$pnvid), as.factor(SPF$wdpaint), > > function(x) length(x))* > > > > If I understand correctly, I would use the below when working with data > > tables: > > > > *f <- SPF[,length(cat),by="wdpaint,pnvid"]* > > > > But what if I want to reshuffle the column wdpaint first? When using > > tapply, it would be something along the lines of: > > > > *a <- list(as.factor(SPF$pnvid), as.factor(sample(SPF$wdpaint, > > replace=F))) > > c <- tapply(SPF$cat, a, function(x) length(x))* > > > > > > But how to do this with data.table? > > > > Paulo > > _______________________________________________ > > datatable-help mailing list > > [email protected] > > > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help > > >
_______________________________________________ datatable-help mailing list [email protected] https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
