For 10 million data points table(interaction(vec_D, vec_C, vec_B, vec_A)) took my laptop 11.45 seconds and the following function required 0.18 seconds f0 <- function (vec_A, vec_B, vec_C, vec_D) { x <- 1 + vec_A + 2 * (vec_B + 2 * (vec_C + 2 * vec_D)) tab <- tabulate(x, nbins = 16) names(tab) <- do.call(paste0, rev(expand.grid(0:1, 0:1, 0:1, 0:1))) tab } Aside from the order of the entries in the output tables, they gave the same results.
Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com > -----Original Message----- > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On > Behalf > Of Sridhar Iyer > Sent: Saturday, June 01, 2013 2:57 PM > To: r-help@r-project.org > Subject: [R] Frequency count of Boolean pattern in 4 vectors. > > I need to do this on very large datasets ( > a few million data points). So > seeking help in figuring out an implementation of the task. > > Input 4 vectors which contain values as 0 or 1. (as integers, not boolean > bits) > vec_A = ( 0, 1, 0, 0, ...... 1, 0, 1, 0) etc > vec_B = (0,0,1,1.....) > vec_C, vec_D (similar to above) > All four vectors are same length. > > I need to compute frequency count of the boolean literals for DCBA, > DCBA > 0000 > 0001 > 0010 > 0011 > .. > .. > 1111 > > Questions: > a) Is there a mechanism for combining the 4 vectors (in integer formats) > into 4 bits of a new vector or some other > type? (or treat them as boolean values true/false instead of 0 or 1 > integers). > b) what is the most efficient mechanism for obtaining the frequency count of > each of the sixteen Boolean > combinations? > > I need to do this frequently on large datasets. So am trying to get an > efficient implementation (instead of > a quick and dirty scheme). Thank you very very much in advance. > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.