Andrew, Is this what you're looking for? Most likely a more elegant solution exists... but maybe this is good enough.
## BEGIN R SAMPLE CODE ## sample data frame, 3 factors tmp <- data.frame(f1 = sample(gl(2, 50, labels = c("Male", "Female"))), f2 = sample(gl(4, 25, labels = c("White", "Black", "Hispanic", "Other"))), f3 = sample(gl(4, 25, labels = c("0-20", "21-40", "41-60", "61-80")))) summary(tmp) ## the function test <- function(...) { tbl <- table(interaction(..., sep = "!")) tbl.nozero <- tbl[tbl > 0] nms <- strsplit(names(tbl.nozero), "!") cb <- cbind(t(do.call(data.frame, nms)), tbl.nozero) dimnames(cb) <- NULL cb } ## test calling the function, does this produce what you want? with(tmp, test(f1, f2, f3)) ## END R SAMPLE CODE Best Regards, Erik Iverson > -----Original Message----- > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] > On Behalf Of Andrew Spence > Sent: Friday, October 02, 2009 1:15 PM > To: r-help@r-project.org > Subject: [R] Tabulating using arbitrary numbers of factors > > Dear R-help, > > > > First of all, thank you VERY much for any help you have time to offer. I > greatly appreciate it. > > > > I would like to write a function that, given an arbitrary number of > factors > from a data frame, tabulates the number of occurrences of each unique > combination of the factors. Cleary, this works: > > > > > table(horse,date,surface) > > <SNIP> > > , , surface = TURF > > > > date > > horse 20080404 20080514 20081015 20081025 20081120 20081203 > 20090319 > > Bedevil 0 0 0 0 0 0 > 0 > > Cut To The Point 227 0 0 0 0 0 > 0 > > <SNIP> > > > > But I would prefer output that skips all the zeros, flattens any > dimensions > greater than 2, and gives the level names rather than codes. I can write > code specifically for n factors like this: (here 2 levels): > > > > ft <- function(x,y) {cbind( > levels(x)[unique(cbind(x,y))[,1]],levels(y)[unique(cbind(x,y))[,2]], > table(x,y)[unique(cbind(x,y))])} > > > > which gives the lovely output I'm looking for: > > > > # [,1] [,2] [,3] > > # [1,] "Cut To The Point" "20080404" "227" > > # [2,] "Prairie Wolf" "20080404" "364" > > # [3,] "Bedevil" "20080514" "319" > > # [4,] "Prairie Wolf" "20080514" "330" > > > > But my attempts to make this into a function that handles arbitrary > numbers > of factors as separate input arguments has failed. The closest I can get > is: > > > > ft2 <- function (...) { cbind( unique(cbind(...)), > table(...)[unique(cbind(...))] ) > > > > giving: > > > ft2(horse,date) > > horse date > > [1,] 2 1 227 > > [2,] 9 1 364 > > [3,] 1 2 319 > > [4,] 9 2 330 > > [5,] 9 3 291 > > [6,] 12 3 249 > > [7,] 10 3 286 > > [8,] 5 4 217 > > [9,] 3 4 426 > > [10,] 8 4 468 > > [11,] 9 5 319 > > [12,] 13 5 328 > > [13,] 12 5 138 > > [14,] 7 6 375 > > [15,] 11 6 366 > > [16,] 4 7 255 > > [17,] 6 7 517 > > > > I would be greatly in debt to anyone willing to show me how to make the > above function take arbitrary inputs and still produce output displaying > factor level names instead of the underlying coded numbers. > > > > Cheers and thanks for your time! > > > > Andrew Spence > RCUK Academic Research Fellow > Structure and Motion Laboratory > Royal Veterinary College > Hawkshead Lane > North Mymms, Hatfield > Hertfordshire AL9 7TA > +44 (0) 1707 666988 > > mailto:aspe...@rvc.ac.uk > > http://www.rvc.ac.uk/sml/People/andrewspence.cfm > > > > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.