Dear R-help,

 

First of all, thank you VERY much for any help you have time to offer. I
greatly appreciate it.

 

I would like to write a function that, given an arbitrary number of factors
from a data frame, tabulates the number of occurrences of each unique
combination of the factors. Cleary, this works:

 

> table(horse,date,surface)

<SNIP>

, , surface = TURF

 

                   date

horse               20080404 20080514 20081015 20081025 20081120 20081203
20090319

  Bedevil                  0        0        0        0        0        0
0

  Cut To The Point       227        0        0        0        0        0
0

<SNIP>

 

But I would prefer output that skips all the zeros, flattens any dimensions
greater than 2, and gives the level names rather than codes. I can write
code specifically for n factors like this: (here 2 levels):

 

ft <- function(x,y) {cbind(
levels(x)[unique(cbind(x,y))[,1]],levels(y)[unique(cbind(x,y))[,2]],
table(x,y)[unique(cbind(x,y))])}

 

which gives the lovely output I'm looking for:

 

#      [,1]                [,2]       [,3]

# [1,] "Cut To The Point"  "20080404" "227"

# [2,] "Prairie Wolf"      "20080404" "364"

# [3,] "Bedevil"           "20080514" "319"

# [4,] "Prairie Wolf"      "20080514" "330"

 

But my attempts to make this into a function that handles arbitrary numbers
of factors as separate input arguments has failed. The closest I can get is:

 

ft2 <- function (...) { cbind( unique(cbind(...)),
table(...)[unique(cbind(...))] )

 

giving:

> ft2(horse,date)

      horse date    

 [1,]     2    1 227

 [2,]     9    1 364

 [3,]     1    2 319

 [4,]     9    2 330

 [5,]     9    3 291

 [6,]    12    3 249

 [7,]    10    3 286

 [8,]     5    4 217

 [9,]     3    4 426

[10,]     8    4 468

[11,]     9    5 319

[12,]    13    5 328

[13,]    12    5 138

[14,]     7    6 375

[15,]    11    6 366

[16,]     4    7 255

[17,]     6    7 517

 

I would be greatly in debt to anyone willing to show me how to make the
above function take arbitrary inputs and still produce output displaying
factor level names instead of the underlying coded numbers.

 

Cheers and thanks for your time!

 

Andrew Spence
RCUK Academic Research Fellow
Structure and Motion Laboratory
Royal Veterinary College
Hawkshead Lane
North Mymms, Hatfield
Hertfordshire AL9 7TA
+44 (0) 1707 666988

mailto:aspe...@rvc.ac.uk

http://www.rvc.ac.uk/sml/People/andrewspence.cfm

 

 


        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to