Re: [R] Multiple binomial tests on a large table

Dennis Murphy Thu, 29 Jul 2010 03:32:44 -0700

Hi:

As it turns out, this is pretty straightforward using plyr's ldply()
function. Here's a toy example:


d1 <- structure(list(X = c(11L, 9L, 13L, 13L, 18L), N = c(19L, 26L,
21L, 27L, 30L)), .Names = c("X", "N"), class = "data.frame", row.names =
c(NA,
-5L))
w <- sample(1:50, 5)
d2 <- data.frame(X = mapply(rbinom, 1, w, 0.5), N = w)
w <- sample(1:50, 5)
d3 <- data.frame(X = mapply(rbinom, 1, w, 0.5), N = w)

# Combine data frames into a list - since these are already R objects, the
call is easy:
l <- list(d1, d2, d3)

# the function:
f <- function(df)
  do.call(c, with(df, mapply(binom.test, x = X, n = N))[3, ])

# do.call + lapply:
do.call(rbind, lapply(l, f))
          [,1]      [,2]       [,3]      [,4]      [,5]
[1,] 0.6476059 0.1686375 0.38331032 1.0000000 0.3615946
[2,] 0.3019956 0.6515878 0.02944937 0.5600646 1.0000000
[3,] 1.0000000 1.0000000 0.81452942 0.0390625 0.4050322

# plyr approach:
library(plyr)
ldply(l, f)
         V1        V2         V3        V4        V5
1 0.6476059 0.1686375 0.38331032 1.0000000 0.3615946
2 0.3019956 0.6515878 0.02944937 0.5600646 1.0000000
3 1.0000000 1.0000000 0.81452942 0.0390625 0.4050322

ldply() takes a list as input along with a function to process in the
lapply() step and returns a data frame of results. So the plyr approach can
be summarized as:

1. Create a list of data frames.
2. Create a function to apply to each data frame.
3. Load the plyr package.
4. Run ldply().

Essentially, the plyr package provides a number of convenient 'wrapper'
functions to simplify the 'split-apply-combine' strategy of data analysis
for various combinations of input and output objects.

HTH,
Dennis


On Thu, Jul 29, 2010 at 1:05 AM, Wilson, Andrew <a.wil...@lancaster.ac.uk>wrote:

> I need to run binomial tests (binom.test) on a large set of data, stored
> in a table - 600 tests in total.
>
> The values of x are stored in a column, as are the values of n.  The
> data for each test are on a separate row.
>
> For example:
>
> X       N
> 11      19
> 9       26
> 13      21
> 13      27
> 18      30
>
> It is a two-tailed test, and P in all cases is 0.5.
>
> My question is:  Is there a quicker way of running these tests without
> having to type an individual command for each test - and ideally also to
> store the resulting p-values in a single data vector?
>
> Many thanks for any pointers,
>
> Andrew Wilson
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Multiple binomial tests on a large table

Reply via email to