dear all,

I recently came across the following issue and I was not sure whether it is 
intentionally or not:
using p.adjust to adjust p-values for multiple hypothesis testing using the 
method from Benjamini and Hochberg removes all NA values from the input vector 
and does not account for them in the adjustment, i.e. in a vector of 23 
p-values with 20 of them being NA it adjusts the 3 non-NA p-values as if there 
had only been 3 tests to adjust for (see example). I was not aware of that 
behaviour, and also implementations like the one in Bioconductor's multtest 
package handle NAs differently.
If this behaviour is intentionally I would appreciate if a related note could 
be added to the help page.

Example:

x <- c( 0.001, 0.01, 0.02, rep( NA, 20 ) )
p.adjust( x, method="BH" )

 [1] 0.003 0.015 0.020    NA    NA    NA    NA    NA    NA    NA    NA    NA
[13]    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA

p.adjust( x, method="BH", n=length( x ) )
 [1] 0.0230000 0.1150000 0.1533333        NA        NA        NA        NA
 [8]        NA        NA        NA        NA        NA        NA        NA
[15]        NA        NA        NA        NA        NA        NA        NA
[22]        NA        NA

in the default settings (without specifying n, i.e. n=length(p)) the value of n 
is determined after all NAs have been removed from the p-value vector p.

cheers, jo

my R:
> sessionInfo()
R version 3.1.2 (2014-10-31)
Platform: x86_64-apple-darwin14.0.0/x86_64 (64-bit)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     
>

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to