Dear all,

There is a simple question regarding gene set enrichment analysis.
Say, we have a simple denominator and numerator, therefore
hypergeometric test looks like:

p=phyper(white-1,total white,total black,drawn).

However, there is a question regarding database size. Say, my
denominator (total genes on array) is equal to 10000. However,
database (say GO database) harbor only 8000 from this 10000. The
question is should I subtract genes from all values in phyper that do
not fall into the database? By other words:

original function ie: phyper(50,200,9800,500).

subtract genes that didn't fall into database for example:
phyper(50,180,7700,400).

Should I correct my gene lists with database records? Which way is correct?

Thank you in advance for the replies.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to