Hi UCSC Staffs,

I have a set of sequence that I want to get the gene expression (read 
counts). I downloaded two tables (refGene in RefSeq Genes and ensGene in 
Ensembl Genes for mm9), and did a counting the overlapping between my 
data and the two tables. To my supprise, the results are very different: 
the counts using refGene are way bigger than the counts using ensGene.

Since the ensGene has 93809 genes and refGene has 28422, I thought the 
ensGene table should contain most (if not all) of the genes in refGene, 
or at least the intersection of them should be close to refGene table. 
But my counting result suggests that the intersection is actually very 
small. My questions are:

  * How was the refGene database built? Was it built based on Ensembl Genes?
  * If the answer is NO, then how big is the intersection between 
refGene and ensGene. How do I get this intersection?
  * If the answer is NO, then which table/dataset I should download to 
get the most of genes available (for example, contains both refGene and 
ensGene)?

Thank you so much in advance,

D.
_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to