Hi UCSC Staffs, I have a set of sequence that I want to get the gene expression (read counts). I downloaded two tables (refGene in RefSeq Genes and ensGene in Ensembl Genes for mm9), and did a counting the overlapping between my data and the two tables. To my supprise, the results are very different: the counts using refGene are way bigger than the counts using ensGene.
Since the ensGene has 93809 genes and refGene has 28422, I thought the ensGene table should contain most (if not all) of the genes in refGene, or at least the intersection of them should be close to refGene table. But my counting result suggests that the intersection is actually very small. My questions are: * How was the refGene database built? Was it built based on Ensembl Genes? * If the answer is NO, then how big is the intersection between refGene and ensGene. How do I get this intersection? * If the answer is NO, then which table/dataset I should download to get the most of genes available (for example, contains both refGene and ensGene)? Thank you so much in advance, D. _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
