Hi UCSC Staffs, On 4/13/11 6:40 PM, Duke wrote: > Hi UCSC Staffs, > > I have a set of sequence that I want to get the gene expression (read > counts). I downloaded two tables (refGene in RefSeq Genes and ensGene in > Ensembl Genes for mm9), and did a counting the overlapping between my > data and the two tables. To my supprise, the results are very different: > the counts using refGene are way bigger than the counts using ensGene. >
I carefully double checked my counting, and it was my fault: the couting for ensGen is bigger than for refGene, which is as expected. > Since the ensGene has 93809 genes and refGene has 28422, I thought the > ensGene table should contain most (if not all) of the genes in refGene, > or at least the intersection of them should be close to refGene table. > But my counting result suggests that the intersection is actually very > small. My questions are: > > * How was the refGene database built? Was it built based on Ensembl Genes? > * If the answer is NO, then how big is the intersection between > refGene and ensGene. How do I get this intersection? > * If the answer is NO, then which table/dataset I should download to > get the most of genes available (for example, contains both refGene and > ensGene)? But I am still very curious about the above questions. Any clarification will be greatly appreciated. Thanks, D. _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
