Hi UCSC Staffs,

On 4/13/11 6:40 PM, Duke wrote:
> Hi UCSC Staffs,
>
> I have a set of sequence that I want to get the gene expression (read
> counts). I downloaded two tables (refGene in RefSeq Genes and ensGene in
> Ensembl Genes for mm9), and did a counting the overlapping between my
> data and the two tables. To my supprise, the results are very different:
> the counts using refGene are way bigger than the counts using ensGene.
>

I carefully double checked my counting, and it was my fault: the couting 
for ensGen is bigger than for refGene, which is as expected.

> Since the ensGene has 93809 genes and refGene has 28422, I thought the
> ensGene table should contain most (if not all) of the genes in refGene,
> or at least the intersection of them should be close to refGene table.
> But my counting result suggests that the intersection is actually very
> small. My questions are:
>
>    * How was the refGene database built? Was it built based on Ensembl Genes?
>    * If the answer is NO, then how big is the intersection between
> refGene and ensGene. How do I get this intersection?
>    * If the answer is NO, then which table/dataset I should download to
> get the most of genes available (for example, contains both refGene and
> ensGene)?

But I am still very curious about the above questions. Any clarification 
will be greatly appreciated.

Thanks,

D.
_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to