Hello,

The refSeq and Ensembl genes tracks are two separate tracks built on 
separate data sets. You can read about each of them on their respective 
description pages:

http://genome.ucsc.edu/cgi-bin/hgTrackUi?g=refGene
http://genome.ucsc.edu/cgi-bin/hgTrackUi?g=ensGene

These pages can also be accessed by clicking on the blue/gray bar to the 
left of the track in the main display or by clicking on the track title 
above the pulldown menu.

You can see the intersection between any two tables in our table browser 
(http://genome.ucsc.edu/cgi-bin/hgTables) using the intersection tool. 
For more information about using the Table Browser, see the Open Helix
table browser training video 
(http://www.openhelix.com//cgi/tutorialInfo.cgi?id=28) and the "User's
Guide" at http://genome.ucsc.edu/goldenPath/help/hgTablesHelp.html.

To look at the intersection between RefSeq and Ensembl, for example, 
after selecting your assembly of interest select:

group: Genes and Gene Prediction Tracks
track: RefSeq Genes
table: refGene
region: genome

Then click "intersection: create" then in the next menu choose:

group: Genes and Gene Prediction Tracks
track: Ensembl Genes
table: ensGene

and select the parameters you want for intersecting the two tracks.

We do not have a track that encompasses all of the sets you mention, but 
you may be interested in our UCSC genes track which does pull data from 
RefSeq and other gene prediction data sets. For more information about 
the UCSC genes track you can read the description page:

http://genome.ucsc.edu/cgi-bin/hgTrackUi?g=knownGene

Note that the UCSC genes track includes multiple isoforms for many genes 
and you can use the knownIsoforms table to see which isoforms belong to 
the same transcript cluster and the knownCanonical table to see which 
transcript is designated the canonical splice variant of a gene.

Best regards,

Pauline Fujita
UCSC Genome Bioinformatics Group
http://genome.ucsc.edu

On 04/15/11 12:12, Duke wrote:
> Hi UCSC Staffs,
> 
> On 4/13/11 6:40 PM, Duke wrote:
>> Hi UCSC Staffs,
>>
>> I have a set of sequence that I want to get the gene expression (read
>> counts). I downloaded two tables (refGene in RefSeq Genes and ensGene in
>> Ensembl Genes for mm9), and did a counting the overlapping between my
>> data and the two tables. To my supprise, the results are very different:
>> the counts using refGene are way bigger than the counts using ensGene.
>>
> 
> I carefully double checked my counting, and it was my fault: the couting 
> for ensGen is bigger than for refGene, which is as expected.
> 
>> Since the ensGene has 93809 genes and refGene has 28422, I thought the
>> ensGene table should contain most (if not all) of the genes in refGene,
>> or at least the intersection of them should be close to refGene table.
>> But my counting result suggests that the intersection is actually very
>> small. My questions are:
>>
>>    * How was the refGene database built? Was it built based on Ensembl Genes?
>>    * If the answer is NO, then how big is the intersection between
>> refGene and ensGene. How do I get this intersection?
>>    * If the answer is NO, then which table/dataset I should download to
>> get the most of genes available (for example, contains both refGene and
>> ensGene)?
> 
> But I am still very curious about the above questions. Any clarification 
> will be greatly appreciated.
> 
> Thanks,
> 
> D.
> _______________________________________________
> Genome maillist  -  [email protected]
> https://lists.soe.ucsc.edu/mailman/listinfo/genome

_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to