Hello, The refSeq and Ensembl genes tracks are two separate tracks built on separate data sets. You can read about each of them on their respective description pages:
http://genome.ucsc.edu/cgi-bin/hgTrackUi?g=refGene http://genome.ucsc.edu/cgi-bin/hgTrackUi?g=ensGene These pages can also be accessed by clicking on the blue/gray bar to the left of the track in the main display or by clicking on the track title above the pulldown menu. You can see the intersection between any two tables in our table browser (http://genome.ucsc.edu/cgi-bin/hgTables) using the intersection tool. For more information about using the Table Browser, see the Open Helix table browser training video (http://www.openhelix.com//cgi/tutorialInfo.cgi?id=28) and the "User's Guide" at http://genome.ucsc.edu/goldenPath/help/hgTablesHelp.html. To look at the intersection between RefSeq and Ensembl, for example, after selecting your assembly of interest select: group: Genes and Gene Prediction Tracks track: RefSeq Genes table: refGene region: genome Then click "intersection: create" then in the next menu choose: group: Genes and Gene Prediction Tracks track: Ensembl Genes table: ensGene and select the parameters you want for intersecting the two tracks. We do not have a track that encompasses all of the sets you mention, but you may be interested in our UCSC genes track which does pull data from RefSeq and other gene prediction data sets. For more information about the UCSC genes track you can read the description page: http://genome.ucsc.edu/cgi-bin/hgTrackUi?g=knownGene Note that the UCSC genes track includes multiple isoforms for many genes and you can use the knownIsoforms table to see which isoforms belong to the same transcript cluster and the knownCanonical table to see which transcript is designated the canonical splice variant of a gene. Best regards, Pauline Fujita UCSC Genome Bioinformatics Group http://genome.ucsc.edu On 04/15/11 12:12, Duke wrote: > Hi UCSC Staffs, > > On 4/13/11 6:40 PM, Duke wrote: >> Hi UCSC Staffs, >> >> I have a set of sequence that I want to get the gene expression (read >> counts). I downloaded two tables (refGene in RefSeq Genes and ensGene in >> Ensembl Genes for mm9), and did a counting the overlapping between my >> data and the two tables. To my supprise, the results are very different: >> the counts using refGene are way bigger than the counts using ensGene. >> > > I carefully double checked my counting, and it was my fault: the couting > for ensGen is bigger than for refGene, which is as expected. > >> Since the ensGene has 93809 genes and refGene has 28422, I thought the >> ensGene table should contain most (if not all) of the genes in refGene, >> or at least the intersection of them should be close to refGene table. >> But my counting result suggests that the intersection is actually very >> small. My questions are: >> >> * How was the refGene database built? Was it built based on Ensembl Genes? >> * If the answer is NO, then how big is the intersection between >> refGene and ensGene. How do I get this intersection? >> * If the answer is NO, then which table/dataset I should download to >> get the most of genes available (for example, contains both refGene and >> ensGene)? > > But I am still very curious about the above questions. Any clarification > will be greatly appreciated. > > Thanks, > > D. > _______________________________________________ > Genome maillist - [email protected] > https://lists.soe.ucsc.edu/mailman/listinfo/genome _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
