Hi Stefanie,

I double-checked your numbers for the hg19 database via the Table 
Browser, and I got all of the same counts.

For future reference, in case you want to do this yourself from the 
Table Browser, I used the "summary/statistics" button to show the 
various counts, and to find the number of protein-coding genes I added a 
filter on the knownGene table that used the free-form query 
cdsStart!=cdsEnd.

Also, in case querying the MySQL database directly would be easier for 
you, instructions for accessing it are here:
http://genome.ucsc.edu/FAQ/FAQdownloads.html#download29

--
Brooke Rhead
UCSC Genome Bioinformatics Group


Stefanie Gerstberger wrote on 9/27/10 5:36 PM:
> Hi,
> I would like to cite the current list of genes contained in the UCSC genome 
> browser known as the "UCSC genes". From the downloads from your website I 
> noticed that I got a different number of canonical proteins when I uploaded a 
> file (by 3000 protein coding genes higher) than if I used the knowncanonical 
> list and parsed through it myself.  I checked for the Pumilio proteins and 
> found 
> in one case that 2 pumilio isoforms were listed from the directly downloaded 
> canonical gene file even though this accession number was not listed in the 
> knowncanonical it was nevertheless found in the fasta file.
> I got following count of genes:
> 
> total UCSC genes (containing all isoforms) : 77614
> number of "canonical" UCSC genes (one isoform per gene locus): 27297
> total protein coding UCSC genes (containing all isoforms): 62378
> number of "canonical" UCSC protein coding genes (one isoform per gene locus): 
> 21018 
> 
> (the number of protein coding genes was generated by parsing myself through 
> the 
> files). 
> Please could you let me know whether these numbers are currently correct,
> 
> Thank you very much,
> 
> 
> Stefanie
> 
> 
>  ---------------------------------------------------
> Stefanie Gerstberger
> graduate student in Chemical Biology
> Tri-Institutional Program 
> Cornell University, 
> Rockefeller University, 
> Memorial Sloan Kettering Cancer Center
> 
> 
> _______________________________________________
> Genome maillist  -  [email protected]
> https://lists.soe.ucsc.edu/mailman/listinfo/genome
_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to