Hi,
I would like to cite the current list of genes contained in the UCSC genome 
browser known as the "UCSC genes". From the downloads from your website I 
noticed that I got a different number of canonical proteins when I uploaded a 
file (by 3000 protein coding genes higher) than if I used the knowncanonical 
list and parsed through it myself.  I checked for the Pumilio proteins and 
found 
in one case that 2 pumilio isoforms were listed from the directly downloaded 
canonical gene file even though this accession number was not listed in the 
knowncanonical it was nevertheless found in the fasta file.
I got following count of genes:

total UCSC genes (containing all isoforms) : 77614
number of "canonical" UCSC genes (one isoform per gene locus): 27297
total protein coding UCSC genes (containing all isoforms): 62378
number of "canonical" UCSC protein coding genes (one isoform per gene locus): 
21018 

(the number of protein coding genes was generated by parsing myself through the 
files). 
Please could you let me know whether these numbers are currently correct,

Thank you very much,


Stefanie


 ---------------------------------------------------
Stefanie Gerstberger
graduate student in Chemical Biology
Tri-Institutional Program 
Cornell University, 
Rockefeller University, 
Memorial Sloan Kettering Cancer Center


_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to