Hi Jaaved, To answer your first question:
The genomes we have are old so it is possible that the differences may be due to years of version updates. On of our engineers has this to say: Go to the current download site for these genomes, fetch the sequence file, and run an faCount on it to see what they name the bits. Compare names and genome organization with what we display. I would assume after 5 or 6 years, these genomes most likely have new assemblies. These genome project sites would most likely explain their update history. You may also find assembly history in the browsers at Ensembl. There may also be information on their trace archive pages if they have them. For example: http://www.ncbi.nlm.nih.gov/Traces/wgs/?val=AAMC01 To answer your second question: Unfortunately, our funding covers primarily vertebrate genomes, though we do host a few of the major model organisms. Hope this help you. If you have further questions, please contact the mailing list: [email protected]. Vanessa Kirkup Swing UCSC Genome Bioinformatics Group ---------- Forwarded message ---------- From: Jaaved Mohammed <[email protected]> Date: Thu, Sep 15, 2011 at 8:57 AM Subject: [Genome] super vs scaffold coordinates & D. willistoni on the browser. To: [email protected] Hello, I have two questions that I would really appreciate your help with answering. =========== Firstly, =========== I am trying to understand the origin of the "super*" coordinates for the droPer1 and droSec1 genomes available on the UCSC Genome Browser. For example, in the D. sechellia assembly, I see that all the chromosomes are prefixed by "super" on the Genome Browser: http://genome-mirror.bscb.cornell.edu/cgi-bin/hgTracks?hgsid=36382&chromInfo Page=. However, from Flybase.org, the GFF files, or any coordinate for that matter on Flybase, is always prefixed by "scaffold" as can be seen from ftp://flybase.net/genomes/Drosophila_sechellia/current/gff/. Why is this? How were the conversion done from "scaffold" into "super" coordinates? I'm trying to convert the flybase genes reported in the GFF files into a file that I can upload to the browser to see the flybase annotated genes, non-coding RNAs, etc. however this clash of coordinate names is causing much problems. I should note that I looked in all the older revisions of the Flybase GFF files and still I see no "super" prefixed coordinates. I hope I'm not looking at the wrong flybase GFF files. The same observation was made in the droPer1 reference assembly. ============= Secondly, I've noticed that D. willistoni reference assembly is not available on the UCSC Genome Browser. Why is this? I've added this genome to the Cornell mirror using the droWil1.fa file downloaded/available from the UCSC browser. The added genome can be viewed here: http://genome-mirror.bscb.cornell.edu/cgi-bin/hgGateway?hgsid=36387&clade=in sect&org=D.+willistoni&db=0 On a similar note to the first point above, I've observed that the coordinates are prefixed with "scaffold" on the browser, but flybase reports coordinates prefixed with "scf2_": ftp://flybase.net/genomes/Drosophila_willistoni/current/gff/. Thanks, Jaaved -- Jaaved Mohammed, Ph.D. Student of Computational Biology Tri-Institutional Training Program in Computational Biology and Medicine (Cornell University - Ithaca, Weill Cornell Medical College, and Memorial Sloan-Kettering Cancer Center) _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
