Hi, I enquired the Ensembl help desk on whether the human genome assembly in Ensembl v47 is equivalent to that of UCSC hg18 and NCBI 36. I got a clear answer from them that the genome builds from three different organizations are indeed the same. The original correspondence is attached in the end.
Based on the reply, I think it is safe to generalize that a particular genome build of Ensembl is the same to that of UCSC and NCBI. That's good news because I only need to hard-code a dictionary in the Python Ensembl API. So, as long as a user specifies a species name, a database type and a version number, the required genome sequences will be automatically retrieved from pygr.Data, provide that it's been pre-saved there. Cheers, Jenny Original Correspondence: I understand, from reading the documentation of the ensembl core databases and APIs, that 'the genome sequence is stored on the sequence-level i.e. in form of BAC clones or whole genome shotgun scaffolds'. The assembly info for generating top-level sequences from sequence-level entities is stored in the assembly table. Is it also correct that we can obtain the whole genome assembly directly from UCSC or NCBI? For instance, I am interested in the genome assembly for the Ensembl database homo_sapiens_core_47_36i. According to http://oct2007.archive.ensembl.org/Homo_sapiens/index.html, the NCBI version for this genome build is 36. According to the UCSC genome releases webpage http://genome.ucsc.edu/FAQ/FAQreleases, the NCBI Build 36 is equivalent to hg18. Therefore, I can get the top-level sequence like the whole genome for the Ensembl homo_sapiens_core_47_36i database by downloading UCSC hg18. Is that right? Thanks, Jenny Qing Qian The assembly for human is indeed the same between UCSC, NCBI and Ensembl. you can download it from Ensembl here: ftp://ftp.ensembl.org/pub/release-47/fasta/homo_sapiens_47_36i/dna Regards, Giulietta (Ensembl Helpdesk) > [EMAIL PROTECTED] - Mon Oct 20 01:31:21 2008]: - Show quoted text - -- The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "pygr-dev" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/pygr-dev?hl=en -~----------~----~----~----~------~----~------~--~---
