Hi,

I enquired the Ensembl help desk on whether the human genome assembly in
Ensembl v47 is equivalent to that of UCSC hg18 and NCBI 36.  I got a clear
answer from them that the genome builds from three different organizations
are indeed the same.  The original correspondence is attached in the end.

Based on the reply, I think it is safe to generalize that a particular
genome build of Ensembl is the same to that of UCSC and NCBI.  That's good
news because I only need to hard-code a dictionary in the Python Ensembl
API.  So, as long as a user specifies a species name, a database type and a
version number, the required genome sequences will be automatically
retrieved from pygr.Data, provide that it's been pre-saved there.

Cheers,

Jenny

Original Correspondence:

I understand, from reading the documentation of the ensembl core databases
and APIs, that 'the genome sequence is stored on the sequence-level i.e. in
form of BAC clones or whole genome shotgun scaffolds'.  The assembly info
for generating top-level sequences from sequence-level entities is stored in
the assembly table.  Is it also correct that we can obtain the whole genome
assembly directly from UCSC or NCBI?

For instance, I am interested in the genome assembly for the Ensembl
database homo_sapiens_core_47_36i.  According to
http://oct2007.archive.ensembl.org/Homo_sapiens/index.html, the NCBI version
for this genome build is 36.  According to the UCSC genome releases webpage
http://genome.ucsc.edu/FAQ/FAQreleases, the NCBI Build 36 is equivalent to
hg18.  Therefore, I can get the top-level sequence like the whole genome for
the Ensembl homo_sapiens_core_47_36i database by downloading UCSC hg18.  Is
that right?

Thanks,

Jenny Qing Qian

The assembly for human is indeed the same between UCSC, NCBI and
Ensembl.  you can download it from Ensembl here:

ftp://ftp.ensembl.org/pub/release-47/fasta/homo_sapiens_47_36i/dna

Regards,
Giulietta (Ensembl Helpdesk)

> [EMAIL PROTECTED] - Mon Oct 20 01:31:21 2008]:
- Show quoted text -
--
 The Wellcome Trust Sanger Institute is operated by Genome Research
 Limited, a charity registered in England with number 1021457 and a
 company registered in England with number 2742969, whose registered
 office is 215 Euston Road, London, NW1 2BE.

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"pygr-dev" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/pygr-dev?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to