Hi Nermin,

To complement Guy's reply: You could also use the EMBLCDS database. This one contains all CDSs in EMBL-Bank (soon to be called ENA = European Nucleotide Archive). This one is available via the EBI's ftp server at pub/databases/embl/cds. The identifiers in this database correspond to the protein_id feature in the EMBL-Bank Feature Table which maps each CDS to corresponding protein translation. These in turn can be identified in UniProtKB. Please see the README.txt file at:

ftp.ebi.ac.uk/pub/databases/embl/cds/README.txt

for further details.

Further to the above, and depending on the proteome in question, you could have a look at the integr8 directory on the ftp server as well:

ftp.ebi.ac.uk/pub/databases/integr8

In here you will find the proteomes of more than 1600 organisms, mainly bacteria and archea, but also human, rat, mouse, etc.

R:)


Nermin Celik wrote:
Hi,

I have the CDS section of a feature table and a genome of an organism.
Which EMBOSS program will allow me to extract the coding regions defined
in the CDS file from the genome and then translate them to protein
sequences?

Example of CDS file:
FT   CDS             166..231
FT                   /systematic_id="ROD00001"
FT   CDS             313..2775
FT                   /systematic_id="ROD00011"
FT   CDS             2778..3707

Thank you.
Nermin

_______________________________________________
EMBOSS mailing list
[email protected]
http://lists.open-bio.org/mailman/listinfo/emboss
_______________________________________________
EMBOSS mailing list
[email protected]
http://lists.open-bio.org/mailman/listinfo/emboss

Reply via email to