On Wed, 03 Oct 2007 03:11:41 +0900, Hyunchul Kim
<hyunchul.love...@gmail.com> wrote:

> Hi, all
> 
> How can I extract protein sequence from given pdb files by python API for
> pymol ?
> I want to get a sequence which is extracted from coordinates not the
> annotated residue sequence in remarks.

Assuming the protein has all of its alpha carbons, you can do:

  iterate name ca, print resn

which will print the list of residues, one per line, or you could do:

  iterate name ca, print resn,

(the last comma is important there) and this will print the residues out
to the terminal window (assuming you are running from one) without a
carriage return between each. 

If you want to get fancy and convert to single letter code you could type:

  
res3=['ALA','ASN','ASP','ARG','CYS','GLN','GLU','GLY','HIS','ILE','LEU','LYS','MET','PRO','PHE','SER','THR','TRP','TYR','VAL']
  res1='ANDRCQEGHILKMPFSTWYV'
  iterate name ca, print res1[res3.index(resn)],

and you will get just the single letter code version of the sequence.

I often want this sort of thing when I'm not using PyMOL, so I have my own
sequence conversion/extraction script (seq_convert.py) written in python
that you can get from my web site under "scripts".

http://pldserver1.biochem.queensu.ca/~rlc/work/scripts/index.shtml#seq_convert.py

This in turn uses a PDB parser that I have called MyPDB.py:

http://pldserver1.biochem.queensu.ca/~rlc/work/scripts/index.shtml#MyPDB.py

I'm sure someone could chime in with a pymol-based script that would do the
same without needing to start up the graphics.

Cheers,
Rob
-- 
Robert L. Campbell, Ph.D.
Senior Research Associate/Adjunct Assistant Professor 
Botterell Hall Rm 644
Department of Biochemistry, Queen's University, 
Kingston, ON K7L 3N6  Canada
Tel: 613-533-6821            Fax: 613-533-2497
<robert.campb...@queensu.ca>    http://pldserver1.biochem.queensu.ca/~rlc

Reply via email to