On Wed, 03 Oct 2007 03:11:41 +0900, Hyunchul Kim <hyunchul.love...@gmail.com> wrote:
> Hi, all > > How can I extract protein sequence from given pdb files by python API for > pymol ? > I want to get a sequence which is extracted from coordinates not the > annotated residue sequence in remarks. Assuming the protein has all of its alpha carbons, you can do: iterate name ca, print resn which will print the list of residues, one per line, or you could do: iterate name ca, print resn, (the last comma is important there) and this will print the residues out to the terminal window (assuming you are running from one) without a carriage return between each. If you want to get fancy and convert to single letter code you could type: res3=['ALA','ASN','ASP','ARG','CYS','GLN','GLU','GLY','HIS','ILE','LEU','LYS','MET','PRO','PHE','SER','THR','TRP','TYR','VAL'] res1='ANDRCQEGHILKMPFSTWYV' iterate name ca, print res1[res3.index(resn)], and you will get just the single letter code version of the sequence. I often want this sort of thing when I'm not using PyMOL, so I have my own sequence conversion/extraction script (seq_convert.py) written in python that you can get from my web site under "scripts". http://pldserver1.biochem.queensu.ca/~rlc/work/scripts/index.shtml#seq_convert.py This in turn uses a PDB parser that I have called MyPDB.py: http://pldserver1.biochem.queensu.ca/~rlc/work/scripts/index.shtml#MyPDB.py I'm sure someone could chime in with a pymol-based script that would do the same without needing to start up the graphics. Cheers, Rob -- Robert L. Campbell, Ph.D. Senior Research Associate/Adjunct Assistant Professor Botterell Hall Rm 644 Department of Biochemistry, Queen's University, Kingston, ON K7L 3N6 Canada Tel: 613-533-6821 Fax: 613-533-2497 <robert.campb...@queensu.ca> http://pldserver1.biochem.queensu.ca/~rlc