Hi Camille and Daniel, I heard my name mentioned, so I'll join in!
* Daniel Rigden <drig...@liverpool.ac.uk> [2004-05-21 16:43] wrote: > Hi Camille > > The Espript server is the easiest way I know. Go here > > http://prodes.toulouse.inra.fr/ESPript/cgi-bin/ESPript.cgi > > choose Execute then Expert mode. SUpply an alignment and a > supplementary pdb file then execute. You get back an annotated > alignment + the pdb file with a conservation score in the b-factor > column. The only trouble can be making that the top sequence in your > alignment matches exactly the sequence of the structure (no missing > loops etc). You can read this new pdb file into Pymol and colour easily > with Robert Campbell's script color_b.py which you get here > > http://adelie.biochem.queensu.ca/~rlc/work/pymol/ > > Good luck > > Daniel > > On Fri, 2004-05-21 at 15:28, cami...@mrc-lmb.cam.ac.uk wrote: > > Hello PyMol community! > > > > is there any way to display sequence conservation on the surface of a > > protein? > > i.e. to use the info I have in a sequence alignment. > > > > do I have to do this by hand? > > > > Thanks, > > Camille I have a little python routine for calculating the sequence variability (a measure that has apparently been commonly used in the immunoglobulin community -- Thanks for the tip Dave!), but it requires that you already have your sequences in lists (or strings) and already aligned. My seq_convert.py routines could be used for reading files with these individual sequences, or you could use biopython to read, say, a Fasta alignment file. Variability is defined as: (Number of different residue types at a location)/(Frequence of the most common at that location) Ok, so I quickly added the sequence reading capability and added variability.py to my web site (where you can find seq_convert.py also): http://adelie.biochem.queensu.ca/~rlc/work/scripts/ Create three plain text sequence files (really short examples here!) that look like: SGKSGMDVAI AKCIGPDDAL ARCS-MDVAL Then run it with: variability.py test1.seq test2.seq test3.seq giving: 1 S A A 2 2 0.667 3.000 2 G K R 3 1 0.333 9.000 3 K C C 2 2 0.667 3.000 4 S I S 2 2 0.667 3.000 5 G G - 2 2 0.667 3.000 6 M P M 2 2 0.667 3.000 7 D D D 1 3 1.000 1.000 8 V D V 2 2 0.667 3.000 9 A A A 1 3 1.000 1.000 10 I L L 2 2 0.667 3.000 That is, the sequences written in vertical columns followed by: the number of different amino acids found at the position the number of the most commonly observed at that position the frequency of the most common the variability You would then need to extract the variability data (e.g. with awk, or modify the script to write only that) and modify your B-factors within PyMOL using my data2bfactor.py script followed by colouring on B-factor either with the spectrum command or my color_b.py script. This was kind of a quick hack, so it isn't as user-friendly as it could be! It would be ideal to get the alignment out of PyMOL that is created with the align command but I haven't found a way to get at it. Hope that helps, Rob -- Robert L. Campbell, Ph.D. <r...@post.queensu.ca> Senior Research Associate phone: 613-533-6821 Dept. of Biochemistry, Queen's University, fax: 613-533-2497 Kingston, ON K7L 3N6 Canada http://adelie.biochem.queensu.ca/~rlc PGP Fingerprint: 9B49 3D3F A489 05DC B35C 8E33 F238 A8F5 F635 C0E2