RE: [ccp4bb]: gap links

Ian Tickle Thu, 10 Aug 2006 05:08:01 -0700

***  For details on how to be removed from this list visit the  ***
***          CCP4 home page http://www.ccp4.ac.uk         ***




I come down strongly on Bernhard's side here, and have to disagree equally 
strongly with Miguel.  This is an issue I've tried to take up on the CootBB 
(sadly so far with limited success!).  There will always be conflicts between 
the 'natural' scientists (i.e. physicists, chemists, biologists etc) and the 
computer scientists over what is feasible in software, but it seems to me that 
a fundamental principle should be that in the first instance the natural 
scientist dictates his/her requirements to the computer scientist, not the 
other way around!  The natural scientist is the 'customer' and the computer 
scientist is the 'service provider' and we all know that the customer is always 
right (even when he's wrong!).  Too many times the programmer produces software 
'features' (or bugs depending on how you look at it!) that are convenient from 
the programming point of view but are not what the scientist actually wants.  
Now clearly there will be situations where the scientist is ask!
 ing for something that's just totally unfeasible in software, and then there 
will have to be some negotiation, but it still behoves the programmer to 
accommodate the scientist's wishes as far as is practical.

It seems to me that 'biological' (i.e. essentially arbitrary) residue numbering 
most definitely falls way short of the class of unreasonable requests.  The 
biologist essentially wants the residue 'number' (actually a name if you 
include the chain ID and insertion code) to be merely a label, nothing more, 
obviously firstly to identify the residue on the graphics, but also to relate 
it to the corresponding residue in homologous structures.  Therefore the 
programmer must not infer anything concerning the sequence (such as the residue 
connectivity) purely from the labels!  It seems to me completely crazy that the 
biologist has to relabel his meaningfully labelled sequence just to make life 
comfortable for the programmer - and to maintain different sets of numbers for 
different purposes!  If the biologist really wants to label his/her contiguous 
sequence '12345  -15X  5  6  -99W  ...' then so be it (anything becomes 
possible if the numbers are treated purely as labels).  It's the!
  programmer's job to accommodate that in software, it's not his place to 
question the wisdom of the biologist.

In the majority of structures each unique chain identified by the chain ID is 
contiguous, so that obviously has to be the default presumption, regardless of 
the labelling.  Since we are assuming that the residue labels provide 
absolutely no information concerning the connectivity, and given the current 
limitations of the PDB format, I think the programmer is entitled to require 
that the ordering of residues in the file is the same as that in the sequence 
(otherwise you would need an additional column to specify the ordinal numbers 
of the residues).  Then there has to be a way of telling the software where the 
breaks in the sequence are.  In most cases this will be obvious (e.g. the C-N 
distance is 10 Ang).  In the few cases that the program is unable to infer a 
break from the distance, the user clearly would be expected to provide that 
information.  In the RESTRAIN program I required that each chain break is 
flagged by a TER record, though strictly that is only used to flag !
 end-of-chain (AFAIK other software ignores the TER record).  It seems to be 
that fixing this on-going problem is not beyond the bounds of what we can 
reasonably expect from the software.

Cheers

-- Ian

> -----Original Message-----
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On 
> Behalf Of Miguel Ortiz Lombardia
> Sent: Thursday, August 10, 2006 8:01 AM
> To: [EMAIL PROTECTED]
> Cc: 'CCP4bb'
> Subject: Re: [ccp4bb]: gap links
> 
> ***  For details on how to be removed from this list visit the  ***
> ***          CCP4 home page http://www.ccp4.ac.uk         ***
> 
> 
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> > 
> >>refmac5 must be assuming that you number your protein 
> according to your
> > 
> > protein sequence, which is continuous. In my opinion, this 
> is reasonable.
> > 
> > Uhhh... this assumption turns perilious quickly, because there
> > are post-translational mods and splicing (see Concanavalin), 
> > and biologists sometimes prefer keeping the
> > key residues in related structures (trypsin, fabs, etc) at 
> > a certain residue number. This causes 
> > sequence insertions (addressed correctly, as you say) 
> > and gaps (not addressed correctly, my situation.) 
> > 
> 
> Sure, but after any modification whatsoever the sequence of the final
> protein is, except for perhaps a few pathological cases, continuous.
> Now, I can understand, though not always agree, that biologists (I am
> one) prefer to give a consistent number to a particular residue in a
> family of proteins, but for a refinement program I still think it is
> reasonable to consider the numbering as continuous by default: this
> would be the most usual situation, I would say.
> 
> In any case, knowing that you can fix the problem using TRANS (perhaps
> even CIS if the thing is really bizarre) is very useful, thanks!
> 
> 
> Miguel
> - --
> Miguel Ortiz Lombardía
> Centro de Investigaciones Oncológicas
> C/ Melchor Fernández Almagro, 3
> 28029 Madrid, Spain
> Tel. +34 912 246 900
> Fax. +34 912 246 976
> email: [EMAIL PROTECTED]
> www: http://www.ysbl.york.ac.uk/~mol/
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> ~~~~~~~~~
> Je suis de la mauvaise herbe,
> Braves gens, braves gens,
> Je pousse en liberté
> Dans les jardins mal fréquentés!
>                                                        
> Georges Brassens
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.1 (GNU/Linux)
> 
> iD8DBQFE2tmRF6oOrDvhbQIRAoW9AJoCpyWRpC+R6XGzn6IGxniwwRK2UgCgoyDe
> RRece2CHvTn8P22eekYjbZc=
> =61Z2
> -----END PGP SIGNATURE-----
> 
> 

Disclaimer

This communication is confidential and may contain privileged information 
intended solely for the named addressee(s). It may not be used or disclosed 
except for the purpose for which it has been sent. If you are not the intended 
recipient you must not review, use, disclose, copy, distribute or take any 
action in reliance upon it. If you have received this communication in error, 
please notify Astex Therapeutics Ltd by emailing [EMAIL PROTECTED] and destroy 
all copies of the message and any attached documents. 



Astex Therapeutics Ltd monitors, controls and protects all its messaging 
traffic in compliance with its corporate email policy. The Company accepts no 
liability or responsibility for any onward transmission or use of emails and 
attachments having left the Astex Therapeutics domain.  Unless expressly 
stated, opinions in this message are those of the individual sender and not of 
Astex Therapeutics Ltd. The recipient should check this email and any 
attachments for the presence of computer viruses. Astex Therapeutics Ltd 
accepts no liability for damage caused by any virus transmitted by this email. 
E-mail is susceptible to data corruption, interception, unauthorized amendment, 
and tampering, Astex Therapeutics Ltd only send and receive e-mails on the 
basis that the Company is not liable for any such alteration or any 
consequences thereof.

RE: [ccp4bb]: gap links

Reply via email to