Re: [ccp4bb]: gap links

Garib Murshudov Wed, 23 Aug 2006 21:33:30 -0700

***  For details on how to be removed from this list visit the  ***
***          CCP4 home page http://www.ccp4.ac.uk         ***

At the moment output mmCIF is not available. I am working on this andit will be available soon.


Garib

On 11 Aug 2006, at 13:03, Ian Tickle wrote:

***  For details on how to be removed from this list visit the  ***
***          CCP4 home page http://www.ccp4.ac.uk         ***

Refmac documentation (files/input-script.html) says "XYZIN: Inputco-ordinate file. Preferred format PDB. mmCIF can also beused.". If we should all be using mmCIF why is PDB preferred, oris doc out of date?


-- Ian

-----Original Message-----
From: Eleanor Dodson [mailto:[EMAIL PROTECTED]
Sent: Friday, August 11, 2006 12:13 PM
To: Ian Tickle
Cc: Miguel Ortiz Lombardia; [EMAIL PROTECTED]; CCP4bb
Subject: Re: [ccp4bb]: gap links

Obviously we all should accept using the mmCIF format for
coordinates.
That assigns
a residue NAME       which can be 1 2 3 7 6 8 8A etc etc" and
a residue NUMBER which will be 1 2 3 4 5 6 etc for sequential
residues..

This discussion demonstrates the inadequacy of the PDB 80 char record
  Eleanor

Ian Tickle wrote:

***  For details on how to be removed from this list visit the  ***
***          CCP4 home page http://www.ccp4.ac.uk         ***



I come down strongly on Bernhard's side here, and have to

disagree equally strongly with Miguel.  This is an issue I've
tried to take up on the CootBB (sadly so far with limited
success!).  There will always be conflicts between the
'natural' scientists (i.e. physicists, chemists, biologists
etc) and the computer scientists over what is feasible in
software, but it seems to me that a fundamental principle
should be that in the first instance the natural scientist
dictates his/her requirements to the computer scientist, not
the other way around!  The natural scientist is the
'customer' and the computer scientist is the 'service
provider' and we all know that the customer is always right
(even when he's wrong!).  Too many times the programmer
produces software 'features' (or bugs depending on how you
look at it!) that are convenient from the programming point
of view but are not what the scientist actually wants.  Now
clearly there will be situations where the scientist is ask!

ing for something that's just totally unfeasible in

software, and then there will have to be some negotiation,
but it still behoves the programmer to accommodate the
scientist's wishes as far as is practical.


It seems to me that 'biological' (i.e. essentially

arbitrary) residue numbering most definitely falls way short
of the class of unreasonable requests.  The biologist
essentially wants the residue 'number' (actually a name if
you include the chain ID and insertion code) to be merely a
label, nothing more, obviously firstly to identify the
residue on the graphics, but also to relate it to the
corresponding residue in homologous structures.  Therefore
the programmer must not infer anything concerning the
sequence (such as the residue connectivity) purely from the
labels!  It seems to me completely crazy that the biologist
has to relabel his meaningfully labelled sequence just to
make life comfortable for the programmer - and to maintain
different sets of numbers for different purposes!  If the
biologist really wants to label his/her contiguous sequence
'12345  -15X  5  6  -99W  ...' then so be it (anything
becomes possible if the numbers are treated purely as
labels).  It's the!

 programmer's job to accommodate that in software, it's not

his place to question the wisdom of the biologist.


In the majority of structures each unique chain identified

by the chain ID is contiguous, so that obviously has to be
the default presumption, regardless of the labelling.  Since
we are assuming that the residue labels provide absolutely no
information concerning the connectivity, and given the
current limitations of the PDB format, I think the programmer
is entitled to require that the ordering of residues in the
file is the same as that in the sequence (otherwise you would
need an additional column to specify the ordinal numbers of
the residues).  Then there has to be a way of telling the
software where the breaks in the sequence are.  In most cases
this will be obvious (e.g. the C-N distance is 10 Ang).  In
the few cases that the program is unable to infer a break
from the distance, the user clearly would be expected to
provide that information.  In the RESTRAIN program I required
that each chain break is flagged by a TER record, though
strictly that is only used to flag !

end-of-chain (AFAIK other software ignores the TER record).

 It seems to be that fixing this on-going problem is not
beyond the bounds of what we can reasonably expect from the software.


Cheers

-- Ian

-----Original Message-----
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On
Behalf Of Miguel Ortiz Lombardia
Sent: Thursday, August 10, 2006 8:01 AM
To: [EMAIL PROTECTED]
Cc: 'CCP4bb'
Subject: Re: [ccp4bb]: gap links

***  For details on how to be removed from this list visit the  ***
***          CCP4 home page http://www.ccp4.ac.uk         ***


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

refmac5 must be assuming that you number your protein

according to your

protein sequence, which is continuous. In my opinion, this

is reasonable.

Uhhh... this assumption turns perilious quickly, because there
are post-translational mods and splicing (see Concanavalin),
and biologists sometimes prefer keeping the
key residues in related structures (trypsin, fabs, etc) at
a certain residue number. This causes
sequence insertions (addressed correctly, as you say)
and gaps (not addressed correctly, my situation.)

Sure, but after any modification whatsoever the sequence of

the final

protein is, except for perhaps a few pathological cases,continuous.
Now, I can understand, though not always agree, that

biologists (I am

one) prefer to give a consistent number to a particular residuein afamily of proteins, but for a refinement program I still thinkit is
reasonable to consider the numbering as continuous by default: this
would be the most usual situation, I would say.

In any case, knowing that you can fix the problem using

TRANS (perhaps

even CIS if the thing is really bizarre) is very useful, thanks!


Miguel
- --
Miguel Ortiz Lombardía
Centro de Investigaciones Oncológicas
C/ Melchor Fernández Almagro, 3
28029 Madrid, Spain
Tel. +34 912 246 900
Fax. +34 912 246 976
email: [EMAIL PROTECTED]
www: http://www.ysbl.york.ac.uk/~mol/
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
~~~~~~~~~
Je suis de la mauvaise herbe,
Braves gens, braves gens,
Je pousse en liberté
Dans les jardins mal fréquentés!

Georges Brassens
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)

iD8DBQFE2tmRF6oOrDvhbQIRAoW9AJoCpyWRpC+R6XGzn6IGxniwwRK2UgCgoyDe
RRece2CHvTn8P22eekYjbZc=
=61Z2
-----END PGP SIGNATURE-----


Disclaimer

This communication is confidential and may contain

privileged information intended solely for the named
addressee(s). It may not be used or disclosed except for the
purpose for which it has been sent. If you are not the
intended recipient you must not review, use, disclose, copy,
distribute or take any action in reliance upon it. If you
have received this communication in error, please notify
Astex Therapeutics Ltd by emailing
[EMAIL PROTECTED] and destroy all copies of the
message and any attached documents.




Astex Therapeutics Ltd monitors, controls and protects all

its messaging traffic in compliance with its corporate email
policy. The Company accepts no liability or responsibility
for any onward transmission or use of emails and attachments
having left the Astex Therapeutics domain.  Unless expressly
stated, opinions in this message are those of the individual
sender and not of Astex Therapeutics Ltd. The recipient
should check this email and any attachments for the presence
of computer viruses. Astex Therapeutics Ltd accepts no
liability for damage caused by any virus transmitted by this
email. E-mail is susceptible to data corruption,
interception, unauthorized amendment, and tampering, Astex
Therapeutics Ltd only send and receive e-mails on the basis
that the Company is not liable for any such alteration or any
consequences thereof.


Disclaimer

This communication is confidential and may contain privilegedinformation intended solely for the named addressee(s). It may notbe used or disclosed except for the purpose for which it has beensent. If you are not the intended recipient you must not review,use, disclose, copy, distribute or take any action in reliance uponit. If you have received this communication in error, please notifyAstex Therapeutics Ltd by emailing [EMAIL PROTECTED]and destroy all copies of the message and any attached documents.

Astex Therapeutics Ltd monitors, controls and protects all itsmessaging traffic in compliance with its corporate email policy.The Company accepts no liability or responsibility for any onwardtransmission or use of emails and attachments having left the AstexTherapeutics domain. Unless expressly stated, opinions in thismessage are those of the individual sender and not of AstexTherapeutics Ltd. The recipient should check this email and anyattachments for the presence of computer viruses. AstexTherapeutics Ltd accepts no liability for damage caused by anyvirus transmitted by this email. E-mail is susceptible to datacorruption, interception, unauthorized amendment, and tampering,Astex Therapeutics Ltd only send and receive e-mails on the basisthat the Company is not liable for any such alteration or anyconsequences thereof.

Re: [ccp4bb]: gap links

Reply via email to