Re: [ccp4bb] what to do with disordered side chains

James Holton Wed, 30 Mar 2011 11:36:20 -0700

How about a converter between the two "file formats"? Perhaps somethinglike this:


if(occ==1 && B < 30) print;


-James Holton
MAD Scientist

On 3/30/2011 11:30 AM, Mark J van Raaij wrote:

perhaps then there should be 2 pdb files for each structure:
- a "users" pdb containing "correct" models but tailored for easy use by 
non-crystallographers
- a "depository" pdb containing the "best" model the crystallographers can (or 
has bothered to) come up with, of course conforming to certain quality standards.
I am not saying the standards for the files should be forever the same, they 
should be allowed to evolve with average user understanding and 
crystallographic developments, respectively.
I don't think we can expect all molecular biologists to understand protein 
structure refinement, yet I think we still should encourage they all use pdb 
models where available. They currently don't, partly due to difficulties with 
and differences between files in the pdb format. A particular case I can 
remember is where someone insisted on referring to a gel experiment to prove a 
protein was trimeric, rather than to the structure that had been solved 
recently (or at least he should have referred to both).
We would shoot ourselves in the foot if we don't promote the most possible 
wide-spread use of our models.
I don't think we are talking about students or research novices here, but about 
"savvy" end-users to quote Phoebe Rice.

Mark J van Raaij
Laboratorio M-4
Dpto de Estructura de Macromoleculas
Centro Nacional de Biotecnologia - CSIC
c/Darwin 3, Campus Cantoblanco
E-28049 Madrid, Spain
tel. (+34) 91 585 4616
http://www.cnb.csic.es/content/research/macromolecular/mvraaij/index.php?l=1



On 30 Mar 2011, at 20:04, James Holton wrote:

I'm afraid this is not a problem that can be solved by "standardization".

Fundamentally, if you are a scientist who has collected some data (be it diffraction spot intensities, cell
counts, or substrate concentration vs time), and you have built a "model" to explain that data (be
it a constellation of atoms in a unit cell, exponential population growth, or a microscopic reaction
mechanism), I think it is generally expected that your model explain the data "to within experimental
error". Unfortunately, this is never the case in macromolecular crystallography, where the model-data
disagreement (Fobs-Fcalc) is ~4-5x bigger than the "error bars" (sigma(F)).

Now, there is nothing shameful about an incomplete model, especially when thousands of very intelligent people working over half a century
have not been able to come up with a better way to build one. In fact, perhaps a better name for the "disordered side chain
problem" would be "dark density"? This name would place it properly amongst "dark matter", "dark
energy" and other fudge factors introduced to try and explain why our "standard model" is not consistent with observation?
That is, "dark density" is the stuff we can't see, but nonetheless must be there somewhere.

Whatever it is, I personally do hold a vain belief that perhaps someday soon the problem of "dark density" will be
solved, and that presently instituting a "policy" requiring that all macromolecular models from this day forward remain
at least as incomplete as yesterday's models is not a very good idea. I say: if you think there is "something there"
then you should build it in, especially if it is important to the conclusions you are trying to make. You can defend your model
the same way you would defend any other scientific model: by using established statistics to show that it agrees with the data
better than an "alternative model" (like leaving it out). It is YOUR model, after all! Only you are responsible for
how "right" it is.

I do appreciate that students and other novices may have a harder time defining
"surfaces" and measuring hydrogen bond lengths in these pesky "floppy regions",
but perhaps their education would be served better by learning the truth sooner than later?

-James Holton
MAD Scientist

On 3/30/2011 9:26 AM, Filip Van Petegem wrote:

Hello Mark,

I absolutely agree with this.  The worst thing is when everybody is following 
their own personal rules, and there are no major guidelines for end-users to 
figure out how to interpret those parts.  I assume there are no absolute 
guidelines simply because there isn't any consensus among crystallographers... 
(from what we can gather from this set of emails...). On the other hand, this 
discussion has flared up many times in the past, and maybe it's time for a 
powerful dictator at the PDB to create the law...

Filip Van Petegem



On Wed, Mar 30, 2011 at 8:37 AM, Mark J van Raaij<[email protected]>  
wrote:
perhaps the IUCr and/or PDB (Gerard K?) should issue some guidelines along 
these lines?
And oblige us all to follow them?
Mark J van Raaij
Laboratorio M-4
Dpto de Estructura de Macromoleculas
Centro Nacional de Biotecnologia - CSIC
c/Darwin 3, Campus Cantoblanco
E-28049 Madrid, Spain
tel. (+34) 91 585 4616
http://www.cnb.csic.es/content/research/macromolecular/mvraaij/index.php?l=1



On 30 Mar 2011, at 17:29, Phoebe Rice wrote:

I've now polled 4 fairly savvy "end users" of crystal structures and there 
seems to be a consensus:

- they all know what B is and how to look for regions of high B (with, say, 
pymol) and they know not to make firm conclusions about H-bonds to flaming red 
side chains.
- None of them would ever think to look at occupancy and they don't know how 
anyway.
- they expect that loops with disordered backbones would not be included in the 
models, and can figure out truncated or fake-ala side chains with some 
additioanl effort, but that option makes viewing surfaces and e-stats more of a 
pain.

  Phoebe

=====================================
Phoebe A. Rice
Dept. of Biochemistry&  Molecular Biology
The University of Chicago
phone 773 834 1723
http://bmb.bsd.uchicago.edu/Faculty_and_Research/01_Faculty/01_Faculty_Alphabetically.php?faculty_id=123
http://www.rsc.org/shop/books/2008/9780854042722.asp


---- Original message ----

Date: Tue, 29 Mar 2011 17:43:49 -0400
From: CCP4 bulletin board<[email protected]>  (on behalf of Ed 
Pozharski<[email protected]>)
Subject: [ccp4bb] what to do with disordered side chains
To: [email protected]

The results of the online survey on what to do with disordered side
chains (from total of 240 responses):

Delete the atoms                                         43%
Let refinement take care of it by inflating B-factors    41%
Set occupancy to zero                                    12%
Other                                                     4%

"Other" suggestions were:

- Place atoms in most likely spot based on rotomer and contacts and
indicate high positional sigmas on ATMSIG records
- To invent refinement that will spread this residues over many rotamers
as this is what actually happened
- Delet the atoms but retain the original amino acid name
- choose the most common rotamer (B-factors don't "inflate", they just
rise slightly)
- Depends. if the disordered region is unteresting, delete atoms.
Otherwise, try to model it in one or more disordered model (and then
state it clearly in the pdb file)
- In case that no density is in the map, model several conformations of
the missing segment and insert it into the PDB file with zero
occupancies. It is equivalent what the NMR people do.
- Model it in and compare the MD simulations with SAXS
- I would assumne Dale Tronrod suggestion the best. Sigatm labels.
- Let the refinement inflate B-factors, then set occupancy to zero in
the last round.

Thanks to all for participation,

Ed.

--
"I'd jump in myself, if I weren't so good at whistling."
                              Julian, King of Lemurs



--
Filip Van Petegem, PhD
Assistant Professor
The University of British Columbia
Dept. of Biochemistry and Molecular Biology
2350 Health Sciences Mall - Rm 2.356
Vancouver, V6T 1Z3

phone: +1 604 827 4267
email: [email protected]
http://crg.ubc.ca/VanPetegem/

Re: [ccp4bb] what to do with disordered side chains

Reply via email to