*** For details on how to be removed from this list visit the ***
*** CCP4 home page http://www.ccp4.ac.uk ***
M. Schiltz wrote:
If this approach was pushed to the extreme, it would imply that bulk
solvent >atoms should also be explicitly included in the PDB file,
because, clearly "the >atoms are in the crystal", e.g. refine
hundreds of bulk solvent atoms with >occupancy = 1.0 and let the B
factor reflect the disorder....
I have to agree for the most part. Someone argued against this by
saying that side chain atoms are different because you know they're
anchored to the protein, so you don't have distance restraints. But
if you randomly place a water at position XYZ far away from the
protein you're probably going to be closer to the correct average
real position of a water oxygen than you would be to the correct
average position for the terminal nitrogen of a highly disordered
lysine side chain.
And covalent attachment is not a good criterion to use, because the
object that is covalently attached is sometimes an entire protein
domain. If there is no supporting electron density, hopefully nobody
would try to model an entire missing N-terminal domain of a protein
into their x-ray structure of the entire protein based on (a) the
knowledge that the other domain was there, and (b) someone else's
x-ray structure of the N-terminal domain by itself.
Mischa Machius wrote:
If C-beta is well defined, we have a pretty good idea about where
C-gamma is >and so on. Contour the 2Fo-Fc map at 0.3 sigma and you
will likely see some >density. Whether this is noise or signal is a
matter of discussion (ask people >in Tom Alber's lab), but at least
refinement programs have something to base >their B values on.
If C-beta is fuzzy and C-gamma is guesstimated based on the position
of C-beta, then what do you do with 0.3 sigma density near your
guesstimated C-gamma? Do you call it C delta? Even if that 0.3 sigma
density really is signal and not noise, it could be solvent density
if your guesstimated C-gamma position is off.
Frances Berstein wrote:
I would suggest that the people on this discussion list,
who are all basically crystallographers or sophisticated users,
who are advocating including atoms with occupancy 0.0 should
talk to a biologist PDB user down the hall from them and see if
they even understand what a B value is or what an occupancy
of 0.0 means.
Poor software design contributes to this problem, because even though
you might explain what these things are, the biologists are going to
forget the B factor is there or forget what it is if the viewew they
are using doesn't keep reminding them the data is there.
Crystallographers may not realize the full extent of this problem
because they are using software designed for crystallography.
Biologists, chemist, and even molecular modelers tend not to be using
such software. Having personally made the change from
crystallography to modeling, it's very apparent to me that
crystallographers have had close to zero input into the design of
many molecular modeling packages.
Ideally, as Dirk Kostrewa said, people would deposit their electron
density maps. Tassos wrote, "i admit i have limited sympathy for PDB
users that ignore B values and also cant be bothered to use the EDS".
But biologists are not going to learn "O" or even Coot, nor should
they have to. If you are a molecular modeler using Schrodinger's
suite, you can't easily view density. If you're using Accelrys, are
you going to pay the extra money for extra program modules to see
electron density, when really IMO that is a key function of a
competent PDB viewing module?
I can still drag out "O" when I need to look at density but none of
the other packages I use can view density and trying to remember how
to use "O" when I only run it once or twice a year is a real pain
(sure I could learn Coot but people like me who only look at maps
infrequently aren't going to want to invest the time to learn another
software package just to see a map). If I didn't have my x-ray
background, I would probably be tempted conclude it's way too much
trouble and give up, which would be a mistake, because I see an awful
lot of errors in active sites and once you suspect you're looking at
a sloppy x-ray refinement you really need the density maps. Even so
I am usually too busy to bother with a manual map calculation if the
EDS server failed for some reason. Yes, that makes me lazy, but no
lazier than crystallographers who don't ensure the EDS server can
calulate their maps.
Should I have to read through the text of a PDB file to identify
residues with missing atoms or multiple conformations? No. The whole
point of a PDB file is to provide input to a program that gives us a
graphical view of a chemical structure. To the maximum extent
possible all the information in the PDB file should be readily
visualized by a good software program. One of the commercial
molecular modeling packages does some helpful color coding when it
imports a PDB file. Gray atoms have nothing funky going on, orange
ones may have incorrect bonding, residues with missing side chains
are red, those with multiple conformations are green, etc. This is
much better for a biologist/chemist to interpret as opposed to trying
to teach them to scan a PDB file for alternate conformations, but
even this program with the color coding can't actually access the
alternate conformations or display a density map.
Suggesting that a biologist/chemist should have to learn the PDB or
CIF file format is the wrong approach: it contributes to keeping
crystallography inaccessible to non-experts. Remember, you want your
biologist and chemist friends to understand and enjoy x-ray
structures. They will be more likely to fund your grant proposals if
they do. :-)
Ethan Merritt wrote:
We are. At least, those who correctly use available options in
refmac or shelx are.
The best results (R, Rfree, geometry) are obtained by explicit inclusion
of hydrogens via the riding-hydrogen model. This is also a basis for the
Molprobity validation tools.
But you still don't include them explicitly in the final model, which
is what this discussion is really about.
Has anyone done a large-scale study of whether modeling all
geometrically reasonable common rotamers improves R, Rfree, and
geometry for various possible definitions of a "disordered side
chain"? It sounds from Kevin's comments like no large study has been
done. But in the end, that is probably the only way to conclusively
resolve this question: by looking at whether making educated guesses
for disordered side chains (you'd have to carefully define
"disordered") improves the model's agreement with the experimental
data. All of our theoretical arguments in this thread wouldn't mean
that much in the face of some conclusive evidence one way or the
other.
--
Eric Bennett
Assistant Director
Center for Drug Design
University of Minnesota