Try this:
1) take your favorite PDB file and set all the B factors to ~80 (reduces
series-termination errors)
2) use sfall/fft in CCP4 to calculate structure factors to 4A resolution
3) use sftools to add a "SIGF" column (0.1 will do) to make refmac5 happy
4) refine the "perfect" model against these fake data for ~5 cycles
(with "solvent no")
5) load this up in coot and contour at 1 sigma
6) repeat the refinement with a PDB file containing only main chain.
7) repeat the refinement after putting all the side chains in their most
likely (Ponder-Richards) rotamers.
Ask yourself these questions:
1) can you "see" the side chains?
2) can you "see" the waters?
3) what are the R factors from these refinements?
Answers: 1) no, 2) no, 3) ~3% for "perfect", ~50% for "main chain", and
~36% for "likely rotamer"
Now ask yourself: even though there is "no density" for side chains and
waters, is there really "no evidence" that they exist?
The point I am trying to make here is that you EXPECT side chains to
poke out of density at low resolution, even under ideal conditions
(perfect phases). For example, the C-deltas of Leu will "breach" the
1-sigma contour at around 2.8A resolution and worse. You can see this
in my old movie:
http://bl831.als.lbl.gov/~jamesh/movies/index.html#reso
When it comes to building, yes, once an atom dips below the 1-sigma
contour it gets harder and harder to know exactly where it is, but it
does have to be somewhere. Somewhere nearby. Formally, there is "prior
knowledge" of bond lengths, etc. at play. And if you know that there is
one copy of a given atom in every unit cell of the crystal, then
occupancy < 1 is inappropriate. Much better to use B = 999, which
models the atom as a Gaussian with the electrons spread over an area
about 3.5 A wide. This is roughly the range your average side chain
atom has available to it, given that it is attached to the main chain by
covalent bonds.
Of course, a more "Bayesian" model for the "I don't know what the
rotamer is" situation would be to build in ALL possible rotamers, with
occupancies equal to their Ponder-Richards probabilities. Some
improvement to this initial "guess" would no doubt be made by using
constrained occupancy refinement of rigid-body side chains.
Unfortunately, this is impossible with any refinement program I know
about, since refmac, phenix.refine, etc. don't support more than 3 or 4
alternate conformers.
Building in all possible conformers and using the occupancy as a
"p-value" would also help solve the problem of the careless and/or
uneducated over-interpreting PDB files. Which is the "right one"? Good
question! I think its time we started dispelling the myth of the
single-conformer protein anyway.
-James Holton
MAD Scientist
On 3/26/2012 7:40 AM, Ed Pozharski wrote:
On Mon, 2012-03-26 at 10:17 -0400, Gregory Bowman wrote:
But what about the issue of resolution? As was previously pointed out,
at say 3.2 Å resolution, many side chains will fail to fit, but it
doesn't seem appropriate to trim them all down.
Why is it inappropriate to trim them down? Sometimes at low resolution
all one can be confident about is the backbone trace.
Just to be clear, I am talking about atoms whose positions are not
supported by electron density, i.e. where difference map in the absence
of the side chain is featureless. I assume that is the likely situation
when one would set occupancy to zero.
Cheers,
Ed.