The principle difference between occ=0 and omitting the atom entirely is
that occ=0 atoms exclude bulk solvent. Or at least they do for typical
operation of contemporary refinement programs. So, by defining occ=0
you are forcing all map voxels within ~0.6A or so of your "invisible"
atom to be vacuum. If you omit it, then the bulk solvent may "flood
in", perhaps enough to pull the fo-fc peak down below 3x rms. How much
the bulk solvent floods in depends on how nearby atoms exclude the bulk
solvent, and this, in turn, depends on which refinement program you are
using. Different bulk solvent implementations use different radii,
"shrink" parameters, etc. In addition, bulk solvent always "bleeds" a
bit into surrounding areas because the solvent B factor is never zero.
The real problem, I think, is that for any voxel of the map there is
ALWAYS "something there". The only question is: what is it? Is there a
100% occupied ligand? 100% occupied solvent? Two conformers of the
ligand? Or is it some mixture of all these? If you are asking these
questions I think it is most likely a mixture, and mixtures are hard to
model. What is worse, mixtures of a partially-occupied ligand with bulk
solvent taking up the slack is currently impossible to model. We will
have to wait for partial-occupancy-bulk-solvent to be implemented before
we can build representations of these alternative hypotheses and and
test them with competitive occupancy refinement.
The bulk solvent is actually a very good example of something for which
we see "no evidence" in our electron density maps, yet we model it in
because 1) we know it must be there, and 2) it makes our R factors
lower. What more could you want?
-James Holton
MAD Scientist
On 6/13/2014 7:45 PM, Frank von Delft wrote:
Hi all - talking about ligands, a quick question on that old
conundrum, of what to do about invisible atoms -- build them with
occ=0, or omit them?
For bits of protein, I know all the arguments; personally I prefer
omitting atoms because:
* for amino acid sidechains, their presence is implied in the
residue name.
* for whole residues, their presence is implied in the sequence
numbering
However: what about ligands? Nowhere else in the PDB file is their
presence implied - or have I missed something?
(Certainly disorder in a ligand is important information that needs to
be captured!)
Cheers
phx