Hi Frank,
> > I described in the previous e-mail the probabilistic interpretation of
> > B-factors. In the case of very high uncertainty = poorly ordered side
> > chains, I prefer to deposit the conformer representing maximum a
> > posteriori, even if it does not represent all possible conformations.
> > Maximum a posteriori will have significant contribution from the most
> > probable conformation of side chain (prior knowledge) and should not
> > conflict with likelihood (electron density map).
> > Thus, in practice I model the most probable conformation as long as it
> > it in even very weak electron density, does not overlap significantly
> > with negative difference electron density and do not clash with other
> > residues.
> If it's probability you're after, if there's no density to guide you
> (very common!) you'd have to place all "likely" rotamers that don't
> clash with anything, and set their occupancies to their probability (as
> encoded in the rotamer library).
Which library? The one for all side chains of a specific type, or the one for a
specific type with a given backbone conformation? These are quite different and
change with the content of the PDB.
'Hacking' the occupancies is risky bussiness in general: errors are made quite
easily. I frequently encounter side chains with partial occupancies but no
alternatives, how can I relate this to the experimental date? Even worse, I
also see cases where the occupancies of alternates sum up to values > 1.00.
What does that mean? Is that a local increase of DarmMatter accidentally
encoded in the occupancy?
> This is now veering into data-free protein modeling territory... wasn't
> the idea to present to the downstream user an atomic representation of
> what the electron density shows us?
Yes, but what we see can be deceiving.
> Worse, what we're also doing is encoding multiple different things in
> one place - what database people call "poorly normalised", i.e. to
> understand a data field requires further parsing and if statements. In
> this case: to know whether there was no density, as end-user I'd have
> to have to second-guess what exactly those
> high-B-factor-variable-occupancy atoms mean.
>
> Until the PDB is expanded, the conventions need to be clear, and I
> thought they were:
> High B-factor ==> means atom is there but density is weak
> Atom missing ==> no density to support it.
Unfortunately, it is not trivial to decide when there is 'no density'. We must
have a good metric to do this, but I don't think it exists yet. Removing atoms
is thus very subjective. This explaines why I frequently find positive
difference density peaks near missing side chains. Leaving side chains in
sometimes gives negative difference density but refining them with proper
B-factor restrainsts reduces the problem a lot. There is still the problem of
radiation damage, but that is relatively small. At least refining the B-factor
is more reproducible and less subjective than making the binary choice to keep
or remove an atom.
Cheers,
Robbie
>
> Oh well...
> phx.