Hi Frank, > > I described in the previous e-mail the probabilistic interpretation of > > B-factors. In the case of very high uncertainty = poorly ordered side > > chains, I prefer to deposit the conformer representing maximum a > > posteriori, even if it does not represent all possible conformations. > > Maximum a posteriori will have significant contribution from the most > > probable conformation of side chain (prior knowledge) and should not > > conflict with likelihood (electron density map). > > Thus, in practice I model the most probable conformation as long as it > > it in even very weak electron density, does not overlap significantly > > with negative difference electron density and do not clash with other > > residues. > If it's probability you're after, if there's no density to guide you > (very common!) you'd have to place all "likely" rotamers that don't > clash with anything, and set their occupancies to their probability (as > encoded in the rotamer library). Which library? The one for all side chains of a specific type, or the one for a specific type with a given backbone conformation? These are quite different and change with the content of the PDB. 'Hacking' the occupancies is risky bussiness in general: errors are made quite easily. I frequently encounter side chains with partial occupancies but no alternatives, how can I relate this to the experimental date? Even worse, I also see cases where the occupancies of alternates sum up to values > 1.00. What does that mean? Is that a local increase of DarmMatter accidentally encoded in the occupancy?
> This is now veering into data-free protein modeling territory... wasn't > the idea to present to the downstream user an atomic representation of > what the electron density shows us? Yes, but what we see can be deceiving. > Worse, what we're also doing is encoding multiple different things in > one place - what database people call "poorly normalised", i.e. to > understand a data field requires further parsing and if statements. In > this case: to know whether there was no density, as end-user I'd have > to have to second-guess what exactly those > high-B-factor-variable-occupancy atoms mean. > > Until the PDB is expanded, the conventions need to be clear, and I > thought they were: > High B-factor ==> means atom is there but density is weak > Atom missing ==> no density to support it. Unfortunately, it is not trivial to decide when there is 'no density'. We must have a good metric to do this, but I don't think it exists yet. Removing atoms is thus very subjective. This explaines why I frequently find positive difference density peaks near missing side chains. Leaving side chains in sometimes gives negative difference density but refining them with proper B-factor restrainsts reduces the problem a lot. There is still the problem of radiation damage, but that is relatively small. At least refining the B-factor is more reproducible and less subjective than making the binary choice to keep or remove an atom. Cheers, Robbie > > Oh well... > phx.