> (1). Can I say the X-ray weighting is optimal when it yields the smallest
> Rfree, meanwhile RMS-Z(bonds) is smaller than "0.85 - 0.146*resolution"
> (angles also maybe)?

The weighting is optimal when the free likelihood is maximised with
respect to the weights, or equivalently when the negative log of the
free likelihood (-LLfree: the number printed by Refmac) is minimised.
The practical problem is that this requires a lot of refinement runs
with different weights to locate the optimum.  Ideally also the B
weighting factor needs to be optimised by the same method, but this
makes it a 2-parameter optimisation so you would need even more runs
of Refmac to locate the optimum.  The B weighting factor is
resolution-dependent so a single value is really not suitable at all
resolutions.  I was suggesting using the PDB-REDO based
resolution-dependent RMS-Z(bonds) target value as a "quick-and-dirty"
alternative which won't be too far out.

> (2). Why RMS-Z(bonds) should be lower than that for low resolution data
> and higher for high resolution? Or why high-resolution can allows more
> outliers?

Bernhard's thought experiment is a good one, I would just say that if
you only have low resolution data you can't hope to estimate small
deviations from the target values accurately: there's a good chance
that half of them will be just random deviations in the wrong
direction and only produce overfitting and an increase in Rfree; hence
you won't achieve the optimal LLfree.  If you have high resolution
data then of course you are justified in claiming that the deviations
from the target values that you observe are meaningful - that's what
'resolution' means.  You're second question about only being able to
detect outliers with high resolution data answers your first question!

This is analogous to the 'D' factor in D*Fcalc for the map
coefficients: the effect of random error is to reduce the expectation.
 Incidentally, changing the subject briefly to a previous thread: note
that Refmac writes out D*Fcalc in the 'FC' column, not Fcalc, so if
people deposit this column in the PDB then it cannot be used to
reproduce the R factor, which requires Fcalc.

Cheers

-- Ian

Reply via email to