> (1). Can I say the X-ray weighting is optimal when it yields the smallest > Rfree, meanwhile RMS-Z(bonds) is smaller than "0.85 - 0.146*resolution" > (angles also maybe)?
The weighting is optimal when the free likelihood is maximised with respect to the weights, or equivalently when the negative log of the free likelihood (-LLfree: the number printed by Refmac) is minimised. The practical problem is that this requires a lot of refinement runs with different weights to locate the optimum. Ideally also the B weighting factor needs to be optimised by the same method, but this makes it a 2-parameter optimisation so you would need even more runs of Refmac to locate the optimum. The B weighting factor is resolution-dependent so a single value is really not suitable at all resolutions. I was suggesting using the PDB-REDO based resolution-dependent RMS-Z(bonds) target value as a "quick-and-dirty" alternative which won't be too far out. > (2). Why RMS-Z(bonds) should be lower than that for low resolution data > and higher for high resolution? Or why high-resolution can allows more > outliers? Bernhard's thought experiment is a good one, I would just say that if you only have low resolution data you can't hope to estimate small deviations from the target values accurately: there's a good chance that half of them will be just random deviations in the wrong direction and only produce overfitting and an increase in Rfree; hence you won't achieve the optimal LLfree. If you have high resolution data then of course you are justified in claiming that the deviations from the target values that you observe are meaningful - that's what 'resolution' means. You're second question about only being able to detect outliers with high resolution data answers your first question! This is analogous to the 'D' factor in D*Fcalc for the map coefficients: the effect of random error is to reduce the expectation. Incidentally, changing the subject briefly to a previous thread: note that Refmac writes out D*Fcalc in the 'FC' column, not Fcalc, so if people deposit this column in the PDB then it cannot be used to reproduce the R factor, which requires Fcalc. Cheers -- Ian
