I haven't read the paper, so perhaps shouldn't say anything yet, but here goes.

For me Rfree is primarily a tool to help choose the refinement protocol, set the relative weight for geometry restraints versus crystallographic data, B-value restraints etc. Trying different parameter settings and picking the one that reduces Rfree the most is what, in my mind, Rfree was designed for. Sure there is statistical noise in Rfree and by picking the lowest Rfree you may be selecting for "favourable noise" rather than the best model but it is still your best indicator for model quality and the quality differences between models with very similar Rfree values is probably not worth loosing (R)sleep over.

The big difference, I think, is that in refinement the big enemy is a too low observation/parameter ratio with Rfree acting as the indicator to reduce overfitting. In selecting appropriate settings for a few global parameters there just isn't the same risk of overfitting. Using multi-start torsion-angle refinement and picking the solution with the lowest Rfree is not that different. Are you really biasing Rfree by picking the run with the lowest value or are you truly picking the best solution? Even if the solution you picked was not the very best due to statistical noise in Rfree, in continuing refinement the statistical benefits are probably not going to carry over into the rest of the refinement.

I'm sure there is going to be a lot of different opinions on this one...

Bart

Mark J. van Raaij wrote:
Dear All,

the short paper by Gerard Kleywegt (ActaD 63, 939-940) treats an interesting subject (at least I think so...). I agree that what we are now doing in many cases is effectively refining against Rfree. For example, the standard CNS torsion angle refinement does n refinement trials with randomised starting points. If you then take the one with lowest Rfree (or let a script do this for you), you are biasing Rfree! Therefore, his proposal to put an extra set of reflections in a dormant "vault" (R-sleep) sounds like a good idea to me. However, how would the "vault" be implemented to be effective? If left to the experimenter, it would be very tempting to check R-sleep once in a while (or often) during refinement, rendering it useless as an unbiased validator.
or am I being paranoid and too pessimistic?

Mark J. van Raaij
Unidad de Bioquímica Estructural
Dpto de Bioquímica, Facultad de Farmacia
and
Unidad de Rayos X, Edificio CACTUS
Universidad de Santiago
15782 Santiago de Compostela
Spain
http://web.usc.es/~vanraaij/




--

==============================================================================

Bart Hazes (Assistant Professor)
Dept. of Medical Microbiology & Immunology
University of Alberta
1-15 Medical Sciences Building
Edmonton, Alberta
Canada, T6G 2H7
phone:  1-780-492-0042
fax:    1-780-492-7521

==============================================================================

Reply via email to