I haven't read the paper, so perhaps shouldn't say anything yet, but
here goes.
For me Rfree is primarily a tool to help choose the refinement protocol,
set the relative weight for geometry restraints versus crystallographic
data, B-value restraints etc. Trying different parameter settings and
picking the one that reduces Rfree the most is what, in my mind, Rfree
was designed for. Sure there is statistical noise in Rfree and by
picking the lowest Rfree you may be selecting for "favourable noise"
rather than the best model but it is still your best indicator for model
quality and the quality differences between models with very similar
Rfree values is probably not worth loosing (R)sleep over.
The big difference, I think, is that in refinement the big enemy is a
too low observation/parameter ratio with Rfree acting as the indicator
to reduce overfitting. In selecting appropriate settings for a few
global parameters there just isn't the same risk of overfitting. Using
multi-start torsion-angle refinement and picking the solution with the
lowest Rfree is not that different. Are you really biasing Rfree by
picking the run with the lowest value or are you truly picking the best
solution? Even if the solution you picked was not the very best due to
statistical noise in Rfree, in continuing refinement the statistical
benefits are probably not going to carry over into the rest of the
refinement.
I'm sure there is going to be a lot of different opinions on this one...
Bart
Mark J. van Raaij wrote:
Dear All,
the short paper by Gerard Kleywegt (ActaD 63, 939-940) treats an
interesting subject (at least I think so...). I agree that what we are
now doing in many cases is effectively refining against Rfree. For
example, the standard CNS torsion angle refinement does n refinement
trials with randomised starting points. If you then take the one with
lowest Rfree (or let a script do this for you), you are biasing Rfree!
Therefore, his proposal to put an extra set of reflections in a dormant
"vault" (R-sleep) sounds like a good idea to me. However, how would the
"vault" be implemented to be effective? If left to the experimenter, it
would be very tempting to check R-sleep once in a while (or often)
during refinement, rendering it useless as an unbiased validator.
or am I being paranoid and too pessimistic?
Mark J. van Raaij
Unidad de Bioquímica Estructural
Dpto de Bioquímica, Facultad de Farmacia
and
Unidad de Rayos X, Edificio CACTUS
Universidad de Santiago
15782 Santiago de Compostela
Spain
http://web.usc.es/~vanraaij/
--
==============================================================================
Bart Hazes (Assistant Professor)
Dept. of Medical Microbiology & Immunology
University of Alberta
1-15 Medical Sciences Building
Edmonton, Alberta
Canada, T6G 2H7
phone: 1-780-492-0042
fax: 1-780-492-7521
==============================================================================