Hi AR Please define what you mean by 'over-refinement' as it's not a term I use: does it mean 'convergence', or 'over-fitting', or 'over-optimisation' (whatever that means) or something else?
If by "LLG is stabilized" you mean it has converged then I agree that's a possible stopping criterion, but then so must all the refinement indicators including R and Rfree (& RMSDs etc) since by definition at convergence there can be no further significant changes in the parameters to cause R and Rfree to change further. You say R & Rfree "are going in opposite directions" when LLG has stabilized. It's not possible for R and Rfree to continue to change if the refinement has converged, since that clearly implies that it hasn't converged. Cheers -- Ian PS 3 copies of your email is 2 too many (or if this is the list server acting up again, my apologies). On Fri, Aug 26, 2011 at 3:55 PM, protein chemistry < proteinchemistr...@gmail.com> wrote: > Dear Dr Ian > > from your argument i could not understand how many cycles to refine before > submitting the coordinates to the PDB. what is the upper limit 100 or > thousand or million???? according to my understanding, its more logical to > stop the refinement when over refinement is taking place (when R and Rfree > are going in opposite directions and LLG is stabilized ) > > > On Fri, Aug 26, 2011 at 4:01 PM, Ian Tickle <ianj...@gmail.com> wrote: > >> Frank, >> >> Point #1 - fair point; the reason Rfree is popular, though, is because it >>> is a *relative* metric, i.e. by now we have a sense of what "good" is. >>> So I predict an uphill fight for LLfree. >>> >> >> Why? I don't see any difference. As you say Rfree is a relative metric so >> your sense of what 'good' is relies on comparisons with other Rfrees (i.e. >> it can only be 'better' or 'worse' not 'good' or 'bad'), but then the same >> is true of LLfree (note that they both assume that exactly the same data >> were used and that only the model has changed). So when choosing between >> alternative model parameterisations in order to minimise over-fitting we >> compare their Rfrees and choose the lower one - same with LLfree, or we >> compare the observed Rfree with the expected Rfree based on Rwork and the >> obs/param ratio to check for problems with the model - same with LLfree. In >> fact you can do it better because the observations in LLfree are weighted in >> exactly the same way as those in the target function. >> >> >>> Point #2 would hold if we routinely let our refinements run to >>> convergence; seems common though to run "10 cycles" or "50 cycles" instead >>> and draw conclusions from the behaviour of the metrics. Are the conclusions >>> really much different from the comparison-at-convergence you advocate? >>> Which is in practice often less convenient. >>> >>> You might do 10 cycles for a quick optimisation of the coordinates, but >> then I wouldn't place much faith in the R factors! How can you draw any >> conclusions from their behaviour: there's no way of predicting how they will >> change in further cycles, the only way to find out is to do it. I'm not >> saying that you need to refine exhaustively on every run, that would be >> silly since you don't need to know the correct value of the R factors for >> every run; but certainly on the final run before PDB submission I would >> regard stopping the refinement early based on Rfree as implied in Tim's >> original posting as something akin to 'cheating'. >> >> Cheers >> >> -- Ian >> > >