Re: [ccp4bb] should the final model be refined against full datset

Thomas C. Terwilliger Fri, 14 Oct 2011 16:35:30 -0700

Dear Gerard,

I'm very happy for the discussion to be on the CCP4 list (or on the IUCR
forums, or both).  I was only trying to not create too much traffic.


All the best,
Tom T

>> Dear Tom,
>>
>>      I am not sure that I feel happy with your invitation that views on
>> such
>> crucial matters as these deposition issues be communicated to you
>> off-list.
>> It would seem much healthier if these views were aired out within the BB.
>> Again!, some will say ... but the difference is that there is now a forum
>> for them, set up by the IUCr, that may eventually turn opinions into some
>> form of action.
>>
>>      I am sure that many subscribers to this BB, and not just you as a
>> member of some committees, would be interested to hear the full variety of
>> views on the desirable and the feasible in these areas, and to express
>> their
>> own for everyone to read and discuss.
>>
>>      Perhaps John Helliwell can elaborate on this and on the newly created
>> forum.
>>
>>
>>      With best wishes,
>>
>>           Gerard.
>>
>> --
>> On Fri, Oct 14, 2011 at 04:56:20PM -0600, Thomas C. Terwilliger wrote:
>>> For those who have strong opinions on what data should be deposited...
>>>
>>> The IUCR is just starting a serious discussion of this subject. Two
>>> committees, the "Data Deposition Working Group", led by John Helliwell,
>>> and the Commission on Biological Macromolecules (chaired by Xiao-Dong
>>> Su)
>>> are working on this.
>>>
>>> Two key issues are (1) feasibility and importance of deposition of raw
>>> images and (2) deposition of sufficient information to fully reproduce
>>> the
>>> crystallographic analysis.
>>>
>>> I am on both committees and would be happy to hear your ideas
>>> (off-list).
>>> I am sure the other members of the committees would welcome your
>>> thoughts
>>> as well.
>>>
>>> -Tom T
>>>
>>> Tom Terwilliger
>>> terwilli...@lanl.gov
>>>
>>>
>>> >> This is a follow up (or a digression) to James comparing test set to
>>> >> missing reflections.  I also heard this issue mentioned before but
>>> was
>>> >> always too lazy to actually pursue it.
>>> >>
>>> >> So.
>>> >>
>>> >> The role of the test set is to prevent overfitting.  Let's say I have
>>> >> the final model and I monitored the Rfree every step of the way and
>>> can
>>> >> conclude that there is no overfitting.  Should I do the final
>>> refinement
>>> >> against complete dataset?
>>> >>
>>> >> IMCO, I absolutely should.  The test set reflections contain
>>> >> information, and the "final" model is actually biased towards the
>>> >> working set.  Refining using all the data can only improve the
>>> accuracy
>>> >> of the model, if only slightly.
>>> >>
>>> >> The second question is practical.  Let's say I want to deposit the
>>> >> results of the refinement against the full dataset as my final model.
>>> >> Should I not report the Rfree and instead insert a remark explaining
>>> the
>>> >> situation?  If I report the Rfree prior to the test set removal, it
>>> is
>>> >> certain that every validation tool will report a mismatch.  It does
>>> not
>>> >> seem that the PDB has a mechanism to deal with this.
>>> >>
>>> >> Cheers,
>>> >>
>>> >> Ed.
>>> >>
>>> >>
>>> >>
>>> >> --
>>> >> Oh, suddenly throwing a giraffe into a volcano to make water is
>>> crazy?
>>> >>                                                 Julian, King of
>>> Lemurs
>>> >>
>>
>> --
>>
>>      ===============================================================
>>      *                                                             *
>>      * Gerard Bricogne                     g...@globalphasing.com  *
>>      *                                                             *
>>      * Global Phasing Ltd.                                         *
>>      * Sheraton House, Castle Park         Tel: +44-(0)1223-353033 *
>>      * Cambridge CB3 0AX, UK               Fax: +44-(0)1223-366889 *
>>      *                                                             *
>>      ===============================================================
>>

Re: [ccp4bb] should the final model be refined against full datset

Reply via email to