Hi Graeme,

 

We have had similar discussion with PDB_REDO that is frequently forced to 
assign a new R-free set when the input data doesn’t have one (this still 
happens with new PDB entries!). The ‘500/1000/1500/2000 reflections’ is enough 
school seems to look only at the variance of R-free for different choices of 
test sets, which depends on the absolute number of reflections.  You also want 
a representative sample of reciprocal space which depends on the fraction of 
reflections. In PDB_REDO we make a new test set if:

-          The test set is smaller than 1% of the reflections

-          When the set has fewer than 500 reflections AND is smaller than 10% 
of the reflections.

 

The new set is chosen as at least 5% of the possible reflections given the cell 
parameters and the resolution. If there are between 20000 and 10000 
reflections, the percentage is increased to get at least 1000 reflections in 
the test set.  So the maximum percentage is 10%. 

 

Funny side note: The random number generator in freerflag was set up to always 
pick the same test set for given resolution and cell parameters, which is 
useful if you misplace your test set. Unfortunately, we also had data sets from 
the PDB where the newly generated test set had no observed reflections. Most of 
these datasets were close to 95% complete ;)

 

Cheers,

Robbie  

 

From: CCP4 bulletin board [mailto:[email protected]] On Behalf Of Graeme 
Winter
Sent: Tuesday, June 2, 2015 12:27
To: [email protected]
Subject: [ccp4bb] How many is too many free reflections?

 

Hi Folks

 

Had a vague comment handed my way that "xia2 assigns too many free reflections" 
- I have a feeling that by default it makes a free set of 5% which was OK back 
in the day (like I/sig(I) = 2 was OK) but maybe seems excessive now.

 

This was particularly in the case of high resolution data where you have a lot 
of reflections, so 5% could be several thousand which would be more than you 
need to just check Rfree seems OK.

 

Since I really don't know what is the right # reflections to assign to a free 
set thought I would ask here - what do you think? Essentially I need to assign 
a minimum %age or minimum # - the lower of the two presumably?

 

Any comments welcome!

 

Thanks & best wishes Graeme

Reply via email to