Right about the 1000 in that case, but also Rfree with 5% would be 
statistically poor. I guess one would be stuck in that case.

JPK

From: Pavel Afonine [mailto:[email protected]]
Sent: Friday, November 21, 2014 11:16 AM
To: Keller, Jacob
Cc: [email protected]
Subject: Re: [ccp4bb] Free Reflections as Percent and not a Number

Oh I see, I though the answer follows from that. Fraction is better (or may be 
fraction with a cap). Hardwiring a number may not always work. For small 
crystals or small data sets or incomplete datasets say 1000 reflections may 
mean 50% of the dataset.

All the best,
Pavel

On Fri, Nov 21, 2014 at 8:09 AM, Keller, Jacob 
<[email protected]<mailto:[email protected]>> wrote:
Agree with all of this—but how does it reflect on the original question of 
whether to use a percent or an absolute number?

JPK

From: Pavel Afonine [mailto:[email protected]<mailto:[email protected]>]
Sent: Friday, November 21, 2014 11:02 AM
To: Keller, Jacob
Cc: [email protected]<mailto:[email protected]>
Subject: Re: [ccp4bb] Free Reflections as Percent and not a Number

Hello,

choice of the size of free (or test, whatever you like to call them) 
reflections is important for three different purposes:

- estimation of parameters for ML target for refinement;
- map calculation (coefficients m&D in 2mFo-DFc or mFo-DFc map are calculated 
using test reflections);
- validation (calculation Rfree).

It is important that free reflections are evenly distributed across the whole 
resolution range, and each sufficiently thin resolution bin contains at least 
50 test reflections so that the estimation of ML parameters is robust and 
reliable. "Sufficiently thin resolution bin" is such that ML parameters can be 
assumed constants in it.

Smaller test sets will result in less stable refinements (refinement outcome 
will strongly depend on the choice of test set).

Larger test sets will damage map quality (unless all reflections are used in 
map calculation).

Size of free set needs to be sufficiently large so that Rfree is statistically 
meaningful.

Nothing new is said above, it's all documented in the literature!

Pavel


On Thu, Nov 20, 2014 at 2:43 PM, Keller, Jacob 
<[email protected]<mailto:[email protected]>> wrote:
Dear Crystallographers,

I thought that for reliable values for Rfree, one needs only to satisfy 
counting statistics, and therefore using at most a couple thousand reflections 
should always be sufficient. Almost always, however, some seemingly-arbitrary 
percentage of reflections is used, say 5%. Is there any rationale for using a 
percentage rather than some absolute number like 1000?

All the best,

Jacob

Reply via email to