Re: [ccp4bb]: Acceptable Rmerge values in last resolution shell!

Eleanor Dodson Thu, 27 Apr 2006 01:56:47 -0700

To add a little to James's excellent summary.

As reviewers I think we should always question results where the I/SigIis > 2-3 in the outer shell. Authors should at least be asked to justifywhy they have not cellected the best available experimental data.Ditto if Rfree is too low for the resolution ( eg differing by < 5% at2.8A) the authors should be challenged - there are many ways ofunderestimated your Rfree - all of which compromise the maximumlikelihood refinement, but they should be deprecated!

To finish with a question that always puzzles me - why do structureswhich generate very similar quality maps at similar resolutions havesuch different Rfactor profiles. I have seen lovely final maps at 2Awith R< 18% etc, and also lovely final maps at 2A with Rfactors ~24%... It might be a radiation damage phenonoma I guess.

  Eleanor


James Holton wrote:

***  For details on how to be removed from this list visit the  ***
***          CCP4 home page http://www.ccp4.ac.uk         ***
Well, since I was mentioned by name. I suppose I should put my twocents in:
Rmerge is NOT a good way to judge your last resolution shell!
My advice if you are faced with a reviewer who complains your Rmergeis to high is to change the name to Rsym. This is actually theappropriate name for the statistic you are quoting. Rmerge(traditionally) refers to the R factor of combining data from twocrystals. Rsym refers to the agreement between symmetry mates afterscaling.
Rsym (and Rmerge) used to be useful things to quote back when peopleapplied a 3-sigma cutoff to their raw observation data. Seems like aborderline criminal thing to do nowadays (and it is), but in the darkages before maximum likelihood the only way to keep a least-squaresrefinement package from chasing noise was to make sure you didn'tconfuse it with a ton of weak (noisy) data.All "R" statistics are supposed to be measuring one type of error (Ris for residual). Rmerge is supposed to measure non-isomorphism.Rsym is supposed to measure deviation from true symmetry. Rcryst andRfree measure the "incorrectness" of your model.The absolute value of "R" statistics is only meaningful if you cannormalize out the contribution of other sources of error. Weak datahave more random noise than strong data, and the more high-resolutiondata you include, the more weak data you will have. Applying a3-sigma cutoff eliminates any spots measured with more than ~33% error(if you believe your sigmas). The remaining strong spots haverelatively little random error (from counting statistics), so the3-sigma cutoff tends to "normalize" data collected from one crystal oranother. However, if you apply a 3-sigma cutoff, you will have lessand less spots as you get out to high resolution. This is why"completeness" became a criterion for the high-resolution limit.
Anyway, in sumary: I say don't worry about your Rmerge in the highresolution shell. I/sd is much more meaningful. Just be careful tooptimize your error model (SDCORR in scala, error_scale_factor andestimated_error in scalepack) so that your scatter/sigma values in thescala log are close to one (or the final "Chi^2" in scalepack). Asfor what I/sd you should cut off your data? I use I/sd of 1.5.Mainly because it is a "compromise" between 1.0 (signal = noise) and2.0 (signal = 2x noise).
As a comment: I fear that the recent rash of structures with I/sd of 6or 8 in the outer resolution shell is happening because Rfree is alsosubject to the unfortunate feature of "R" statistics mentioned above:you get a lower Rcryst and Rfree if you are willing to sacrifice alittle "resolution". I guess it is just too tempting to play withyour resolution limit when you run out of model building ideas andyour Rfree is still too high. This is a BAD BAD thing to do. BAD!!Better to calculate an Rfree using only data with F/sd > 3 (note it assuch!), and have the decency to deposit all your structure factors.
-James Holton
MAD Scientist


Bart Hazes wrote:
Hi Ashima,
With these statistics you shouldn't have to worry about reviewers, itlooks perfectly sensible. Actually I'm much more concerned about therecent epidemic of overly pessimistic resolution cutoffs. In ourjournal club at least half the papers have I/SigI in the highestresolution bin in the 3-6 range which means they could have gottensignificantly higher resolution. There are situations where dataquality is more important than resolution, for instance (anomalous)phasing, but I see the same with many native data sets.
It is not clear to me if people are placing the detector too far fromthe crystal and thus not even measure the highest resolution data orthat they just elect not to process those data. Why??? To get nicerlooking statistics???? That would be VERY bad practice!!!
A kinder view is that the detector distance is set based on theapparent resolution of the first image(s) which underestimates thetrue resolution of a high redundancy data set. If you don't need along detector distance to resolve spots I prefer to select a distancewhere my visible diffraction uses the central 80-90% of the detectorallowing mosflm to try to extract some sensible information frombeyond what the eye can see.
This looks like something James Holton may have looked at. If so I'dbe interested to hear if he or the elves have come up with a magic rule.
Bart

Ashima Bagaria wrote:
***  For details on how to be removed from this list visit the  ***
***          CCP4 home page http://www.ccp4.ac.uk         ***



HI all,
In regards to my CCP4 question about the acceptable Rmerge values inlast resolution shell..various other parameters pertaining to theprotein data at 3.5 A are
I/sigmaI = 13.1 (2.3)
%completeness = 95.7(96.8)
multiplicity = 3.8

All suggestions are welcome

Regards
ashima

begin:vcard
fn:Eleanor  Dodson
n:Dodson;Eleanor 
email;internet:[EMAIL PROTECTED]
tel;work:+44 (0) 1904 328259
tel;fax:+44 (0) 1904 328266
tel;home:+44 (0) 1904 424449
version:2.1
end:vcard

Re: [ccp4bb]: Acceptable Rmerge values in last resolution shell!

Reply via email to