> i thought this was clear long ago.. so i am bit amazed that this discussio is > still alive and kicking.. Even though it does not relate to me, but the explanation may come from the fact that new people start using crystallography, and they do not like reading old papers. So, there is nothing wrong in bringing such fundamental discussions up to life periodically. I had some misconception about accuracy of sigI, which I explained earlier. It is obvious that "I/σ is the signal to noise of your measurements" at a local point of the reciprocal space, but it is not obvious that it would work as well for the merged data. Thanks to Ian and others, I now understand that there is no problem with it.
I am also glad to stumble on the referenced papers about the resolution and data quality (Ed Pozharski's post). I missed them because at the time when they published, I was learning molecular biology, enzymology, virology... A modern crystallographer needs to be a good biologist, and this applies some limitations on how much we know about each technique that we use. Alex On Jun 4, 2012, at 1:48 AM, Tommi Kajander wrote: > well, actually i recommend having a look at the old but good scalepack manual > for why Rmerge is inferior.. > > (i thought this was clear long ago.. so i am bit amazed that this discussio > is still alive and kicking..) > > question of where to cut, is a different one and thats where the recent > papers and developments start to come in. > > > > short quote...(scalepack manual): > > "From a statistical point of view, I/σ is a superior criterion, for two > reasons. First, it defines a resolution “limit” since by definition I/σ is > the signal to noise of your measurements. In contrast, Rmerge is not directly > related to signal to noise. > Second, the σ assigned to each intensity derives its validity from the χ2’s, > which represent the weighted ratio of the difference between the observed and > average value of I, 〈I〉, squared, divided by the square of the error model, > the whole thing times a factor correcting for the correlation between I and > 〈I〉. Since it depends on an explicit declaration of the expected error in the > measurement, the user of the program is part of the Bayesian reasoning > process behind the error estimation. > ......In short, I/σ is the preferred way of assessing the quality of > diffraction data because it derives its validity from the χ2 (likelihood) > analysis. " > > credits to Otwinowski et al. > > end of story, i believe. so R-merge died long back. > > -tommi > > > > > On Jun 4, 2012, at 9:00 AM, aaleshin wrote: > >> Wow, it is quite a lecture here! It is very appreciated. >> >> I admit some (most?) of my statements were questionable. Thus, I did not >> know how sigI would be calculated in case of multiple observations, and, >> indeed, its proper handling should make <sigI/I> similar to Rmerge. >> Consequently, <I/sigI> substitutes Rmerge fairly well. >> >> Now, where the metric Rmerge=0.5 came from? If I remember correctly, It was >> proposed here at ccp4bb. Also, one reviewer suggested to use it. I admit >> that this is quite an arbitrary value, but when everyone follows it, >> structures become comparable by this metric. If there is a better approach >> to estimate the resolution, lets use it, but the common rule should be >> enforced, otherwise the resolution becomes another venue for cheating. >> >> Once again, I was talking about metric for the resolution, it does not need >> to be equal to metric for the data cutoff. >> >> Alex >> >> >> >> On Jun 3, 2012, at 2:55 PM, Ian Tickle wrote: >> >>> Hi Alex >>> >>> On 3 June 2012 07:00, aaleshin <[email protected]> wrote: >>>> I was also taught that under "normal conditions" this would occur when the >>>> data are collected up to the shell, in which Rmerge = 0.5. >>> >>> Do you have a reference for that? I have not seen a demonstration of >>> such an exact relationship between Rmerge and resolution, even for >>> 'normal' data, and I don't think everyone uses 0.5 as the cut-off >>> anyway (e.g. some people use 0.4, some 0.8 etc - though I agree with >>> Phil that we shouldn't get too hung up about the exact number!). >>> Certainly having used the other suggested criteria for resolution >>> cut-off (I/sigma(I) & CC(1/2)), the corresponding Rmerge (and Rpim >>> etc) seems to vary a lot (or maybe my data weren't 'normal'). >>> >>>> One can collect more data (up to Rmerge=1.0 or even 100) but the >>>> resolution of the electron density map will not change significantly. >>> >>> I think we are all at least agreed that beyond some resolution >>> cut-off, adding further higher resolution 'data' will not result in >>> any further improvement in the map (because the weights will become >>> negligible). So it would appear prudent at least to err on the high >>> resolution side! >>> >>>> I solved several structures of my own, and this simple rule worked every >>>> time. >>> >>> In what sense do you mean it 'worked'? Do you mean you tried >>> different cut-offs in Rmerge (e.g. 0.25, 0.50, 0.75, 1.00 ...) and >>> then used some metric to judge when there was no further significant >>> change in the map and you noted that the optimal value of your chosen >>> metric always occurs around Rmerge 0.5?; and if so how did you judge a >>> 'significant change'? Personally I go along with Dale's suggestion to >>> use the optical resolution of the map to judge when no further >>> improvement occurs. This would need to be done with the completely >>> refined structure because presumably optical resolution will be >>> reduced by phase errors. Note that it wouldn't be necessary to >>> actually quote the optical resolution in place of the X-ray resolution >>> (that would confuse everyone!), you just need to know the value of the >>> X-ray resolution cut-off where the optical resolution no longer >>> changes (it should be clear from a plot of X-ray vs. optical >>> resolution). >>> >>>> I is measured as a number of detector counts in the reflection minus >>>> background counts. >>>> sigI is measured as sq. root of I plus standard deviation (SD) for the >>>> background plus various deviations from ideal experiment (like noise from >>>> satellite crystals). >>> >>> The most important contribution to the sigma(I)'s, except maybe for >>> the weak reflections, actually comes from differences between the >>> intensities of equivalent reflections, due to variations in absorption >>> and illuminated volume, and other errors in image scale factors >>> (though these are all highly correlated). These are of course exactly >>> the same differences that contribute to Rmerge. E.g. in Scala the >>> SDFAC & SDADD parameters are automatically adjusted to fit the >>> observed QQ plot to the expected one, in order to account for such >>> differences. >>> >>>> Obviously, sigI cannot be measured accurately. Moreover, the 'resolution' >>>> is related to errors in the structural factors, which are average from >>>> several measurements. >>>> Errors in their scaling would affect the 'resolution', and <I/sigI> does >>>> not detect them, but Rmerge does! >>> >>> Sorry you've lost me here, I don't see why <I/sigI> should not detect >>> scaling errors: as indicated above if there are errors in the scale >>> factors this will inflate the sigma(I) values via increased SDFAC >>> and/or SDADD, which will increase the sigma(I) values which will in >>> turn reduce the <I/sigma(I)> values exactly as expected. I see no >>> difference in the behaviour of Rmerge and <I/sigma(I)> (or indeed in >>> CC(1/2)) in this respect, since they all depend on the differences >>> between equivalents. >>> >>>> Rmerge, it means that the symmetry related reflections did not merge well. >>>> Under those conditions, Rmerge becomes a much better criterion for >>>> estimation of the 'resolution' than <sigi/I>. >>> >>> As indicated above, if the symmetry equivalents don't merge well it >>> will increase the sigma(I)'s and reduce <I/sigma(I)>, so in this >>> respect I don't see why Rmerge should be any better than <I/sigma(I)>. >>> My biggest objection to Rmerge (and this applies also to CC(1/2)) is >>> that it involves throwing away valuable information, namely the >>> measured sigma(I) values from counting stats. This is not usually a >>> good idea (in statistical parlance it reduces the 'power' of the test) >>> - and it's not as though one can argue the sigma's are so small that >>> they can be neglected (at least not for the weak reflections). Even >>> though as you say the estimates of sigma(I) may not be very accurate, >>> it seems to me that any estimate is better than no estimate. In any >>> case the estimates of sigma(I) are probably quite accurate for the >>> weak reflections, it's just for the strong ones that the assumptions >>> tend to break down. However if we're estimating resolution from >>> <I/sigma(I)> it's only the weak reflections in the outer shell that >>> are relevant, so I don't think accuracy of sigma(I) is an issue. >>> >>>> If someone decides to use <I/sigI> instead of Rmerge, fine, let it be 2.0. >>> >>> As I indicated previously I think 2 is too high, it should be much >>> closer to 1 (and again it would appear prudent to err on the side of >>> the lower value), because in the outer shell the majority of >>> I/sigma(I) values will be < 1 (just from the normal distribution of >>> errors). This means that in order to get an average value of >>> I/sigma(I) = 2 you need a lot of very significant intensities >> 3. >>> The fallacy here lies in comparing the average I/sigma(I) with the >>> standard '3 sigma' criterion which is actually appropriate only for a >>> single intensity. Of course data anisotropy may well "throw a spanner >>> in the works". >>> >>>> Alternatively, the resolution could be estimated from the electron density >>>> maps. >>> >>> I agree, using the optical resolution in the manner indicated above, >>> but still quoting the corresponding X-ray resolution for backwards >>> compatibility! >>> >>>> I hope everyone agrees that the resolution should not be dead.. >>> >>> I completely agree: I say "Long live the resolution!" (sorry I >>> couldn't resist it). >>> >>> -- Ian >
