Re: [ccp4bb] PAIREF, Anisotropy and STARANISO

Kay Diederichs Thu, 06 Oct 2022 03:34:14 -0700

Dear Gerard,

I'm not going to comment on what others said in this (new) thread; just trying 
to make a few remarks about what you write below -

On Tue, 4 Oct 2022 17:01:10 +0100, Gerard Bricogne <[email protected]> 
wrote:

>Dear all,
>
>     First of all, apologies for breaking the threads entitled "PAIREF -
>Warning - not enough free reflections in resolution bin" and "Anisotropy" by
>merging them into a new one, but it somehow felt rather against nature to
>keep them separate.
>
>     Since the early days of the availability of STARANISO [1] (the actual
>starting year for the Web server [2] was 2016), we had a hunch that much of
>what was happening in the PAIREF procedure might simply be the detection of
>the existence of significant data beyond an initially chosen resolution
>cut-off not only as a result of an excessively conservative criterion having
>been applied in that initial choice, but as a consequence of anisotropy in
>the data. 

Why "much of what was happening ... as a consequence of anisotropy"? These 
words imply that datasets where PAIREF indicates "existence of significant data 
beyond an initially chosen resolution cut-off" (EOSDBAICRC) are anisotropic, 
but that is a) not the case, because PAIREF - or paired refinement in general - 
in my experience, and that of others, often indicates EOSDBAICRC also for 
isotropic data, b) this depends on the initial cutoff. So your general 
statement (or hunch?) cannot be correct.

> The latter would give rise to different diffraction limits in
>different directions, and the choice of a single value for "the resolution"
>at which the data were cut off would necessarily yield a compromise value
>between the best and the worse diffraction limits. This would imply that
>significant data would be excluded in the best diffracting directions, that
>would subsequently drive PAIREF towards increasing the estimated resolution
>compared to its compromise value.
>

In its current implementation, PAIREF tries to determine the isotropic 
resolution cutoff that gives the best model based on valid comparisons of 
(mainly) Rfree values (it also gives other information to the user).
This is the correct thing to do for isotropic data, and still useful for 
moderately anisotropic data, but clearly there is room for improvement, e.g. by 
using an anisotropic high-resolution cutoff, or by using data from STARANISO, 
or ... 
We (the authors of the PAIREF paper) have been discussing the treatment of 
anisotropy in the past, but we were under the impression that there is not an 
obvious single best way to deal with anisotropy.

>     This "hunch" was validated by a detailed comparison carried out on the
>exact same examples that are considered in the 2020 paper by Maly et al.,
>that is summarised in the attached PDF. In other words, whenever anisotropy
>is present in the data, PAIREF will tend to indicate a higher value for an
>isotropic cut-off than would have been estimated for the initial dataset.

based on what?? Different people employ different initial resolution cut-offs, 
based on their prior experience.
Your general statement above assumes a certain decision mode that I'd say is 
not universally valid.

>The problem with taking the PAIREF result as the final answer is that the
>higher cut-off it indicates is applied *isotropically*. The inclusion of the
>significant data thus reclaimed is therefore unavoidably accompanied by that
>of noisy data in the worst diffracting direction(s), resulting in alarmingly
>poor statistics in the outermost shell (as pointed out in Eleanor's message)
>that may cast doubts on the usefulness of the procedure. 

To my understanding, Eleanor's message was not about PAIREF, but you cite it as 
if it were. I don't like this.

> This consideration
>was the basis of the rationale for implementing an *anisotropic* cut-off
>surface in STARANISO, so that one could thus reclaim the significant data in
>the best-diffracting direction(s) while avoiding the simultaneous inclusion
>of the pure-noise measurements in the worse one(s). While this is clearly
>and extensively explained in the documentation provided on the STARANISO
>server [2], it seems to be far from having been assimilated. Of course this
>would be perfect material for a publication, but life is somehow too short,
>and our to-do list has remained too long, to leave us room for spending the
>necessary time to go through the process of putting a paper together. The
>truly important matter is to get our picture in front of the user community.

It would actually be good to have a proper paper!

STARANISO is a very valuable program. I do use it a lot, and have seen great 
improvements in maps. But there are open questions.
First, there is always a danger associated with modifying experimental data, so 
I'm not sure I like the default of STARANISO that leads to an up-scaling of 
data along the weak direction(s). I'd rather see this up-scaling implemented in 
the refinement program(s) which write out the coefficients for map calculation.
Second, (from the POV of Randy Read not an open question IIUC) STARANISO data 
should not be used for MR in Phaser.
Third, I'd like to know if substructure solution works better with data from 
STARANISO than with the original data.
Fourth, to me a (STARANISO default) cutoff of I/sigI at 1.2 is arbitrary. Yes I 
know I can modify it, but given that the STARANISO calculation is not 
instantaneous, I'd rather have a cutoff that is variable, and is optimized for 
the given data and model - exactly what PAIREF does. Also, the sigI values are 
not very reproducible across different data processing programs.

>
>     Now that the combined topics of PAIREF and anisotropy are being brought
>to the foreground of the community's attention, this seems like the perfect
>opportunity to present our analysis and position: what PAIREF achieves in
>terms of an upward revision of an initial isotropic resolution cut-off is
>likely to be achieved more straightforwardly by submitting the same data to
>the STARANISO server (or using it within autoPROC [3]); and the STARANISO
>output will have the advantage of being devoid of the large extra amount of
>purely noisy, uninformative data that are retained in the output from PAIREF
>according to its revised isotropic cut-off.

By saying so, you imply that the default cutoff that STARANISO uses gives the 
best results. I don't agree,
for the same reasons that apply to the choice of high-resolution cutoffs for 
isotropic data - any fixed cutoff based on some indicator is arbitrary (why is 
a I/sigI cutoff of 1.2 better than 1.1 or 1.3 or ...? is there a proof?); the 
cutoff must depend on the model (a bad model does not benefit from weak data); 
the cutoff must also depend on the refinement program - e.g. phenix.refine does 
not take the sigI into account. Paired refinement would be a better way because 
it informs the user about the consequences (on the model and its R-values in a 
fair comparison) of a certain cutoff - the cutoff does not have to be based on 
resolution, but could be based on local I/sigI or the like.

>
>     We would very much welcome feedback on this position: indeed we would
>like to *crowd-source* the validation (or refutation) of this conclusion. In
>our view, continuing to use the PAIREF procedure to revise an isotropic
>resolution cut off misses the point about the consequences of anisotropy.

Here too you imply that all datasets are anisotropic.

>The only sensible use of a PAIREF-like procedure would be to adjust the
>cut-off threshold for the local average of I/sig(I) in STARANISO, whose
>default value is currently 1.2 but can be reset by the user through the Web
>server's GUI. We occasionally see datasets of very high quality for which
>the CC_1/2 value in the outermost shell stays above 0.6 or even 0.7, and it
>is quite plausible that further useful data could be rescued if the local
>I/sig(I) cut-off threshold were lowered below 1.2.

The way you phrase it appears to diminish the value of a PAIREF-like procedure. 
To the contrary, I'd think it would be valuable and I'd like to see exactly 
such a procedure.

>
>     Concerning Eleanor's view that noisy data can't hurt refinement because
>they are properly down-weighted by the consideration of e.g. Rfree values in
>resolution shells, we would point out that any criterion based on statistics
>in resolution shells will be polluted if the data are anisotropic and if the
>noisy data that STARANISO would reject are retained. That will result in
>excessive down-weighting of the significant data that STARANISO retains,
>hence in losing the information they contain. Perhaps this is a matter for
>later discussion, but the main idea is that retaining pure-noise data is not
>neutral in refinement, and that every "isotropic thinking habit" on which
>many views are based needs to be revisited.

My view here is that the existence of a "best" resolution cutoff (e.g. as a 
minimum in Rfree) that we often see in paired refinement appears to prove that 
the inclusion of data beyond that limit is somewhat detrimental to the model. 
Meaning that inclusion of noise is not recommended - and emphasizing the value 
of cutting the data in a smart(er) way.

To summarize what I want to say: a) I don't find your assessment of the merits 
of PAIREF to be balanced. b) I think it would be worthwhile to optimize the 
data cutoff based on local I/sigI or similar - so I'd wish there were a 
combination of STARANISO and PAIREF (which you seem to see as non-equitable 
alternatives).

One more word: sorry, I don't have the time currently to continue this thread 
from my side.

Best wishes,
Kay

>
>
>     With best wishes,
>
>Clemens, Claus, Ian and Gerard.
>
>
>[1] Tickle, I.J., Flensburg, C., Keller, P., Paciorek, W., Sharff, A.,
>    Vonrhein, C., Bricogne, G. (2018). STARANISO. Cambridge, United
>    Kingdom: Global Phasing Ltd.
>    
> https://www.jiscmail.ac.uk/cgi-bin/wa-jisc.exe?A2=ind1806&L=CCP4BB&O=D&P=3971
>
>[2] https://staraniso.globalphasing.org/
>
>[3] https://doi.org/10.1107/s0907444911007773
>    https://www.globalphasing.com/autoproc/
>
>
>########################################################################
>
>To unsubscribe from the CCP4BB list, click the following link:
>https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1
>
>This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing 
>list hosted by www.jiscmail.ac.uk, terms & conditions are available at 
>https://www.jiscmail.ac.uk/policyandsecurity/
>

########################################################################

To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/

Re: [ccp4bb] PAIREF, Anisotropy and STARANISO

Reply via email to