Re: [ccp4bb] Anomalous signal to noise details

2020-12-18 Thread David Waterman
Hi all,

Thanks for the useful advice. Clemens, I was indeed reminded of the
 vs / discussion when I started this thread. The
reason I brought it up here is because the ratio-of-means form is the way
it is described in "Estimation of anomalous signal in diffraction data
<https://journals.iucr.org/d/issues/2006/08/00/dz5079/#SEC2>"
(Dauter 2006), and similarly in "Can I solve my structure by SAD phasing?
Anomalous signal in SAD phasing
<https://journals.iucr.org/d/issues/2016/03/00/ba5234/index.html#SEC1>"
(Terwilliger et al. 2015), where it appears as <|Δano|>/<σano>. Therefore,
I'm not sure it is true that everyone really means <|ΔF|/σ(ΔF)> when
discussing the anomalous signal to noise, but I am happy to adopt this
version. The problem is, we have different formulations and different
suggested cut offs, and given that we don't really have independent and
identically distributed random variables, these criteria surely produce
(slightly?) different results depending on which form is adopted.

There is a practical motive here: we want to report this metric in DIALS
and we want to provide some indicative guiding lines on a plot to help the
user make an interpretive judgement. It sounds like a plot of <|ΔF|/σ(ΔF)>
with guidelines at 0.8 and 1.2 would be a good start, but it does seem
there is scope for confusion, not immediately resolved by the literature.
At least we will write the formula out on the plot explicitly, rather than
hiding it behind an ambiguous name. Bernhard, unfortunately the copy of BMC
I use for reference is currently locked up at the office. One of the side
effects of working from home is that the group library of hardcopy
textbooks is sat gathering dust.

Cheers

-- David


On Fri, 18 Dec 2020 at 22:41, Petr Kolenko 
wrote:

> Dear David,
>
> Although this is not exactly a topic of your question, an alternative
> approach is to use the resolution screening and compare the results. I have
> implemented this approach to my program SHELIXIR (because it uses SHELX
> C/D/E), which can be found here:
>
> http://kmlinux.fjfi.cvut.cz/~kolenpe1/shelixir/
>
> It also has its GUI here:
>
> http://kmlinux.fjfi.cvut.cz/~kolenpe1/shelixir/gui/
>
> Once you have enough computational power, you can easily perform such
> testing (and no longer need to understand everything that is written in the
> manuscript :-) ).
>
> I hope that the program will soon be published and I would welcome if you
> (or someone else) used it and potentially gave me some feedback or
> suggestion. The program has other functions like parallelized solvent
> content screening, etc. ;-) Feel free to ask for more.
>
> Best regards,
>
> Petr
>
>
>
> PS: Although tested on a number of cases, the command line is more stable
> than the GUI.
>
>
>
> *From:* CCP4 bulletin board  *On Behalf Of *David
> Waterman
> *Sent:* Friday, December 18, 2020 12:53 PM
> *To:* CCP4BB@JISCMAIL.AC.UK
> *Subject:* [ccp4bb] Anomalous signal to noise details
>
>
>
> Hi folks
>
>
>
> The paper "Substructure solution with SHELXD
> <https://journals.iucr.org/d/issues/2002/10/02/gr2280/index.html>"
> (Schneider & Sheldrick, 2002) describes how
>
>
>
> data can be truncated at the resolution at which [ΔF to its estimated
> standard deviation as a function of the resolution] drops to below about 1.3
>
>
>
> Is this referring to the quantity <|ΔF|>/<σ(ΔF)> calculated in resolution
> shells, or the quantity <|ΔF|/σ(ΔF)> ?
>
>
>
> This entry
> <https://strucbio.biologie.uni-konstanz.de/ccp4wiki/index.php?title=SHELX_C/D/E#Resolution_cutoff_.28SHEL.29>
> on the ccp4wiki gives a cutoff
>
>
>
> where the mean value of |ΔF|/σ(ΔF) falls below about 1.2 (a value of 0.8
> would indicate pure noise)
>
>
>
> this version sounds to me like <|ΔF|/σ(ΔF)>
>
>
>
> which is the "better" metric, and what do people mean when they say
> DANO/SIGDANO? What is the justification for the 1.3 (or 1.2) value?
>
>
>
> Thanks!
>
> -- David
>
>
> --
>
> To unsubscribe from the CCP4BB list, click the following link:
> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1
>



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/


Re: [ccp4bb] Anomalous signal to noise details

2020-12-18 Thread Petr Kolenko
Dear David,
Although this is not exactly a topic of your question, an alternative approach 
is to use the resolution screening and compare the results. I have implemented 
this approach to my program SHELIXIR (because it uses SHELX C/D/E), which can 
be found here:
http://kmlinux.fjfi.cvut.cz/~kolenpe1/shelixir/
It also has its GUI here:
http://kmlinux.fjfi.cvut.cz/~kolenpe1/shelixir/gui/
Once you have enough computational power, you can easily perform such testing 
(and no longer need to understand everything that is written in the manuscript 
:-) ).
I hope that the program will soon be published and I would welcome if you (or 
someone else) used it and potentially gave me some feedback or suggestion. The 
program has other functions like parallelized solvent content screening, etc. 
;-) Feel free to ask for more.
Best regards,
Petr

PS: Although tested on a number of cases, the command line is more stable than 
the GUI.

From: CCP4 bulletin board  On Behalf Of David Waterman
Sent: Friday, December 18, 2020 12:53 PM
To: CCP4BB@JISCMAIL.AC.UK
Subject: [ccp4bb] Anomalous signal to noise details

Hi folks

The paper "Substructure solution with 
SHELXD<https://journals.iucr.org/d/issues/2002/10/02/gr2280/index.html>" 
(Schneider & Sheldrick, 2002) describes how

data can be truncated at the resolution at which [ΔF to its estimated standard 
deviation as a function of the resolution] drops to below about 1.3

Is this referring to the quantity <|ΔF|>/<σ(ΔF)> calculated in resolution 
shells, or the quantity <|ΔF|/σ(ΔF)> ?

This 
entry<https://strucbio.biologie.uni-konstanz.de/ccp4wiki/index.php?title=SHELX_C/D/E#Resolution_cutoff_.28SHEL.29>
 on the ccp4wiki gives a cutoff

where the mean value of |ΔF|/σ(ΔF) falls below about 1.2 (a value of 0.8 would 
indicate pure noise)

this version sounds to me like <|ΔF|/σ(ΔF)>

which is the "better" metric, and what do people mean when they say 
DANO/SIGDANO? What is the justification for the 1.3 (or 1.2) value?

Thanks!
-- David



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/


Re: [ccp4bb] Anomalous signal to noise details

2020-12-18 Thread Ian Tickle
Conventionally (e.g. in cryo-EM) the SNR is taken as a ratio of averages,
either the ratio of the variance of the signal (average of signal squared)
to the variance of the noise, i.e. var(signal) / var(noise), or the square
root of that, i.e. sd(signal) / sd(noise).  See here:
https://en.wikipedia.org/wiki/Signal-to-noise_ratio .

Alternatively, it may be defined as the ratio of the mean value of the
signal to its standard error, i.e. mean(signal) / sd(noise):
https://en.wikipedia.org/wiki/Signal-to-noise_ratio_(imagi
ng)  , so
again a ratio of averages.  Whenever you read about SNR you have to check
carefully which of the three definitions is in use (and often it's not
stated!).

I think the confusion of ratio-of-averages vs. average-of-ratios has arisen
because in crystallography we're not actually talking about the SNR, rather
we're talking about the _average_ SNR, where the average is taken over
samples of a population that have _different_ distributions in reciprocal
space, whereas in the previous case the samples are taken from a population
whose samples are assumed to have the _same_ distribution (and therefore
same s.d.).  Note that the ratio-of-average and average-of-ratio cases
converge if all samples are drawn from the same population with uniform
distribution and s.d..

So then in our situation it is indeed correct to say that average SNR =
average( I / sd(I) ), i.e. an average of the ratios of averages!  When we
measure an _individual_ intensity with its s.d., we are already taking
averages, but they are averages over time where the standard error is
constant.  Then, when we take the spatial average SNR over different
intensities the sampling distribution and its s.d. naturally varies so we
must take the average of ratios.

Cheers

-- Ian


On Fri, 18 Dec 2020 at 20:14, Bernhard Rupp 
wrote:

> > I don't know the justification; maybe just experience? Surely the higher
> the better.  I've seen George Sheldrick deriving the value of ~0.8 when
> there is _no_ anom signal but I forgot the details, sorry ...
>
> It is derived from the mean absolute error (cf. p414 in Chapter 8 of BMC,
> with help of Ian Tickle), and holds for unmerged data. A reasonable good
> indication where to set the cutoff in practice is to look at the site vs
> occupancy plot. A distinct drop after a few good sites is usually a good
> sign, and that tends to cluster around the ~1.3 ratio.
>
>
> http://www.ruppweb.org/Garland/gallery/Ch10/pages/Biomolecular_Crystallography_Fig_10-30_PART2.htm
>
> The noise level is actually observable in data without anomalous signal
>
>
> http://www.ruppweb.org/Garland/gallery/Ch10/pages/Biomolecular_Crystallography_Fig_10-29.htm
>
> Best BR
>
>
>
>
>
> best wishes,
>
> Kay
>
> >
> >Thanks!
> >-- David
> >
> >###
> >#
> >
> >To unsubscribe from the CCP4BB list, click the following link:
> >https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1
> >
> >This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a
> >mailing list hosted by www.jiscmail.ac.uk, terms & conditions are
> >available at https://www.jiscmail.ac.uk/policyandsecurity/
> >
>
> 
>
> To unsubscribe from the CCP4BB list, click the following link:
> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1
>
> This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a
> mailing list hosted by www.jiscmail.ac.uk, terms & conditions are
> available at https://www.jiscmail.ac.uk/policyandsecurity/
>
> 
>
> To unsubscribe from the CCP4BB list, click the following link:
> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1
>
> This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a
> mailing list hosted by www.jiscmail.ac.uk, terms & conditions are
> available at https://www.jiscmail.ac.uk/policyandsecurity/
>



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/


Re: [ccp4bb] Anomalous signal to noise details

2020-12-18 Thread Gergely Katona
Hi,

I am not sure about the 1.2-1.3 limit, but 0.8 probably comes from sqrt(2/pi), 
which is the ratio of the mean of a half-normal and its sigma when the mean 
parameter is 0.

Best wishes,

Gergely


From: CCP4 bulletin board  On Behalf Of David Waterman
Sent: 18 December, 2020 12:53
To: CCP4BB@JISCMAIL.AC.UK
Subject: [ccp4bb] Anomalous signal to noise details

Hi folks

The paper "Substructure solution with 
SHELXD<https://journals.iucr.org/d/issues/2002/10/02/gr2280/index.html>" 
(Schneider & Sheldrick, 2002) describes how

data can be truncated at the resolution at which [ΔF to its estimated standard 
deviation as a function of the resolution] drops to below about 1.3

Is this referring to the quantity <|ΔF|>/<σ(ΔF)> calculated in resolution 
shells, or the quantity <|ΔF|/σ(ΔF)> ?

This 
entry<https://strucbio.biologie.uni-konstanz.de/ccp4wiki/index.php?title=SHELX_C/D/E#Resolution_cutoff_.28SHEL.29>
 on the ccp4wiki gives a cutoff

where the mean value of |ΔF|/σ(ΔF) falls below about 1.2 (a value of 0.8 would 
indicate pure noise)

this version sounds to me like <|ΔF|/σ(ΔF)>

which is the "better" metric, and what do people mean when they say 
DANO/SIGDANO? What is the justification for the 1.3 (or 1.2) value?

Thanks!
-- David



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/


Re: [ccp4bb] Anomalous signal to noise details

2020-12-18 Thread Bernhard Rupp
> I don't know the justification; maybe just experience? Surely the higher the 
> better.  I've seen George Sheldrick deriving the value of ~0.8 when there is 
> _no_ anom signal but I forgot the details, sorry ...

It is derived from the mean absolute error (cf. p414 in Chapter 8 of BMC, with 
help of Ian Tickle), and holds for unmerged data. A reasonable good indication 
where to set the cutoff in practice is to look at the site vs occupancy plot. A 
distinct drop after a few good sites is usually a good sign, and that tends to 
cluster around the ~1.3 ratio.

http://www.ruppweb.org/Garland/gallery/Ch10/pages/Biomolecular_Crystallography_Fig_10-30_PART2.htm

The noise level is actually observable in data without anomalous signal

http://www.ruppweb.org/Garland/gallery/Ch10/pages/Biomolecular_Crystallography_Fig_10-29.htm

Best BR





best wishes,

Kay

>
>Thanks!
>-- David
>
>###
>#
>
>To unsubscribe from the CCP4BB list, click the following link:
>https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1
>
>This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a 
>mailing list hosted by www.jiscmail.ac.uk, terms & conditions are 
>available at https://www.jiscmail.ac.uk/policyandsecurity/
>



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/


Re: [ccp4bb] Anomalous signal to noise details

2020-12-18 Thread Kay Diederichs
Hi David,

On Fri, 18 Dec 2020 11:53:08 +, David Waterman  wrote:

>Hi folks
>
>The paper "Substructure solution with SHELXD
>"
>(Schneider & Sheldrick, 2002) describes how
>
>data can be truncated at the resolution at which [ΔF to its estimated
>> standard deviation as a function of the resolution] drops to below about 1.3
>
>
>Is this referring to the quantity <|ΔF|>/<σ(ΔF)> calculated in resolution
>shells, or the quantity <|ΔF|/σ(ΔF)> ?

the latter

the only scaling program that I know of that calculates ratios of averages is 
SCALEPACK; 
the others calculate averages of ratios.

>
>This entry
>
>on the ccp4wiki gives a cutoff
>
>where the mean value of |ΔF|/σ(ΔF) falls below about 1.2 (a value of 0.8
>> would indicate pure noise)
>
>
>this version sounds to me like <|ΔF|/σ(ΔF)>

yes

>
>which is the "better" metric, and what do people mean when they say
>DANO/SIGDANO? What is the justification for the 1.3 (or 1.2) value?

I don't know the justification; maybe just experience? Surely the higher the 
better.  I've seen George Sheldrick deriving the value of ~0.8
when there is _no_ anom signal but I forgot the details, sorry ...

best wishes,

Kay

>
>Thanks!
>-- David
>
>
>
>To unsubscribe from the CCP4BB list, click the following link:
>https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1
>
>This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing 
>list hosted by www.jiscmail.ac.uk, terms & conditions are available at 
>https://www.jiscmail.ac.uk/policyandsecurity/
>



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/


Re: [ccp4bb] Anomalous signal to noise details

2020-12-18 Thread Clemens Vonrhein
Dear David,

On Fri, Dec 18, 2020 at 11:53:08AM +, David Waterman wrote:
> The paper "Substructure solution with SHELXD
> "
> (Schneider & Sheldrick, 2002) describes how
> 
> data can be truncated at the resolution at which [ΔF to its estimated
> > standard deviation as a function of the resolution] drops to below about 1.3
> 
> 
> Is this referring to the quantity <|ΔF|>/<σ(ΔF)> calculated in resolution
> shells, or the quantity <|ΔF|/σ(ΔF)> ?

I'm nearly 100% sure this refers to the latter - or at least: the
latter is the only one making sense to me. This sounds very much like
the confusion when it comes to

   (1)

==> PDBx/mmCIF: _reflns.pdbx_netI_over_sigmaI   73.6  % of entries
_reflns_shell.pdbx_netI_over_sigmaI_all  0.001% of entries
_reflns_shell.pdbx_netI_over_sigmaI_obs  2.6  % of entries

versus

  / (2)

==> PDBx/mmCIF: _reflns.pdbx_netI_over_av_sigmaI 2.6  % of entries
_reflns_shell.meanI_over_sigI_all0.2  % of entries
_reflns_shell.meanI_over_sigI_obs   53.0  % of entries

As far as I can remember, we always computed and reported (1) and
never (2) - at least when it comes to the scaling/merging programs I'm
familiar with (SCALE, XDS/XSCALE, AIMLESS, d*TREK). What useful
information would (2) or <|ΔF|>/<σ(ΔF)> convey anyway ... ?

If we were to believe these definitions, then we are storing the
"right/useful" value  in the overall statistics, but a very
different value of / in the per-shell statistics. All those
_reflns_shell.meanI_over_sigI_obs are most like mis-labeled (1)
quantities.

> This entry
> 
> on the ccp4wiki gives a cutoff
> 
> where the mean value of |ΔF|/σ(ΔF) falls below about 1.2 (a value of 0.8
> > would indicate pure noise)
> 
> 
> this version sounds to me like <|ΔF|/σ(ΔF)>
> 
> which is the "better" metric, and what do people mean when they say
> DANO/SIGDANO? What is the justification for the 1.3 (or 1.2) value?

I think everyone always refers to <|ΔF|/σ(ΔF)> no matter what it is
called (sometimes programmers shorten the notation to avoid unwieldly
wide columns).

I tend to look for values above 1 (and the higher, the better) - but
maybe even more importantly: check the trend with resolution (higher
at low resolution), maybe in comparison with expectations (type of
scatterer, fluorescence scan, anomalous signal, number of sites,
potential B-factors of scatteres etc).

Cheers

Clemens



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/


[ccp4bb] Anomalous signal to noise details

2020-12-18 Thread David Waterman
Hi folks

The paper "Substructure solution with SHELXD
"
(Schneider & Sheldrick, 2002) describes how

data can be truncated at the resolution at which [ΔF to its estimated
> standard deviation as a function of the resolution] drops to below about 1.3


Is this referring to the quantity <|ΔF|>/<σ(ΔF)> calculated in resolution
shells, or the quantity <|ΔF|/σ(ΔF)> ?

This entry

on the ccp4wiki gives a cutoff

where the mean value of |ΔF|/σ(ΔF) falls below about 1.2 (a value of 0.8
> would indicate pure noise)


this version sounds to me like <|ΔF|/σ(ΔF)>

which is the "better" metric, and what do people mean when they say
DANO/SIGDANO? What is the justification for the 1.3 (or 1.2) value?

Thanks!
-- David



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/