Re: [COOT] Calculating sigma value

Edward A. Berry Fri, 20 Apr 2018 08:27:36 -0700

Agreed!
Best,
Ed

On 04/20/2018 10:26 AM, Ian Tickle wrote:


Hi Edward

You are perfectly correct that the use of the term 'standard deviation' is not limited 
to error distributions.  However I believe that my preferred terminology, namely 
'standard uncertainty', i.e. the experimental estimate of the standard deviation in the 
error, related in the same way as the experimental estimate of an intensity to its true 
value, is.  The use of the Greek letterσ (sigma), or indeed any symbol, in equations is 
purely a notational convention, since there are obviously not enough symbols in a font 
set to go round for every possible entity - 'sigma' is used as an algebraic symbol for 
probably at least 20 different entities in maths & the sciences, as I would guess 
are most of the other Latin & Greek letters.   There cannot therefore be any 
permanent connection between a symbol and its meaning, so that typically its meaning in 
an manuscript is given in a table of notation: this is used locally in equations only 
in the context of that manuscript (which therefore
cannot necessarily be taken to apply in any other context).

This means that if one is to avoid ambiguity, one cannot use 'sigma' to mean both 
'standard deviation of the error' (or 'standard deviation' / 'RMSD') and 'standard 
uncertainty' in the same context, and therefore that we are forced to define symbols to 
mean whatever we want them to mean (with a suitable explanation of the notation of 
course).  My thinking in the context of the current thread was that sigma indeed meant 
'standard uncertainty', and I thought that that was implicit from what I wrote, but if 
anyone misunderstood my meaning then I should certainly have been more explicit.  I 
should perhaps have properly defined my notation and said something like: "... it 
shouldn't be called sigma (where here I define sigma as 'standard uncertainty'), because 
it's not an uncertainty ...".

My main argument, which I should perhaps have expanded further, is that we need 
to avoid a clash of symbols when RMS deviation and standard uncertainty appear 
in the same set of equations (I take it that there is no argument that they are 
distinct quantities that require different symbols).  Now since RMS deviation 
is in general a sample standard deviation (one can and often does take only a 
sample of the map and calculate the RMSD of that sample), the usual symbol for 
that is 's'.  In contrast the standard uncertainty is an estimate of the 
population standard deviation, for which I think we have agreed that the symbol 
is 'sigma'.  As Steven Sheriff pointed out to me, this situation does arise 
with my own program EDSTATS, which attempts to calculate the standard 
uncertainty of the 2mFo-DFc map, based on the RMSD of the 2(mFo-DFc) map, so 
this is a genuine issue.

You are right that the FFT in Coot is most likely performed by the FFTW 
('Fastest Fourier Transform in the West') package and not by the CCP4 FFT 
program as I originally stated.

Thanks for the correction!

Cheers

-- Ian


On 19 April 2018 at 17:44, Edward A. Berry <[email protected] 
<mailto:[email protected]>> wrote:



    On 04/19/2018 08:57 AM, Ian Tickle wrote:


        Hi, first maps are produced by Refmac, not Coot, and second it 
shouldn't be called sigma because it's not an uncertainty, it's a 
root-mean-square deviation from the mean.  The equation for the RMSD can be 
found in any basic text on statistics, e.g. just type 'RMSD' in Wikipedia.

        Cheers

        -- Ian



    With all due respect, and I may be misunderstanding something here, but I think that that is an 
unnecessarily restrictive definition of sigma! I'm assuming sigma stands for the standard 
deviation. Although standard deviation is often associated with a probability distribution, it is 
defined for (any?) kind of distribution. From the Wikipedia page on standard deviation, "the 
standard deviation (SD, also represented by the Greek letter sigma σ or the Latin letter s, is a 
measure that is used to quantify the amount of variation or dispersion of a set of data 
values", and "There are also other measures of deviation from the norm . . .".  That 
together with the formula for population standard deviation suggests standard deviation is exactly 
the RMS Deviation from the mean.

    For an analogy, suppose a dietician weighs a dozen mice that have undergone 
the same regimen, and calculates a certain mean value with a standard deviation 
deviation of 1.2 g. Now he weighed the mice on a scale reading to the tenth of 
a gram, so the standard deviation of the measurement is around 0.1 g or less. 
Nonetheless he is going to report the deviation of his population, which is 1.2 
g.  Likewise even if we knew precisely the electron density at every point in 
the unit cell of a crystal, that density would still have a distribution, and 
that distribution would have a standard deviation. The important thing, and I 
think this was the main point of Ian's remark, is that that standard deviation 
would have nothing to do with the uncertainty of our estimate of the density.

    You could make a probability distribution out of the weight distribution of 
the mice. Say if I pick a random mouse and weigh it, or if I repeat the 
experiment with only a single mouse, that standard deviation tells me something 
about how likely my result is to be close to the population mean. In the latter 
case, this could also be viewed as a measure of the error in the experiment. 
But in the same way, you could say if I pick a random point in the asymmetric 
unit and sample the density there, the RMSD tells me something about the 
probability that my result will be close to the mean value for the map.

    However, in keeping with the main point mentioned above, it may be a good 
convention to use sigma only for standard deviation of a probability function 
such as normally (or otherwise) distributed error of a measurement, and RMSD 
for standard deviation in all other cases.

    I think the way most people use coot nowadays, refmac (or other) is 
producing map coefficients, and coot is calculating the map (presumably using 
the FFT alogorithm as mentioned) and contouring it for us to see.

    eab




        On 19 April 2018 at 13:20, Mohamed Ibrahim <[email protected] 
<mailto:[email protected]> <mailto:[email protected] 
<mailto:[email protected]>>> wrote:

             Dear COOT users,

             Do you know how to extract the equations that COOT uses for 
generating the maps and calculating the sigma values?

             Best regards,
             Mohamed

             --
             
             --
             /*
             
             ----------------------------------*/
             /*Mohamed Ibrahim
             *//**//*
             */
             /*Humboldt University
             */
             /*Berlin, Germany
             */
             /*Tel: +49 30 209347931

             */

Re: [COOT] Calculating sigma value

Reply via email to