Hi James

I don't think it's meaningful to ask whether the deviation of a single bond
length (or anything else that's single) from its expected value is
significant, since as you say there's always some finite probability that
it occurred purely by chance.  Statistics can only meaningfully be applied
to samples of a 'reasonable' size.  I know there are statistics designed
for small samples but not for samples of size 1 !  It's more meaningful to
talk about distributions.  For example if 1% of the sample contained
deviations > 3 sigma when you expected there to be only 0.3 %, that is
probably significant (but it still has a finite probability of occurring by
chance), as would be finding no deviations > 3 sigma (for a reasonably
large sample to avoid sampling errors).

Cheers

-- Ian


On Tue, Nov 8, 2022, 22:22 James Holton <[email protected]> wrote:

> OK, so lets suppose there is this bond in your structure that is
> stretched a bit.  Is that for real? Or just a random fluke?  Let's say
> for example its a CA-CB bond that is supposed to be 1.529 A long, but in
> your model its 1.579 A.  This is 0.05 A too long. Doesn't seem like
> much, right? But the "sigma" given to such a bond in our geometry
> libraries is 0.016 A.  These sigmas are typically derived from a
> database of observed bonds of similar type found in highly accurate
> structures, like small molecules. So, that makes this a 3-sigma outlier.
> Assuming the distribution of deviations is Gaussian, that's a pretty
> unlikely thing to happen. You expect 3-sigma deviates to appear less
> than 0.3% of the time.  So, is that significant?
>
> But, then again, there are lots of other bonds in the structure. Lets
> say there are 1000. With that many samplings from a Gaussian
> distribution you generally expect to see a 3-sigma deviate at least
> once.  That is, do an "experiment" where you pick 1000 Gaussian-random
> numbers from a distribution with a standard deviation of 1.0. Then, look
> for the maximum over all 1000 trials. Is that one > 3 sigma? It probably
> is. If you do this "experiment" millions of times it turns out seeing at
> least one 3-sigma deviate in 1000 tries is very common. Specifically,
> about 93% of the time. It is rare indeed to have every member of a
> 1000-deviate set all lie within 3 sigmas.  So, we have gone from one
> 3-sigma deviate being highly unlikely to being a virtual certainty if
> you look at enough samples.
>
> So, my question is: is a 3-sigma deviate significant?  Is it significant
> only if you have one bond in the structure?  What about angles? What if
> you have 500 bonds and 500 angles?  Do they count as 1000 deviates
> together? Or separately?
>
> I'm sure the more mathematically inclined out there will have some
> intelligent answers for the rest of us, however, if you are not a
> mathematician, how about a vote?  Is a 3-sigma bond length deviation
> significant? Or not?
>
> Looking forward to both kinds of responses,
>
> -James Holton
> MAD Scientist
>
> ########################################################################
>
> To unsubscribe from the CCP4BB list, click the following link:
> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1
>
> This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a
> mailing list hosted by www.jiscmail.ac.uk, terms & conditions are
> available at https://www.jiscmail.ac.uk/policyandsecurity/
>

########################################################################

To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/

Reply via email to