On Tue, 8 Nov 2022 15:25:03 -0800, James Holton <jmhol...@lbl.gov> wrote:

>Thank you Ian for your quick response!
>
>I suppose what I'm really trying to do is put a p-value on the
>"geometry" of a given PDB file.  As in: what are the odds the deviations
>from ideality of this model are due to chance?
>
>I am leaning toward the need to take all the deviations in the structure
>together as a set, but, as Joao just noted, that it just "feels wrong"
>to tolerate a 3-sigma deviate.  Even more wrong to tolerate 4 sigma, 5
>sigma. And 6 sigma deviates are really difficult to swallow unless your
>have trillions of data points.
>
>To put it down in equations, is the p-value of a structure with 1000
>bonds in it with one 3-sigma deviate given by:
>
>a)  p = 1-erf(3/sqrt(2))
>or
>b)  p = 1-erf(3/sqrt(2))**1000
>or
>c) something else?
>

p = 1-erf(3/sqrt(2))**1000 (= 0.933 thus quite likely to happen) is the p-value 
of a structure with 1000 bonds in it, with one or more deviations > 3-sigma. 
(the words after the comma differ from how you express it)

But keep in mind: one person's outlier may be another person's Nobel Prize!

Best,
Kay

P.S. I used 1-erf(3/sqrt(2))^1000 at wolfram alpha.com for the numerical 
calculation

########################################################################

To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/

Reply via email to