Hi James,
My initial thought is that a 3-sigma bond length deviation is significant.
When we make an experiment and observe something that doesn’t correspond to 
what we expect, it could be an error, or it could be that we found the thread 
for a new discovery. 

If the experimental data supports that deviation, I would look at the 
experimental data carefully because that can show you a surprise. Maybe it is a 
different bond than you were expecting? Maybe a different geometry? 
I assume you are talking about a protein crystal structure that was collected 
at the synchrotron, but you should analyze the data and consider what is the 
resolution of your data?
What is the completeness? What is the B-factor? How did you refine those bonds?
 If you have low resolution you might have more variability, but we typically 
impose more restrictions during refinement and might not observe those 
variations. 

When you represent your crystal structure by a model, you have to consider that 
in a crystal you have an average of structures in the crystal lattice, and an 
average in time, where radiation damage is also occurring .

If the experiment is well done, the data might point you to something relevant.
Maybe the atoms you are considering C-C bond are not what you thought and 
therefore the bond is different.
One example can be observed in interactions of proteins with metals, where 
initially we refine the model thinking we have one metal and it shows a mixture 
of metals, or a completely different one. I know you will do a fluorescence 
scan in your MAD beamline to sort it out 😊

Best wishes,
Joao

-----Original Message-----
From: CCP4 bulletin board <CCP4BB@JISCMAIL.AC.UK> On Behalf Of James Holton
Sent: Tuesday, November 8, 2022 5:22 PM
To: CCP4BB@JISCMAIL.AC.UK
Subject: [EXTERNAL] [ccp4bb] outliers

OK, so lets suppose there is this bond in your structure that is stretched a 
bit.  Is that for real? Or just a random fluke?  Let's say for example its a 
CA-CB bond that is supposed to be 1.529 A long, but in your model its 1.579 A.  
This is 0.05 A too long. Doesn't seem like much, right? But the "sigma" given 
to such a bond in our geometry libraries is 0.016 A.  These sigmas are 
typically derived from a database of observed bonds of similar type found in 
highly accurate structures, like small molecules. So, that makes this a 3-sigma 
outlier. 
Assuming the distribution of deviations is Gaussian, that's a pretty unlikely 
thing to happen. You expect 3-sigma deviates to appear less than 0.3% of the 
time.  So, is that significant?

But, then again, there are lots of other bonds in the structure. Lets say there 
are 1000. With that many samplings from a Gaussian distribution you generally 
expect to see a 3-sigma deviate at least once.  That is, do an "experiment" 
where you pick 1000 Gaussian-random numbers from a distribution with a standard 
deviation of 1.0. Then, look for the maximum over all 1000 trials. Is that one 
> 3 sigma? It probably is. If you do this "experiment" millions of times it 
turns out seeing at least one 3-sigma deviate in 1000 tries is very common. 
Specifically, about 93% of the time. It is rare indeed to have every member of 
a 1000-deviate set all lie within 3 sigmas.  So, we have gone from one 3-sigma 
deviate being highly unlikely to being a virtual certainty if you look at 
enough samples.

So, my question is: is a 3-sigma deviate significant?  Is it significant only 
if you have one bond in the structure?  What about angles? What if you have 500 
bonds and 500 angles?  Do they count as 1000 deviates together? Or separately?

I'm sure the more mathematically inclined out there will have some intelligent 
answers for the rest of us, however, if you are not a mathematician, how about 
a vote?  Is a 3-sigma bond length deviation significant? Or not?

Looking forward to both kinds of responses,

-James Holton
MAD Scientist

########################################################################

To unsubscribe from the CCP4BB list, click the following link:
https://urldefense.com/v3/__https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1__;!!H9nueQsQ!59F_VEEPv4VzGZgCp3sN73Rj4vKyZBcIOY9TwFRyr0U47aUiTqIIkUg_5ib8dT46tiKEAgQxf378QBG7$
  

This message was issued to members of 
https://urldefense.com/v3/__http://www.jiscmail.ac.uk/CCP4BB__;!!H9nueQsQ!59F_VEEPv4VzGZgCp3sN73Rj4vKyZBcIOY9TwFRyr0U47aUiTqIIkUg_5ib8dT46tiKEAgQxf28joQL8$
  , a mailing list hosted by 
https://urldefense.com/v3/__http://www.jiscmail.ac.uk__;!!H9nueQsQ!59F_VEEPv4VzGZgCp3sN73Rj4vKyZBcIOY9TwFRyr0U47aUiTqIIkUg_5ib8dT46tiKEAgQxfwxM1QmH$
  , terms & conditions are available at 
https://urldefense.com/v3/__https://www.jiscmail.ac.uk/policyandsecurity/__;!!H9nueQsQ!59F_VEEPv4VzGZgCp3sN73Rj4vKyZBcIOY9TwFRyr0U47aUiTqIIkUg_5ib8dT46tiKEAgQxf1Mbwaqi$
  

########################################################################

To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/

Reply via email to