Re: [ccp4bb] AW: [ccp4bb] AW: [ccp4bb] (R)MS

Gergely Katona Sat, 29 May 2021 02:13:55 -0700

Hi,

It is enough to have Å² as unit to express uncertainty in 3D, but one can 
express it with a single number only in a very specific case when the atom is 
isotropic. Few atoms have a naturally isotropic distribution around their mean 
position in very high resolution protein crystal structures. The anisotropic 
atoms can be described by a 3x3 matrix, where each row and column is associated 
with the uncertainty in a specific spatial direction. The matrix elements are 
the product of the uncertainty in these directions. The diagonal elements will 
be the square of uncertainty in the same direction and they should be always 
positive, the off-diagonal combination of directions are covariances (+,0 or 
-). In the end, every element will have a unit distance*distance and the matrix 
will be symmetric. We cannot just take the square root of the matrix elements 
and expect something meaningful, if for no other reason the problem with 
negative covariances. To calculate the square root on the matrix itself one has 
to diagonalize it first. The height of a person in your example  sounds easy to 
define, but the mathematical formalism will not decide that for me. I can also 
define height as the longest cord of a person or the maximum elevation of a car 
mechanic under a car.  Through diagonalization one can at least extract some 
interesting, intuitive, principal directions. The final product, the 
sqrt(matrix), is not more intuitive to me. To convert it to something intuitive 
I would have to diagonalize square rooted matrix again. So shall we make an 
exception for the special, isotropic description? Or use general principles for 
isotropic and anisotropic treatments?


About what B-factors are, I like to think about them as necessary model 
parameters. Computational biologists also use them for benchmarking their 
molecular dynamics models. They are also reproducible to the extent that one 
can identify specific atoms just based on their anisotropic tensor from 
independent structure determinations in the same crystal form. They are of 
course not immune to errors and variation.

I also wonder how we can represent model parameter variation in the best way. I 
admire NMR spectroscopists’ approach to deposit multiple samples from a 
structural distribution. One could reproduce their conclusions without assuming 
any sort of error model from these samples. In crystallography, we have more 
and more distributions to deal with because we are swimming in data. It is easy 
to sample/resample data sets from the same or different crystals (SFX for 
example). Which can lead to many replicates of structural models. I cannot 
really motivate to create multiple PDB entries for these replicates, it is not 
good for to reader to try to understand which PDB codes belong to which group 
of samples. Maybe it works for up to 10 structures, but how about a 100? Is it 
possible to deposit crystal structures as a chain of model/data pairs under the 
same entry? It is possible to just make a tarball and deposit in alternative 
services such as Zenodo, but it would be a pity to completely bypass the PDB. I 
can think of more compact description of structural distributions, for example 
mean positions and mean B-factors of atoms with their associated covariance 
matrices, analogously how MD trajectories can be described as average 
structures and covariance matrices.  I think the assumption of independent 
variations per atoms is too strong in many cases and does not give an accurate 
picture of uncertainty.

Best wishes,

Gergely

Gergely Katona, Professor, Chairman of the Chemistry Program Council
Department of Chemistry and Molecular Biology, University of Gothenburg
Box 462, 40530 Göteborg, Sweden
Tel: +46-31-786-3959 / M: +46-70-912-3309 / Fax: +46-31-786-3910
Web: http://katonalab.eu, Email: [email protected]

From: CCP4 bulletin board <[email protected]> On Behalf Of Hughes, Jonathan
Sent: 28 May, 2021 14:49
To: [email protected]
Subject: [ccp4bb] AW: [ccp4bb] AW: [ccp4bb] AW: [ccp4bb] (R)MS

hi ian,
yes, that aspect was in my mind, a bit, but i wanted to keep it simple. my 
point wasn't really how the "uncertainty" parameter is derived but rather its 
units. i can imagine that uncertainty in 3D could be expressed in Å³ (without 
helping the naïve user much) or in Å (which to me at least seems useful), but 
Å² (i.e. the B factor) seems neither logical nor helpful in this context, 
irrespective of its utility elsewhere. if you just see the B factor as a 
number, ok, you can do the √ in your head, but if it's visualized as in 
pymol/putty larger uncertainties become exaggerated – which is another word for 
"misrepresented".
cheers
j

Von: Ian Tickle <[email protected]<mailto:[email protected]>>
Gesendet: Freitag, 28. Mai 2021 12:10
An: Hughes, Jonathan 
<[email protected]<mailto:[email protected]>>
Cc: [email protected]<mailto:[email protected]>
Betreff: Re: [ccp4bb] AW: [ccp4bb] AW: [ccp4bb] (R)MS


Hi Jonathan

On Thu, 27 May 2021 at 18:34, Hughes, Jonathan 
<[email protected]<mailto:[email protected]>> 
wrote:

 "B = 8π2<u2>  where u is the r.m.s. displacement of a scattering center, and 
<...> denotes time averaging"

Neither of those statements is necessarily correct: u is the _instantaneous_ 
displacement which of course is constantly changing (on a timescale of the 
order of femtoseconds) and cannot be measured.  So u2 is the squared 
instantaneous displacement, <u2>  is the mean-squared displacement, and so the 
root-mean-squared displacement (which of course is amenable to measurement) is 
sqrt(<u2>), not the same thing at all as u.

Incidentally, the 8π2 constant factor comes from Fourier-transforming the 
Debye-Waller factor expression I mentioned earlier.

Also for crystals at least, the averaging is not only over time, it's over all 
unit cells, i.e. the displacements are not only thermal in origin but also due 
to spatial static disorder (instantaneous differences between unit cells).


it would seem to me that we would be able to interpret things MUCH more easily 
with u rather than anything derived from u².
So then I think what you mean is sqrt(<u2>) rather than <u2>, which seems not 
unreasonable.

Cheers

-- Ian





________________________________

To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1

########################################################################

To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/

Re: [ccp4bb] AW: [ccp4bb] AW: [ccp4bb] (R)MS

Reply via email to