On Saturday, 29 May 2021 02:12:16 PDT Gergely Katona wrote:
[...snip...]
I think the assumption of independent variations per atoms is too strong
in many cases and does not give an accurate picture of uncertainty.
[...snip...]
Gergely, you are revisiting a line of thought that historically led
to the introduction of more global treatments of atomic displacement.
These have distinct statistical and interpretational advantages.
Several approaches have been tried over the past 40 years or so.
The one that has proved most successful is the use of TLS
(Translation/Libration/Screw) models of bulk displacement to supplement
or replace per-atom descriptions. As you say, a per-atom treatment
is often too strong and is not statistically justified by the
experimental data. I explored this with specific examples in
"To B or not to B?" [Acta Cryst. 2012, D68, 468-477]
http://skuld.bmsc.washington.edu/~tlsmd/references.html
An NMR-style approach that constructs and refines multiple discrete
models has been been re-invented several times. These treatments are
generally called "ensemble models". IMHO they are statistically
unjustified and strictly worse than treatments based on higher level
descriptions such as TLS or normal-mode analysis.
X-ray data is qualitatively different from NMR data, and optimal
treatment of uncertainty must take this into account.
best regards
Ethan
> Hi,
>
> It is enough to have Ų as unit to express uncertainty in 3D, but one can
> express it with a single number only in a very specific case when the atom is
> isotropic. Few atoms have a naturally isotropic distribution around their
> mean position in very high resolution protein crystal structures. The
> anisotropic atoms can be described by a 3x3 matrix, where each row and column
> is associated with the uncertainty in a specific spatial direction. The
> matrix elements are the product of the uncertainty in these directions. The
> diagonal elements will be the square of uncertainty in the same direction and
> they should be always positive, the off-diagonal combination of directions
> are covariances (+,0 or -). In the end, every element will have a unit
> distance*distance and the matrix will be symmetric. We cannot just take the
> square root of the matrix elements and expect something meaningful, if for no
> other reason the problem with negative covariances. To calculate the square
> root on the matrix itself one has to diagonalize it first. The height of a
> person in your example sounds easy to define, but the mathematical formalism
> will not decide that for me. I can also define height as the longest cord of
> a person or the maximum elevation of a car mechanic under a car. Through
> diagonalization one can at least extract some interesting, intuitive,
> principal directions. The final product, the sqrt(matrix), is not more
> intuitive to me. To convert it to something intuitive I would have to
> diagonalize square rooted matrix again. So shall we make an exception for the
> special, isotropic description? Or use general principles for isotropic and
> anisotropic treatments?
>
> About what B-factors are, I like to think about them as necessary model
> parameters. Computational biologists also use them for benchmarking their
> molecular dynamics models. They are also reproducible to the extent that one
> can identify specific atoms just based on their anisotropic tensor from
> independent structure determinations in the same crystal form. They are of
> course not immune to errors and variation.
>
> I also wonder how we can represent model parameter variation in the best way.
> I admire NMR spectroscopists’ approach to deposit multiple samples from a
> structural distribution. One could reproduce their conclusions without
> assuming any sort of error model from these samples. In crystallography, we
> have more and more distributions to deal with because we are swimming in
> data. It is easy to sample/resample data sets from the same or different
> crystals (SFX for example). Which can lead to many replicates of structural
> models. I cannot really motivate to create multiple PDB entries for these
> replicates, it is not good for to reader to try to understand which PDB codes
> belong to which group of samples. Maybe it works for up to 10 structures, but
> how about a 100? Is it possible to deposit crystal structures as a chain of
> model/data pairs under the same entry? It is possible to just make a tarball
> and deposit in alternative services such as Zenodo, but it would be a pity to
> completely bypass the PDB. I can think of more compact description of
> structural distributions, for example mean positions and mean B-factors of
> atoms with their associated covariance matrices, analogously how MD
> trajectories can be described as average structures and covariance matrices.
> I think the assumption of independent variations per atoms is too strong in
> many cases and does not give an accurate picture of uncertainty.
>
> Best wishes,
>
> Gergely
>
> Gergely Katona, Professor, Chairman of the Chemistry Program Council
> Department of Chemistry and Molecular Biology, University of Gothenburg
> Box 462, 40530 Göteborg, Sweden
> Tel: +46-31-786-3959 / M: +46-70-912-3309 / Fax: +46-31-786-3910
> Web: http://katonalab.eu, Email: [email protected]
>
> From: CCP4 bulletin board <[email protected]> On Behalf Of Hughes,
> Jonathan
> Sent: 28 May, 2021 14:49
> To: [email protected]
> Subject: [ccp4bb] AW: [ccp4bb] AW: [ccp4bb] AW: [ccp4bb] (R)MS
>
> hi ian,
> yes, that aspect was in my mind, a bit, but i wanted to keep it simple. my
> point wasn't really how the "uncertainty" parameter is derived but rather its
> units. i can imagine that uncertainty in 3D could be expressed in ų (without
> helping the naïve user much) or in Å (which to me at least seems useful), but
> Ų (i.e. the B factor) seems neither logical nor helpful in this context,
> irrespective of its utility elsewhere. if you just see the B factor as a
> number, ok, you can do the √ in your head, but if it's visualized as in
> pymol/putty larger uncertainties become exaggerated – which is another word
> for "misrepresented".
> cheers
> j
>
> Von: Ian Tickle <[email protected]<mailto:[email protected]>>
> Gesendet: Freitag, 28. Mai 2021 12:10
> An: Hughes, Jonathan
> <[email protected]<mailto:[email protected]>>
> Cc: [email protected]<mailto:[email protected]>
> Betreff: Re: [ccp4bb] AW: [ccp4bb] AW: [ccp4bb] (R)MS
>
>
> Hi Jonathan
>
> On Thu, 27 May 2021 at 18:34, Hughes, Jonathan
> <[email protected]<mailto:[email protected]>>
> wrote:
>
> "B = 8π2<u2> where u is the r.m.s. displacement of a scattering center, and
> <...> denotes time averaging"
>
> Neither of those statements is necessarily correct: u is the _instantaneous_
> displacement which of course is constantly changing (on a timescale of the
> order of femtoseconds) and cannot be measured. So u2 is the squared
> instantaneous displacement, <u2> is the mean-squared displacement, and so
> the root-mean-squared displacement (which of course is amenable to
> measurement) is sqrt(<u2>), not the same thing at all as u.
>
> Incidentally, the 8π2 constant factor comes from Fourier-transforming the
> Debye-Waller factor expression I mentioned earlier.
>
> Also for crystals at least, the averaging is not only over time, it's over
> all unit cells, i.e. the displacements are not only thermal in origin but
> also due to spatial static disorder (instantaneous differences between unit
> cells).
>
>
> it would seem to me that we would be able to interpret things MUCH more
> easily with u rather than anything derived from u².
> So then I think what you mean is sqrt(<u2>) rather than <u2>, which seems not
> unreasonable.
>
> Cheers
>
> -- Ian
>
>
>
>
>
> ________________________________
>
> To unsubscribe from the CCP4BB list, click the following link:
> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1
>
> ########################################################################
>
> To unsubscribe from the CCP4BB list, click the following link:
> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1
>
> This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing
> list hosted by www.jiscmail.ac.uk, terms & conditions are available at
> https://www.jiscmail.ac.uk/policyandsecurity/
>
--
Ethan A Merritt
Biomolecular Structure Center, K-428 Health Sciences Bldg
MS 357742, University of Washington, Seattle 98195-7742
########################################################################
To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1
This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list
hosted by www.jiscmail.ac.uk, terms & conditions are available at
https://www.jiscmail.ac.uk/policyandsecurity/