Dear Ethan, thanks for comments. I do remember we talked about it last year in Argonne at CCP4 school and we did not seem to agree indeed. Let me bring my arguments as you stated yours.
Refinement is an optimization problem that involves model, data and fitting tools (such as target function and means to minimize or maximize it). In turn this means that refinement is a mathematical problem, and when it comes to math lousy definitions are least desired. TLS is one of many models used to describe crystal structures. Obviously, any model is an approximation to reality, and as you rightfully point out TLS isn't exception. TLS approximates motions of atomic model as rigid body. It is a crude and poor approximation indeed if, and only if, TLS model is used alone to account for all motions that the molecule happen to undergo. However, if one considers that total motion of the molecule is a superposition of several different motions arising at different structural levels, such as vibration of molecule as a whole, motions of individual chains, libration of side chains around chi-angles, individual atomic vibrations, etc, then it is not too unnatural to assume that there will be always a rigid-body component in this hierarchy of motions and this is the component that TLS is supposed to describe. And by the way this is exactly the reason why we always recommend using TLS and individual isotropic ADP refinement together. Most of the time a model is expected to have physical meaning, its mathematical description and computer instructions that bring all this to production. In case of TLS the physical meaning mandates the parameters of TLS model describe rigid body motions of a group of atoms. For instance, I hope it is not too uncomfortable to assume that atom group is unlikely to vibrate or librate with amplitudes that would throw it off the unit cell or bump into other molecules around or adjacent TLS groups tearing apart corresponding covalent bond, to name but a few. Mathematically it all means that T, L and S matrices have to possess a number of certain very well defined properties. Based on these properties one can tell something about meaningfulness of TLS parameters and this is what Urzhumtsev et al discuss. If we analyze TLS matrices for all PDB entries (where available), we will find that about 85% of them do not have physical sense. Now, shall we care? Well, we do care about covalent bonds to have certain lengths, peptide phi/psi not being in forbidden Ramachandran plot areas, meaningful molecule packing in unit cell, etc.. so why we would not care about TLS being meaningful? Why one would fight to death to get great R factors, nice model geometries and such but let nonsense in TLS records? After all, these records my be looked at by someone in an attempt to extract some biological relevance. It's just does not seem logic nor consistent to be strict about validity of some model parameters and let other parameters to be junk. So why TLS refinement results (refined elements of T, L and S matrices!) may be bad? There are three fundamental reasons for this that I can think of. One is related to the choice of TLS model such as assumptions about rigid groups. The other one is related to how TLS model is used, such as whether it is used alone to account for all motions or in combination with other motion descriptors. The third reason arises from a technicality about how parameters of TLS model are optimized. While the first two are more or less obvious, the third one is more subtle and I will expand on this more. Elements of TLS matrices represent parameters of corresponding motions (such as three amplitudes of libration about three orthogonal axes and three amplitudes of translation about possibly different set of three orthogonal axes). However, elements of TLS matrices are not exactly motion parameters but their functions. In order to extract motion parameters (amplitudes, axes, etc) from TLS matrices one needs to perform a rather complex protocol shown in figure 1 in our paper: http://journals.iucr.org/d/issues/2015/08/00/rr5096/rr5096.pdf In other words, using TLS matrices makes it mathematically very easy to encode information about motions into structure factor formula, but the price for this convenience is that it makes it difficult to extract motion parameters from TLS matrices. The problem is that all refinement programs that I'm aware of (and that use TLS) refine elements of TLS matrices and not actual motion parameters. For someone who wrote a refinement program this makes it crystal-clear why there are so many bad TLS refinement outcomes. This is simply because there is no control over refinement of motion parameters. Basically, these parameters are refined without any restraints! And to know what that means just try to refine a model without any stereochemistry/geometry restraints - you will get a model with atoms flying all over the unit cell volume! That same sort of damage happens to TLS parameters making it no wonder why there are so many failures! The fundamental solution to this problem is to re-design implementation of TLS refinement such that a new procedure would a) refine and report parameters of motions and not elements of TLS matrices, and b) implement proper restraints on refinable parameters. This is a lot of work, though. I'd say for me it would take a few months to do it properly. Needless to say that it's next to impossible to get any funding to do this kind of exercise! Reading year remark """ The Urzhumtsev et al classification of "nonsensical" TLS matrices includes many that make lots of sense but do not happen to describe a perfectly rigid body. That's OK, because proteins are not perfectly rigid bodies. The TLS models are useful approximations that capture essential features of a messy ensemble of protein atoms. """ I guess I see the source of confusion.. Once again we all agree that any model (and TLS is not an exception) is only an approximation to the reality. However, there are two different key questions here that people often either confuse or inappropriately mix up: a) How well the TLS approximation explains the real motions of atomic groups? Your remark above concerns this point, which is the validation of the whole TLS approach, and clearly it is a valid point. b) Whether the particular descriptors of a TLS model comply with the "basic axioms" of the whole TLS theory. In Urzhumtsev et al we address this question. A can't-be-simpler example for "a)" and "b" above is: - Everybody agrees that using isotropic B-factors is a fine modeling tool at certain resolutions - this is the point (a) above. - Everybody also agrees that negative B values in the PDB models is not a very good thing to have - this is the point (b) above. Atomic coordinates and ADP are the two among several parameters used to describe crystal structure. Crystallographers put a lot of effort to identify stereochemically correct models. We argue that the same should be done to the other model parameters, such as TLS. Hirshfeld (1976) pioneered this and you brought ADP validation to macromolecular world. Urzhumtsev et al extend this deeper and further. The TLS theory (Schomaker & Trueblood, 1968) is based on the assumption that uncertainties in atomic positions of a given group are described by correlated vibrations and librations (which may be supplemented by individual atomic corrections). The same theory shows that mathematically it is possible to represent motions of many random models (around their mean) as an averaged motion that can be described by a few matrices (T, L and S). Now if the TLS matrices do not represent such the average then what is their meaning then? Clearly, arguments like "they may decrease R-factors", "it is better than nothing" or "it's ok since TLS is just an approximation" do not appear very strong scientifically. You say """ Complaining that in practice the refined TLS values deviate from those that would hypothetically be obtained from fitting perfectly rigid groups is beside the point """ Here you again refer to the point (a) above and not (b). The U matrices calculated from the TLS may be different from the individual U values, regardless how much we want them to coincide. However, independently of the quality of this fit, the TLS matrices always have to comply with the basic TLS axioms. You say """ But a validation criterion that is so strict that, it labels 85% of all protein refinements as "nonsensical" is not a very, useful test. """ Here we are talking about validity of TLS refinement result, which are T, L and S matrices and not atomic model as a whole. Bad TLS does not necessarily mean bad atom coordinates, for example. In fact atomic models with bad TLS may still be fine, it's just their TLS parameters should not be taken seriously and also one should realize that possibly lower R factors obtained as result of using TLS are simply due to using TLS as a fudge factor and not as a physically meaningful model. Summarizing, I would say: 1) Using TLS model is a great addition to (and not a replacement for) simple individual isotropic or anisotropic model of Atomic Displacement Parameters (ADP). If used correctly, TLS is expected to improve the model. However, this is not the case most of the time for reasons eluded above, which is unfortunate. 2) Needless to say, validation is important. This applies to all model parameters, not just coordinates! 3) Someone better off investing some effort into redoing TLS refinement protocols... just to stop adding nonsense to the database! Happy holidays and all the best, Pavel On Tue, Dec 20, 2016 at 11:30 PM, Ethan Merritt <merr...@u.washington.edu> wrote: > On Tuesday, 20 December 2016 10:28:44 PM Pavel Afonine wrote: > > Hi Dirk, > > > > > > I want to check the validity of the refinement of anisotropic B-factors > vs. > > > TLS + isototropic B-factors using the Hamilton R-value ratio test as > > > described in Ethan Merritt's paper "To B or not to B", Acta Cryst. D, > Vol > > > 68, pp 468. This test uses the generalised R-factors (assuming unit > > > weights), RG=(Sum(Fo-Fc)^2/Sum(Fo)^2)^1/2. Although Hamilton wrote > that > > > at the end of refinement, one could also use the similar ratio of the > usual > > > R-factors, I really would like to check the ratio of the RG-values > after > > > refinement. As far as I can see, this value is not reported by the > usual > > > refinement programs. > > > > > > > > R factor is a global metric that, if considered alone, is not going to > > answer your question. Best is to consider all three: > > > > 1) Rfree; > > 2) Rfree-Rwork; > > > 3) Meaningfulness of refined TLS matrices. Note, as we discovered and > > documented recently, results of TLS refinements (TLS matrices) are > > nonsensical in 85% of PDB entries (yes, eighty-five are bad, believe it > or not!): > > > From deep TLS validation to ensembles of atomic models built from > elemental > > motions. A. Urzhumtsev, P. V. Afonine, A. H. Van Benschoten, J. S. > Fraser and P. D. > > Adams. Acta Cryst. (2015). D71, 1668-1683. > > As you know, I disagree on this point. > > The Urzhumtsev et al classification of "nonsensical" TLS matrices includes > many that make lots of sense but do not happen to describe a perfectly > rigid body. > That's OK, because proteins are not perfectly rigid bodies. > The TLS models are useful approximations that capture > essential features of a messy ensemble of protein atoms. > Complaining that in practice the refined TLS values deviate from those that > would hypothetically be obtained from fitting perfectly rigid groups is > beside > the point. > > Of course some refinements really are bad and some models really are > unreasonable. Validation tests can help you catch these and fix your > model or refinement. But a validation criterion that is so strict that > it labels 85% of all protein refinements as "nonsensical" is not a very > useful test. > > > > > I'd say if you pass "1-3)" you are more than good. If still in doubt, you > > can make an extra effort and do what's described in > > > > Validation of crystallographic models containing TLS or other > descriptions > > of anisotropy > > F. Zucker, P. C. Champ and E. A. Merritt > > Acta Cryst. (2010). D66, 889-900 > > > > which may reveal extra troubles. > > Note that the primary validation test described in the Zucker paper > (we called it SKITTLS) is a check for the pairwise consistency of > adjacent TLS groups. It might flag as inconsistent two adjacent > groups that both pass the criteria in Urzhumtsev et al, or conversely > it might rate two groups that fail the Urzhumtsev criteria as being > nevertheless consistent in their description of atoms they jointly > apply to. > > Ethan >