Thank you Herbert,
Yes, I call tunneling the "conformer swap trick", and I provide a jiffy
script for doing this. Swapping conformer letter assignments is
equivalent to changing the color of the ropes at a certain distance down
their length. When you give this to the refinement program all the
bonds are created between same-color bits of rope only, so you
effectively tunnel.
The hard part is deciding which bits need to change color. In my
simple diagram it is easy, as the point of maximum stress is also the
point where corrective action needs to be taken. This is equivalent to
"Level 1" of the UNTANGLE Challenge. At Level 2 the atoms that need to
be swapped are not the most strained, but nearby. At Level 3 there are
several groups of atoms that need to be swapped, but I can't do it
myself without cheating because I already know what they are.
At Level 9 the atoms that need swapping are in large, connected groups.
This type of correlated motion is probably the most biologically
interesting.
Good news is: every "wrong interpretation" of the correlated motions
that I have been able to contrive has markedly more strain than the
ground truth. This implies the "right interpretation" of correlated
motion is recognizable and provable. I find that motivating.
-James Holton
MAD Scientist
On 1/21/2024 4:07 AM, Herbert J. Bernstein wrote:
Have you considered the impact of tunneling? Your rope crossings are
not perfect barriers.
On Sat, Jan 20, 2024 at 6:09 PM James Holton <jmhol...@lbl.gov> wrote:
Update:
I've gotten some feedback asking for clarity on what I mean by
"tangled". I paste here a visual aid:
The protein chains in an ensemble model are like these ropes. If
these ropes are the same length as the distance from floor to
ceiling, then straight up-and-down is the global minimum in energy
(left). The anchor points are analogous to the rest of the protein
structure, which is the same in both diagrams. Imagine for a
moment, however, after anchoring the dangling rope ends to the
floor you look up and see the ropes are actually crossed (right).
You got the end points right, but no amount of pulling on the
ropes (energy minimization) is going to get you from the tangled
structure to the global minimum. The tangled ropes are also
strained, because they are being forced to be a little longer than
they want to be. This strain in protein models manifests as
geometry outliers and the automatic weighting in your refinement
program responds to bad geometry by relaxing the x-ray weight,
which alleviates some of the strain, but increases your Rfree.
The goal of this challenge is to eliminate these tangles, and do
it efficiently. What we need is a topoisomerase! Something that
can find the source of strain and let the ropes pass through each
other at the appropriate place. I've always wanted one of those
for the wires behind my desk...
More details on the origins of tangling in ensemble models can be
found here:
https://bl831.als.lbl.gov/~jamesh/challenge/twoconf/#tangle
-James Holton
MAD Scientist
On 1/18/2024 4:33 PM, James Holton wrote:
Greetings Everybody,
I present to you a Challenge.
Structural biology would be far more powerful if we can get our
models out of local minima, and together, I believe we can find a
way to escape them.
tldr: I dare any one of you to build a model that scores better
than my "best.pdb" model below. That is probably impossible, so I
also dare you to approach or even match "best.pdb" by doing
something more clever than just copying it. Difficulty levels
range from 0 to 11. First one to match the best.pdb energy score
an Rfree wins the challenge, and I'd like you to be on my paper.
You have nine months.
Details of the challenge, scoring system, test data, and
available starting points can be found here:
https://bl831.als.lbl.gov/~jamesh/challenge/twoconf/
Why am I doing this?
We all know that macromolecules adopt multiple conformations.
That is how they function. And yet, ensemble refinement still has
a hard time competing with conventional
single-conformer-with-a-few-split-side-chain models when it comes
to revealing correlated motions, or even just simultaneously
satisfying density data and chemical restraints. That is,
ensembles still suffer from the battle between R factors and
geometry restraints. This is because the ensemble member chains
cannot pass through each other, and get tangled. The tangling
comes from the density, not the chemistry. Refinement in refmac,
shelxl, phenix, simulated annealing, qFit, and even coot cannot
untangle them.
The good news is: knowledge of chemistry, combined with R
factors, appears to be a powerful indicator of how near a model
is to being untangled. What is really exciting is that the
genuine, underlying ensemble cannot be tangled. The true ensemble
_defines_ the density; it is not being fit to it. The more
untangled a model gets the closer it comes to the true ensemble,
with deviations from reasonable chemistry becoming easier and
easier to detect. In the end, when all alternative hypotheses
have been eliminated, the model must match the truth.
Why can't we do this with real data? Because all ensemble models
are tangled. Let's get to untangling them, shall we?
To demonstrate, I have created a series of examples that are
progressively more difficult to solve, but the ground truth model
and density is the same in all cases. Build the right model, and
it will not only explain the data to within experimental error,
and have the best possible validation stats, but it will reveal
the true, underlying cooperative motion of the protein as well.
Unless, of course, you can prove me wrong?
-James Holton
MAD Scientist
------------------------------------------------------------------------
To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1
<https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1>
########################################################################
To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1
This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list
hosted by www.jiscmail.ac.uk, terms & conditions are available at
https://www.jiscmail.ac.uk/policyandsecurity/