Hi,
Dr Nicholls brings up many interesting points, but doesn't touch on
the major point I had hoped to make in my letter. Whenever you start
making multiple tests of your hypothesis you have to evaluate each of
those tests with a higher standard than you would if you only applied
one. If you take a survey of the amount of fat people eat along with
their history of heart disease you can calculate a correlation and find
it significant with a p value of 0.05. If, instead, you perform a
survey asking for twenty different dietary behaviors and twenty health
outcomes and find a correlation between eating fat and heart disease you
need a much higher "signal" to determine its significance. You just
made 400 comparisons and a p of 0.05 allows 20 spurious correlations to
appear significant.
If you are exploring your data set to decide if a compound has
bound, and your try several different refinement programs and calculate
several different map types based on the results of those refinements,
and then adjust the blur of each map, and pick the map with the
strongest peak in the putative binding site, you have to consider the
significance of that peak height to be less than if you had just
calculated one map and got that same height.
Ignoring this counterintuitive fact has resulted in a huge number of
studies in many fields to be published that ultimately turned out to not
be reproducible. It likely has also resulted in the deposition of a lot
of "complex" models in the PDB that aren't correct.
Yes, I am arguing for an ideal, hoping to pull some of you over
toward my side a bit. I certainly understand that one has to be
flexible when solving a difficult problem, but you can't ignore that
this "flexibility" has significant consequences for understanding the
results of your work.
Dr Nicholls' letter brings up a related topic which I'd like to
explore. His letter repeatedly mentions the importance of "intuition"
when interpreting a map. Yes, the power of human intuition, and our
inability to replicate it in silico is the reason we are still staring
at maps in Coot. Intuition is a remarkable tool which, by its nature,
is difficult to describe.
Yet, no one is born with an innate intuition for interpreting
electron density maps. Intuition is acquired thru practice. Practice
is not simple repetition, however. You can't become proficient in
shooting basketball hoops by simply repeatedly throwing a basketball on
the roof of your garage. You have to have a proper backboard and a
hoop. Now, after repeatedly throwing the ball and "feeling" the
difference between it going through the hoop and not, you will develop
the ability to make a basket w/o really thinking about it. You will
have developed an intuition for achieving that task.
There are two caveats. First, you have to actually watch the ball
go through the hoop. If you close your eyes right after your throw you
will never develop a useful skill. It is the feedback from the success
or failure of each attempt that makes it practice. Second, no matter
how much time you spend shooting baskets, you will never get better at
dribbling the ball. Good practice allows you to develop intuition, but
only intuition about that task.
Let's say you are working on a project, but having difficulty
interpreting your map at some critical location. You ask around and
learn of some spiffy new map calculation and you want to try it. While
you certainly can calculate the map, you have no intuition on how to
interpret it. You have not practiced with that type of map.
It may look similar to the maps you've looked at before, but that
similarity can be a trap. By now a large number of us here on the BB
have had the experience of looking at a high resolution electrostatic
potential (ESP) map and "feeling" that something is wrong with it. The
carbonyl oxygen bumps are too small and the acid groups are oddly weak.
Wow, those magnesium ions really stand out -- Maybe they're potassium
instead? No, there is nothing wrong with the ESP map. The fault is
with our intuition which was based on many, many hours of looking at ED
maps. To interpret ESP maps you have to practice with a bunch of ESP
maps first.
You cannot develop intuition for the spiffy map calculated from your
project's data since you don't know its correct interpretation -- It
cannot give you feedback. Before you calculate this map for your data
you should calculate versions for many other *completed* projects and
get a "feel" for what that kind of map shows under different
circumstances. Practice, practice, practice, then you will be ready to
return to your little mystery and be able to apply your, newly acquired,
intuition.
Yes, I try new refinement programs - But first I run refinement with
them on familiar proteins. Yes, I try new styles of map calculations -
But first I calculate those maps for cases where I know the answer.
I've refined a fair number of structures, probably not as many as most
of you, but at the end of a refinement I take the answer and go back to
the original maps. Looking at those maps in light of the answer is what
improves my map interpretation skills, such as they are, the most.
All of my practice has been with ED (and some ESP) maps of better
than 3 A resolution. Despite all the intuition I can bring to bear on
them, when it comes to a 4 A resolution map I'm no better than an
undergrad.
Your first experience with a new technique should never be with your
current project's data. You should work to add that technique to your
tool box, and then move back to your data. Practice, and more practice
will build that squishy neural network in your head.
Descending from soapbox,
Dale Tronrud
On 12/1/2020 8:31 AM, Robert Nicholls wrote:
Dear all,
I feel the need to respond following last week’s critique of the use of
Coot’s map blurring tool for providing diagnostic insight and aiding
ligand identification…
On 24 Nov 2020, at 16:02, Dale Tronrud <[email protected]
<mailto:[email protected]>> wrote:
To me, this sounds like a very dangerous way to use this tool decide
if a ligand has bound. I would be very reluctant to modify my map
with a range of arbitrary parameters until it looked like what I
wanted to see. The sharpening and blurring of this tool is not guided
or limited by theory or data.
I disagree with this, subject to the important qualification that care
is needed with interpretation. Blurring isn't a crime - it merely
involves adjusting the weighting given to lower versus higher resolution
reflections, and thus allows relaxation of the choice of high-resolution
limit, and facilitates local investigation of regions that exhibit a
poor signal-to-noise ratio. This is particularly pertinent to liganded
compounds, which are typically present with sub-unitary occupancies.
Coot's blurring merely involves convolution of the whole map with an
isotropic 3D Gaussian, with a parameter (B-factor) to control the
standard deviation of the Gaussian. This corresponds to reweighting the
structure factors in order to give higher weight to lower-resolution
reflections. This approach is guided by a very simple theory: higher
resolution structure factors (SFs) are typically noisier, with a
worse signal-to-noise ratio than lower resolution SFs (due to increased
errors in both observed higher-resolution reflections and calculated
phases). Consequently, increasing the blurring B-factor reduces the
effect of the noisier higher-resolution SFs. This results in a map that
should be more reliable, but at the expense of reduced structural detail
due to artificially reducing the effective resolution.
It should be noted that this does assume that lower resolution
reflections are more reliable than higher resolution ones. So, good
low-resolution data quality and completeness is important.
Unfortunately, determination of an optimal B-factor parameter is not
presently automated. Consequently, users are currently expected to trial
different values in the Coot slider tool in order to maximise
information and gain, for want of a better word, intuition.
Furthermore, due to the spatially heterogeneous nature of atomic
positional uncertainty in macromolecular complexes, it can be that
different B-factor parameters are of optimal usefulness in
different local regions of the map that exhibit
different signal-to-noise ratios. Such issues are on-going areas of
research.
The main problem is that interpretation is subjective. In difficult
cases, it is necessary to obtain as much information and insight as
possible in order to gain a good intuition. If you can't see a ligand in
the "standard" maps, but you can see evidence for a ligand in
blurred density (or difference density) maps of the various types, then
it means that careful exploration of those avenues is required.
Any "evidence" from viewing such maps and map types should serve to
guide intuition, and should be digested along with all other
available information. Such complementary maps should be seen as
diagnostics to gain intuition, rather than something that can be used as
an unequivocal argument for ligand binding.
Ultimately, the presence of significant density in a blurred map means
that there is something substantial present. Or in a blurred difference
density that there is something missing from the current model. This
could be a missing ligand, or it could be a mismodelled region of
the macromolecule, or it could be mismodelled solvent (in which
case re-evaluating any solvent mask may be worthwhile). Ultimately it is
down to the practitioner to explore all potential explanations for any
such behaviour, in order to maximise intuition and convince
themselves of the crystal's structural composition.
In some cases the presence of density in a blurred map might be
sufficient to convince the practitioner that it is worth pursing
investigation of binding. This may take various forms: hypothesising an
approximate pose for the ligand; the nature of interactions in the
structural environment of the macromolecule; re-evaluation after
modelling and refinement; or simply stating that there may be evidence
of binding. In many cases, the latter is the appropriate action, and, as
Robbie quite rightly pointed out: "in a scientific setting this digging
is not to come to a strong conclusion, but only to see if you should
pursue the project and do additional experiments".
On 24 Nov 2020, at 16:02, Dale Tronrud <[email protected]
<mailto:[email protected]>> wrote:
[...] to avoid bias in the interpretation of the results, all of the
statistical procedures are decided upon BEFORE the study is even
began. This protocol is written down and peer reviewed at the start.
Then the study is performed and the protocol is followed exactly.
[...] I would recommend that you decide what sort of map you think is
the best at showing features of your active site, based on the
resolution of your data set and other qualities of your project,
before you calculate your first Fourier transform. If you think a
Polder map is the bee's knees then calculate a Polder map and live
with it. If you are convinced of the value of a FEM, or a Buster map,
or a SA omit map, or whatever, calculate that map instead and live
with it.
I agree that such an approach would be more scientific, and I certainly
find this idea very appealing. Whilst I hesitate to speak against such a
philosophy, I feel it is necessary to temper/balance this view by
pitching a counterargument in the interests of pragmatism - in general
it's just not that practical. And perhaps propositions for revolution of
best-practice policies within the field should be distinct from current
practical recommendation, in the interests of avoiding
potential confusion for the student/user who simply wants a solution
that they can apply to today's problems.
Whilst it sounds like a nice ideal, in general it is difficult to know
which pathologies might be encountered (e.g. ambiguous density in the
binding site; twinning; modelling difficulties around a symmetry axis;
multiple conformations; semi-disorder; post-translational
chemical modifications; radiation damage… the list goes on). It's
completely acceptable for someone encountering a problem for the
first time to explore what tools are available to guide
any decision-making, in the hope of achieving the best model possible. A
typical user cannot be expected to outline a strategy for every
eventuality a priori - that sounds more like the design of an automated
pipeline, not advice that users should be expected follow.
In summary, it's unadvisable to put all eggs in one basket (of one type
of map, Polder or otherwise). If an experienced user likes a particular
tool because it's worked well for them in the past, it doesn't mean that
they shouldn't try other tools now (in this case: view other types of
maps) the next time they encounter a problem. Especially given that
tools in our field are still very much evolving over time. Different
approaches may have more value and provide more insight in different
circumstances.
Best regards,
Rob
------------------------------------------------------------------------
To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1
<https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1>
########################################################################
To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1
This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list
hosted by www.jiscmail.ac.uk, terms & conditions are available at
https://www.jiscmail.ac.uk/policyandsecurity/