Re: [ccp4bb] [3dem] Which resolution?

Andreas Förster Sun, 23 Feb 2020 13:24:52 -0800

A very good point, Gerard, but maybe too late. It seems to me that alot of microscopists have already given up this abundantly discussedquestion. They just call everything atomic resolution irrespective ofwhatever numerical value they arrive at by whatever means.


All best.



Andreas



On 23/02/2020 8:41, Gerard Bricogne wrote:

Gentlemen,

      Please consider for a moment that by such intemperate language and
tone, you are making a topic of fundamental importance to both the MX and
the EM communities into a no-go area. This cannot be good for anyone's
reputation nor for the two fields in general. It has to be possible to
discuss the topic of "resolution" in a dispassionate way, so as to jointly
gain an improved and shared understanding of the matter, without feeling
implicitly under pressure to support one side or the other. An acrimonious
dispute like this one can only be putting people off getting involved in the
discussion, which is exactly the opposite of what a thread on a scientific
bulletin board should be doing.


      With best wishes,

           Gerard.

--
On Sun, Feb 23, 2020 at 08:15:34AM -0300, Marin van Heel wrote:

Hi Carlos Oscar and Jose-Maria,

I choose to answer you guys first, because it will take little of my time
to counter your criticism and because I have long since been less than
amused by your published, ill-conceived criticism:

“*Marin, I always suffer with your reference to sloppy statistics. If we
take your paper of 2005 where the 1/2 bit criterion was proposed, Eqs. 4 to
15 have completely ignored the fact that you are dealing with Fourier
components, that are complex numbers, and consequently you have to deal
with random variables that have TWO components, which moreover the real and
imaginary part are not independent and, in their turn, they are not
independent of the nearby Fourier coefficients so that for computing radial
averages you would need to account for the correlation among coefficients*”

I had seen this argumentation against our (2005) paper in your
manuscript/paper years back. I was so stunned by the level of
misunderstanding expressed in your manuscript that I chose not to spend any
time reacting to those statements. Now that you choose to so openly display
your thoughts on the matter, I have no other choice than to spell out your
errors in public.



All complex arrays in our 2005 paper are Hermitian (since they are the FTs
of real data), and so are all their inner products. In all the integrals
over rings one always averages a complex Fourier-space voxel with its
Hermitian conjugate yielding *ONE* real value (times two)!  Without that
Hermitian property, FRCs and FSCs, which are real normalised correlation
functions would not even have been possible. I was - and still am - stunned
by this level of misunderstanding!



This is a blatant blunder that you are propagating over years, a blunder
that does not do any good to your reputation, yet also a blunder that has
probably damaged to our research income. The fact that you can divulgate
such rubbish and leave it out there for years for referees to read (who are
possibly not as well educated in physics and mathematics) will do – and may
already have done – damage to our research.  An apology is appropriate but
an apology is not enough.



Maybe you should ask your granting agencies how to transfer 25% of your
grant income to our research, in compensation of damages created by your
blunder!



Success with your request!



Marin



PS. You have also missed that our 2005 paper explicitly includes the
influence of the size of the object within the sampling box (your: “*they
are not independent of the nearby Fourier coefficients*”). I remain
flabbergasted.

On Fri, Feb 21, 2020 at 3:15 PM Carlos Oscar Sorzano <c...@cnb.csic.es>
wrote:

Dear all,

I always try to refrain myself from getting into these discussions, but I
cannot resist more the temptation. Here are some more ideas that I hope
bring more light than confusion:

- There must be some functional relationship between the FSC and the SNR,
but the exact analytical form of this relationship is unknown (I suspect
that it must be at least monotonic, the worse the SNR, the worse FSC; but
even this is difficult to prove). The relationship we normally use
FSC=SNR/(1+SNR) was derived in a context that does not apply to CryoEM (1D
stationary signals in real space; our molecules are not stationary), and
consequently any reasoning of any threshold based on this relationship is
incorrect (see our review).

- Still, as long as we all use the same threshold, the reported
resolutions are comparable to each other. In that regard, I am happy that
we have set 0.143 (although any other number would have served the purpose)
as the standard.

- I totally agree with Steve that the full FSC is much more informative
than its crossing with the threshold. Specially, because we should be much
more worried about its behavior when it has high values than when it has
low values. Before crossing the threshold it should be as high as possible,
and that is the "true measure" of goodness of the map. When it crosses the
threshold of 0.143, it has too low SNR, and by definition, that is a very
unstable part of the FSC, resulting in relatively unstable reports of
resolution. We made some tests about the variability of the FSC (refining
random splits of the dataset), trying to put the error bars that Steve was
asking for, and it turned out to be pretty reproducible (rather low
variance except in the region when it crosses the threshold) as long as the
dataset was large enough (which is the current state).

- @Marin, I always suffer with your reference to sloppy statistics. If we
take your paper of 2005 where the 1/2 bit criterion was proposed (
https://www.sciencedirect.com/science/article/pii/S1047847705001292),
Eqs. 4 to 15 have completely ignored the fact that you are dealing with
Fourier components, that are complex numbers, and consequently you have to
deal with random variables that have two components, which moreover the
real and imaginary part are not independent and, in their turn, they are
not independent of the nearby Fourier coefficients so that for computing
radial averages you would need to account for the correlation among
coefficients (
https://www.aimspress.com/fileOther/PDF/biophysics/20150102.pdf). For
properly dealing the statistics, at least one needs to carry out a
two-dimensional reasoning, including the complex conjugate multiplication
which is all missing in your derivation, rather than treating everything as
one-dimensional, real valued random variables. Additionally, embedded in
your whole reasoning is the idea that the expected value of a ratio is the
ratio of the expected values, that is a 0-th order Taylor approximation of
the mean of the distribution of a ratio between two random variables.
Finally, I always find an extreme difficulty to understand the 1 bit or 1/2
bit criteria, that is, what is the relationship between the channel's
capacity formula of Shannon (
https://en.wikipedia.org/wiki/Shannon%E2%80%93Hartley_theorem) and our
FSC (we do not have any channel through which we are "transmitting" our
volume, although it is true we have a model y=x+n that is the same as in
signal transmission, it is not true that the average information of a
signal is log2(1+SNR); for me, the only relationship is that the SNR
appears in both formulas, FSC and channel capacity, but that does not
automatically make them comparable and interchangeble). This is not a
criticism on your work. I think the FSC is a very useful tool to measure
some properties of the reconstruction process and the quality of the
dataset (not everything is measured by the FSC) and it also has its
drawbacks (for instance, systematic errors are rewarded by the FSC as they
are reproducible in both halves). Moreover, I think you are an extremely
intelligent person, who I consider a good friend, with a very good
intuition about image processing and who has brought very interesting ideas
and methodologies into the field. Only that we cannot become crazy about
the FSC threshold and the reported resolution, as the most interesting part
of the FSC is not when it is low, but when it is high.

I hope I can keep refraining myself in the future :-)

Cheers, Carlos Oscar

On 2/21/20 6:19 PM, Ludtke, Steven J. wrote:

I've been steadfastly refusing to get myself dragged in this time, but
with this very sensible statement (which I am largely in agreement with), I
thought I'd throw in one thought, just to stir the pot a little more.

This is not a new idea, but I think it is the most sensible strategy I've
heard proposed, and addresses Marin's concerns in a more conventional way.
What we are talking about here is the statistical noise present in the FSC
curves themselves. Viewed from the framework of traditional error analysis
and propagation of uncertainties, which pretty much every scientist should
be familiar with since high-school, (and thus would not be confusing to the
non statisticians) the 'correct' solution to this issue is not to adjust
the threshold, but to present FSC curves with error bars.

One can then use a fixed threshold at a level based on expectation values,
and simply produce a resolution value which also has an associated
uncertainty. This is much better than using a variable threshold and still
producing a single number with no uncertainty estimate! Not only does this
approach account for the statistical noise in the FSC curve, but it also
should stop people from reporting resolutions as 2.3397 Å, as it would be
silly to say 2.3397 +- 0.2.

The cross terms are not ignored, but are used in the production of the
error bars. This is a very simple approach, which is certainly closer to
being correct than the fixed threshold without error-bars approach, and it
solves many of the issues we have with resolution reporting people do. Of
course we still have people who will insist that 3.2+-0.2 is better than
3.3+-0.2, but there isn't much you can do about them... (other than beat
them over the head with a statistics textbook).

The caveat, of course, is that like all propagation of uncertainty that it
is a linear approximation, and the correlation axis isn't linear, so the
typical Normal distributions with linear propagation used to justify
propagation of uncertainty aren't _strictly_ true. However, the
approximation is fine as long as the error bars are reasonably small
compared to the -1 to 1 range of the correlation axis. Each individual
error bar is computed around its expectation value, so the overall
nonlinearity of the correlation isn't a concern.

--------------------------------------------------------------------------------------
Steven Ludtke, Ph.D. <slud...@bcm.edu> Baylor
College of Medicine
Charles C. Bell Jr., Professor of Structural Biology
Dept. of Biochemistry and Molecular Biology (
www.bcm.edu/biochem)
Academic Director, CryoEM Core (
cryoem.bcm.edu)
Co-Director CIBR Center (
www.bcm.edu/research/cibr)

On Feb 21, 2020, at 10:34 AM, Alexis Rohou <a.ro...@gmail.com> wrote:

****CAUTION:*** This email is not from a BCM Source. Only click links or
open attachments you know are safe.*
------------------------------
Hi all,

For those bewildered by Marin's insistence that everyone's been messing up
their stats since the bronze age, I'd like to offer what my understanding
of the situation. More details in this thread from a few years ago on the
exact same topic:
https://mail.ncmir.ucsd.edu/pipermail/3dem/2015-August/003939.html
<https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.ncmir.ucsd.edu_pipermail_3dem_2015-2DAugust_003939.html&d=DwMFaQ&c=ZQs-KZ8oxEw0p81sqgiaRA&r=Dk5VoQQ-wINYVssLMZihyC5Dj_sWYKxCyKz9E4Lp3gc&m=UWn2RUCMENrXjn3JLSwlIU6Zmp_JYnRrXesjtsM1u2E&s=CZ3YcAV1LVKXsLT0KjCIRby6j3XPA6GqZcOVP3nMyK0&e=>
https://mail.ncmir.ucsd.edu/pipermail/3dem/2015-August/003944.html
<https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.ncmir.ucsd.edu_pipermail_3dem_2015-2DAugust_003944.html&d=DwMFaQ&c=ZQs-KZ8oxEw0p81sqgiaRA&r=Dk5VoQQ-wINYVssLMZihyC5Dj_sWYKxCyKz9E4Lp3gc&m=UWn2RUCMENrXjn3JLSwlIU6Zmp_JYnRrXesjtsM1u2E&s=oG6lGnei74jC5VVGsfFAdiTpIxrZhs_IH2mH0re5QRM&e=>

Notwithstanding notational problems (e.g. strict equations as opposed to
approximation symbols, or omission of symbols to denote estimation), I
believe Frank & Al-Ali and "descendent" papers (e.g. appendix of Rosenthal
& Henderson 2003) are fine. The cross terms that Marin is agitated about
indeed do in fact have an expectation value of 0.0 (in the ensemble; if the
experiment were performed an infinite number of times with different
realizations of noise). I don't believe Pawel or Jose Maria or any of the
other authors really believe that the cross-terms are orthogonal.

When N (the number of independent Fouier voxels in a shell) is large
enough, mean(Signal x Noise) ~ 0.0 is only an approximation, but a pretty
good one, even for a single FSC experiment. This is why, in my book,
derivations that depend on Frank & Al-Ali are OK, under the strict
assumption that N is large. Numerically, this becomes apparent when Marin's
half-bit criterion is plotted - asymptotically it has the same behavior as
a constant threshold.

So, is Marin wrong to worry about this? No, I don't think so. There are
indeed cases where the assumption of large N is broken. And under those
circumstances, any fixed threshold (0.143, 0.5, whatever) is dangerous.
This is illustrated in figures of van Heel & Schatz (2005). Small boxes,
high-symmetry, small objects in large boxes, and a number of other
conditions can make fixed thresholds dangerous.

It would indeed be better to use a non-fixed threshold. So why am I not
using the 1/2-bit criterion in my own work? While numerically it behaves
well at most resolution ranges, I was not convinced by Marin's derivation
in 2005. Philosophically though, I think he's right - we should aim for FSC
thresholds that are more robust to the kinds of edge cases mentioned above.
It would be the right thing to do.

Hope this helps,
Alexis

On Sun, Feb 16, 2020 at 9:00 AM Penczek, Pawel A <
pawel.a.penc...@uth.tmc.edu> wrote:

Marin,

The statistics in 2010 review is fine. You may disagree with assumptions,
but I can assure you the “statistics” (as you call it) is fine. Careful
reading of the paper would reveal to you this much.

Regards,
Pawel

On Feb 16, 2020, at 10:38 AM, Marin van Heel <
marin.vanh...@googlemail.com> wrote:



***** EXTERNAL EMAIL *****
Dear Pawel and All others ....

This 2010 review is - unfortunately - largely based on the flawed
statistics I mentioned before, namely on the a priori assumption that the
inner product of a signal vector and a noise vector are ZERO (an
orthogonality assumption).  The (Frank & Al-Ali 1975) paper we have refuted
on a number of occasions (for example in 2005, and most recently in our
BioRxiv paper) but you still take that as the correct relation between SNR
and FRC (and you never cite the criticism...).
Sorry
Marin

On Thu, Feb 13, 2020 at 10:42 AM Penczek, Pawel A <
pawel.a.penc...@uth.tmc.edu> wrote:

Dear Teige,

I am wondering whether you are familiar with

Resolution measures in molecular electron microscopy.
Penczek PA. Methods Enzymol. 2010.
Citation

Methods Enzymol. 2010;482:73-100. doi: 10.1016/S0076-6879(10)82003-8.

You will find there answers to all questions you asked and much more.

Regards,
Pawel Penczek


Regards,
Pawel
_______________________________________________
3dem mailing list
3...@ncmir.ucsd.edu
https://mail.ncmir.ucsd.edu/mailman/listinfo/3dem
<https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.ncmir.ucsd.edu_mailman_listinfo_3dem&d=DwMFaQ&c=bKRySV-ouEg_AT-w2QWsTdd9X__KYh9Eq2fdmQDVZgw&r=yEYHb4SF2vvMq3W-iluu41LlHcFadz4Ekzr3_bT4-qI&m=3-TZcohYbZGHCQ7azF9_fgEJmssbBksaI7ESb0VIk1Y&s=XHMq9Q6Zwa69NL8kzFbmaLmZA9M33U01tBE6iAtQ140&e=>

_______________________________________________
3dem mailing list
3...@ncmir.ucsd.edu
https://mail.ncmir.ucsd.edu/mailman/listinfo/3dem
<https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.ncmir.ucsd.edu_mailman_listinfo_3dem&d=DwMFaQ&c=ZQs-KZ8oxEw0p81sqgiaRA&r=Dk5VoQQ-wINYVssLMZihyC5Dj_sWYKxCyKz9E4Lp3gc&m=UWn2RUCMENrXjn3JLSwlIU6Zmp_JYnRrXesjtsM1u2E&s=TeEhUNYC5v59HGWMrPQCMaGK5opuX-NIG2mJvGLuiKA&e=>

_______________________________________________
3dem mailing list
3...@ncmir.ucsd.edu

https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.ncmir.ucsd.edu_mailman_listinfo_3dem&d=DwICAg&c=ZQs-KZ8oxEw0p81sqgiaRA&r=Dk5VoQQ-wINYVssLMZihyC5Dj_sWYKxCyKz9E4Lp3gc&m=UWn2RUCMENrXjn3JLSwlIU6Zmp_JYnRrXesjtsM1u2E&s=TeEhUNYC5v59HGWMrPQCMaGK5opuX-NIG2mJvGLuiKA&e=



_______________________________________________
3dem mailing 
list3...@ncmir.ucsd.eduhttps://mail.ncmir.ucsd.edu/mailman/listinfo/3dem


########################################################################

To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1


########################################################################

To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1


########################################################################

To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1

Re: [ccp4bb] [3dem] Which resolution?

Reply via email to