Re: [ccp4bb] [3dem] Which resolution?

Petrus Zwart Thu, 12 Mar 2020 09:51:30 -0700

Hi Jacob,

On Thu, Mar 12, 2020 at 9:13 AM Keller, Jacob <[email protected]>
wrote:


> I would think the most information-reflecting representation for
> systematic absences (or maybe for all reflections) would be not I/sig but
> the reflection's (|log|) ratio to the expected intensity in that shell
> (median intensity, say).


Xtriage does something like this as part of its space group assignment
algorithm. A choice of space group implies assigning reflections the label
acentric, centric or absent. Each of these have their own prior
distribution, which can be convoluted with a gaussian to compute a
likelihood for that specific space group hypothesis. It provides a decent
way of assigning space groups in an automated manner.


> (...)
>


> Maybe more generally, should refinement incorporate weighting for these
> deviant spots? Or maybe it already does, but my understanding was that
> I/sig was the most salient for weighting.
>

The best option is to have a decent likelihood function that takes into
account the (almost) full uncertainty of the observation into
consideration, as described by Read & Pannu (https://bit.ly/2W6qmVR)
including various numerical /mathematical approaches to compute this ( Read
& McCoy https://bit.ly/2Qa6b5I;  Perpendicular Pronoun & Perryman
https://bit.ly/2TKjJXH ).

P






> JPK
>
> +++++++++++++++++++++++++++++++++++++++++++++++++
> Jacob Pearson Keller
> Research Scientist / Looger Lab
> HHMI Janelia Research Campus
> 19700 Helix Dr, Ashburn, VA 20147
> Desk: (571)209-4000 x3159
> Cell: (301)592-7004
> +++++++++++++++++++++++++++++++++++++++++++++++++
>
> The content of this email is confidential and intended for the recipient
> specified in message only. It is strictly forbidden to share any part of
> this message with any third party, without a written consent of the sender.
> If you received this message by mistake, please reply to this message and
> follow with its deletion, so that we can ensure such a mistake does not
> occur in the future.
>
> -----Original Message-----
> From: CCP4 bulletin board <[email protected]> On Behalf Of Kay
> Diederichs
> Sent: Tuesday, March 10, 2020 2:48 AM
> To: [email protected]
> Subject: Re: [ccp4bb] [3dem] Which resolution?
>
> I'd say that it depends on your state of knowledge, and on their I and
> sigma.
>
> - if you know the space group for sure before you do the measurement of
> the systematic absences, their I and sigma don't matter to you (because
> they don't influence your mental model of the experiment), so their
> information content is (close to) zero.
> - if the space group is completely unknown, some groups of reflections
> (e.g. h,k,l = 0,0,2n+1) can only be considered "potentially systematic
> absences". Then both I and sigma matter. "small" or "high" I/sigma for each
> member of such a group of reflections would indeed add quite some
> information in this situation, so an information content of up to 1 bit
> would be justified. "intermediate" I/sigma (say, 0.5 to 2) would be closer
> to zero bit, since it does not let you safely decide between "yes" or "no"
> (the recent paper by Randy Read and coworkers relates I and sigma to bits
> of information, but not in the context of decision making from potentially
> systematic absent reflections).
>
> So it is not quite straightforward, I think.
>
> best wishes,
> Kay
>
> On Tue, 10 Mar 2020 01:26:03 +0100, James Holton <[email protected]> wrote:
>
> >I'd say they are 1 bit each, since they are the answer to a yes-or-no
> >question.
> >
> >-James Holton
> >MAD Scientist
> >
> >On 2/27/2020 6:32 PM, Keller, Jacob wrote:
> >> How would one evaluate the information content of systematic absences?
> >>
> >> JPK
> >>
> >> On Feb 26, 2020 8:14 PM, James Holton <[email protected]> wrote:
> >> In my opinion the threshold should be zero bits.  Yes, this is where
> >> CC1/2 = 0 (or FSC = 0).  If there is correlation then there is
> >> information, and why throw out information if there is information to
> >> be had?  Yes, this information comes with noise attached, but that is
> >> why we have weights.
> >>
> >> It is also important to remember that zero intensity is still useful
> >> information.  Systematic absences are an excellent example.  They
> >> have no intensity at all, but they speak volumes about the structure.
> >> In a similar way, high-angle zero-intensity observations also tell us
> >> something.  Ever tried unrestrained B factor refinement at poor
> >> resolution?  It is hard to do nowadays because of all the safety
> >> catches in modern software, but you can get great R factors this way.
> >> A telltale sign of this kind of "over fitting" is remarkably large
> >> Fcalc values beyond the resolution cutoff.  These don't contribute to
> >> the R factor, however, because Fobs is missing for these hkls. So,
> >> including zero-intensity data suppresses at least some types of
> >> over-fitting.
> >>
> >> The thing I like most about the zero-information resolution cutoff is
> >> that it forces us to address the real problem: what do you mean by
> >> "resolution" ?  Not long ago, claiming your resolution was 3.0 A
> >> meant that after discarding all spots with individual I/sigI < 3 you
> >> still have 80% completeness in the 3.0 A bin.  Now we are saying we
> >> have a
> >> 3.0 A data set when we can prove statistically that a few
> >> non-background counts fell into the sum of all spot areas at 3.0 A.
> >> These are not the same thing.
> >>
> >> Don't get me wrong, including the weak high-resolution information
> >> makes the model better, and indeed I am even advocating including all
> >> the noisy zeroes.  However, weak data at 3.0 A is never going to be
> >> as good as having strong data at 3.0 A.  So, how do we decide?  I
> >> personally think that the resolution assigned to the PDB deposition
> >> should remain the classical I/sigI > 3 at 80% rule.  This is really
> >> the only way to have meaningful comparison of resolution between very
> >> old and very new structures.  One should, of course, deposit all the
> >> data, but don't claim that cut-off as your "resolution".  That is
> >> just plain unfair to those who came before.
> >>
> >> Oh yeah, and I also have a session on "interpreting low-resolution
> >> maps" at the GRC this year.
> >> https://urldefense.com/v3/__https://www.grc.org/diffraction-methods-i
> >> n-structural-biology-conference/2020/__;!!Eh6p8Q!XrEJFTzyDh5AKIyF7aqX
> >> MswM8g5VF_7U-msuYRN_IWolD5KPaoP8Xsj8THkFrPUFJmw$
> >>
> >> So, please, let the discussion continue!
> >>
> >> -James Holton
> >> MAD Scientist
> >>
> >> On 2/22/2020 11:06 AM, Nave, Colin (DLSLtd,RAL,LSCI) wrote:
> >>>
> >>> Alexis
> >>>
> >>> This is a very useful summary.
> >>>
> >>> You say you were not convinced by Marin's derivation in 2005. Are
> >>> you convinced now and, if not, why?
> >>>
> >>> My interest in this is that the FSC with half bit thresholds have
> >>> the danger of being adopted elsewhere because they are becoming
> >>> standard for protein structure determination (by EM or MX). If it is
> >>> used for these mature techniques it must be right!
> >>>
> >>> It is the adoption of the ½ bit threshold I worry about. I gave a
> >>> rather weak example for MX which consisted of partial occupancy of
> >>> side chains, substrates etc. For x-ray imaging a wide range of
> >>> contrasts can occur and, if you want to see features with only a
> >>> small contrast above the surroundings then I think the half bit
> >>> threshold would be inappropriate.
> >>>
> >>> It would be good to see a clear message from the MX and EM
> >>> communities as to why an information content threshold of ½ a bit is
> >>> generally appropriate for these techniques and an acknowledgement
> >>> that this threshold is technique/problem dependent.
> >>>
> >>> We might then progress from the bronze age to the iron age.
> >>>
> >>> Regards
> >>>
> >>> Colin
> >>>
> >>> *From:*CCP4 bulletin board <[email protected]> *On Behalf Of
> >>> *Alexis Rohou
> >>> *Sent:* 21 February 2020 16:35
> >>> *To:* [email protected]
> >>> *Subject:* Re: [ccp4bb] [3dem] Which resolution?
> >>>
> >>> Hi all,
> >>>
> >>> For those bewildered by Marin's insistence that everyone's been
> >>> messing up their stats since the bronze age, I'd like to offer what
> >>> my understanding of the situation. More details in this thread from
> >>> a few years ago on the exact same topic:
> >>>
> >>> https://urldefense.com/v3/__https://mail.ncmir.ucsd.edu/pipermail/3d
> >>> em/2015-August/003939.html__;!!Eh6p8Q!XrEJFTzyDh5AKIyF7aqXMswM8g5VF_
> >>> 7U-msuYRN_IWolD5KPaoP8Xsj8THkFyeegrI8$
> >>> <https://urldefense.com/v3/__https://mail.ncmir.ucsd.edu/pipermail/3
> >>> dem/2015-August/003939.html__;!!Eh6p8Q!TK-tIY-zm5coRu74uWMkIJkTFWNz4
> >>> -1ibr1oaahxT_2BAAetUTMNdfRqUCmIsJF61uc$>
> >>>
> >>> https://urldefense.com/v3/__https://mail.ncmir.ucsd.edu/pipermail/3d
> >>> em/2015-August/003944.html__;!!Eh6p8Q!XrEJFTzyDh5AKIyF7aqXMswM8g5VF_
> >>> 7U-msuYRN_IWolD5KPaoP8Xsj8THkFj5n6OLY$
> >>> <https://urldefense.com/v3/__https://mail.ncmir.ucsd.edu/pipermail/3
> >>> dem/2015-August/003944.html__;!!Eh6p8Q!TK-tIY-zm5coRu74uWMkIJkTFWNz4
> >>> -1ibr1oaahxT_2BAAetUTMNdfRqUCmIPu-nRBo$>
> >>>
> >>> Notwithstanding notational problems (e.g. strict equations as
> >>> opposed to approximation symbols, or omission of symbols to denote
> >>> estimation), I believe Frank & Al-Ali and "descendent" papers (e.g.
> >>> appendix of Rosenthal & Henderson 2003) are fine. The cross terms
> >>> that Marin is agitated about indeed do in fact have an expectation
> >>> value of 0.0 (in the ensemble; if the experiment were performed an
> >>> infinite number of times with different realizations of noise). I
> >>> don't believe Pawel or Jose Maria or any of the other authors really
> >>> believe that the cross-terms are orthogonal.
> >>>
> >>> When N (the number of independent Fouier voxels in a shell) is large
> >>> enough, mean(Signal x Noise) ~ 0.0 is only an approximation, but a
> >>> pretty good one, even for a single FSC experiment. This is why, in
> >>> my book, derivations that depend on Frank & Al-Ali are OK, under the
> >>> strict assumption that N is large. Numerically, this becomes
> >>> apparent when Marin's half-bit criterion is plotted - asymptotically
> >>> it has the same behavior as a constant threshold.
> >>>
> >>> So, is Marin wrong to worry about this? No, I don't think so. There
> >>> are indeed cases where the assumption of large N is broken. And
> >>> under those circumstances, any fixed threshold (0.143, 0.5,
> >>> whatever) is dangerous. This is illustrated in figures of van Heel &
> >>> Schatz (2005). Small boxes, high-symmetry, small objects in large
> >>> boxes, and a number of other conditions can make fixed thresholds
> dangerous.
> >>>
> >>> It would indeed be better to use a non-fixed threshold. So why am I
> >>> not using the 1/2-bit criterion in my own work? While numerically it
> >>> behaves well at most resolution ranges, I was not convinced by
> >>> Marin's derivation in 2005. Philosophically though, I think he's
> >>> right - we should aim for FSC thresholds that are more robust to the
> >>> kinds of edge cases mentioned above. It would be the right thing to do.
> >>>
> >>> Hope this helps,
> >>>
> >>> Alexis
> >>>
> >>> On Sun, Feb 16, 2020 at 9:00 AM Penczek, Pawel A
> >>> <[email protected] <mailto:[email protected]>>
> wrote:
> >>>
> >>>     Marin,
> >>>
> >>>     The statistics in 2010 review is fine. You may disagree with
> >>>     assumptions, but I can assure you the “statistics” (as you call
> >>>     it) is fine. Careful reading of the paper would reveal to you
> >>>     this much.
> >>>
> >>>     Regards,
> >>>
> >>>     Pawel
> >>>
> >>>
> >>>
> >>>         On Feb 16, 2020, at 10:38 AM, Marin van Heel
> >>>         <[email protected]
> >>>         <mailto:[email protected]>> wrote:
> >>>
> >>>         
> >>>
> >>>         ***** EXTERNAL EMAIL *****
> >>>
> >>>         Dear Pawel and All others ....
> >>>
> >>>         This 2010 review is - unfortunately - largely based on the
> >>>         flawed statistics I mentioned before, namely on the a priori
> >>>         assumption that the inner product of a signal vector and a
> >>>         noise vector are ZERO (an orthogonality assumption).  The
> >>>         (Frank & Al-Ali 1975) paper we have refuted on a number of
> >>>         occasions (for example in 2005, and most recently in our
> >>>         BioRxiv paper) but you still take that as the correct
> >>>         relation between SNR and FRC (and you never cite the
> >>>         criticism...).
> >>>
> >>>         Sorry
> >>>
> >>>         Marin
> >>>
> >>>         On Thu, Feb 13, 2020 at 10:42 AM Penczek, Pawel A
> >>>         <[email protected]
> >>>         <mailto:[email protected]>> wrote:
> >>>
> >>>             Dear Teige,
> >>>
> >>>             I am wondering whether you are familiar with
> >>>
> >>>
> >>>                 Resolution measures in molecular electron microscopy.
> >>>
> >>>             Penczek PA. Methods Enzymol. 2010.
> >>>
> >>>
> >>>                   Citation
> >>>
> >>>             Methods Enzymol. 2010;482:73-100. doi:
> >>>             10.1016/S0076-6879(10)82003-8.
> >>>
> >>>             You will find there answers to all questions you asked
> >>>             and much more.
> >>>
> >>>             Regards,
> >>>
> >>>             Pawel Penczek
> >>>
> >>>             Regards,
> >>>
> >>>             Pawel
> >>>
> >>>             _______________________________________________
> >>>             3dem mailing list
> >>>             [email protected] <mailto:[email protected]>
> >>>
> https://urldefense.com/v3/__https://mail.ncmir.ucsd.edu/mailman/listinfo/3dem__;!!Eh6p8Q!XrEJFTzyDh5AKIyF7aqXMswM8g5VF_7U-msuYRN_IWolD5KPaoP8Xsj8THkFWAPvO-k$
> >>>
> >>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.ncmir.ucs
> >>> d.edu_mailman_listinfo_3dem&d=DwMFaQ&c=bKRySV-ouEg_AT-w2QWsTdd9X__KY
> >>> h9Eq2fdmQDVZgw&r=yEYHb4SF2vvMq3W-iluu41LlHcFadz4Ekzr3_bT4-qI&m=3-TZc
> >>> ohYbZGHCQ7azF9_fgEJmssbBksaI7ESb0VIk1Y&s=XHMq9Q6Zwa69NL8kzFbmaLmZA9M
> >>> 33U01tBE6iAtQ140&e=>
> >>>
> >>>     _______________________________________________
> >>>     3dem mailing list
> >>>     [email protected] <mailto:[email protected]>
> >>>
> https://urldefense.com/v3/__https://mail.ncmir.ucsd.edu/mailman/listinfo/3dem__;!!Eh6p8Q!XrEJFTzyDh5AKIyF7aqXMswM8g5VF_7U-msuYRN_IWolD5KPaoP8Xsj8THkFWAPvO-k$
> >>>
> >>> <https://urldefense.com/v3/__https://mail.ncmir.ucsd.edu/mailman/lis
> >>> tinfo/3dem__;!!Eh6p8Q!TK-tIY-zm5coRu74uWMkIJkTFWNz4-1ibr1oaahxT_2BAA
> >>> etUTMNdfRqUCmI7LD77u4$>
> >>>
> >>> --------------------------------------------------------------------
> >>> ----
> >>>
> >>> To unsubscribe from the CCP4BB list, click the following link:
> >>> https://urldefense.com/v3/__https://www.jiscmail.ac.uk/cgi-bin/webad
> >>> min?SUBED1=CCP4BB&A=1__;!!Eh6p8Q!XrEJFTzyDh5AKIyF7aqXMswM8g5VF_7U-ms
> >>> uYRN_IWolD5KPaoP8Xsj8THkFg3ruXqc$
> >>> <https://urldefense.com/v3/__https://www.jiscmail.ac.uk/cgi-bin/weba
> >>> dmin?SUBED1=CCP4BB&A=1__;!!Eh6p8Q!TK-tIY-zm5coRu74uWMkIJkTFWNz4-1ibr
> >>> 1oaahxT_2BAAetUTMNdfRqUCmI1pndYoE$>
> >>>
> >>>
> >>> --
> >>>
> >>> This e-mail and any attachments may contain confidential, copyright
> >>> and or privileged material, and are for the use of the intended
> >>> addressee only. If you are not the intended addressee or an
> >>> authorised recipient of the addressee please notify us of receipt by
> >>> returning the e-mail and do not use, copy, retain, distribute or
> >>> disclose the information in or attached to the e-mail.
> >>> Any opinions expressed within this e-mail are those of the
> >>> individual and not necessarily of Diamond Light Source Ltd.
> >>> Diamond Light Source Ltd. cannot guarantee that this e-mail or any
> >>> attachments are free from viruses and we cannot accept liability for
> >>> any damage which you may sustain as a result of software viruses
> >>> which may be transmitted in or with the message.
> >>> Diamond Light Source Limited (company no. 4375679). Registered in
> >>> England and Wales with its registered office at Diamond House,
> >>> Harwell Science and Innovation Campus, Didcot, Oxfordshire, OX11
> >>> 0DE, United Kingdom
> >>>
> >>>
> >>> --------------------------------------------------------------------
> >>> ----
> >>>
> >>> To unsubscribe from the CCP4BB list, click the following link:
> >>> https://urldefense.com/v3/__https://www.jiscmail.ac.uk/cgi-bin/webad
> >>> min?SUBED1=CCP4BB&A=1__;!!Eh6p8Q!XrEJFTzyDh5AKIyF7aqXMswM8g5VF_7U-ms
> >>> uYRN_IWolD5KPaoP8Xsj8THkFg3ruXqc$
> >>> <https://urldefense.com/v3/__https://www.jiscmail.ac.uk/cgi-bin/weba
> >>> dmin?SUBED1=CCP4BB&A=1__;!!Eh6p8Q!TK-tIY-zm5coRu74uWMkIJkTFWNz4-1ibr
> >>> 1oaahxT_2BAAetUTMNdfRqUCmI1pndYoE$>
> >>>
> >>>
> >>
> >>
> >> ---------------------------------------------------------------------
> >> ---
> >>
> >> To unsubscribe from the CCP4BB list, click the following link:
> >> https://urldefense.com/v3/__https://www.jiscmail.ac.uk/cgi-bin/webadm
> >> in?SUBED1=CCP4BB&A=1__;!!Eh6p8Q!XrEJFTzyDh5AKIyF7aqXMswM8g5VF_7U-msuY
> >> RN_IWolD5KPaoP8Xsj8THkFg3ruXqc$
> >> <https://urldefense.com/v3/__https://www.jiscmail.ac.uk/cgi-bin/webad
> >> min?SUBED1=CCP4BB&A=1__;!!Eh6p8Q!TK-tIY-zm5coRu74uWMkIJkTFWNz4-1ibr1o
> >> aahxT_2BAAetUTMNdfRqUCmI1pndYoE$>
> >>
> >>
> >
> >
> >#######################################################################
> >#
> >
> >To unsubscribe from the CCP4BB list, click the following link:
> >https://urldefense.com/v3/__https://www.jiscmail.ac.uk/cgi-bin/webadmin
> >?SUBED1=CCP4BB&A=1__;!!Eh6p8Q!XrEJFTzyDh5AKIyF7aqXMswM8g5VF_7U-msuYRN_I
> >WolD5KPaoP8Xsj8THkFg3ruXqc$
> >
>
> ########################################################################
>
> To unsubscribe from the CCP4BB list, click the following link:
>
> https://urldefense.com/v3/__https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1__;!!Eh6p8Q!XrEJFTzyDh5AKIyF7aqXMswM8g5VF_7U-msuYRN_IWolD5KPaoP8Xsj8THkFg3ruXqc$
>
> ########################################################################
>
> To unsubscribe from the CCP4BB list, click the following link:
> https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1
>


-- 
------------------------------------------------------------------------
P.H. Zwart
Staff Scientist
Molecular Biophysics and Integrated Bioimaging &
Center for Advanced Mathematics for Energy Research Applications
Lawrence Berkeley National Laboratories
1 Cyclotron Road, Berkeley, CA-94703, USA
Cell: 510 289 9246

PHENIX:   http://www.phenix-online.org
CAMERA: http://camera.lbl.gov/
-------------------------------------------------------------------------

########################################################################

To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1

Re: [ccp4bb] [3dem] Which resolution?

Reply via email to