Re: [ccp4bb] criteria to set resolution limit

Alexandre Ourjoumtsev Mon, 13 Sep 2021 02:40:53 -0700

Dear Tom, 

there are a couple of papers (essentially the first one) with a relevant 
discussion.


Urzhumtseva_2013_ActaCryst_D69_1921-1934 
Urzhumtseva_2015_J.Applied Cryst_48_589-597 

If you wish I may send you the files (off list). 

With best regards, 

Sacha Urzhumtsev 

----- Le 13 Sep 21, à 4:42, Peat, Tom (Manufacturing, Clayton) 
<[email protected]> a écrit : 

> Thanks to Petr and Ian for their thoughtful replies.

> One worry I have is that as a community we continue to debate what is the
> 'proper' or 'correct' way to measure resolution which I think is quite
> confusing to those that are early in their crystallographic (or general
> structural biology) careers. It is the scientific method to argue (sometimes 
> ad
> nauseum) about what constitutes the best data, the best methods, etc. so this
> isn't really a surprise to those that have been around a while. But it might 
> be
> nice to have a set of criteria which most people in the field agree to and 
> then
> these are updated on a regular basis as we move forward as a field. Not that 
> we
> will ever get 100% agreement to anything, as that is just unrealistic (and
> there are many posts to this BB already to show that). What I was thinking was
> a set of standards that are reasonable and that when broken (not all at once,
> but one standard at a time), one needs to explain why instead of just hoping
> that a reviewer (or other scientist looking at the data) just misses it. From
> various posts, it seems that people generally agree that CC1/2 is a good
> criteria, that Rpim and Rfree are pretty good criteria, that I/sigI is
> reasonable at some level and that completeness and multiplicity (or 
> redundancy)
> are important as well. These are not all independent (Rpim clearly depends on
> the multiplicity/redundancy, etc) but having some kind of standard set of
> numbers to judge one's own data by as a first pass might be helpful (and I
> believe the original question was basically, what do a I report as the
> resolution?)
> Just to throw some numbers out as an example: CC1/2 at 0.3 (or 30% depending 
> on
> your reporting style), I/sigI at 1.0, completeness at 75% in the last
> resolution bin and multiplicity/redundancy at least 3.0 throughout (and in the
> last shell). Nothing magical in these numbers, but if you feel that your data
> are really good but the completeness isn't there, you just explain why, or
> something to that effect. I believe this is one of the reasons we always have 
> a
> table 1 in our publications and that there is no one number that really gives
> us that sense of assurance that the model and data are good (or in my own
> experience, good enough).

> I guess what I am trying to 'solve' is the issue I come across regularly in
> reviewing papers: the authors are very interested in the biology of their
> system and spend a lot of time explaining what the system is, why it is
> important, etc (all great stuff) and then fill in the table with a set of
> numbers that makes me then wonder why they believe their own models? Often 
> very
> low completeness, low redundancy/ multiplicity, CC1/2 which varies from 0.99+
> to almost zero, all in order to make the reported resolution sound good (and
> crazy numbers of decimal places- reporting a resolution of 1.39623 AA with 15%
> completeness could more realistically be reported at 1.40 AA or 1.50 AA with
> 50% completeness and I don't think the actual interpretation/ electron density
> would change significantly). If it was then stated explicitly in the
> manuscript, for example, that paired refinement was done or that difference
> maps were calculated (or FEM or Polder or ?) at various resolutions which then
> showed the area of interest more clearly, the readers and reviewers might be
> more assured that the authors weren't just reporting a semi-random number as
> 'the resolution'. Numbers in the table that are clearly (?) a bit relaxed, if
> actually explained in the paper, would then make more sense. We as a community
> have gone somewhat this direction with the validation criteria given for
> deposited structures, which is a start, but it hasn't really tackled the 
> thorny
> question of 'what is my resolution?'

> As Ian mentioned, some programs and some criteria depend on relatively high
> completeness in the data in the way they are calculated (CC1/2 is perfect when
> all data are set to zero). If a program 'fills in' data that are missing, then
> that one will also be subject to issues when the data are very incomplete. One
> can always call on people to 'get better data' and of course it would always 
> be
> fantastic if each data set was complete, had high CC1/2 and multiplicity/
> redundancy, but then this isn't very realistic either.

> Thanks again for the considered replies to the previous post, and if this 
> sounds
> like a rant, it probably is.

> cheers, tom

> Tom Peat, PhD
> Proteins Group
> Biomedical Program, CSIRO
> 343 Royal Parade
> Parkville, VIC, 3052
> +613 9662 7304
> +614 57 539 419
> [email protected]

> From: Petr Kolenko <[email protected]>
> Sent: Sunday, September 12, 2021 10:07 PM
> To: Peat, Tom (Manufacturing, Clayton) <[email protected]>;
> [email protected] <[email protected]>
> Subject: Re: [ccp4bb] criteria to set resolution limit
> Dear Tom,
> You are absolutely right with your points. But let me explain a bit more my
> opinion. And be aware that it is my opinion! Not necessarily the truth. There
> might be another opinion in the community.
> In paired refinement, you always have the reference data. In case of
> significantly decreasing completeness, you can always select your starting
> resolution that is complete enough (e.g. more than 90% ?). And this is your
> reference data. As an increase in resolution improves your model (drop in
> R-values, mainly R-free), you always compare your models using the reference
> data. Should we use as many observables as possible? I would do so. Even if 
> the
> completeness was very low.
> Another thing is the statement that your data is processed up to 1.1 AA when 
> the
> completeness is as low as 2%. Of course. But, this is why we have more cells 
> in
> the so-called "Table 1". When judging the structure, one should go carefully
> through the whole table. And maybe, more resolution shells should be reported
> in extreme cases. There is a possibility to do so during the structure
> deposition. Here, I agree with you.
> Low data completeness is usually a big problem. Both random and systematic. My
> personal experience is that it causes severe instability in structure
> refinement. But this is frequently projected to the R-values. And as any
> instability appears, paired refinement does not suggest using the higher
> resolution. As long as it follows the right trends, you should be fine.
> Thanks for pointing out the high data completeness in our paper. We should run
> more analyses to get ready for such comments. ;-)
> Best regards,
> Petr
> ________________________________________
> From: Peat, Tom (Manufacturing, Clayton) <[email protected]>
> Sent: Sunday, September 12, 2021 5:02:04 AM
> To: [email protected]; Petr Kolenko
> Subject: Re: [ccp4bb] criteria to set resolution limit

> Hello Petr,

> I would like to understand more completely your assertion in the last email
> regarding completeness: "I would not care about low data completeness in case
> when PAIREF shows improvement of your model."
> In the papers you gave links to, the data completeness was always 90+% even in
> the outer shells. In cases where this is not true, I'm not clear why
> completeness would not be important? The ultimate thought experiment, or
> extreme case, where one has very few reflections in the resolution limit, just
> getting a 'better model' doesn't show me that the structure is now 1.3 A (or
> whatever limit one wants to set). Models with no data are perfect, in the
> physical sense of not having clashes, Ramachandran outliers, etc.
> As an example, I am aware of a deposition in the PDB where the outer 
> resolution
> shell was approximately 2% complete and I don't believe that the structure is
> really at the resolution stated as the features 'seen' in terms of electron
> density don't really measure up to what I would expect and the electron 
> density
> looks a lot more like about 0.5A lower resolution, where the completeness is a
> bit better than 50%.
> So my 'bias' is that completeness of the data is still an important feature 
> that
> needs to be taken into account when forming the basis of 'resolution limit',
> but I'm absolutely willing to be shown that my bias is incorrect.

> Best regards, tom

> Tom Peat, PhD
> Proteins Group
> Biomedical Program, CSIRO
> 343 Royal Parade
> Parkville, VIC, 3052
> +613 9662 7304
> +614 57 539 419
> [email protected]

> ________________________________
> From: CCP4 bulletin board <[email protected]> on behalf of Petr Kolenko
> <[email protected]>
> Sent: Sunday, September 12, 2021 5:43 AM
> To: [email protected] <[email protected]>
> Subject: Re: [ccp4bb] criteria to set resolution limit

> Dear Farhan,
> Your dataset does not seem to be that critically anisotropic to me. But of
> course, try the STARANISO server and make your own decision.
> To me, the dataset seems to be collected with a suboptimal data strategy.
> Although I do not know your setup, I would make the crystal-to-detector
> distance shorter next time. Or maybe rotate a bit more with the crystal? I do
> not know the details.
> And now, to the point of the resolution. The optimal approach is to try paired
> refinement, or even better - paired refinement with the complete
> cross-validation protocol. This can be done using program PAIREF that is easy
> to be installed to your CCP4 installation by the following commands:

> ccp4-python -m ensurepip --user
> ccp4-python -m pip install pairef --no-deps --upgrade --user

> The easiest way to use PAIREF is via GUI. Use the following command:

> ccp4-python -m pairef --gui

> To know more about the program and about the protocol, please read further.
> The original work: [
> https://journals.iucr.org/m/issues/2020/04/00/mf5044/index.html |
> https://journals.iucr.org/m/issues/2020/04/00/mf5044/index.html ]
> Upgrade for PHENIX users: [
> https://scripts.iucr.org/cgi-bin/paper?S2053230X21006129 |
> https://scripts.iucr.org/cgi-bin/paper?S2053230X21006129 ]

> We organized a webinar about the PAIREF about a half year ago. We even made a
> video from that. The video covers a short introduction to paired refinement,
> installation of PAIREF, and running a test case.

> The link for the webinar is here: [
> https://pairef.fjfi.cvut.cz/dokuwiki/doku.php?id=webinar_2021-03 |
> https://pairef.fjfi.cvut.cz/dokuwiki/doku.php?id=webinar_2021-03 ]
> Direct link to the video: [
> https://pairef.fjfi.cvut.cz/docs/pairef_poli_webinar/PAIREF_webinar_23Mar2021_.mp4
> |
> https://pairef.fjfi.cvut.cz/docs/pairef_poli_webinar/PAIREF_webinar_23Mar2021_.mp4
> ]

> I would not care about low data completeness in case when PAIREF shows
> improvement of your model. From my point of view, you have the ideal starting
> point. Start with the resolution of 1.8AA and verify, whether the higher 
> shells
> improve your model. I hope you will be able to make the best decision, good
> luck! ;-) And do not hesitate to ask me for more details about PAIREF.
> Best regards,
> Petr

> ________________________________________
> From: CCP4 bulletin board <[email protected]> on behalf of Tushar R.
> <[email protected]>
> Sent: Saturday, September 11, 2021 6:46:32 PM
> To: [email protected]
> Subject: Re: [ccp4bb] criteria to set resolution limit

> Along with the paper mentioned by Rajiv, you could look at this paper as well
> which discusses a major shift in the understanding of data quality from
> I/sig(I) based to CC1/2 based indicators.

> [ https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4684713/ |
> https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4684713/ ]

> Hope this helps.

> All the best.

> Best,
> Tushar.

> On Sat, 11 Sep 2021, 09:36 Rajiv gandhi.s,
> <[email protected]<mailto:[email protected]>> wrote:
> Dear Chang,
> One need to set resolution cut off, to have a meaningful data without losing
> high resolution data and keeping data integrity. Some key quality indicators
> like I/Sigma I, CC 1/2 and Rpim etc., at outer most shell need to be
> considered. What was the CC 1/2 value in outer shell ?

> Please refer to the below paper.
> How good are my data and what is the resolution
> Assessing and maximizing data quality in macromolecular crystallograph

> On Sat, 11 Sep 2021, 9:52 pm Tao-Hsin Chang,
> <[email protected]<mailto:[email protected]>> wrote:
> Hi Farhan,

> It looks like that your diffraction data has an anisotropic issue and it leads
> to the issues of resolution limit, intensity, and completeness. Check The
> STARANISO Server ( [ https://staraniso.globalphasing.org/cgi-bin/staraniso.cgi
> | https://staraniso.globalphasing.org/cgi-bin/staraniso.cgi ] ). It may be
> useful for your case.

> Best wishes,
> Tao-Hsin

> On Sep 11, 2021, at 11:55 AM, Syed Farhan Ali
> <[email protected]<mailto:[email protected]>> wrote:

> Dear All,

> I have query regarding one of my dataset. I am running aimless by keeping
> highest resolution 1.62 A and getting I/SigI = 2 but data completeness is
> around 22 in outermost shell. And if I am increasing the resolution cutoff up
> to 1.8 A then I/SigI is 6.2 and completeness is 82.4.
> I have attached the screenshot of the result.
> What should be the criteria to set the resolution limit? Should I stick to
> I/SigI or I have to consider about the completeness of data.
> And if completeness is also a guiding factor than how much minimum 
> completeness
> I can keep in the higher resolution shell.

> Regards,
> Farhan

> ________________________________

> To unsubscribe from the CCP4BB list, click the following link:
> [ https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1 |
> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1 ]

> <Screenshot 2021-09-11 at 8.43.25 PM.png><screenshot1.6.tiff>

> ________________________________

> To unsubscribe from the CCP4BB list, click the following link:
> [ https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1 |
> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1 ]

> ________________________________

> To unsubscribe from the CCP4BB list, click the following link:
> [ https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1 |
> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1 ]

> ________________________________

> To unsubscribe from the CCP4BB list, click the following link:
> [ https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1 |
> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1 ]

> ########################################################################

> To unsubscribe from the CCP4BB list, click the following link:
> [ https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1 |
> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1 ]

> This message was issued to members of [ http://www.jiscmail.ac.uk/CCP4BB |
> www.jiscmail.ac.uk/CCP4BB<http://www.jiscmail.ac.uk/CCP4BB ] >, a mailing list
> hosted by www.jiscmail.ac.uk< [ http://www.jiscmail.ac.uk/ |
> http://www.jiscmail.ac.uk ] >, terms & conditions are available at [
> https://www.jiscmail.ac.uk/policyandsecurity/ |
> https://www.jiscmail.ac.uk/policyandsecurity/ ]

> To unsubscribe from the CCP4BB list, click the following link:
> [ https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1 |
> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1 ]

########################################################################

To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/

Re: [ccp4bb] criteria to set resolution limit

Reply via email to