Thanks to Petr and Ian for their thoughtful replies.

One worry I have is that as a community we continue to debate what is the 
'proper' or 'correct' way to measure resolution which I think is quite 
confusing to those that are early in their crystallographic (or general 
structural biology) careers. It is the scientific method to argue (sometimes ad 
nauseum) about what constitutes the best data, the best methods, etc. so this 
isn't really a surprise to those that have been around a while. But it might be 
nice to have a set of criteria which most people in the field agree to and then 
these are updated on a regular basis as we move forward as a field. Not that we 
will ever get 100% agreement to anything, as that is just unrealistic (and 
there are many posts to this BB already to show that). What I was thinking was 
a set of standards that are reasonable and that when broken (not all at once, 
but one standard at a time), one needs to explain why instead of just hoping 
that a reviewer (or other scientist looking at the data) just misses it. From 
various posts, it seems that people generally agree that CC1/2 is a good 
criteria, that Rpim and Rfree are pretty good criteria, that I/sigI is 
reasonable at some level and that completeness and multiplicity (or redundancy) 
are important as well. These are not all independent (Rpim clearly depends on 
the multiplicity/redundancy, etc) but having some kind of standard set of 
numbers to judge one's own data by as a first pass might be helpful (and I 
believe the original question was basically, what do a I report as the 
resolution?)
Just to throw some numbers out as an example: CC1/2 at 0.3 (or 30% depending on 
your reporting style), I/sigI at 1.0, completeness at 75% in the last 
resolution bin and multiplicity/redundancy at least 3.0 throughout (and in the 
last shell). Nothing magical in these numbers, but if you feel that your data 
are really good but the completeness isn't there, you just explain why, or 
something to that effect. I believe this is one of the reasons we always have a 
table 1 in our publications and that there is no one number that really gives 
us that sense of assurance that the model and data are good (or in my own 
experience, good enough).

I guess what I am trying to 'solve' is the issue I come across regularly in 
reviewing papers: the authors are very interested in the biology of their 
system and spend a lot of time explaining what the system is, why it is 
important, etc (all great stuff) and then fill in the table with a set of 
numbers that makes me then wonder why they believe their own models? Often very 
low completeness, low redundancy/ multiplicity, CC1/2 which varies from 0.99+ 
to almost zero, all in order to make the reported resolution sound good (and 
crazy numbers of decimal places- reporting a resolution of 1.39623 AA with 15% 
completeness could more realistically be reported at 1.40 AA or 1.50 AA with 
50% completeness and I don't think the actual interpretation/ electron density 
would change significantly). If it was then stated explicitly in the 
manuscript, for example, that paired refinement was done or that difference 
maps were calculated (or FEM or Polder or ?) at various resolutions which then 
showed the area of interest more clearly, the readers and reviewers might be 
more assured that the authors weren't just reporting a semi-random number as 
'the resolution'.  Numbers in the table that are clearly (?) a bit relaxed, if 
actually explained in the paper, would then make more sense. We as a community 
have gone somewhat this direction with the validation criteria given for 
deposited structures, which is a start, but it hasn't really tackled the thorny 
question of 'what is my resolution?'

As Ian mentioned, some programs and some criteria depend on relatively high 
completeness in the data in the way they are calculated (CC1/2 is perfect when 
all data are set to zero). If a program 'fills in' data that are missing, then 
that one will also be subject to issues when the data are very incomplete. One 
can always call on people to 'get better data' and of course it would always be 
fantastic if each data set was complete, had high CC1/2 and multiplicity/ 
redundancy, but then this isn't very realistic either.

Thanks again for the considered replies to the previous post, and if this 
sounds like a rant, it probably is.

cheers, tom

Tom Peat, PhD
Proteins Group
Biomedical Program, CSIRO
343 Royal Parade
Parkville, VIC, 3052
+613 9662 7304
+614 57 539 419
tom.p...@csiro.au

________________________________
From: Petr Kolenko <petr.kole...@fjfi.cvut.cz>
Sent: Sunday, September 12, 2021 10:07 PM
To: Peat, Tom (Manufacturing, Clayton) <tom.p...@csiro.au>; 
CCP4BB@JISCMAIL.AC.UK <CCP4BB@JISCMAIL.AC.UK>
Subject: Re: [ccp4bb] criteria to set resolution limit

Dear Tom,
You are absolutely right with your points. But let me explain a bit more my 
opinion. And be aware that it is my opinion! Not necessarily the truth. There 
might be another opinion in the community.
In paired refinement, you always have the reference data. In case of 
significantly decreasing completeness, you can always select your starting 
resolution that is complete enough (e.g. more than 90% ?). And this is your 
reference data. As an increase in resolution improves your model (drop in 
R-values, mainly R-free), you always compare your models using the reference 
data. Should we use as many observables as possible? I would do so. Even if the 
completeness was very low.
Another thing is the statement that your data is processed up to 1.1 AA when 
the completeness is as low as 2%. Of course. But, this is why we have more 
cells in the so-called "Table 1". When judging the structure, one should go 
carefully through the whole table. And maybe, more resolution shells should be 
reported in extreme cases. There is a possibility to do so during the structure 
deposition. Here, I agree with you.
Low data completeness is usually a big problem. Both random and systematic. My 
personal experience is that it causes severe instability in structure 
refinement. But this is frequently projected to the R-values. And as any 
instability appears, paired refinement does not suggest using the higher 
resolution. As long as it follows the right trends, you should be fine.
Thanks for pointing out the high data completeness in our paper. We should run 
more analyses to get ready for such comments. ;-)
Best regards,
Petr
________________________________________
From: Peat, Tom (Manufacturing, Clayton) <tom.p...@csiro.au>
Sent: Sunday, September 12, 2021 5:02:04 AM
To: CCP4BB@JISCMAIL.AC.UK; Petr Kolenko
Subject: Re: [ccp4bb] criteria to set resolution limit

Hello Petr,

I would like to understand more completely your assertion in the last email 
regarding completeness: "I would not care about low data completeness in case 
when PAIREF shows improvement of your model."
In the papers you gave links to, the data completeness was always 90+% even in 
the outer shells. In cases where this is not true, I'm not clear why 
completeness would not be important? The ultimate thought experiment, or 
extreme case, where one has very few reflections in the resolution limit, just 
getting a 'better model' doesn't show me that the structure is now 1.3 A (or 
whatever limit one wants to set). Models with no data are perfect, in the 
physical sense of not having clashes, Ramachandran outliers, etc.
As an example, I am aware of a deposition in the PDB where the outer resolution 
shell was approximately 2% complete and I don't believe that the structure is 
really at the resolution stated as the features 'seen' in terms of electron 
density don't really measure up to what I would expect and the electron density 
looks a lot more like about 0.5A lower resolution, where the completeness is a 
bit better than 50%.
So my 'bias' is that completeness of the data is still an important feature 
that needs to be taken into account when forming the basis of 'resolution 
limit', but I'm absolutely willing to be shown that my bias is incorrect.

Best regards, tom

Tom Peat, PhD
Proteins Group
Biomedical Program, CSIRO
343 Royal Parade
Parkville, VIC, 3052
+613 9662 7304
+614 57 539 419
tom.p...@csiro.au

________________________________
From: CCP4 bulletin board <CCP4BB@JISCMAIL.AC.UK> on behalf of Petr Kolenko 
<petr.kole...@fjfi.cvut.cz>
Sent: Sunday, September 12, 2021 5:43 AM
To: CCP4BB@JISCMAIL.AC.UK <CCP4BB@JISCMAIL.AC.UK>
Subject: Re: [ccp4bb] criteria to set resolution limit

Dear Farhan,
Your dataset does not seem to be that critically anisotropic to me. But of 
course, try the STARANISO server and make your own decision.
To me, the dataset seems to be collected with a suboptimal data strategy. 
Although I do not know your setup, I would make the crystal-to-detector 
distance shorter next time. Or maybe rotate a bit more with the crystal? I do 
not know the details.
And now, to the point of the resolution. The optimal approach is to try paired 
refinement, or even better - paired refinement with the complete 
cross-validation protocol. This can be done using program PAIREF that is easy 
to be installed to your CCP4 installation by the following commands:

ccp4-python -m ensurepip --user
ccp4-python -m pip install pairef --no-deps --upgrade --user

The easiest way to use PAIREF is via GUI. Use the following command:

ccp4-python -m pairef --gui

To know more about the program and about the protocol, please read further.
The original work: 
https://journals.iucr.org/m/issues/2020/04/00/mf5044/index.html
Upgrade for PHENIX users: 
https://scripts.iucr.org/cgi-bin/paper?S2053230X21006129

We organized a webinar about the PAIREF about a half year ago. We even made a 
video from that. The video covers a short introduction to paired refinement, 
installation of PAIREF, and running a test case.

The link for the webinar is here: 
https://pairef.fjfi.cvut.cz/dokuwiki/doku.php?id=webinar_2021-03
Direct link to the video: 
https://pairef.fjfi.cvut.cz/docs/pairef_poli_webinar/PAIREF_webinar_23Mar2021_.mp4

I would not care about low data completeness in case when PAIREF shows 
improvement of your model. From my point of view, you have the ideal starting 
point. Start with the resolution of 1.8AA and verify, whether the higher shells 
improve your model. I hope you will be able to make the best decision, good 
luck! ;-) And do not hesitate to ask me for more details about PAIREF.
Best regards,
Petr


________________________________________
From: CCP4 bulletin board <CCP4BB@JISCMAIL.AC.UK> on behalf of Tushar R. 
<rtusha...@gmail.com>
Sent: Saturday, September 11, 2021 6:46:32 PM
To: CCP4BB@JISCMAIL.AC.UK
Subject: Re: [ccp4bb] criteria to set resolution limit

Along with the paper mentioned by Rajiv, you could look at this paper as well 
which discusses a major shift in the understanding of data quality from 
I/sig(I) based to CC1/2 based indicators.

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4684713/

Hope this helps.

All the best.

Best,
Tushar.



On Sat, 11 Sep 2021, 09:36 Rajiv gandhi.s, 
<raji....@gmail.com<mailto:raji....@gmail.com>> wrote:
Dear Chang,
One need to set resolution cut off, to have a meaningful data without losing 
high resolution data and keeping data integrity. Some key quality indicators 
like I/Sigma I,  CC 1/2 and Rpim etc., at outer most shell need to be 
considered.  What was the CC 1/2 value in outer shell ?

Please refer to the below paper.
How good are my data and what is the resolution
Assessing and maximizing data quality in macromolecular crystallograph

On Sat, 11 Sep 2021, 9:52 pm Tao-Hsin Chang, 
<taohsin.ch...@gmail.com<mailto:taohsin.ch...@gmail.com>> wrote:
Hi Farhan,

It looks like that your diffraction data has an anisotropic issue and it leads 
to the issues of resolution limit, intensity, and completeness. Check The 
STARANISO Server (https://staraniso.globalphasing.org/cgi-bin/staraniso.cgi). 
It may be useful for your case.

Best wishes,
Tao-Hsin

On Sep 11, 2021, at 11:55 AM, Syed Farhan Ali 
<alifarhan...@gmail.com<mailto:alifarhan...@gmail.com>> wrote:

Dear All,

I have query regarding one of my dataset. I am running aimless by keeping 
highest resolution 1.62 A and getting  I/SigI = 2 but data completeness is 
around 22 in outermost shell. And if I am increasing the resolution cutoff up 
to 1.8 A then I/SigI is 6.2 and completeness is 82.4.
I have attached the screenshot of the result.
What should be the criteria to set the resolution limit?  Should I stick to  
I/SigI  or I have to consider about the completeness of data.
And if completeness is also a guiding factor than how much minimum completeness 
I can keep in the higher resolution shell.





Regards,
Farhan




________________________________

To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1

<Screenshot 2021-09-11 at 8.43.25 PM.png><screenshot1.6.tiff>


________________________________

To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1

________________________________

To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1

________________________________

To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1

########################################################################

To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1

This message was issued to members of 
www.jiscmail.ac.uk/CCP4BB<http://www.jiscmail.ac.uk/CCP4BB<http://www.jiscmail.ac.uk/CCP4BB<http://www.jiscmail.ac.uk/CCP4BB>>,
 a mailing list hosted by www.jiscmail.ac.uk<http://www.jiscmail.ac.uk>, terms 
& conditions are available at https://www.jiscmail.ac.uk/policyandsecurity/

########################################################################

To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/

Reply via email to