I too like the idea of reporting the table 1 stats vs resolution rather than just the overall values and highest resolution shell.
I also wanted to point out an earlier thread from April about the limitations of the PDB's defining the resolution as being that of the highest resolution reflection (even if data is incomplete or weak). https://www.jiscmail.ac.uk/cgi-bin/webadmin?A2=ind1204&L=ccp4bb&D=0&1=ccp4bb&9=A&I=-3&J=on&d=No+Match%3BMatch%3BMatches&z=4&P=376289 https://www.jiscmail.ac.uk/cgi-bin/webadmin?A2=ind1204&L=ccp4bb&D=0&1=ccp4bb&9=A&I=-3&J=on&d=No+Match%3BMatch%3BMatches&z=4&P=377673 What we have done in the past for cases of low completeness in the outer shell is to define the nominal resolution ala Bart Hazes' method of same number of reflections as a complete data set and use this in the PDB title and describe it in the remark 3 other refinement remarks. There is also the possibility of adding a comment to the PDB remark 2 which we have not used. http://www.wwpdb.org/documentation/format33/remarks1.html#REMARK%202 This should help convince reviewers that you are not trying to mis-represent the resolution of the structure. Regards, Mitch -----Original Message----- From: CCP4 bulletin board [mailto:[email protected]] On Behalf Of Edward A. Berry Sent: Friday, December 07, 2012 8:43 AM To: [email protected] Subject: Re: [ccp4bb] refining against weak data and Table I stats Yes, well, actually i'm only a middle author on that paper for a good reason, but I did encourage Rebecca and Stephan to use all the data. But on a later, much more modest submission, where the outer shell was not only weak but very incomplete (edges of the detector), the reviewers found it difficult to evaluate the quality of the data (we had also excluded a zone with bad ice-ring problems). So we provided a second table, cutting off above the ice ring in the good strong data, which convinced them that at least it is a decent 2A structure. In the PDB it is a 1.6A structure. but there was a lot of good data between the ice ring and 1.6 A. Bart Hazes (I think) suggested a statistic called "effective resolution" which is the resolution to which a complete dataset would have the number of reflectionin your dataset, and we reported this, which came out to something like 1.75. I do like the idea of reporting in multiple shells, not just overall and highest shell, and the PDB accomodatesthis, even has a GUI to enter it in the ADIT 2.0 software. It could also be used to report two different overall ranges, such as completeness, 25 to 1.6 A, which would be shocking in my case, and 25 to 2.0 which would be more reassuring. eab Douglas Theobald wrote: > Hi Ed, > > Thanks for the comments. So what do you recommend? Refine against weak > data, and report all stats in a single Table I? > > Looking at your latest V-ATPase structure paper, it appears you favor > something like that, since you report a high res shell with I/sigI=1.34 and > Rsym=1.65. > > > On Dec 6, 2012, at 7:24 PM, Edward A. Berry<[email protected]> wrote: > >> Another consideration here is your PDB deposition. If the reason for using >> weak data is to get a better structure, presumably you are going to deposit >> the structure using all the data. Then the statistics in the PDB file must >> reflect the high resolution refinement. >> >> There are I think three places in the PDB file where the resolution is >> stated, >> but i believe they are all required to be the same and to be equal to the >> highest resolution data used (even if there were only two reflections in >> that shell). >> Rmerge or Rsymm must be reported, and until recently I think they were not >> allowed >> to exceed 1.00 (100% error?). >> >> What are your reviewers going to think if the title of your paper is >> "structure of protein A at 2.1 A resolution" but they check the PDB file >> and the resolution was really 1.9 A? And Rsymm in the PDB is 0.99 but >> in your table 1* says 1.3? >> >> Douglas Theobald wrote: >>> Hello all, >>> >>> I've followed with interest the discussions here about how we should be >>> refining against weak data, e.g. data with I/sigI<< 2 (perhaps using all >>> bins that have a "significant" CC1/2 per Karplus and Diederichs 2012). >>> This all makes statistical sense to me, but now I am wondering how I should >>> report data and model stats in Table I. >>> >>> Here's what I've come up with: report two Table I's. For comparability to >>> legacy structure stats, report a "classic" Table I, where I call the >>> resolution whatever bin I/sigI=2. Use that as my "high res" bin, with high >>> res bin stats reported in parentheses after global stats. Then have >>> another Table (maybe Table I* in supplementary material?) where I report >>> stats for the whole dataset, including the weak data I used in refinement. >>> In both tables report CC1/2 and Rmeas. >>> >>> This way, I don't redefine the (mostly) conventional usage of "resolution", >>> my Table I can be compared to precedent, I report stats for all the data >>> and for the model against all data, and I take advantage of the information >>> in the weak data during refinement. >>> >>> Thoughts? >>> >>> Douglas >>> >>> >>> ^`^`^`^`^`^`^`^`^`^`^`^`^`^`^`^`^`^`^`^` >>> Douglas L. Theobald >>> Assistant Professor >>> Department of Biochemistry >>> Brandeis University >>> Waltham, MA 02454-9110 >>> >>> [email protected] >>> http://theobald.brandeis.edu/ >>> >>> ^\ >>> /` /^. / /\ >>> / / /`/ / . /` >>> / / ' ' >>> ' >>> >>> >> >
