PS. A completely unimportant correction to my comment on the MolProbity output for 2HR0: every residue is indeed an outlier in at least one test, but in three cases it is only the CB-deviation test, not the other three tests that I mentioned.
George Prof. George M. Sheldrick FRS Dept. Structural Chemistry, University of Goettingen, Tammannstr. 4, D37077 Goettingen, Germany Tel. +49-551-39-3021 or -3068 Fax. +49-551-39-2582 On Sat, 18 Aug 2007, George M. Sheldrick wrote: > There are good reasons for preserving frames, but most of all for the > crystals that appeared to diffract but did not lead to a successful > structure solution, publication, and PDB deposition. Maybe in the future > there will be improved data processing software (for example to integrate > non-merohedral twins) that will enable good structures to be obtained from > such data. At the moment most such data is thrown away. However, forcing > everyone to deposit their frames each time they deposit a structure with > the PDB would be a thorough nuisance and major logistic hassle. > > It is also a complete illusion to believe that the reviewers for Nature > etc. would process or even look at frames, even if they could download > them with the manuscript. > > For small molecules, many journals require an 'ORTEP plot' to be submitted > with the paper. As older readers who have experienced Dick Harlow's 'ORTEP > of the year' competition at ACA Meetings will remember, even a viewer > with little experience of small-molecule crystallography can see from the > ORTEP plot within seconds if something is seriously wrong, and many > non-crystallographic referees for e.g. the journal Inorganic Chemistry > can even make a good guess as to what is wrong (e.g wrong element assigned > to an atom). It would be nice if we could find something similar for > macromolecules that the author would have to submit with the paper. One > immediate bonus is that the authors would look at it carefully > themselves before submitting, which could lead to an improvement of the > quality of structures being submitted. My suggestion is that the wwPDB > might provide say a one-page diagnostic summary when they allocate each > PDB ID that could be used for this purpose. > > A good first pass at this would be the output that the MolProbity server > http://molprobity.biochem.duke.edu/ sends when is given a PDB file. It > starts with a few lines of summary in which bad things are marked red > and the structure is assigned to a pecentile: a percentile of 6% means > that 93% of the sturcture in the PDB with a similar resolution are > 'better' and 5% are 'worse'. This summary can be understood with very > little crystallographic background and a similar summary can > of course be produced for NMR structures. The summary is followed by > diagnostics for each residue, normally if the summary looks good it > would not be necessary for the editor or referee to look at the rest. > > Although this server was intended to help us to improve our structures > rather than detect manipulated or fabricated data, I asked it for a > report on 2HR0 to see what it would do (probably many other people were > trying to do exactly the same, the server was slower than usual). > Although the structure got poor marks on most tests, MolProbity > generously assigned it overall to the 6th pecentile, I suppose that > this is about par for structures submitted to Nature (!). However there > was one feature that was unlike anything I have ever seen before > although I have fed the MolProbity server with some pretty ropey PDB > files in the past: EVERY residue, including EVERY WATER molecule, made > either at least one bad contact or was a Ramachandran outlier or was a > rotamer outlier (or more than one of these). This surely would ring > all the alarm bells! > > So I would suggest that the wwPDB could coordinate, with the help of the > validation experts, software to produce a short summary report that > would be automatically provided in the same email that allocates the PDB > ID. This email could make the strong recommendation that the report file > be submitted with the publication, and maybe in the fullness of time > even the Editors of high profile journals would require this report for > the referees (or even read it themselves!). To gain acceptance for such > a procedure the report would have to be short and comprehensible to > non-crystallographers; the MolProbity summary is an excellent first > pass in this respect, but (partially with a view to detecting > manipulation of the data) a couple of tests could be added based on the > data statistics as reported in the PDB file or even better the > reflection data if submitted). Most of the necessary software already > exists, much of it produced by regular readers of this bb, it just needs > to be adapted so that the results can be digested by referees and > editors with little or no crystallographic experience. And most important, > a PDB ID should always be released only in combination with such a > summary. > > George > > Prof. George M. Sheldrick FRS > Dept. Structural Chemistry, > University of Goettingen, > Tammannstr. 4, > D37077 Goettingen, Germany > Tel. +49-551-39-3021 or -3068 > Fax. +49-551-39-2582 >