Since the beginning of this thread, I've been looking for a suitable example. Apologies to the creators of 1LDK, but I think it is easy to be misled by this pdb (published in Nature)
I opened this file with coot, and it immediately flagged some residues (including mainchain - almost a whole chain) as having an occupancy of zero. Well done coot - any possible problem was immediately obvious. However, other display programs I had to hand, such as Rasmol and PyMol showed no problem. If you colour by B-factor, (default settings) then you can see that said domain is slightly red, but as this is a low res-structure (3.1A) you kind of expect that. But were I to try and infer biological information from the position of this chain I would be misled - as the majority of it has been refined with an occ=0, it just isn't there. As the reflection file was not supplied, and no note was made either in the pdb remarks or the paper, we cannot make any judgements about why this was done. One needs to look at occupancy (and indeed symmetry - how many non-xtallers check packing?) to decide that there clearly something awry with this structure. We can come up with a 'right-way' of doing things, and champion the use EDS etc, but ultimately the responsibility lies with the crystallographer who deposits the structure (and the other authors on the paper). We are putting our data into the public domain and we must be as transparent and accountable as possible. D On 11/01/07, Eric Bennett <[EMAIL PROTECTED]> wrote:
*** For details on how to be removed from this list visit the *** *** CCP4 home page http://www.ccp4.ac.uk *** Ethan Merritt wrote: >Because it is not necessary to do so. Storing every H coordinate >generated by the >riding model adds no information to the PDB file that is not already >present in the >parsimonious description provided by the header record: >REMARK 3 HYDROGENS HAVE BEEN ADDED IN THE RIDING POSITIONS Hopefully that is tongue in cheek. If this is your reasoning, why build a PDB file at all? It contains no information that isn't already present in your original diffraction images plus a basic book on chemical bonding. The point of building the PDB file is not that it adds information beyond what is contained in the raw diffraction data and an organic chemistry textbook. The point is to present that data in an easily interpreted format. Most non-crystallographer scientists will know what it means if they see hydrogens connected to carbons when they pull up a structure. If they see: REMARK 3 HYDROGENS HAVE BEEN ADDED IN THE RIDING POSITIONS they are not going to know what it means, even assuming they know to look for the REMARK 3 line in the PDB file in the first place. Does it mean that because the crystallographer was confident enough to use hydrogens during refinement, all oxygens and nitrogens have been conclusively assigned in asn and gln side chains? Does it mean that if you knew the definition of "riding positions" you could deduce the location of a serine hydroxyl hydrogen, the correct tautomer of a histidine, or the protonation state of a lysine? Data has to be provided in a representation suitable for the target audience, not left in an obscure format just because converting it to a more easily digested form "adds no information". The target audience for protein structures should be larger than other protein crystallographers. -- Eric Bennett Assistant Director Center for Drug Design University of Minnesota
-- --------------------------------------- David Briggs, PhD. Father & Crystallographer www.dbriggs.talktalk.net iChat AIM ID: DBassophile
