Clearly there are strong feelings held by the advocates of the several solutions to the problem of what to do about atoms that cannot be reliably placed based on the electron density map. I certainly understand since I passionately support my own favorite solution.
Why is it that a community of generally reasonable people keep coming back to this same issue and yet fail to find a solution that can reach some kind of consensus? My 2 cents on this, more fundamental, issue: A model created by someone who believes that all atoms (for a residue with any atoms) must be built will contain two kinds of atoms. Those placed based on the appearance of the electron density and those placed in some convenient location simply to fill out the atom count. I think most everyone agrees that a full residue is a convenience for some users of our models. What those of us who favor partial models want is an absolutely clear distinction between the two classes of atoms. All this needs is a bit. Literally, one bit of data that flags those atoms added to the model simply to complete the set. Why can't we come to a solution that satisfies? Because we continue to use a non-extensible file format that does not allow us a place to put this bit. Some people want to put the bit in the occupancy column by defining a special value (occ=0) that would be the flag. Some people want to put it in the B factor column by defining a special value there (a couple possibilities here, B=1000.00, B=500.00, B varying but larger than that of any atom built into density). The B factor and occupancy columns in the PDB file have been precisely defined back when the mmCIF dictionary was created and to change their definitions now would require opening that process again. I am pretty sure that committee in charge will never allow a definition for these items that includes the phrase "... except when the value is equal too ...". You can't run a database that way. Each piece of information has to have its own tag and definition. Once it is defined we can embrace the task of educating software developers and our collaborators who use our models in its meaning. There is just no place to put this bit in a PDB format file. mmCIF - its trivial. PDB format - no way. As long as we insist that this format is the preferred means of distributing our models we will continue to return to this argument again and again with no possibility of coming to a solution. Dale Tronrud P.S. I've even thought about using the model of the "REMARK" statement, where all sorts of information have been added by the hack of "standardized remarks". I thought that one could create a "standardized footnote" that would mark the atoms as "imaginary". I found that, unfortunately, footnotes were removed from the PDB format many years ago. On 4/3/2011 11:01 AM, Boaz Shaanan wrote:
The original posting that started this thread referred to side-chains, as the subject still suggests. Do you propose to omit only side-chain atoms, in which case you end up with different residues, as pointed out by quite a few people,or do you suggest also to omit the main-chain atoms of the problematic residues ? Besides, as mentioned by Phoebe and others, many users (non-crystallographers) of PDB's know already the meaning of the B-factor and will know how to interpret a very high B. It is our task (the crystallographers) to enllighten those who don't know what the B column in a PDB entry stands for. I certainly do and I'm sure many of us do so too. I voted for high B and would vote for it again, if asked. Cheers, Boaz Boaz Shaanan, Ph.D. Dept. of Life Sciences Ben-Gurion University of the Negev Beer-Sheva 84105 Israel Phone: 972-8-647-2220 Skype: boaz.shaanan Fax: 972-8-647-2992 or 972-8-646-1710 ________________________________________ From: CCP4 bulletin board [CCP4BB@JISCMAIL.AC.UK] On Behalf Of Bernhard Rupp (Hofkristallrat a.D.) [hofkristall...@gmail.com] Sent: Sunday, April 03, 2011 7:42 PM To: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] what to do with disordered side chains Thus my feeling is that if one does NOT see the coords in the electron density, they should NOT be included, and let someone else try to model them in, but they should be aware that they are modeling them. Joel L. Sussman Concur. BMC p 680 ‘How to handle missing parts’ Best wishes, BR On 3 Apr 2011, at 06:15, Frances C. Bernstein wrote: Doing something sensible in the major software packages, both for graphics and for other analysis of the structure, could solve the problem for most users. But nobody knows what other software is out there being used by individuals or small groups. And the more remote the authors of that software are from protein structure solution the more likely it is that they have not/will not properly handle atoms with zero occupancy or high B values, for example. I am absolutely positive that there is software that does its voodoo on ATOM/HETATM records and pays absolutely no attention to anything beyond the x, y, z coordinates (i.e. beyond column 54). Frances Bernstein ===================================================== **** Bernstein + Sons * * Information Systems Consultants **** 5 Brewster Lane, Bellport, NY 11713-2803 * * *** **** * Frances C. Bernstein * *** f...@bernstein-plus-sons.com<mailto:f...@bernstein-plus-sons.com> *** * * *** 1-631-286-1339 FAX: 1-631-286-1999 ===================================================== On Sat, 2 Apr 2011, Jacob Keller wrote: I guess I missed it in the flurry of replies to this thread over the last few days, but what exactly is so terrible about keeping the atoms (since you have chemical evidence from protein sequence that they are there, and even if there is X-ray damage they were originally there and are likely still there in a subset of the molecules), but changing occupancy to zero as an acknowledgment that your data does not provide evidence to support a specific atomic position for these atoms? Some users might pull up the structure, see those atoms, and think their positions were based on data, which they were not, and then draw conclusions based on them. I agree that occ=0 is tantamount to the suggestion you queried, however. A somewhat key question might be: across the various molecular visualization programs, what is the default way to handle atoms with occ=0? Perhaps those programs might be the best place to fix the problem... JPK ******************************************* Jacob Pearson Keller Northwestern University Medical Scientist Training Program cel: 773.608.9185 email: j-kell...@northwestern.edu<mailto:j-kell...@northwestern.edu> *******************************************