At the risk of derailing the discussion, can it be that the blob is actually an accumulation of many Fourier ripples? (on top of bulk solvent, I guess). The “chloride” seems to be about 3.5Å away from a lot of atoms, with nothing closer. This is mostly based on intuition and the fact that in my experience any almost spherical cavity or any almost cylindrical crevice has a blob of difference density inside, which often proves to be very difficult to model. I have no hard data to back this up.
Cheers, Jose. ================================ Jose Antonio Cuesta-Seijo, PhD Carlsberg Laboratory Gamle Carlsberg Vej 10 DK-1799 Copenhagen V Denmark Tlf +45 3327 5332 Email [email protected]<mailto:[email protected]> ================================ From: CCP4 bulletin board [mailto:[email protected]] On Behalf Of Bernhard Rupp (Hofkristallrat a.D.) Sent: Thursday, January 22, 2015 10:26 AM To: [email protected] Subject: Re: [ccp4bb] chloride or water After reading this exchange, I think at the core of the dispute is the question what a structure model really is supposed to represent (a), and how to annotate/describe it (b). ad (a) In general, and forgive me for not disclosing all caveats and fine tune (I leave this to GB), we are interested in the posterior likelihood (model likelihood). The two terms to consider (yes, I know, I am omitting any normalization necessary for hypothesis testing etc) this model likelihood would be proportional to the product of an evidence term (data likelihood) and an independent prior knowledge term. Imho the expressed opinions diverge primarily in the relative significance of the terms or normalization of the probabilities. The evidence purists (and it seems that computationalists often mistake this for arrogance of the crystallographers) argue that if I can’t see/recognize it in ED or support it otherwise by direct experimental evidence, leave it out of the model (after all, X-ray structure models are supposed to be based on experimental evidence). On the other hand, from prior knowledge (admittedly extracted from polluted data bases like the PDB and that is not meant as an insult but a statement of fact) we do know something about what reasonably could be expected and could use it to the full extent of its statistical support. Both extremes are of course justifiable, but in practice not separable. E.g. we use riding hydrogens without giving it a second thought that we do not see them in (macro X-ray) ED, and they do improve models. On the other hand, we still put side chain atoms we do not ‘see’ in specific positions and hope that the B-factors increase to a point where the absence of any meaningful scattering contributions does not ruin our Holy R. That specific position is perhaps closer to ‘wild speculation’ than the probability that a chloride atom exists in that specific case. (I do argue that in the above case a set of conformations with occupancies of rotamers corresponding to their population in the torsion angle landscape (or in the polluted databases) – the prior – under consideration where they cannot be – the rest of the model as obtained from evidence – would be a possible description). The final weighting one could apply might be a less tangible factor – how badly does it matter? If a ligand in a specific pose is modelled and intended for the use of drug discovery, I’d say the claim is extraordinarily strong, and the model likelihood (both terms) better be convincing. In the less earth shaking blob case, considering priors and the mentioned restrictions of low resolution etc, I can accept a low but not unreasonable probability (<- such apparent evasiveness being a dead giveaway of a mental Bayes factor calculation instead of adherence to an artificial significance level; frequentists please feel free to flame me) for Cl as the most probable in the Cl/water/empty model competition (not that any of the models are overly convincing, however, compatible with the low drama factor of that decision). ad (b) having said this, how to express such probabilistic considerations in the current atomic PDB model format, is an unresolved issue. I think the whole idea of the single static atomic model sooner or later will fall. It is already a mess because much information about the model is hidden for example in remarks like TLS groups (btw, one of the most abused and ad-hoc applied means in the hope of reducing Holy R instead of reflecting what these groups actually mean). But this is besides the original point and becoming free floating… I am not calling for making peace here, rather argue that the seemingly insignificant issue of a single Cl ion in one of 100k structure models can lead to productive reflection about meaning and improvement of model description. Sorry for offending those in need for cozy comfort closing quotes. The answer is, as always, 42. HTC, BR (Happy To Confuse) From: CCP4 bulletin board [mailto:[email protected]] On Behalf Of Keller, Jacob Sent: Mittwoch, 21. Januar 2015 19:18 To: [email protected]<mailto:[email protected]> Subject: Re: [ccp4bb] chloride or water I reiterate that assigning a chloride is not “wild speculation” or “just making something up” in light of what we know about the situation. I see your point about not knowing that it’s a chloride, but I think you would agree that it is certainly more likely a chloride than map-noise, and perhaps more likely than water as well. Would you agree that chloride is the best guess, at least? What are the options for that blob, and what is the probability of each? I think you want to make sure people don’t get misled by it, which is a good point and a noble aspiration. I would argue that “not choosing” is here, as everywhere else, indeed choosing. And if you choose nothing here, you are almost certainly wrong, given the data. Some might be surprised or misled that a cavity like that would be totally empty, and the map density is unequivocal evidence that something is indeed there. So what now? Maybe a solution would be dummy atoms, maybe call them agnostons (agn) or something? Perhaps this is a basic disagreement about what a protein structure model represents, with one opinion being “that which we can rely upon confidently” and the other being “that which is most likely considering the data.” Each has advantages depending on one’s goals. Since the PDB is certainly tainted by structures modeled in accordance with the “most likely” outlook, one now has to be cautious about all structures. I prefer the latter since I would try to be responsible/skeptical enough to go back to the original crystal data before making important conclusions based on it. Maybe there should be a disclaimer printed in each PDB file… JPK From: CCP4 bulletin board [mailto:[email protected]] On Behalf Of Nat Echols Sent: Wednesday, January 21, 2015 12:33 PM To: [email protected]<mailto:[email protected]> Subject: Re: [ccp4bb] chloride or water On Wed, Jan 21, 2015 at 9:05 AM, Keller, Jacob <[email protected]<mailto:[email protected]>> wrote: Not sure why there is this level of suspicion about the poor halide when waters generally get assigned so haphazardly. I would say that there are probably more “wrong” waters in the PDB than wrong chlorides, but there’s not much fuss about that. Great, so leave it empty instead of just making something up. Perhaps future generations will figure out a more rigorous and quantitative method for handling such features than guessing based on screenshots posted to a mailing list. At this resolution water placement is difficult to justify anyway - and since neither the scattering properties nor the coordination distances are especially accurate, trying to assign chemical identity in the absence of any supporting information (for example anomalous data) is especially futile. (Although at least in this case the resolution is an obvious red flag - to a crystallographer, anyway - indicating that any lighter ions shouldn't be taken very seriously. Other biologists, of course, may be more trusting.) -Nat
