Re: [ccp4bb] chloride or water
After reading this exchange, I think at the core of the dispute is the question what a structure model really is supposed to represent (a), and how to annotate/describe it (b). ad (a) In general, and forgive me for not disclosing all caveats and fine tune (I leave this to GB), we are interested in the posterior likelihood (model likelihood). The two terms to consider (yes, I know, I am omitting any normalization necessary for hypothesis testing etc) this model likelihood would be proportional to the product of an evidence term (data likelihood) and an independent prior knowledge term. Imho the expressed opinions diverge primarily in the relative significance of the terms or normalization of the probabilities. The evidence purists (and it seems that computationalists often mistake this for arrogance of the crystallographers) argue that if I can’t see/recognize it in ED or support it otherwise by direct experimental evidence, leave it out of the model (after all, X-ray structure models are supposed to be based on experimental evidence). On the other hand, from prior knowledge (admittedly extracted from polluted data bases like the PDB and that is not meant as an insult but a statement of fact) we do know something about what reasonably could be expected and could use it to the full extent of its statistical support. Both extremes are of course justifiable, but in practice not separable. E.g. we use riding hydrogens without giving it a second thought that we do not see them in (macro X-ray) ED, and they do improve models. On the other hand, we still put side chain atoms we do not ‘see’ in specific positions and hope that the B-factors increase to a point where the absence of any meaningful scattering contributions does not ruin our Holy R. That specific position is perhaps closer to ‘wild speculation’ than the probability that a chloride atom exists in that specific case. (I do argue that in the above case a set of conformations with occupancies of rotamers corresponding to their population in the torsion angle landscape (or in the polluted databases) – the prior – under consideration where they cannot be – the rest of the model as obtained from evidence – would be a possible description). The final weighting one could apply might be a less tangible factor – how badly does it matter? If a ligand in a specific pose is modelled and intended for the use of drug discovery, I’d say the claim is extraordinarily strong, and the model likelihood (both terms) better be convincing. In the less earth shaking blob case, considering priors and the mentioned restrictions of low resolution etc, I can accept a low but not unreasonable probability (- such apparent evasiveness being a dead giveaway of a mental Bayes factor calculation instead of adherence to an artificial significance level; frequentists please feel free to flame me) for Cl as the most probable in the Cl/water/empty model competition (not that any of the models are overly convincing, however, compatible with the low drama factor of that decision). ad (b) having said this, how to express such probabilistic considerations in the current atomic PDB model format, is an unresolved issue. I think the whole idea of the single static atomic model sooner or later will fall. It is already a mess because much information about the model is hidden for example in remarks like TLS groups (btw, one of the most abused and ad-hoc applied means in the hope of reducing Holy R instead of reflecting what these groups actually mean). But this is besides the original point and becoming free floating… I am not calling for making peace here, rather argue that the seemingly insignificant issue of a single Cl ion in one of 100k structure models can lead to productive reflection about meaning and improvement of model description. Sorry for offending those in need for cozy comfort closing quotes. The answer is, as always, 42. HTC, BR (Happy To Confuse) From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Keller, Jacob Sent: Mittwoch, 21. Januar 2015 19:18 To: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] chloride or water I reiterate that assigning a chloride is not “wild speculation” or “just making something up” in light of what we know about the situation. I see your point about not knowing that it’s a chloride, but I think you would agree that it is certainly more likely a chloride than map-noise, and perhaps more likely than water as well. Would you agree that chloride is the best guess, at least? What are the options for that blob, and what is the probability of each? I think you want to make sure people don’t get misled by it, which is a good point and a noble aspiration. I would argue that “not choosing” is here, as everywhere else, indeed choosing. And if you choose nothing here, you are almost certainly wrong, given
Re: [ccp4bb] chloride or water
At the risk of derailing the discussion, can it be that the blob is actually an accumulation of many Fourier ripples? (on top of bulk solvent, I guess). The “chloride” seems to be about 3.5Å away from a lot of atoms, with nothing closer. This is mostly based on intuition and the fact that in my experience any almost spherical cavity or any almost cylindrical crevice has a blob of difference density inside, which often proves to be very difficult to model. I have no hard data to back this up. Cheers, Jose. Jose Antonio Cuesta-Seijo, PhD Carlsberg Laboratory Gamle Carlsberg Vej 10 DK-1799 Copenhagen V Denmark Tlf +45 3327 5332 Email josea.cuesta.se...@carlsberglab.dkmailto:josea.cuesta.se...@carlsberglab.dk From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Bernhard Rupp (Hofkristallrat a.D.) Sent: Thursday, January 22, 2015 10:26 AM To: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] chloride or water After reading this exchange, I think at the core of the dispute is the question what a structure model really is supposed to represent (a), and how to annotate/describe it (b). ad (a) In general, and forgive me for not disclosing all caveats and fine tune (I leave this to GB), we are interested in the posterior likelihood (model likelihood). The two terms to consider (yes, I know, I am omitting any normalization necessary for hypothesis testing etc) this model likelihood would be proportional to the product of an evidence term (data likelihood) and an independent prior knowledge term. Imho the expressed opinions diverge primarily in the relative significance of the terms or normalization of the probabilities. The evidence purists (and it seems that computationalists often mistake this for arrogance of the crystallographers) argue that if I can’t see/recognize it in ED or support it otherwise by direct experimental evidence, leave it out of the model (after all, X-ray structure models are supposed to be based on experimental evidence). On the other hand, from prior knowledge (admittedly extracted from polluted data bases like the PDB and that is not meant as an insult but a statement of fact) we do know something about what reasonably could be expected and could use it to the full extent of its statistical support. Both extremes are of course justifiable, but in practice not separable. E.g. we use riding hydrogens without giving it a second thought that we do not see them in (macro X-ray) ED, and they do improve models. On the other hand, we still put side chain atoms we do not ‘see’ in specific positions and hope that the B-factors increase to a point where the absence of any meaningful scattering contributions does not ruin our Holy R. That specific position is perhaps closer to ‘wild speculation’ than the probability that a chloride atom exists in that specific case. (I do argue that in the above case a set of conformations with occupancies of rotamers corresponding to their population in the torsion angle landscape (or in the polluted databases) – the prior – under consideration where they cannot be – the rest of the model as obtained from evidence – would be a possible description). The final weighting one could apply might be a less tangible factor – how badly does it matter? If a ligand in a specific pose is modelled and intended for the use of drug discovery, I’d say the claim is extraordinarily strong, and the model likelihood (both terms) better be convincing. In the less earth shaking blob case, considering priors and the mentioned restrictions of low resolution etc, I can accept a low but not unreasonable probability (- such apparent evasiveness being a dead giveaway of a mental Bayes factor calculation instead of adherence to an artificial significance level; frequentists please feel free to flame me) for Cl as the most probable in the Cl/water/empty model competition (not that any of the models are overly convincing, however, compatible with the low drama factor of that decision). ad (b) having said this, how to express such probabilistic considerations in the current atomic PDB model format, is an unresolved issue. I think the whole idea of the single static atomic model sooner or later will fall. It is already a mess because much information about the model is hidden for example in remarks like TLS groups (btw, one of the most abused and ad-hoc applied means in the hope of reducing Holy R instead of reflecting what these groups actually mean). But this is besides the original point and becoming free floating… I am not calling for making peace here, rather argue that the seemingly insignificant issue of a single Cl ion in one of 100k structure models can lead to productive reflection about meaning and improvement of model description. Sorry for offending those in need for cozy comfort closing quotes. The answer is, as always, 42. HTC, BR (Happy To Confuse)
[ccp4bb] Postdoc position for long-wavelength MX at Diamond Light Source
Dear all, Just before Christmas we were able to collect first protein crystal data at the long-wavelength MX beamline at Diamond (as you might have seen in my presentation at the CCP4 Study Weekend). The combination of the in-vacuum sample environment and the single-photon counting detector (in-vacuum P12M) proves to work and produces data with impressive signal-to-noise ratios. We are now in the process to optimise the beamline towards user operation. If you want to join our team to take part exploring unknown territory using this unique instrument, please get into touch or follow the links below. We intend to shortlist for a first round of interviews in about two weeks (9th of Feb). Best regards, Armin From: Armin Wagner [mailto:armin.wag...@diamond.ac.uk] Sent: 30 October 2014 20:23 To: ccp4bb Subject: [ccp4bb] Postdoc position for long-wavelength MX at Diamond Light Source Postdoctoral Research Associate - Beamline I23 Beamline I23 at Diamond Light Source will be a unique facility for in-vacuum long-wavelength macromolecular crystallography (MX). After more than five years in design, construction and commissioning, we have recently collected first diffraction data and are expecting first user experiments before the end of this year. We are looking for a highly motivated Post Doc to take part in the very first experiments at long wavelengths, helping to establish and test new data collection strategies for anomalous data collections from native proteins or around the M edges for large macromolecular complexes. For more details, please see: http://www.diamond.ac.uk/Careers/Vacancies/All/DIA0968_CH.html and http://www.diamond.ac.uk/Beamlines/Mx/I23.html If you have any questions concerning the position, please do not hesitate to get into touch. Best regards, Armin -- This e-mail and any attachments may contain confidential, copyright and or privileged material, and are for the use of the intended addressee only. If you are not the intended addressee or an authorised recipient of the addressee please notify us of receipt by returning the e-mail and do not use, copy, retain, distribute or disclose the information in or attached to the e-mail. Any opinions expressed within this e-mail are those of the individual and not necessarily of Diamond Light Source Ltd. Diamond Light Source Ltd. cannot guarantee that this e-mail or any attachments are free from viruses and we cannot accept liability for any damage which you may sustain as a result of software viruses which may be transmitted in or with the message. Diamond Light Source Limited (company no. 4375679). Registered in England and Wales with its registered office at Diamond House, Harwell Science and Innovation Campus, Didcot, Oxfordshire, OX11 0DE, United Kingdom
[ccp4bb] CCP4-6.5 Update 002
Dear CCP4 Users An update for the CCP4-6.5 series has just been released, consisting of the following changes * crank2 (all) - Several bug fixes, mainly related to the graphical interface and Crunch2/fix * clipper-progs (all) - ctruncate: fix to issue with estimation of alpha in Murray-Rust plot/fix * phaser (all) - version correction, 2.5.7 * blend (all) - Fixed a bug in keywords management; now accepting both uppercase and lowercase. Fixed a bug for the calculation of aLCV. * privateer (all) - Minor bugfix * monomer (all) - Several bug-fixes and addition of new compounds * ccp4-progs (all) - Zanuda: adjustments to changes in refmac output pdb-file; bypass for P1 refinement failure * ccp4 (all) - MrSHELXE: temporarily switched off pdb- and mtz-files crosscheck - Fix for handling Protein/DNA complex sequences in Matthews interface /component * documentation (all) - minor corrections and updates * mrbump (all) - Fix for sftools path length problem and Mac fix for MAFFT Please report any bugs to c...@stfc.ac.uk. Many thanks for using CCP4. The CCP4 Core Team
[ccp4bb] Continuous-Single Versus Coarse-Multiple Sampling
Dear Crystallographers, This is more general than crystallography, but has applications therein, particularly in understanding fine phi-slicing. The general question is: Given one needs to collect data to fit parameters for a known function, and given a limited total number of measurements, is it generally better to measure a small group of points multiple times or to distribute each individual measurement over the measureable extent of the function? I have a strong intuition that it is the latter, but all errors being equal, it would seem prima facie that both are equivalent. For example, a line (y = mx + b) can be fit from two points. One could either measure the line at two points A and B five times each for a total of 10 independent measurements, or measure ten points evenly-spaced from A to B. Are these equivalent in terms of fitting and information content or not? Which is better? Again, conjecture and intuition suggest the evenly-spaced experiment is better, but I cannot formulate or prove to myself why, yet. The application of this to crystallography might be another reason that fine phi-slicing (0.1 degrees * 3600 frames) is better than coarse (1 degree * 3600 frames), even though the number of times one measures reflections is tenfold higher in the second case (assuming no radiation damage). In the first case, one never measures the same phi angle twice, but one does have multiple measurements in a sense, i.e., of different parts of the same reflection. Yes, 3D profile-fitting may be a big reason fine phi-slicing works, but beyond that, perhaps this sampling choice plays a role as well. Or maybe the profile-fitting works so well precisely because of this diffuse-single type of sampling rather than coarse-multiple sampling? This general math/science concept must have been discussed somewhere--can anyone point to where? JPK *** Jacob Pearson Keller, PhD Looger Lab/HHMI Janelia Research Campus 19700 Helix Dr, Ashburn, VA 20147 email: kell...@janelia.hhmi.org ***