Re: [ccp4bb] chloride or water

2015-01-22 Thread Bernhard Rupp (Hofkristallrat a.D.)
After reading this exchange, I think at the core of the dispute is the question 
what a structure 

model really is supposed to represent (a), and how to annotate/describe it (b). 

 

ad (a) In general, and forgive me for not disclosing all caveats and fine  tune 
(I leave this to GB), we are 

interested in the posterior likelihood (model likelihood). The two terms to 
consider (yes, I know, I am omitting 

any normalization necessary for hypothesis testing etc) this model likelihood 
would be proportional to

the product of an evidence term (data likelihood) and an independent prior 
knowledge term.

 

Imho the expressed opinions diverge primarily in the relative significance of 
the terms or normalization of the probabilities.

The evidence purists (and it seems that computationalists often mistake this 
for arrogance of the crystallographers)

argue that if I can’t see/recognize it in ED or support it otherwise by direct 
experimental evidence, leave it out

of the model (after all, X-ray structure models are supposed to be based on 
experimental evidence).

 

On the other hand, from prior knowledge (admittedly extracted from polluted 
data bases like the PDB

and that is not meant as an insult but a statement of fact) we do know 
something about what reasonably could be

expected and could use it to the full extent of its statistical support.

 

Both extremes are of course justifiable, but in practice not separable. E.g. we 
use riding hydrogens without giving it a 

second thought that we do not see them in (macro X-ray) ED, and they do improve 
models. On the other hand,

we still put side chain atoms we do not ‘see’ in specific positions and hope 
that the B-factors increase to a point where

the absence of any meaningful scattering contributions does not ruin our Holy 
R. That specific position is perhaps

closer to ‘wild speculation’ than the probability that a chloride atom exists 
in that specific case.

 

(I do argue that in the above case a set of conformations with occupancies of 
rotamers corresponding to their

population in the torsion angle landscape (or in the polluted databases) – the 
prior – under consideration where they

cannot be – the rest of the model as obtained from evidence – would be a 
possible description). 

 

The final weighting one could apply might be a less tangible factor – how badly 
does it matter? If a ligand in a specific

pose is modelled and intended for the use of drug discovery, I’d say the claim 
is extraordinarily strong, and

the model likelihood (both terms) better be convincing. In the less earth 
shaking blob case, considering priors and the mentioned 

restrictions of low resolution etc, I can accept a low but not unreasonable 
probability (- such apparent evasiveness being a dead

giveaway of a mental Bayes factor calculation instead of adherence to an 
artificial significance level; frequentists please feel free 

to flame me) for Cl as the most probable in the Cl/water/empty model 
competition (not that any of the models are 

overly convincing, however, compatible with the low drama factor of that 
decision). 

 

ad (b) having said this, how to express such probabilistic considerations in 
the current atomic PDB model format, is an unresolved

issue. I think the whole idea of the single static atomic model sooner or later 
will fall. It is already a mess because

much information about the model is hidden for example in remarks like TLS 
groups (btw, one of the most abused and 

ad-hoc applied means in the hope of reducing Holy R instead of reflecting what 
these groups actually mean). But this is besides the 

original point and becoming free floating…

 

I am not calling for making peace here, rather argue that the seemingly 
insignificant issue of a single Cl ion in one of

100k structure models can lead to productive reflection about meaning and 
improvement of model description. 

Sorry for offending those in need for cozy comfort closing quotes. 

 

The answer is, as always, 42.

 

HTC, BR

 

(Happy To Confuse)

 

 

 

From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Keller, 
Jacob
Sent: Mittwoch, 21. Januar 2015 19:18
To: CCP4BB@JISCMAIL.AC.UK
Subject: Re: [ccp4bb] chloride or water

 

I reiterate that assigning a chloride is not “wild speculation” or “just making 
something up” in light of what we know about the situation.

 

I see your point about not knowing that it’s a chloride, but I think you would 
agree that it is certainly more likely a chloride than map-noise, and perhaps 
more likely than water as well. Would you agree that chloride is the best 
guess, at least? What are the options for that blob, and what is the 
probability of each? I think you want to make sure people don’t get misled by 
it, which is a good point and a noble aspiration. I would argue that “not 
choosing” is here, as everywhere else, indeed choosing. And if you choose 
nothing here, you are almost certainly wrong, given 

Re: [ccp4bb] chloride or water

2015-01-22 Thread Seijo, Jose A. Cuesta
At the risk of derailing the discussion, can it be that the blob is actually an 
accumulation of many Fourier ripples? (on top of bulk solvent, I guess). The 
“chloride” seems to be about 3.5Å away from a lot of atoms, with nothing closer.
This is mostly based on intuition and the fact that in my experience any almost 
spherical cavity or any almost cylindrical crevice has a blob of difference 
density inside, which often proves to be very difficult to model. I have no 
hard data to back this up.

Cheers,

Jose.


Jose Antonio Cuesta-Seijo, PhD
Carlsberg Laboratory
Gamle Carlsberg Vej 10
DK-1799 Copenhagen V
Denmark

Tlf +45 3327 5332
Email 
josea.cuesta.se...@carlsberglab.dkmailto:josea.cuesta.se...@carlsberglab.dk


From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Bernhard 
Rupp (Hofkristallrat a.D.)
Sent: Thursday, January 22, 2015 10:26 AM
To: CCP4BB@JISCMAIL.AC.UK
Subject: Re: [ccp4bb] chloride or water

After reading this exchange, I think at the core of the dispute is the question 
what a structure
model really is supposed to represent (a), and how to annotate/describe it (b).

ad (a) In general, and forgive me for not disclosing all caveats and fine  tune 
(I leave this to GB), we are
interested in the posterior likelihood (model likelihood). The two terms to 
consider (yes, I know, I am omitting
any normalization necessary for hypothesis testing etc) this model likelihood 
would be proportional to
the product of an evidence term (data likelihood) and an independent prior 
knowledge term.

Imho the expressed opinions diverge primarily in the relative significance of 
the terms or normalization of the probabilities.
The evidence purists (and it seems that computationalists often mistake this 
for arrogance of the crystallographers)
argue that if I can’t see/recognize it in ED or support it otherwise by direct 
experimental evidence, leave it out
of the model (after all, X-ray structure models are supposed to be based on 
experimental evidence).

On the other hand, from prior knowledge (admittedly extracted from polluted 
data bases like the PDB
and that is not meant as an insult but a statement of fact) we do know 
something about what reasonably could be
expected and could use it to the full extent of its statistical support.

Both extremes are of course justifiable, but in practice not separable. E.g. we 
use riding hydrogens without giving it a
second thought that we do not see them in (macro X-ray) ED, and they do improve 
models. On the other hand,
we still put side chain atoms we do not ‘see’ in specific positions and hope 
that the B-factors increase to a point where
the absence of any meaningful scattering contributions does not ruin our Holy 
R. That specific position is perhaps
closer to ‘wild speculation’ than the probability that a chloride atom exists 
in that specific case.

(I do argue that in the above case a set of conformations with occupancies of 
rotamers corresponding to their
population in the torsion angle landscape (or in the polluted databases) – the 
prior – under consideration where they
cannot be – the rest of the model as obtained from evidence – would be a 
possible description).

The final weighting one could apply might be a less tangible factor – how badly 
does it matter? If a ligand in a specific
pose is modelled and intended for the use of drug discovery, I’d say the claim 
is extraordinarily strong, and
the model likelihood (both terms) better be convincing. In the less earth 
shaking blob case, considering priors and the mentioned
restrictions of low resolution etc, I can accept a low but not unreasonable 
probability (- such apparent evasiveness being a dead
giveaway of a mental Bayes factor calculation instead of adherence to an 
artificial significance level; frequentists please feel free
to flame me) for Cl as the most probable in the Cl/water/empty model 
competition (not that any of the models are
overly convincing, however, compatible with the low drama factor of that 
decision).

ad (b) having said this, how to express such probabilistic considerations in 
the current atomic PDB model format, is an unresolved
issue. I think the whole idea of the single static atomic model sooner or later 
will fall. It is already a mess because
much information about the model is hidden for example in remarks like TLS 
groups (btw, one of the most abused and
ad-hoc applied means in the hope of reducing Holy R instead of reflecting what 
these groups actually mean). But this is besides the
original point and becoming free floating…

I am not calling for making peace here, rather argue that the seemingly 
insignificant issue of a single Cl ion in one of
100k structure models can lead to productive reflection about meaning and 
improvement of model description.
Sorry for offending those in need for cozy comfort closing quotes.

The answer is, as always, 42.

HTC, BR

(Happy To Confuse)




[ccp4bb] Postdoc position for long-wavelength MX at Diamond Light Source

2015-01-22 Thread Armin Wagner

Dear all,

Just before Christmas we were able to collect first protein crystal data at the 
long-wavelength MX beamline at Diamond (as you might have seen in my 
presentation at the CCP4 Study Weekend). The combination of the in-vacuum 
sample environment and the single-photon counting detector (in-vacuum P12M) 
proves to work and produces data with impressive signal-to-noise ratios. We are 
now in the process to optimise the beamline towards user operation. If you want 
to join our team to take part exploring unknown territory using this unique 
instrument, please get into touch or follow the links below. We intend to 
shortlist for a first round of interviews in about two weeks (9th of Feb).

Best regards,

   Armin



From: Armin Wagner [mailto:armin.wag...@diamond.ac.uk]
Sent: 30 October 2014 20:23
To: ccp4bb
Subject: [ccp4bb] Postdoc position for long-wavelength MX at Diamond Light 
Source

Postdoctoral Research Associate - Beamline I23

Beamline I23 at Diamond Light Source will be a unique facility for in-vacuum 
long-wavelength macromolecular crystallography (MX). After more than five years 
in design, construction and commissioning, we have recently collected first 
diffraction data and are expecting first user experiments before the end of 
this year.

We are looking for a highly motivated Post Doc to take part in the very first 
experiments at long wavelengths, helping to establish and test new data 
collection strategies for anomalous data collections from native proteins or 
around the M edges for large macromolecular complexes.

For more details, please see:

http://www.diamond.ac.uk/Careers/Vacancies/All/DIA0968_CH.html

and

http://www.diamond.ac.uk/Beamlines/Mx/I23.html

If you have any questions concerning the position, please do not hesitate to 
get into touch.

Best regards,

  Armin





--

This e-mail and any attachments may contain confidential, copyright and or 
privileged material, and are for the use of the intended addressee only. If you 
are not the intended addressee or an authorised recipient of the addressee 
please notify us of receipt by returning the e-mail and do not use, copy, 
retain, distribute or disclose the information in or attached to the e-mail.
Any opinions expressed within this e-mail are those of the individual and not 
necessarily of Diamond Light Source Ltd.
Diamond Light Source Ltd. cannot guarantee that this e-mail or any attachments 
are free from viruses and we cannot accept liability for any damage which you 
may sustain as a result of software viruses which may be transmitted in or with 
the message.
Diamond Light Source Limited (company no. 4375679). Registered in England and 
Wales with its registered office at Diamond House, Harwell Science and 
Innovation Campus, Didcot, Oxfordshire, OX11 0DE, United Kingdom





[ccp4bb] CCP4-6.5 Update 002

2015-01-22 Thread Charles Ballard
Dear CCP4 Users

An update for the CCP4-6.5 series has just been released, consisting
of the following changes

 * crank2 (all)
  - Several bug fixes, mainly related to the graphical interface and 
Crunch2/fix

* clipper-progs (all)
  - ctruncate: fix to issue with estimation of alpha in Murray-Rust plot/fix

* phaser (all)
  - version correction, 2.5.7

* blend (all)
 - Fixed a bug in keywords management; now accepting both uppercase and 
lowercase. Fixed a bug for the calculation of aLCV.

* privateer  (all)
  - Minor bugfix

* monomer  (all)
  - Several bug-fixes and addition of new compounds

* ccp4-progs  (all)
  - Zanuda: adjustments to changes in refmac output pdb-file; bypass for P1 
refinement failure

* ccp4  (all)
  - MrSHELXE: temporarily switched off pdb- and mtz-files crosscheck
  - Fix for handling Protein/DNA complex sequences in Matthews interface
/component

* documentation (all)
  - minor corrections and updates

* mrbump (all)
  - Fix for sftools path length problem and Mac fix for MAFFT


Please report any bugs to c...@stfc.ac.uk.

Many thanks for using CCP4.

The CCP4 Core Team

[ccp4bb] Continuous-Single Versus Coarse-Multiple Sampling

2015-01-22 Thread Keller, Jacob
Dear Crystallographers,

This is more general than crystallography, but has applications therein, 
particularly in understanding fine phi-slicing.

The general question is:

Given one needs to collect data to fit parameters for a known function, and 
given a limited total number of measurements, is it generally better to measure 
a small group of points multiple times or to distribute each individual 
measurement over the measureable extent of the function? I have a strong 
intuition that it is the latter, but all errors being equal, it would seem 
prima facie that both are equivalent. For example, a line (y = mx + b) can be 
fit from two points. One could either measure the line at two points A and B 
five times each for a total of 10 independent measurements, or measure ten 
points evenly-spaced from A to B. Are these equivalent in terms of fitting and 
information content or not? Which is better? Again, conjecture and intuition 
suggest the evenly-spaced experiment is better, but I cannot formulate or prove 
to myself why, yet.

The application of this to crystallography might be another reason that fine 
phi-slicing (0.1 degrees * 3600 frames) is better than coarse (1 degree * 3600 
frames), even though the number of times one measures reflections is tenfold 
higher in the second case (assuming no radiation damage). In the first case, 
one never measures the same phi angle twice, but one does have multiple 
measurements in a sense, i.e., of different parts of the same reflection.

Yes, 3D profile-fitting may be a big reason fine phi-slicing works, but beyond 
that, perhaps this sampling choice plays a role as well. Or maybe the 
profile-fitting works so well precisely because of this diffuse-single type of 
sampling rather than coarse-multiple sampling?

This general math/science concept must have been discussed somewhere--can 
anyone point to where?

JPK

***
Jacob Pearson Keller, PhD
Looger Lab/HHMI Janelia Research Campus
19700 Helix Dr, Ashburn, VA 20147
email: kell...@janelia.hhmi.org
***