Dear all,
regarding the "remaining strong differences" between measured data and
calculated SFs from a a finished (high res structure) I once investigated a
bit into this going back to images and looking up some extreme outliers.
I found the same - those were clear strong diffraction spots, not ice, not
small molecule, genuine protein diffraction. So I had no explanation for
those. Some were even "forbidden" intensities, because of screw axes which
were correct. structure refined perfectly, no problems at all.
I then found some literature about the possibilities of multiple
reflections - I guess this is possible but I wonder if you could get easily
say a 25 sigma I in this way.

And as we often end our beer-discussions - may be all protein space groups
are actually true P1, just close enough to satisfy the high symmetry rules
.. but this is getting a bit philosophical I know ..

Jan Dohnalek




On Wed, Oct 16, 2019 at 6:24 PM Randy Read <rj...@cam.ac.uk> wrote:

> James,
>
> Where we diverge is with your interpretation that big differences lead to
> small FOMs.  The size of the FOM depends on the product of Fo and Fc, not
> their difference.  The FOM for a reflection where Fo=1000 and Fc=10 is very
> different from the FOM for a reflection with Fo=5000 and Fc=4010, even
> though the difference is the same.
>
> Expanding on this:
>
> 1. The FOM actually depends more on the E values, i.e. reflections smaller
> than average get lower FOM values than ones bigger than average.  In the
> resolution bin from 5.12 to 5.64Å of 2vb1, the mean observed intensity is
> 20687 and the mean calculated intensity is 20022, which means that
> Eobs=Sqrt(145.83/20687)=0.084 and Ecalc=Sqrt(7264/20022)=0.602.  This
> reflection gets a low FOM because the product (0.050) is such a small
> number, not because the difference is big.
>
> 2. You have to consider the role of the model error in the difference,
> because for precisely-measured data most of the difference comes from model
> error.  In this resolution shell, the correlation coefficient between Iobs
> and Fcalc^2 is about 0.88, which means that sigmaA is about Sqrt(0.88) =
> 0.94.  The variance of both the real and imaginary components of Ec (as an
> estimate of the phased true E) will be (1-0.94^2)/2 = 0.058, so the
> standard deviations of the real and imaginary components of Ec will be
> about 0.24.  In that context, the difference between Eobs and Ecalc is
> nothing like a 2000-sigma outlier.
>
> Looking at this another way, the reason why the FOM is low for this
> reflection is that the conditional probability distribution of Eo given Ec
> has significant values on the other side of the origin of the complex
> plane. That means that the *phase* of the complex Eo is very uncertain.
> The figures in this web page (
> https://www-structmed.cimr.cam.ac.uk/Course/Statistics/statistics.html)
> should help to explain that idea.
>
> Best wishes,
>
> Randy
>
> On 16 Oct 2019, at 16:02, James Holton <jmhol...@lbl.gov> wrote:
>
>
> All very true Randy,
>
> But nevertheless every hkl has an FOM assigned to it, and that is used to
> calculate the map.  Statistical distribution or not, the trend is that hkls
> with big amplitude differences get smaller FOMs, so that means large
> model-to-data discrepancies are down-weighted.  I wonder sometimes at what
> point this becomes a self-fulfilling prophecy?  If you look in detail and
> the Fo-Fc differences in pretty much any refined structure in the PDB you
> will find huge outliers.  Some are hundreds of sigmas, and they can go in
> either direction.
>
> Take for example reflection -5,2,2 in the highest-resolution lysozyme
> structure in the PDB: 2vb1.  Iobs(-5,2,2) was recorded as 145.83 ± 3.62 (at
> 5.4 Ang) with Fcalc^2(-5,2,2) = 7264.  A 2000-sigma outlier!  What are the
> odds?   On the other hand, Iobs(4,-6,2) = 1611.21 ± 30.67 vs
> Fcalc^2(4,-6,2) = 73, which is in the opposite direction.  One can always
> suppose "experimental errors", but ZD sent me these images and I have
> looked at all the spots involved in these hkls.  I don't see anything wrong
> with any of them.  The average multiplicity of this data set was 7.1 and
> involved 3 different kappa angles, so I don't think these are "zingers" or
> other weird measurement problems.
>
> I'm not just picking on 2vb1 here.  EVERY PDB entry has this problem.  Not
> sure where it comes from, but the FOM assigned to these huge differences is
> always small, so whatever is causing them won't show up in an FOM-weighted
> map.
>
> Is there a way to "change up" the statistical distribution that assigns
> FOMs to hkls?  Or are we stuck with this systematic error?
>
> -James Holton
> MAD Scientist
>
> On 10/4/2019 9:31 AM, Randy Read wrote:
>
> Hi James,
>
> I'm sure you realise this, but it's important for other readers to
> remember that the FOM is a statistical quantity: we have a probability
> distribution for the true phase, we pick one phase (the "centroid" phase
> that should minimise the RMS error in the density map), and then the FOM is
> the expected value of the phase error, obtained by taking the cosines of
> all possible phase differences and weighting by the probability of that
> phase difference.  Because it's a statistical quantity from a random
> distribution, you really can't expect this to agree reflection by
> reflection!  It's a good start to see that the overall values are good, but
> if you want to look more closely you have to look a groups of reflections,
> e.g. bins of resolution, bins of observed amplitude, bins of calculated
> amplitude.  However, each bin has to have enough members that the average
> will generally be close to the expected value.
>
> Best wishes,
>
> Randy Read
>
> On 4 Oct 2019, at 16:38, James Holton <jmhol...@lbl.gov> wrote:
>
> I've done a few little experiments over the years using simulated data
> where I know the "correct" phase, trying to see just how accurate FOMs
> are.  What I have found in general is that overall FOM values are fairly
> well correlated to overall phase error, but if you go
> reflection-by-reflection they are terrible.  I suppose this is because FOM
> estimates are rooted in amplitudes.  Good agreement in amplitude gives you
> more confidence in the model (and therefore the phases), but if your R
> factor is 55% then your phases probably aren't very good either.  However,
> if you look at any given h,k,l those assumptions become less and less
> applicable.  Still, it's the only thing we've got.
>
> 2qwAt the end of the day, the phase you get out of a refinement program is
> the phase of the model.  All those fancy "FWT" coefficients with "m" and
> "D" or "FOM" weights are modifications to the amplitudes, not the phases.
> The phases in your 2mFo-DFc map are identical to those of just an Fc map.
> Seriously, have a look!  Sometimes you will get a 180 flip to keep the sign
> of the amplitude positive, but that's it.  Nevertheless, the electron
> density of a 2mFo-DFc map is closer to the "correct" electron density than
> any other map.  This is quite remarkable considering that the "phase error"
> is the same.
>
> This realization is what led my colleagues and I to forget about "phase
> error" and start looking at the error in the electron density itself
> (10.1073/pnas.1302823110).  We did this rather pedagogically.  Basically,
> pretend you did the whole experiment again, but "change up" the source of
> error of interest.  For example if you want to see the effect of sigma(F)
> then you add random noise with the same magnitude as sigma(F) to the Fs,
> and then re-refine the structure.  This gives you your new phases, and a
> new map. Do this 50 or so times and you get a pretty good idea of how any
> source of error of interest propagates into your map.  There is even a
> little feature in coot for animating these maps, which gives a much more
> intuitive view of the "noise".  You can also look at variation of model
> parameters like the refined occupancy of a ligand, which is a good way to
> put an "error bar" on it.  The trick is finding the right source of error
> to propagate.
>
> -James Holton
> MAD Scientist
>
>
> On 10/2/2019 2:47 PM, Andre LB Ambrosio wrote:
>
> Dear all,
>
> How is the phase error estimated for any given reflection, specifically in
> the context of model refinement? In terms of math I mean.
>
> How useful is FOM in assessing the phase quality, when not for initial
> experimental phases?
>
> Many thank in advance,
>
> Andre.
>
> ------------------------------
>
> To unsubscribe from the CCP4BB list, click the following link:
> https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1
>
>
>
> ------------------------------
>
> To unsubscribe from the CCP4BB list, click the following link:
> https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1
>
>
> ------
> Randy J. Read
> Department of Haematology, University of Cambridge
> Cambridge Institute for Medical Research     Tel: + 44 1223 336500
> The Keith Peters Building                               Fax: + 44 1223
> 336827
> Hills Road                                                       E-mail:
> rj...@cam.ac.uk <rj...@cam.ac.uk>
> Cambridge CB2 0XY, U.K.
> www-structmed.cimr.cam.ac.uk
>
>
>
> ------
> Randy J. Read
> Department of Haematology, University of Cambridge
> Cambridge Institute for Medical Research     Tel: + 44 1223 336500
> The Keith Peters Building                               Fax: + 44 1223
> 336827
> Hills Road                                                       E-mail:
> rj...@cam.ac.uk <rj...@cam.ac.uk>
> Cambridge CB2 0XY, U.K.
> www-structmed.cimr.cam.ac.uk
>
>
> ------------------------------
>
> To unsubscribe from the CCP4BB list, click the following link:
> https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1
>


-- 
Jan Dohnalek, Ph.D
Institute of Biotechnology
Academy of Sciences of the Czech Republic
Biocev
Prumyslova 595
252 50 Vestec near Prague
Czech Republic

Tel. +420 325 873 758

########################################################################

To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1

Reply via email to