Re: [ccp4bb] Refinement against frames

Loes Kroon-Batenburg Tue, 25 Jun 2013 06:47:40 -0700

Dear Boaz,


Dear Boaz,
On 06/25/13 14:09, Boaz Shaanan wrote:

Dear Loes,

Thanks for the message. To the best of my recollection (I actually come from 
small molecules crystallography) the problems of small molecule 
crystallographers when it comes to studying accurate e.d.'s (e.g. bond 
densities and such) have mostly to do with separating the effect of atomic 
thermal motion and true residual bond densities, i.e. mostly issues of 
modelling the thermal motion. TDS is a pain for small molecule crystallography 
and protein crystallographers. It's reminiscent of  the British weather - 
everybody complains about it but nobody does anything about it. Do small 
molecule crystallographers model TDS properly and correct the data for it 
nowadays in studies of accurate e.d.

I agree that in small molecule crystallography thermal motion has to bemodelled accurately, certainly for accurate electron density studies.The thermal motion leads to TDS, and can be found at /near Braggpositions. This is mostly ignored.

Modelling the thermal motion in proteins by B-factors is known to be a gross 
over-simplification because of many reasons, some of which you mentioned. TDS 
is another issue. There have been attempts in the past by several groups to 
deal with TDS in protein crystals but I'm not sure the community was convinced 
that it lead to improvement of the data. Whether TDS is the main culprit for 
the relatively high R factor of protein structures (that is relative to small 
molecules) is not clear. Modelling TDS (both the parts that arise from protein 
dynamics and crystal disorder) in protein data,  in order to improve our data 
and the resulting atomic models is a good thing.

In proteins also TDS occurs but more importantly large (correlated)domain motions lead to scattering in between Bragg peaks, in the shapeof large diffuse clouds or streaks or whatever.

Why should that logically lead to refinement against frames once the TDS has 
been modelled properly and the data corrected accordingly (future tense should 
be used here, actually), is not clear to me. I would think that working on one 
(or a few) data sets that suffer from severe TDS, correcting the data, and 
re-refining the models to see what difference it makes would be a good starting 
point.

The fact that these can be observed tells us that proteins crystals showmuch more dynamics (frozen in as static disorder) than we tend toassume. And thus our description of protein structures is simplified.


Best wishes,
Loes.

    Cheers,

                Boaz


Boaz Shaanan, Ph.D.
Dept. of Life Sciences
Ben-Gurion University of the Negev
Beer-Sheva 84105
Israel

E-mail: [email protected]
Phone: 972-8-647-2220  Skype: boaz.shaanan
Fax:   972-8-647-2992 or 972-8-646-1710





________________________________________
From: CCP4 bulletin board [[email protected]] on behalf of Loes 
Kroon-Batenburg [[email protected]]
Sent: Tuesday, June 25, 2013 1:09 PM
To: [email protected]
Subject: Re: [ccp4bb] Refinement against frames

Dear Boaz,

Indeed, small molecule crystallographers are routinely converting pixels
into I's and can refine structures to very low R-values, but only to a
limited resolution. The Bragg intensities are very strong, and
background scattering stays almost unnoticed. Once they start studying
accurate electron densities the flaws in the models (Icalc) become
apparent.
However, protein crystals are different: they have large disordered
solvent regions, disorder in the proteins conformations, and background
scattering of the mother liquor/air/crystal mount that may be even
stronger than the many weak intensities. The disorder of the protein
will lead to incoherent scattering that also produces significant
background scattering, which at moderate B-factors  may make up half of
the total scattering.  Converting pixel intensities into I_bragg (after
subtracting some background) and refining against those (or F's) is
clearly a simplification, and only gives us the average structure and
not the true structure. The disorder may also lead to more structured
non-Bragg scattering, which we call diffuse scattering, indicating that
our crystal is in fact not periodic. Understanding what is really going
on in our crystal, and trying to model the observed raw diffraction
patterns is in fact very interesting, may solve the problems of trying
to convert I's to F's, may give a better estimate of the 'average'
structures and tell us how the protein molecules are really behaving (in
the crystal).
Trying to model diffraction images comes with lots of additional
problems, because instrumental characteristics have also to be modeled.
However, it is a very interesting route to go.
There may be a moment in future where we think we can do this. It would
be good if than we would have raw images available of all those weirdly
diffracting crystals, that we managed in some way or another to extract
I_bragg (or Ispot-Iback) from.

Greetings,
Loes.

On 06/24/13 14:21, Boaz Shaanan wrote:

Hi Tim,

I agree with you.  Another point to remember about this issue of pixel->F's  
(or I's) conversion is that small molecule crystallographers take the same route 
and produce structures with 1-2% R-factors, so this conversion is hardly our 
problem. The main culprit in the issues that have been discussed so lucidly on the 
BB recently have mostly to do with the vast amount of weak reflections in 
diffraction patterns of macromolecules (and how to decide on resolution in such 
situations). Digging into the peak/background pixels and signal/noise ratio there 
is just going to open another Pandora box.

My 2p thoughts.

           Cheers,

                   Boaz


Boaz Shaanan, Ph.D.
Dept. of Life Sciences
Ben-Gurion University of the Negev
Beer-Sheva 84105
Israel

E-mail: [email protected]
Phone: 972-8-647-2220  Skype: boaz.shaanan
Fax:   972-8-647-2992 or 972-8-646-1710





________________________________________
From: CCP4 bulletin board [[email protected]] on behalf of Tim Gruene 
[[email protected]]
Sent: Monday, June 24, 2013 2:59 PM
To: [email protected]
Subject: Re: [ccp4bb] Refinement against frames

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Dear John,

actually I am not a friend of this idea. Processing software make an
excellent job of removing the instrumental part from our data. If we
start to integrate against frames, the next structural title might be
something like "Crystal structure of ABC a xA resolution measured at
beamline xyz with a frame width of f degrees and a total rotation
range of phi degreees..." the point I am trying to make: once
integrating against frames one may have to take a lot of issues into
account for interpreting the structure.
And do you think that refining against frames will actually give
greater chemical or biological insight into the sample, or will it
only give a more accurate description of the crystal contents? These
are two different things and the latter is - in my opinion - not what
structures are about.

Best, Tim

P.S.: I changed the subject line, because the thread based sorting of
my emails is soon going to exceed the width of my screem for the
original one.

On 06/24/2013 08:13 AM, Jrh wrote:

Dear Tom, I find this suggestion of using the full images an
excellent and visionary one. So, how to implement it? We are part
way along the path with James Holton's reverse Mosflm. The computer
memory challenge could be ameliorated by simple pixel averaging at
least initially. The diffuse scattering would be the ultimate gold
at the end of the rainbow. Peter Moore's new book, inter alia,
carries many splendid insights into the diffuse scattering in our
diffraction patterns. Fullprof analyses have become a firm trend in
other fields, admittedly with simpler computing overheads.
Greetings, John

Prof John R Helliwell DSc FInstP



On 21 Jun 2013, at 23:16, "Terwilliger, Thomas C"
<[email protected]>   wrote:

I hope I am not duplicating too much of this fascinating
discussion with these comments:  perhaps the main reason there is
confusion about what to do is that neither F nor I is really the
most suitable thing to use in refinement.  As pointed out several
times in different ways, we don't measure F or I, we only measure
counts on a detector.  As a convenience, we "process" our
diffraction images to estimate I or F and their uncertainties and
model these uncertainties as simple functions (e.g., a Gaussian).
There is no need in principle to do that, and if we were to
refine instead against the raw image data these issues about
positivity would disappear and our structures might even be a
little better.

Our standard procedure is to estimate F or I from counts on the
detector, then to use these estimates of F or I in refinement.
This is not so easy to do right because F or I contain many terms
coming from many pixels and it is hard to model their statistics
in detail.  Further, attempts we make to estimate either F or I
as physically plausible values (e.g., using the fact that they
are not negative) will generally be biased (the values after
correction will generally be systematically low or systematically
high, as is true for the French and Wilson correction and as
would be true for the truncation of I at zero or above).

Randy's method for intensity refinement is an improvement because
the statistics are treated more fully than just using an estimate
of F or I and assuming its uncertainty has a simple distribution.
So why not avoid all the problems with modeling the statistics of
processed data and instead refine against the raw data.  From the
structural model you calculate F, from F and a detailed model of
the experiment (the same model that is currently used in data
processing) you calculate the counts expected on each pixel. Then
you calculate the likelihood of the data given your models of the
structure and of the experiment.  This would have lots of
benefits because it would allow improved descriptions of the
experiment (decay, absorption, detector sensitivity, diffuse
scattering and other "background" on the images,....on and on)
that could lead to more accurate structures in the end.  Of
course there are some minor issues about putting all this in
computer memory for refinement....

-Tom T ________________________________________ From: CCP4
bulletin board [[email protected]] on behalf of Phil
[[email protected]] Sent: Friday, June 21, 2013 2:50 PM To:
[email protected] Subject: Re: [ccp4bb] ctruncate bug?

However you decide to argue the point, you must consider _all_
the observations of a reflection (replicates and symmetry
related) together when you infer Itrue or F etc, otherwise you
will bias the result even more. Thus you cannot (easily) do it
during integration

Phil

Sent from my iPad

On 21 Jun 2013, at 20:30, Douglas Theobald
<[email protected]>   wrote:

On Jun 21, 2013, at 2:48 PM, Ed Pozharski
<[email protected]>   wrote:

Douglas,

Observed intensities are the best estimates that we can
come up with in an experiment.

I also agree with this, and this is the clincher.  You are
arguing that Ispot-Iback=Iobs is the best estimate we can
come up with.  I claim that is absurd.  How are you
quantifying "best"?  Usually we have some sort of
discrepancy measure between true and estimate, like RMSD,
mean absolute distance, log distance, or somesuch.  Here is
the important point --- by any measure of discrepancy you
care to use, the person who estimates Iobs as 0 when
Iback>Ispot will *always*, in *every case*, beat the person
who estimates Iobs with a negative value.   This is an
indisputable fact.

First off, you may find it useful to avoid such words as
absurd and indisputable fact.  I know political correctness
may be sometimes overrated, but if you actually plan to have
meaningful discussion, let's assume that everyone responding
to your posts is just trying to help figure this out.

I apologize for offending and using the strong words --- my
intention was not to offend.  This is just how I talk when
brainstorming with my colleagues around a blackboard, but of
course then you can see that I smile when I say it.

To address your point, you are right that J=0 is closer to
"true intensity" then a negative value.  The problem is that
we are not after a single intensity, but rather all of them,
as they all contribute to electron density reconstruction.
If you replace negative Iobs with E(J), you would
systematically inflate the averages, which may turn
problematic in some cases.

So, I get the point.  But even then, using any reasonable
criterion, the whole estimated dataset will be closer to the
true data if you set all "negative" intensity estimates to 0.

It is probably better to stick with "raw intensities" and
construct theoretical predictions properly to account for
their properties.

What I was trying to tell you is that observed intensities is
what we get from experiment.

But they are not what you get from the detector.  The detector
spits out a positive value for what's inside the spot.  It is
we, as human agents, who later manipulate and massage that data
value by subtracting the background estimate.  A value that has
been subjected to a crude background subtraction is not the raw
experimental value.  It has been modified, and there must be
some logic to why we massage the data in that particular
manner.  I agree, of course, that the background should be
accounted for somehow.  But why just subtract it away?  There
are other ways to massage the data --- see my other post to
Ian.  My argument is that however we massage the experimentally
observed value should be physically informed, and allowing
negative intensity estimates violates the basic physics.

[snip]

These observed intensities can be negative because while
their true underlying value is positive, random errorsmay
result in Iback>Ispot.  There is absolutely nothing
unphysical here.

Yes there is.  The only way you can get a negative estimate
is to make unphysical assumptions.  Namely, the estimate
Ispot-Iback=Iobs assumes that both the true value of I and
the background noise come from a Gaussian distribution that
is allowed to have negative values.  Both of those
assumptions are unphysical.

See, I have a problem with this.  Both common sense and laws
of physics dictate that number of photons hitting spot on a
detector is a positive number.  There is no law of physics
that dictates that under no circumstances there could be
Ispot<Iback.

That's not what I'm saying.  Sure, Ispot can be less than Iback
randomly.  That does not mean we have to estimate the detected
intensity as negative, after accounting for background.

Yes, E(Ispot)>=E(Iback).  Yes, E(Ispot-Iback)>=0.  But
P(Ispot-Iback=0)>0, and therefore experimental sampling of
Ispot-Iback is bound to occasionally produce negative values.
What law of physics is broken when for a given reflection
total number of photons in spot pixels is less that total
number of photons in equal number of pixels in the
surrounding background mask?

Cheers,

Ed.

-- Oh, suddenly throwing a giraffe into a volcano to make
water is crazy? Julian, King of Lemurs

- --
Dr Tim Gruene
Institut fuer anorganische Chemie
Tammannstr. 4
D-37077 Goettingen

GPG Key ID = A46BEE1A
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iD8DBQFRyDSWUxlJ7aRr7hoRAlmKAKD0MRGp21frbv8LcNG78Y30PPmi9ACdGgR6
eTfN0+B0XrOgpjIS+wu+KHY=
=sFxD
-----END PGP SIGNATURE-----


--

__________________________________________

Dr. Loes Kroon-Batenburg
Dept. of Crystal and Structural Chemistry
Bijvoet Center for Biomolecular Research
Utrecht University
Padualaan 8, 3584 CH Utrecht
The Netherlands

E-mail : [email protected]
phone  : +31-30-2532865
fax    : +31-30-2533940
__________________________________________



--

__________________________________________

Dr. Loes Kroon-Batenburg
Dept. of Crystal and Structural Chemistry
Bijvoet Center for Biomolecular Research
Utrecht University
Padualaan 8, 3584 CH Utrecht
The Netherlands

E-mail : [email protected]
phone  : +31-30-2532865
fax    : +31-30-2533940
__________________________________________

Re: [ccp4bb] Refinement against frames

Reply via email to