This has been said already but just to emphasise my view.

1) Measured intensities should never be "lost" and the TRUNCATED output should/ (and does by default) keep them.

2) TRUNCATE extracts Fs from Imeass in bins; it is far from perfect but we need amplitude estimates for some purposes and this is better than just square rooting the Imeas. NC translation should betaken into account but it is hard to model.

3) All the amplitude extraction can be done better once the model is built and I guess that is another reason for keeping the intensities.

4) The estimated B factor is just a rough estimate and should not be taken too seriously. It assumes isotropic diffraction and complete data neither of which are necessarily present..

Eleanor

George M. Sheldrick wrote:
Dear Ian and Phil,

I am very reluctant to touch the experimental data, so I am a
bit concerned about the argument that it is better to refine
against <F^2> than Imeas. Are you sure that the 'best estimate' of the intensity is also unbiassed?

To take an extreme example, suppose that we have processed the data to a higher resolution than the crystal actually diffracts.
The mean of Imeas in the outer shell should then be zero. The
'best estimate' of <F^2> must however be slightly positive for
all reflections, and recycling the estimation procedure will
not change this. The affect on the refinement will be to reduce
all B-values a little, i.e. it will lead to a biassed model.

This is of course a purely theoretical discussion, there is no
need for anyone to retract their published structures.

Best wishes, George

Prof. George M. Sheldrick FRS
Dept. Structural Chemistry,
University of Goettingen,
Tammannstr. 4,
D37077 Goettingen, Germany
Tel. +49-551-39-3021 or -3068
Fax. +49-551-39-22582


On Sun, 24 Aug 2008, Ian Tickle wrote:

Phil

Squaring the <F> (i.e. the best estimate of Ftrue) that Truncate
estimates does not give you <F^2> (the best estimate of Itrue):

        <F>      = Integral[0:infinity] sqrt(J) p(J|I) dJ

        var(F)   = Integral[0:infinity] (sqrt(J)-<F>)^2 p(J|I) dJ

        <F^2>    = Integral[0:infinity] J p(J|I) dJ

        var(F^2) = Integral[0:infinity] (J-<F^2>)^2 p(J|I) dJ

where

      p(J|I) = p(I|J) p(J) / p(I) [Bayes Theorem] and I = Imeas, J =
Itrue.

From the eqn for var(F) we see that var(F) = <F^2> - <F>^2, or <F^2> =
<F>^2 + var(F), i.e. <F^2> is not the same as <F>^2.  If you want to use
the true F^2 in a calculation, isn't <F^2> the best estimate by
definition?

Using Imeas in a calculation where Itrue is called for is exactly
equivalent to assuming a uniform Bayesian prior p(J) on Itrue, i.e.
you're assuming that all true values of I (constrained at least to be >=
0 by physics and constrained further by the assumed distribution of atom
positions) from -infinity to +infinity are equally likely!! - so you're
throwing away all prior knowledge about the data.  It doesn't seem
logical to make use of prior info to estimate F but then not use it to
estimate F^2.  By using <F^2> wouldn't you be making the best use of
your prior knowledge?

Cheers

-- Ian

-----Original Message-----
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Phil Evans
Sent: 24 August 2008 20:08
To: [email protected]
Subject: Re: [ccp4bb] Wilson plot from truncated.mtz

You're right. The smoothed <I> is used for the Truncate procedure, though it is difficult in the very low resolution bins. Also it doesn't allow for pseudo-translations, which it should.

As you say, the linear fit is only used to put data on a very rough absolute scale. This isn't necessary, but it doesn't hurt

There's no reason why truncate shouldn't give a "best" estimate of |F| ^2 but I'm not sure why you would want this. I would think that refinement is better done against the measured I, which may be slightly negative

Phil


On 24 Aug 2008, at 19:23, Ian Tickle wrote:

Phil

OK I admit I didn't delve very deeply into the code, but
looking at
the
printer output it obviously does do a linear fit to the
Wilson plot to
get an overall B & scale. However, looking at the man page
(if all
else
fails read the documentation!) I see that it does say that this is only
used to put the data on an approximately absolute scale.  This is
unnecessary of course - absolutely scaled data isn't needed for HA
phasing & MR, and the refinement will give a much more
accurate scale
factor anyway.

Thanks for the info.

Cheers

-- Ian

-----Original Message-----
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Phil Evans
Sent: 23 August 2008 08:09
To: [email protected]
Subject: Re: [ccp4bb] Wilson plot from truncated.mtz

Actually Truncate (& ctruncate) do use a smoothed <I> in resolution
shell (using a spline fit), not a linear Wilson plot

Phil



On 22 Aug 2008, at 20:30, Ian Tickle wrote:

This indeed raises the question of whether the assumed Wilson
distribution is valid, and it's another point I was in
fact going to
bring up.  As presently constructed, Truncate fits a
straight line to
the Wilson plot (based on Imeas) in order to determine the overall
scale
& B, but to avoid the problem that the low res data for a typical
protein deviates markedly from the theoretical
distribution, it uses
resolution limits determined by the RSCALE option.
According to the
man
page, the low resolution limit is by default set to 4 Ang
if the high
res limit is higher than 3.5 Ang.  This usually means that the
straight
line fit is good for the high res data, but very poor for
the low res
data, but at least this means that the assumed distribution
is valid
at
high res where most of the weak data (i.e. the data most
affected by
the
Bayes correction) is located, but not valid for any weak
data at low
res
(there will obviously be some). As you point out
translational NCS
invalidates these assumptions since the form of the distribution
changes, but then probably only a few % of structures
suffer from this
type of NCS.

A better way to deal with this would surely be to forget about the
Wilson plot and simply determine the average I in
resolution shells,
with perhaps spline interpolation between the bin means (this is
actually the 'k-curve' method for determining E's, attributed
originally
to Karle & Hauptman I believe).  There would still be an
assumption of
the Wilson distribution but now only within the bins, i.e.
the Wilson
distribution parameter would vary with resolution, when
previously it
was a single number independent of resolution.  I think
this would go
some way towards addressing your objections.

In fact my prog ECALC already uses the k-curve method to
determine E's
and it would probably not be too much work to have it
optionally read
the IMEAN column, perform the Bayes probability integrals
and output
both <F> and <F^2> (also <E> & <E^2> as now). However
I've no idea
whether my iteration idea for re-using the <F^2> values for the k-
curve
will work - it may well diverge!

Cheers

-- Ian

-----Original Message-----
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of George
M. Sheldrick
Sent: 22 August 2008 18:57
To: [email protected]
Subject: Re: [ccp4bb] Wilson plot from truncated.mtz

In addition to Ian's circular argument, there is the problem that
the assumed distribution is only approximately valid,
indeed in the
presence of (translational) NCS it could well be a poor
approximation.
Refinement against suitably weighted measured intensities
(which may of
course be slightly negative because of experimental errors)
avoids this
problem but we still need F(obs) (and hence TRUNCATE) to
calculate a map.

George

Prof. George M. Sheldrick FRS
Dept. Structural Chemistry,
University of Goettingen,
Tammannstr. 4,
D37077 Goettingen, Germany
Tel. +49-551-39-3021 or -3068
Fax. +49-551-39-22582


On Fri, 22 Aug 2008, Ian Tickle wrote:

This goes back to the issue I was raising, namely that
<F>^2 (from the
Truncate output mtz F column) is not the same as Imeas
(in the IMEAN
column) so you won't get exactly the same results from the
Wilson plot,
particularly at high res where the average I/sigma is low.
Since the
plot actually demands F^2 then it seems to me that
logically you need to
use <F^2> which AFAICS is not possible using Truncate
since it never
calculates that.

This gets you into a circular argument because you need
the correct
Wilson plot results in order to perform the Bayes
correction to the
intensities (i.e. it gives you the prior distribution
parameter),
however you need the Bayes-corrected intensities to
correctly calculate
the Wilson plot! Possibly iterating (from the initial
Wilson plot
results calculated using Imeas) will sort this out.

Also referring to an earlier response by Phil, Truncate
clearly outputs
the scaled Imeas, not <F^2>, in the IMEAN column as I had
originally
assumed, since the column has a -ve min value from mtzdump
(<F^2> can
never be < 0), and logically it's <F^2> not Imeas or <F>
that you need
for applications (such as MR and F^2 based refinement)
which demand F^2.
Cheers

-- Ian

-----Original Message-----
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of
Eleanor Dodson
Sent: 22 August 2008 14:16
To: [EMAIL PROTECTED]
Cc: [email protected]; [EMAIL PROTECTED]
Subject: Re: Wilson plot from truncated.mtz

rerun truncate with input amplitudes..
eleanor

James Pauff wrote:
If I've lost my SCALA MTZ, and have only the truncated.mtz
for my dataset, which program is the quickest means of
obtaining a Wilson plot?
Thank you again,
Jim


--- On Wed, 8/20/08, Eleanor Dodson
<[EMAIL PROTECTED]> wrote:
From: Eleanor Dodson <[EMAIL PROTECTED]>
Subject: Re: [ccp4bb] Lower completeness, decent R
factors, but low B factor...
To: [email protected]
Date: Wednesday, August 20, 2008, 4:30 AM
James Pauff wrote:

Hello all,

I have a refined structure at 2.6 angstroms that at

about 73% completeness at this resolution.  The I/sigma is
about 2.0 at 2.6 angstroms, and the omit density for my
ligands is great contoured at 3.0sigma.  My Rcryst is 19 or
so and the Rfree is 24.5 or so.

HOWEVER, my mean B value is 13.9, whereas my other 2

structures (at 2.2 and 2.3 angstroms, same protein, >95%
completeness) have mean B values of 22+.  Any suggestions as
to what is going on here?  I'm having trouble explaining
this.

Thank you,
Jim







Have you used TLS - listed B factors will then be given
relative to the
TLS parameters. You need to run tLSANL to get a more
realistic value.
Eleanor


But in fact temperature factors are rather harder to
estimate at lower
resolutions than higher. Look at your <Fo> and
<Fc> curves v resolution
( part of a REFMAC loggraph) and you can see that sometimes
the overall
scaling struggles to get a reasonable fit..





Disclaimer
This communication is confidential and may contain
privileged information intended solely for the named
addressee(s). It may not be used or disclosed except for the
purpose for which it has been sent. If you are not the
intended recipient you must not review, use, disclose, copy,
distribute or take any action in reliance upon it. If you
have received this communication in error, please notify
Astex Therapeutics Ltd by emailing
[EMAIL PROTECTED] and destroy all copies of the
message and any attached documents.
Astex Therapeutics Ltd monitors, controls and protects all
its messaging traffic in compliance with its corporate email
policy. The Company accepts no liability or responsibility
for any onward transmission or use of emails and attachments
having left the Astex Therapeutics domain.  Unless expressly
stated, opinions in this message are those of the individual
sender and not of Astex Therapeutics Ltd. The recipient
should check this email and any attachments for the presence
of computer viruses. Astex Therapeutics Ltd accepts no
liability for damage caused by any virus transmitted by this
email. E-mail is susceptible to data corruption,
interception, unauthorized amendment, and tampering, Astex
Therapeutics Ltd only send and receive e-mails on the basis
that the Company is not liable for any such alteration or any
consequences thereof.
Astex Therapeutics Ltd., Registered in England at 436
Cambridge Science Park, Cambridge CB4 0QA under number 3751674
Disclaimer
This communication is confidential and may contain privileged
information intended solely for the named addressee(s). It may not
be used or disclosed except for the purpose for which it has been
sent. If you are not the intended recipient you must not review,
use, disclose, copy, distribute or take any action in
reliance upon
it. If you have received this communication in error,
please notify
Astex Therapeutics Ltd by emailing [EMAIL PROTECTED]
and destroy all copies of the message and any attached documents.
Astex Therapeutics Ltd monitors, controls and protects all its
messaging traffic in compliance with its corporate email
policy. The
Company accepts no liability or responsibility for any onward
transmission or use of emails and attachments having left
the Astex
Therapeutics domain.  Unless expressly stated, opinions in this
message are those of the individual sender and not of Astex
Therapeutics Ltd. The recipient should check this email and any
attachments for the presence of computer viruses. Astex
Therapeutics
Ltd accepts no liability for damage caused by any virus
transmitted
by this email. E-mail is susceptible to data corruption,
interception, unauthorized amendment, and tampering, Astex
Therapeutics Ltd only send and receive e-mails on the basis
that the
Company is not liable for any such alteration or any consequences
thereof.
Astex Therapeutics Ltd., Registered in England at 436 Cambridge
Science Park, Cambridge CB4 0QA under number 3751674
Disclaimer
This communication is confidential and may contain privileged information intended solely for the named addressee(s). It may not be used or disclosed except for the purpose for which it has been sent. If you are not the intended recipient you must not review, use, disclose, copy, distribute or take any action in
reliance upon
it. If you have received this communication in error,
please notify
Astex Therapeutics Ltd by emailing [EMAIL PROTECTED] and destroy all copies of the message and any attached documents. Astex Therapeutics Ltd monitors, controls and protects all its messaging traffic in compliance with its corporate email
policy. The
Company accepts no liability or responsibility for any onward transmission or use of emails and attachments having left
the Astex
Therapeutics domain. Unless expressly stated, opinions in this message are those of the individual sender and not of Astex Therapeutics Ltd. The recipient should check this email and any attachments for the presence of computer viruses. Astex
Therapeutics
Ltd accepts no liability for damage caused by any virus
transmitted
by this email. E-mail is susceptible to data corruption, interception, unauthorized amendment, and tampering, Astex Therapeutics Ltd only send and receive e-mails on the basis
that the
Company is not liable for any such alteration or any consequences thereof. Astex Therapeutics Ltd., Registered in England at 436 Cambridge Science Park, Cambridge CB4 0QA under number 3751674


Disclaimer
This communication is confidential and may contain privileged information intended solely for the named addressee(s). It may not be used or disclosed except for the purpose for which it has been sent. If you are not the intended recipient you must not review, use, disclose, copy, distribute or take any action in reliance upon it. If you have received this communication in error, please notify Astex Therapeutics Ltd by emailing [EMAIL PROTECTED] and destroy all copies of the message and any attached documents. Astex Therapeutics Ltd monitors, controls and protects all its messaging traffic in compliance with its corporate email policy. The Company accepts no liability or responsibility for any onward transmission or use of emails and attachments having left the Astex Therapeutics domain. Unless expressly stated, opinions in this message are those of the individual sender and not of Astex Therapeutics Ltd. The recipient should check this email and any attachments for the presence of computer viruses. Astex Therapeutics Ltd accepts no liability for damage caused by any virus transmitted by this email. E-mail is susceptible to data corruption, interception, unauthorized amendment, and tampering, Astex Therapeutics Ltd only send and receive e-mails on the basis that the Company is not liable for any such alteration or any consequences thereof.
Astex Therapeutics Ltd., Registered in England at 436 Cambridge Science Park, 
Cambridge CB4 0QA under number 3751674



Reply via email to