Phil

Squaring the <F> (i.e. the best estimate of Ftrue) that Truncate
estimates does not give you <F^2> (the best estimate of Itrue):

        <F>      = Integral[0:infinity] sqrt(J) p(J|I) dJ

        var(F)   = Integral[0:infinity] (sqrt(J)-<F>)^2 p(J|I) dJ

        <F^2>    = Integral[0:infinity] J p(J|I) dJ

        var(F^2) = Integral[0:infinity] (J-<F^2>)^2 p(J|I) dJ

where

      p(J|I) = p(I|J) p(J) / p(I) [Bayes Theorem] and I = Imeas, J =
Itrue.

>From the eqn for var(F) we see that var(F) = <F^2> - <F>^2, or <F^2> =
<F>^2 + var(F), i.e. <F^2> is not the same as <F>^2.  If you want to use
the true F^2 in a calculation, isn't <F^2> the best estimate by
definition?

Using Imeas in a calculation where Itrue is called for is exactly
equivalent to assuming a uniform Bayesian prior p(J) on Itrue, i.e.
you're assuming that all true values of I (constrained at least to be >=
0 by physics and constrained further by the assumed distribution of atom
positions) from -infinity to +infinity are equally likely!! - so you're
throwing away all prior knowledge about the data.  It doesn't seem
logical to make use of prior info to estimate F but then not use it to
estimate F^2.  By using <F^2> wouldn't you be making the best use of
your prior knowledge?

Cheers

-- Ian

> -----Original Message-----
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Phil Evans
> Sent: 24 August 2008 20:08
> To: CCP4BB@JISCMAIL.AC.UK
> Subject: Re: [ccp4bb] Wilson plot from truncated.mtz
> 
> You're right. The smoothed <I> is used for the Truncate procedure,  
> though it is difficult in the very low resolution bins. Also it  
> doesn't allow for pseudo-translations, which it should.
> 
> As you say, the linear fit is only used to put data on a very rough  
> absolute scale. This isn't necessary, but it doesn't hurt
> 
> There's no reason why truncate shouldn't give a "best" 
> estimate of |F| 
> ^2 but I'm not sure why you would want this. I would think that  
> refinement is better done against the measured I, which may be  
> slightly negative
> 
> Phil
> 
> 
> On 24 Aug 2008, at 19:23, Ian Tickle wrote:
> 
> >
> > Phil
> >
> > OK I admit I didn't delve very deeply into the code, but 
> looking at  
> > the
> > printer output it obviously does do a linear fit to the 
> Wilson plot to
> > get an overall B & scale.  However, looking at the man page 
> (if all  
> > else
> > fails read the documentation!) I see that it does say that this is  
> > only
> > used to put the data on an approximately absolute scale.  This is
> > unnecessary of course - absolutely scaled data isn't needed for HA
> > phasing & MR, and the refinement will give a much more 
> accurate scale
> > factor anyway.
> >
> > Thanks for the info.
> >
> > Cheers
> >
> > -- Ian
> >
> >> -----Original Message-----
> >> From: [EMAIL PROTECTED]
> >> [mailto:[EMAIL PROTECTED] On Behalf Of Phil Evans
> >> Sent: 23 August 2008 08:09
> >> To: CCP4BB@JISCMAIL.AC.UK
> >> Subject: Re: [ccp4bb] Wilson plot from truncated.mtz
> >>
> >> Actually Truncate (& ctruncate) do use a smoothed <I> in resolution
> >> shell (using a spline fit), not a linear Wilson plot
> >>
> >> Phil
> >>
> >>
> >>
> >> On 22 Aug 2008, at 20:30, Ian Tickle wrote:
> >>
> >>> This indeed raises the question of whether the assumed Wilson
> >>> distribution is valid, and it's another point I was in 
> fact going to
> >>> bring up.  As presently constructed, Truncate fits a
> >> straight line to
> >>> the Wilson plot (based on Imeas) in order to determine the overall
> >>> scale
> >>> & B, but to avoid the problem that the low res data for a typical
> >>> protein deviates markedly from the theoretical 
> distribution, it uses
> >>> resolution limits determined by the RSCALE option.
> >> According to the
> >>> man
> >>> page, the low resolution limit is by default set to 4 Ang
> >> if the high
> >>> res limit is higher than 3.5 Ang.  This usually means that the
> >>> straight
> >>> line fit is good for the high res data, but very poor for
> >> the low res
> >>> data, but at least this means that the assumed distribution
> >> is valid
> >>> at
> >>> high res where most of the weak data (i.e. the data most
> >> affected by
> >>> the
> >>> Bayes correction) is located, but not valid for any weak
> >> data at low
> >>> res
> >>> (there will obviously be some).  As you point out 
> translational NCS
> >>> invalidates these assumptions since the form of the distribution
> >>> changes, but then probably only a few % of structures
> >> suffer from this
> >>> type of NCS.
> >>>
> >>> A better way to deal with this would surely be to forget about the
> >>> Wilson plot and simply determine the average I in 
> resolution shells,
> >>> with perhaps spline interpolation between the bin means (this is
> >>> actually the 'k-curve' method for determining E's, attributed
> >>> originally
> >>> to Karle & Hauptman I believe).  There would still be an
> >> assumption of
> >>> the Wilson distribution but now only within the bins, i.e.
> >> the Wilson
> >>> distribution parameter would vary with resolution, when
> >> previously it
> >>> was a single number independent of resolution.  I think
> >> this would go
> >>> some way towards addressing your objections.
> >>>
> >>> In fact my prog ECALC already uses the k-curve method to
> >> determine E's
> >>> and it would probably not be too much work to have it
> >> optionally read
> >>> the IMEAN column, perform the Bayes probability integrals 
> and output
> >>> both <F> and <F^2> (also <E> & <E^2> as now).  However 
> I've no idea
> >>> whether my iteration idea for re-using the <F^2> values for the k-
> >>> curve
> >>> will work - it may well diverge!
> >>>
> >>> Cheers
> >>>
> >>> -- Ian
> >>>
> >>>> -----Original Message-----
> >>>> From: [EMAIL PROTECTED]
> >>>> [mailto:[EMAIL PROTECTED] On Behalf Of George
> >> M. Sheldrick
> >>>> Sent: 22 August 2008 18:57
> >>>> To: CCP4BB@JISCMAIL.AC.UK
> >>>> Subject: Re: [ccp4bb] Wilson plot from truncated.mtz
> >>>>
> >>>> In addition to Ian's circular argument, there is the problem that
> >>>> the assumed distribution is only approximately valid, 
> indeed in the
> >>>> presence of (translational) NCS it could well be a poor
> >>>> approximation.
> >>>> Refinement against suitably weighted measured intensities
> >>>> (which may of
> >>>> course be slightly negative because of experimental errors)
> >>>> avoids this
> >>>> problem but we still need F(obs) (and hence TRUNCATE) to
> >>>> calculate a map.
> >>>>
> >>>> George
> >>>>
> >>>> Prof. George M. Sheldrick FRS
> >>>> Dept. Structural Chemistry,
> >>>> University of Goettingen,
> >>>> Tammannstr. 4,
> >>>> D37077 Goettingen, Germany
> >>>> Tel. +49-551-39-3021 or -3068
> >>>> Fax. +49-551-39-22582
> >>>>
> >>>>
> >>>> On Fri, 22 Aug 2008, Ian Tickle wrote:
> >>>>
> >>>>> This goes back to the issue I was raising, namely that
> >>>> <F>^2 (from the
> >>>>> Truncate output mtz F column) is not the same as Imeas
> >> (in the IMEAN
> >>>>> column) so you won't get exactly the same results from the
> >>>> Wilson plot,
> >>>>> particularly at high res where the average I/sigma is low.
> >>>> Since the
> >>>>> plot actually demands F^2 then it seems to me that
> >>>> logically you need to
> >>>>> use <F^2> which AFAICS is not possible using Truncate
> >> since it never
> >>>>> calculates that.
> >>>>>
> >>>>> This gets you into a circular argument because you need
> >> the correct
> >>>>> Wilson plot results in order to perform the Bayes
> >> correction to the
> >>>>> intensities (i.e. it gives you the prior distribution 
> parameter),
> >>>>> however you need the Bayes-corrected intensities to
> >>>> correctly calculate
> >>>>> the Wilson plot!  Possibly iterating (from the initial 
> Wilson plot
> >>>>> results calculated using Imeas) will sort this out.
> >>>>>
> >>>>> Also referring to an earlier response by Phil, Truncate
> >>>> clearly outputs
> >>>>> the scaled Imeas, not <F^2>, in the IMEAN column as I had
> >> originally
> >>>>> assumed, since the column has a -ve min value from mtzdump
> >>>> (<F^2> can
> >>>>> never be < 0), and logically it's <F^2> not Imeas or <F>
> >>>> that you need
> >>>>> for applications (such as MR and F^2 based refinement)
> >>>> which demand F^2.
> >>>>>
> >>>>> Cheers
> >>>>>
> >>>>> -- Ian
> >>>>>
> >>>>>> -----Original Message-----
> >>>>>> From: [EMAIL PROTECTED]
> >>>>>> [mailto:[EMAIL PROTECTED] On Behalf Of 
> Eleanor Dodson
> >>>>>> Sent: 22 August 2008 14:16
> >>>>>> To: [EMAIL PROTECTED]
> >>>>>> Cc: CCP4BB@jiscmail.ac.uk; [EMAIL PROTECTED]
> >>>>>> Subject: Re: Wilson plot from truncated.mtz
> >>>>>>
> >>>>>> rerun truncate with input amplitudes..
> >>>>>> eleanor
> >>>>>>
> >>>>>> James Pauff wrote:
> >>>>>>> If I've lost my SCALA MTZ, and have only the truncated.mtz
> >>>>>> for my dataset, which program is the quickest means of
> >>>>>> obtaining a Wilson plot?
> >>>>>>>
> >>>>>>> Thank you again,
> >>>>>>> Jim
> >>>>>>>
> >>>>>>>
> >>>>>>> --- On Wed, 8/20/08, Eleanor Dodson
> >>>> <[EMAIL PROTECTED]> wrote:
> >>>>>>>
> >>>>>>>
> >>>>>>>> From: Eleanor Dodson <[EMAIL PROTECTED]>
> >>>>>>>> Subject: Re: [ccp4bb] Lower completeness, decent R
> >>>>>> factors, but low B factor...
> >>>>>>>> To: CCP4BB@JISCMAIL.AC.UK
> >>>>>>>> Date: Wednesday, August 20, 2008, 4:30 AM
> >>>>>>>> James Pauff wrote:
> >>>>>>>>
> >>>>>>>>> Hello all,
> >>>>>>>>>
> >>>>>>>>> I have a refined structure at 2.6 angstroms that at
> >>>>>>>>>
> >>>>>>>> about 73% completeness at this resolution.  The I/sigma is
> >>>>>>>> about 2.0 at 2.6 angstroms, and the omit density for my
> >>>>>>>> ligands is great contoured at 3.0sigma.  My Rcryst is 19 or
> >>>>>>>> so and the Rfree is 24.5 or so.
> >>>>>>>>
> >>>>>>>>> HOWEVER, my mean B value is 13.9, whereas my other 2
> >>>>>>>>>
> >>>>>>>> structures (at 2.2 and 2.3 angstroms, same protein, >95%
> >>>>>>>> completeness) have mean B values of 22+.  Any suggestions as
> >>>>>>>> to what is going on here?  I'm having trouble explaining
> >>>>>>>> this.
> >>>>>>>>
> >>>>>>>>> Thank you,
> >>>>>>>>> Jim
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>> Have you used TLS - listed B factors will then be given
> >>>>>>>> relative to the
> >>>>>>>> TLS parameters. You need to run tLSANL to get a more
> >>>>>>>> realistic value.
> >>>>>>>> Eleanor
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> But in fact temperature factors are rather harder to
> >>>>>>>> estimate at lower
> >>>>>>>> resolutions than higher. Look at your <Fo> and
> >>>>>>>> <Fc> curves v resolution
> >>>>>>>> ( part of a REFMAC loggraph) and you can see that sometimes
> >>>>>>>> the overall
> >>>>>>>> scaling struggles to get a reasonable fit..
> >>>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>>
> >>>>>
> >>>>> Disclaimer
> >>>>> This communication is confidential and may contain
> >>>> privileged information intended solely for the named
> >>>> addressee(s). It may not be used or disclosed except for the
> >>>> purpose for which it has been sent. If you are not the
> >>>> intended recipient you must not review, use, disclose, copy,
> >>>> distribute or take any action in reliance upon it. If you
> >>>> have received this communication in error, please notify
> >>>> Astex Therapeutics Ltd by emailing
> >>>> [EMAIL PROTECTED] and destroy all copies of the
> >>>> message and any attached documents.
> >>>>> Astex Therapeutics Ltd monitors, controls and protects all
> >>>> its messaging traffic in compliance with its corporate email
> >>>> policy. The Company accepts no liability or responsibility
> >>>> for any onward transmission or use of emails and attachments
> >>>> having left the Astex Therapeutics domain.  Unless expressly
> >>>> stated, opinions in this message are those of the individual
> >>>> sender and not of Astex Therapeutics Ltd. The recipient
> >>>> should check this email and any attachments for the presence
> >>>> of computer viruses. Astex Therapeutics Ltd accepts no
> >>>> liability for damage caused by any virus transmitted by this
> >>>> email. E-mail is susceptible to data corruption,
> >>>> interception, unauthorized amendment, and tampering, Astex
> >>>> Therapeutics Ltd only send and receive e-mails on the basis
> >>>> that the Company is not liable for any such alteration or any
> >>>> consequences thereof.
> >>>>> Astex Therapeutics Ltd., Registered in England at 436
> >>>> Cambridge Science Park, Cambridge CB4 0QA under number 3751674
> >>>>>
> >>>>
> >>>>
> >>>
> >>>
> >>> Disclaimer
> >>> This communication is confidential and may contain privileged
> >>> information intended solely for the named addressee(s). It may not
> >>> be used or disclosed except for the purpose for which it has been
> >>> sent. If you are not the intended recipient you must not review,
> >>> use, disclose, copy, distribute or take any action in
> >> reliance upon
> >>> it. If you have received this communication in error,
> >> please notify
> >>> Astex Therapeutics Ltd by emailing [EMAIL PROTECTED]
> >>> and destroy all copies of the message and any attached documents.
> >>> Astex Therapeutics Ltd monitors, controls and protects all its
> >>> messaging traffic in compliance with its corporate email
> >> policy. The
> >>> Company accepts no liability or responsibility for any onward
> >>> transmission or use of emails and attachments having left
> >> the Astex
> >>> Therapeutics domain.  Unless expressly stated, opinions in this
> >>> message are those of the individual sender and not of Astex
> >>> Therapeutics Ltd. The recipient should check this email and any
> >>> attachments for the presence of computer viruses. Astex
> >> Therapeutics
> >>> Ltd accepts no liability for damage caused by any virus
> >> transmitted
> >>> by this email. E-mail is susceptible to data corruption,
> >>> interception, unauthorized amendment, and tampering, Astex
> >>> Therapeutics Ltd only send and receive e-mails on the basis
> >> that the
> >>> Company is not liable for any such alteration or any consequences
> >>> thereof.
> >>> Astex Therapeutics Ltd., Registered in England at 436 Cambridge
> >>> Science Park, Cambridge CB4 0QA under number 3751674
> >>
> >>
> >
> >
> > Disclaimer
> > This communication is confidential and may contain privileged  
> > information intended solely for the named addressee(s). It may not  
> > be used or disclosed except for the purpose for which it has been  
> > sent. If you are not the intended recipient you must not review,  
> > use, disclose, copy, distribute or take any action in 
> reliance upon  
> > it. If you have received this communication in error, 
> please notify  
> > Astex Therapeutics Ltd by emailing [EMAIL PROTECTED]  
> > and destroy all copies of the message and any attached documents.
> > Astex Therapeutics Ltd monitors, controls and protects all its  
> > messaging traffic in compliance with its corporate email 
> policy. The  
> > Company accepts no liability or responsibility for any onward  
> > transmission or use of emails and attachments having left 
> the Astex  
> > Therapeutics domain.  Unless expressly stated, opinions in this  
> > message are those of the individual sender and not of Astex  
> > Therapeutics Ltd. The recipient should check this email and any  
> > attachments for the presence of computer viruses. Astex 
> Therapeutics  
> > Ltd accepts no liability for damage caused by any virus 
> transmitted  
> > by this email. E-mail is susceptible to data corruption,  
> > interception, unauthorized amendment, and tampering, Astex  
> > Therapeutics Ltd only send and receive e-mails on the basis 
> that the  
> > Company is not liable for any such alteration or any consequences  
> > thereof.
> > Astex Therapeutics Ltd., Registered in England at 436 Cambridge  
> > Science Park, Cambridge CB4 0QA under number 3751674
> >
> >
> 
> 


Disclaimer
This communication is confidential and may contain privileged information 
intended solely for the named addressee(s). It may not be used or disclosed 
except for the purpose for which it has been sent. If you are not the intended 
recipient you must not review, use, disclose, copy, distribute or take any 
action in reliance upon it. If you have received this communication in error, 
please notify Astex Therapeutics Ltd by emailing [EMAIL PROTECTED] and destroy 
all copies of the message and any attached documents. 
Astex Therapeutics Ltd monitors, controls and protects all its messaging 
traffic in compliance with its corporate email policy. The Company accepts no 
liability or responsibility for any onward transmission or use of emails and 
attachments having left the Astex Therapeutics domain.  Unless expressly 
stated, opinions in this message are those of the individual sender and not of 
Astex Therapeutics Ltd. The recipient should check this email and any 
attachments for the presence of computer viruses. Astex Therapeutics Ltd 
accepts no liability for damage caused by any virus transmitted by this email. 
E-mail is susceptible to data corruption, interception, unauthorized amendment, 
and tampering, Astex Therapeutics Ltd only send and receive e-mails on the 
basis that the Company is not liable for any such alteration or any 
consequences thereof.
Astex Therapeutics Ltd., Registered in England at 436 Cambridge Science Park, 
Cambridge CB4 0QA under number 3751674

Reply via email to