Re: [ccp4bb] Experimental phasing vs molecular replacement

2018-12-06 Thread Pavel Afonine
Hi,


> Any time you do a thought experiment you make a fake-data data set, the
> "true" phases and "true" amplitudes become the ones you put into the
> simulation process.  This is by definition.  Is there potential for
> circular reasoning?  Of course!  But you can do controls:
>

this is so much true! This is what I've been doing for development and
testing all the time since started working for Phenix! Fully controlled
thought/numerical experiments done this way are super-helpful (but
obviously have limitations!).


>If you start with an ordinary single-conformer coordinate model and
> flat bulk solvent from refmac to make your Ftrue, then what you will
> find is that even after adding all plausible experimental errors to the
> data the final Rwork/Rfree invariably drop to small-molecule levels of
> 3-4%.  This is true even if you prune the structure back, shake it, and
> rebuild it in various ways.  The difference features always guide you
> back to Rwork/Rfree = 3/4%. However, if you refine with phenix.refine,
> you will find Rwork/Rfree stall at around 10-11%.  This is because Ftrue
> came from refmac and refmac and phenix.refine have somewhat different
> bulk solvent models.  If Ftrue comes from phenix and you refine with
> refmac you get similar "high" R values.  High for a small molecule
> anyway. And, of course, if you get Ftrue from phenix and refine with
> phenix you also get final Rwork/Rfree = 3/4%. If you do more things that
> automated building doesn't do, like multi-headed side chains, or get the
> bulk solvent from an MD simulation, then you can get "realistic"
> Rwork/Rfree in the 20%s.  All of this is the main conclusion from this
> paper: https://dx.doi.org/10./febs.12922


Even within Phenix alone this is true if you switch between different
scaling/bulk-solvent models or play with automation levels (such as
ignoring reflection otliers, etc).

Pavel



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1


Re: [ccp4bb] Experimental phasing vs molecular replacement

2018-12-06 Thread Dale Tronrud
   What may be counter-intuitive when looked at in one way may be
perfectly expected from another point of view.  I look at these maps as
the result of a single cycle of steepest descent refinement where the
parameters are the density values of the map sampled on the grid.  If
you start with a map calculated from the coefficients

(|Fcalc|, PhiCalc)

One cycle of steepest descent gives the shift

2(|Fobs|-|Fcalc|, PhiCalc)

giving a new and improved map with the coefficients

(|Fcalc|, PhiCalc) + 2(|Fobs|-|Fcalc|, PhiCalc)
   = (|Fcalc| + 2|Fobs| -2|Fcalc|, PhiCalc)
   = (2|Fobs| - |Fcalc|, PhiCalc)

or the classic 2Fo-Fc map.

   If you start with (|Fobs|, PhiObs) then your shift will be zero
because the R value is already perfect.  You cannot improve an
experimental map unless you refine against other criteria.

   On the other hand, if you start with (|Fcalc|, PhiObs) you have to
question your sanity a bit because |Fobs| is so much better, in fact
perfect.  If you decide to press ahead anyway you find that the
coefficients of the updated map are (2|Fobs| - |Fcalc|, PhiObs).  These
are better than (|Fcalc|, PhiObs) but still not as good as (|Fobs|, PhiObs).

   There really is no justification for simply attaching the observed
phases to the calculated amplitudes.  The reason we are doing atomic
model refinement (instead of density map refinement as described above)
is to impose a lot of external knowledge such as atomic shape, solvent
flatness, and various relationships between atoms.  All of that
information gets encoded in the Fcalc's (complex numbers) so their
amplitudes and phases are tightly coupled.  It is no surprise that just
ripping out half and replacing it with something else would lower the
quality of the contained information.

   If you want a map to help evaluate your model when you have
experimental phase information you should run one cycle of steepest
descent optimization on the map with both amplitudes and phases
restrained.  If I ignore the complication of much larger uncertainty of
the phase relative to the amplitude, I believe the single cycle shift is
a map calculated from the complex coefficients 2(Fobs-Fcalc) and this is
your "difference map".  The 2Fo-Fc equivalent would have the
coefficients 2Fobs-Fcalc.  Remember they are all complex numbers with
their proper phases.

   I did this derivation in Least-Squares formalism so I can't be
confident of the m's and D's.  I also assumed that Fridel's Law holds,
but that assumption was made with the traditional maps as well.

Dale Tronrud


On 12/6/2018 11:01 AM, James Holton wrote:
> Sorry for the confusion, I was going for brevity.
> 
> Any time you do a thought experiment you make a fake-data data set, the
> "true" phases and "true" amplitudes become the ones you put into the
> simulation process.  This is by definition.  Is there potential for
> circular reasoning?  Of course!  But you can do controls:
> 
>   If you start with an ordinary single-conformer coordinate model and
> flat bulk solvent from refmac to make your Ftrue, then what you will
> find is that even after adding all plausible experimental errors to the
> data the final Rwork/Rfree invariably drop to small-molecule levels of
> 3-4%.  This is true even if you prune the structure back, shake it, and
> rebuild it in various ways.  The difference features always guide you
> back to Rwork/Rfree = 3/4%. However, if you refine with phenix.refine,
> you will find Rwork/Rfree stall at around 10-11%.  This is because Ftrue
> came from refmac and refmac and phenix.refine have somewhat different
> bulk solvent models.  If Ftrue comes from phenix and you refine with
> refmac you get similar "high" R values.  High for a small molecule
> anyway. And, of course, if you get Ftrue from phenix and refine with
> phenix you also get final Rwork/Rfree = 3/4%. If you do more things that
> automated building doesn't do, like multi-headed side chains, or get the
> bulk solvent from an MD simulation, then you can get "realistic"
> Rwork/Rfree in the 20%s.  All of this is the main conclusion from this
> paper: https://dx.doi.org/10./febs.12922
> 
> But, in all these situations with various types of "systematic error"
> thrown in, because you know Ftrue and PHItrue you can compare different
> kinds of maps to this ground "truth" and see which is closest when you
> compare electron density. In my experience, this is the 2mFo-DFc map,
> phased with PHIcalc from the model. You might think that replacing
> PHIcalc with PHItrue would make the map even better because PHItrue is a
> "better" phase than PHIcalc, but it turns out this actually make things
> worse!  That's what is counter-intuitive: 2mFo-DFc amplitudes are
> "designed" to be used with the slightly-wrong phase of PHIcalc, not
> PHItrue.
> 
> That's what I was trying to say.
> 
> -James Holton
> MAD Scientist
> 
> 
> On 12/5/2018 7:36 PM, Keller, Jacob wrote:
 That said, model phases are not so bad.  In fact, in all my
 experime

Re: [ccp4bb] Experimental phasing vs molecular replacement

2018-12-06 Thread James Holton

Sorry for the confusion, I was going for brevity.

Any time you do a thought experiment you make a fake-data data set, the 
"true" phases and "true" amplitudes become the ones you put into the 
simulation process.  This is by definition.  Is there potential for 
circular reasoning?  Of course!  But you can do controls:


  If you start with an ordinary single-conformer coordinate model and 
flat bulk solvent from refmac to make your Ftrue, then what you will 
find is that even after adding all plausible experimental errors to the 
data the final Rwork/Rfree invariably drop to small-molecule levels of 
3-4%.  This is true even if you prune the structure back, shake it, and 
rebuild it in various ways.  The difference features always guide you 
back to Rwork/Rfree = 3/4%. However, if you refine with phenix.refine, 
you will find Rwork/Rfree stall at around 10-11%.  This is because Ftrue 
came from refmac and refmac and phenix.refine have somewhat different 
bulk solvent models.  If Ftrue comes from phenix and you refine with 
refmac you get similar "high" R values.  High for a small molecule 
anyway. And, of course, if you get Ftrue from phenix and refine with 
phenix you also get final Rwork/Rfree = 3/4%. If you do more things that 
automated building doesn't do, like multi-headed side chains, or get the 
bulk solvent from an MD simulation, then you can get "realistic" 
Rwork/Rfree in the 20%s.  All of this is the main conclusion from this 
paper: https://dx.doi.org/10./febs.12922


But, in all these situations with various types of "systematic error" 
thrown in, because you know Ftrue and PHItrue you can compare different 
kinds of maps to this ground "truth" and see which is closest when you 
compare electron density. In my experience, this is the 2mFo-DFc map, 
phased with PHIcalc from the model. You might think that replacing 
PHIcalc with PHItrue would make the map even better because PHItrue is a 
"better" phase than PHIcalc, but it turns out this actually make things 
worse!  That's what is counter-intuitive: 2mFo-DFc amplitudes are 
"designed" to be used with the slightly-wrong phase of PHIcalc, not 
PHItrue.


That's what I was trying to say.

-James Holton
MAD Scientist


On 12/5/2018 7:36 PM, Keller, Jacob wrote:

That said, model phases are not so bad.  In fact, in all my experiments with fake data the 
model-phased 2mFo-DFc map always has the best correlation to the "true" map.  If you 
substitute the "true" phases and use the 2mFo-DFc coefficients you actually make things 
worse. Counter-intuitive, but true.

I don't understand what you mean by true and fake here--can you clarify? How 
are the true map and phases generated (from an original true model, I assume?), 
and how are the fake data generated? (Also from the true model?) I am wondering 
whether there is some circular reasoning?

JPK




To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1


Re: [ccp4bb] Experimental phasing vs molecular replacement

2018-12-06 Thread Phoebe A. Rice
A point I like to make to new trainees trying to solve structures:
Model phases from a good model are pretty good, but from a practical viewpoint, 
if your initial model for molecular replacement is that good, the resulting 
structure will probably not tell you much you didn't already know (with 
exceptions of course).
If you only have an existing structure for, say, half of what's in your 
asymmetric unit, SAD phases will be less biased than model phases from 
molecular replacement, even though both may be noisy. 

~~~
Phoebe A. Rice
Dept. of Biochem & Mol. Biol. and
  Committee on Microbiology
https://voices.uchicago.edu/phoebericelab/
 
 

On 12/5/18, 9:14 PM, "CCP4 bulletin board on behalf of James Holton" 
 wrote:

It is true that MAD phasing can give you hyper-accurate phases. This is 
because you are measuring the heavy atom signal in both directions on 
the Harker diagram, allowing the phase to be solved analytically.  The 
phasing signal is noisy, of course, but you can fix that either with a 
bigger heavy atom (more signal) or by doing a lot of averaging (less 
noise). You will probably need more than one crystal.

SAD, on the other hand, can never give you very good phases because the 
phase probability distributions are all bimodal. The technology that 
makes SAD practical is solvent flattening, but as soon as you start 
doing things like solvent flattening you are already imposing a model, 
and every model comes with some amount of bias.  How important that bias 
is depends on the question you are trying to answer.

MIR, like MAD, can get arbitrarily accurate phases, but this and every 
other technique requires a high degree of isomorphism.

In practice, essentially all experimental phasing attempts are really 
trying to get you just over that ever-elusive tipping point of phase 
quality where solvent flattening and model building can take you the 
rest of the way.  So, in the end what you have are model phases, just 
like if you had done MR.  It's sad really how fleeting the involvement 
of experimental phases are in essentially all MAD/SAD structure 
determinations.  Pun intended.

That said, model phases are not so bad.  In fact, in all my experiments 
with fake data the model-phased 2mFo-DFc map always has the best 
correlation to the "true" map.  If you substitute the "true" phases and 
use the 2mFo-DFc coefficients you actually make things worse.  
Counter-intuitive, but true.

-James Holton
MAD Scientist

On 12/5/2018 12:07 AM, 香川 亘 wrote:
> Dear all,
>
> It is my understanding that experimental phasing (e.g. Se-SAD), in 
principle, yields better electron density maps than molecular replacement for 
protein regions with weak electron densities (partially disordered or 
flexible).  I would appreciate if someone could provide comments on whether my 
understanding is correct or not.  If there any good examples or literatures on 
this issue I would be grateful to know about it.
>
> I thank you in advance.
>
> Wataru Kagawa
> 
>
> To unsubscribe from the CCP4BB list, click the following link:
> https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1





To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1


[ccp4bb] AW: [ccp4bb] Experimental phasing vs molecular replacement

2018-12-06 Thread Herman . Schreuder
I think Jacob is right. As long as protein crystals contain about 10% "dark 
matter"* not accounted for in any model, we cannot fake a "true" electron 
density map and it is then not surprising that an 2mFo-DFc map is closer to a 
model-based fake map than a map based on experimental phases.
HS

*This "dark matter" causes the best Rfactors for protein crystals to be ~15% 
instead of the 5% measurement errors and may include disorder, anisotropy, 
imperfectly modelled solvent etc.



-Ursprüngliche Nachricht-
Von: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] Im Auftrag von Keller, 
Jacob
Gesendet: Donnerstag, 6. Dezember 2018 04:37
An: CCP4BB@JISCMAIL.AC.UK
Betreff: [EXTERNAL] Re: [ccp4bb] Experimental phasing vs molecular replacement

>>That said, model phases are not so bad.  In fact, in all my experiments with 
>>fake data the model-phased 2mFo-DFc map always has the best correlation to 
>>the "true" map.  If you substitute the "true" phases and use the 2mFo-DFc 
>>coefficients you actually make things worse. Counter-intuitive, but true.

I don't understand what you mean by true and fake here--can you clarify? How 
are the true map and phases generated (from an original true model, I assume?), 
and how are the fake data generated? (Also from the true model?) I am wondering 
whether there is some circular reasoning?

JPK



To unsubscribe from the CCP4BB list, click the following link:
https://urldefense.proofpoint.com/v2/url?u=https-3A__www.jiscmail.ac.uk_cgi-2Dbin_webadmin-3FSUBED1-3DCCP4BB-26A-3D1&d=DwIGaQ&c=Dbf9zoswcQ-CRvvI7VX5j3HvibIuT3ZiarcKl5qtMPo&r=HK-CY_tL8CLLA93vdywyu3qI70R4H8oHzZyRHMQu1AQ&m=YVXLK5IxGDpGCJovdCIRO5XSyyWR2c5GBj-Y2IPJ70s&s=AToqUF-6D9-eKOMC6YEFZWDNRgry_hBzKtpNklyB57w&e=



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1


Re: [ccp4bb] Experimental phasing vs molecular replacement

2018-12-05 Thread Keller, Jacob
>>That said, model phases are not so bad.  In fact, in all my experiments with 
>>fake data the model-phased 2mFo-DFc map always has the best correlation to 
>>the "true" map.  If you substitute the "true" phases and use the 2mFo-DFc 
>>coefficients you actually make things worse. Counter-intuitive, but true.

I don't understand what you mean by true and fake here--can you clarify? How 
are the true map and phases generated (from an original true model, I assume?), 
and how are the fake data generated? (Also from the true model?) I am wondering 
whether there is some circular reasoning?

JPK



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1


Re: [ccp4bb] Experimental phasing vs molecular replacement

2018-12-05 Thread James Holton
It is true that MAD phasing can give you hyper-accurate phases. This is 
because you are measuring the heavy atom signal in both directions on 
the Harker diagram, allowing the phase to be solved analytically.  The 
phasing signal is noisy, of course, but you can fix that either with a 
bigger heavy atom (more signal) or by doing a lot of averaging (less 
noise). You will probably need more than one crystal.


SAD, on the other hand, can never give you very good phases because the 
phase probability distributions are all bimodal. The technology that 
makes SAD practical is solvent flattening, but as soon as you start 
doing things like solvent flattening you are already imposing a model, 
and every model comes with some amount of bias.  How important that bias 
is depends on the question you are trying to answer.


MIR, like MAD, can get arbitrarily accurate phases, but this and every 
other technique requires a high degree of isomorphism.


In practice, essentially all experimental phasing attempts are really 
trying to get you just over that ever-elusive tipping point of phase 
quality where solvent flattening and model building can take you the 
rest of the way.  So, in the end what you have are model phases, just 
like if you had done MR.  It's sad really how fleeting the involvement 
of experimental phases are in essentially all MAD/SAD structure 
determinations.  Pun intended.


That said, model phases are not so bad.  In fact, in all my experiments 
with fake data the model-phased 2mFo-DFc map always has the best 
correlation to the "true" map.  If you substitute the "true" phases and 
use the 2mFo-DFc coefficients you actually make things worse.  
Counter-intuitive, but true.


-James Holton
MAD Scientist

On 12/5/2018 12:07 AM, 香川 亘 wrote:

Dear all,

It is my understanding that experimental phasing (e.g. Se-SAD), in principle, 
yields better electron density maps than molecular replacement for protein 
regions with weak electron densities (partially disordered or flexible).  I 
would appreciate if someone could provide comments on whether my understanding 
is correct or not.  If there any good examples or literatures on this issue I 
would be grateful to know about it.

I thank you in advance.

Wataru Kagawa


To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1




To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1


Re: [ccp4bb] Experimental phasing vs molecular replacement

2018-12-05 Thread Frederic Vellieux

Hello,

I think what you are alluding to is model bias in (macromolecular) 
crystallography. What you should consult are the publications associated 
with this topic, and those on the map coefficients used to compute 
electron density maps (e.g. SIGMAA weighting), OMIT maps, current 
refinement techniques...


"Heavy-atom" phasing also suffers (or may suffer) from "imperfections" 
due to the heavy atom model used for phasing. Some of us remember 
ripples in electron density maps.


Fred.

On 2018-12-05 09:07, 香川 亘 wrote:

Dear all,

It is my understanding that experimental phasing (e.g. Se-SAD), in
principle, yields better electron density maps than molecular
replacement for protein regions with weak electron densities
(partially disordered or flexible).  I would appreciate if someone
could provide comments on whether my understanding is correct or not.
If there any good examples or literatures on this issue I would be
grateful to know about it.

I thank you in advance.

Wataru Kagawa


To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1




To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1


[ccp4bb] Experimental phasing vs molecular replacement

2018-12-05 Thread 香川 亘
Dear all,

It is my understanding that experimental phasing (e.g. Se-SAD), in principle, 
yields better electron density maps than molecular replacement for protein 
regions with weak electron densities (partially disordered or flexible).  I 
would appreciate if someone could provide comments on whether my understanding 
is correct or not.  If there any good examples or literatures on this issue I 
would be grateful to know about it.

I thank you in advance.

Wataru Kagawa 


To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1