Re: [ccp4bb] Experimental phasing vs molecular replacement
Hi, > Any time you do a thought experiment you make a fake-data data set, the > "true" phases and "true" amplitudes become the ones you put into the > simulation process. This is by definition. Is there potential for > circular reasoning? Of course! But you can do controls: > this is so much true! This is what I've been doing for development and testing all the time since started working for Phenix! Fully controlled thought/numerical experiments done this way are super-helpful (but obviously have limitations!). >If you start with an ordinary single-conformer coordinate model and > flat bulk solvent from refmac to make your Ftrue, then what you will > find is that even after adding all plausible experimental errors to the > data the final Rwork/Rfree invariably drop to small-molecule levels of > 3-4%. This is true even if you prune the structure back, shake it, and > rebuild it in various ways. The difference features always guide you > back to Rwork/Rfree = 3/4%. However, if you refine with phenix.refine, > you will find Rwork/Rfree stall at around 10-11%. This is because Ftrue > came from refmac and refmac and phenix.refine have somewhat different > bulk solvent models. If Ftrue comes from phenix and you refine with > refmac you get similar "high" R values. High for a small molecule > anyway. And, of course, if you get Ftrue from phenix and refine with > phenix you also get final Rwork/Rfree = 3/4%. If you do more things that > automated building doesn't do, like multi-headed side chains, or get the > bulk solvent from an MD simulation, then you can get "realistic" > Rwork/Rfree in the 20%s. All of this is the main conclusion from this > paper: https://dx.doi.org/10./febs.12922 Even within Phenix alone this is true if you switch between different scaling/bulk-solvent models or play with automation levels (such as ignoring reflection otliers, etc). Pavel To unsubscribe from the CCP4BB list, click the following link: https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1
Re: [ccp4bb] Experimental phasing vs molecular replacement
What may be counter-intuitive when looked at in one way may be perfectly expected from another point of view. I look at these maps as the result of a single cycle of steepest descent refinement where the parameters are the density values of the map sampled on the grid. If you start with a map calculated from the coefficients (|Fcalc|, PhiCalc) One cycle of steepest descent gives the shift 2(|Fobs|-|Fcalc|, PhiCalc) giving a new and improved map with the coefficients (|Fcalc|, PhiCalc) + 2(|Fobs|-|Fcalc|, PhiCalc) = (|Fcalc| + 2|Fobs| -2|Fcalc|, PhiCalc) = (2|Fobs| - |Fcalc|, PhiCalc) or the classic 2Fo-Fc map. If you start with (|Fobs|, PhiObs) then your shift will be zero because the R value is already perfect. You cannot improve an experimental map unless you refine against other criteria. On the other hand, if you start with (|Fcalc|, PhiObs) you have to question your sanity a bit because |Fobs| is so much better, in fact perfect. If you decide to press ahead anyway you find that the coefficients of the updated map are (2|Fobs| - |Fcalc|, PhiObs). These are better than (|Fcalc|, PhiObs) but still not as good as (|Fobs|, PhiObs). There really is no justification for simply attaching the observed phases to the calculated amplitudes. The reason we are doing atomic model refinement (instead of density map refinement as described above) is to impose a lot of external knowledge such as atomic shape, solvent flatness, and various relationships between atoms. All of that information gets encoded in the Fcalc's (complex numbers) so their amplitudes and phases are tightly coupled. It is no surprise that just ripping out half and replacing it with something else would lower the quality of the contained information. If you want a map to help evaluate your model when you have experimental phase information you should run one cycle of steepest descent optimization on the map with both amplitudes and phases restrained. If I ignore the complication of much larger uncertainty of the phase relative to the amplitude, I believe the single cycle shift is a map calculated from the complex coefficients 2(Fobs-Fcalc) and this is your "difference map". The 2Fo-Fc equivalent would have the coefficients 2Fobs-Fcalc. Remember they are all complex numbers with their proper phases. I did this derivation in Least-Squares formalism so I can't be confident of the m's and D's. I also assumed that Fridel's Law holds, but that assumption was made with the traditional maps as well. Dale Tronrud On 12/6/2018 11:01 AM, James Holton wrote: > Sorry for the confusion, I was going for brevity. > > Any time you do a thought experiment you make a fake-data data set, the > "true" phases and "true" amplitudes become the ones you put into the > simulation process. This is by definition. Is there potential for > circular reasoning? Of course! But you can do controls: > > If you start with an ordinary single-conformer coordinate model and > flat bulk solvent from refmac to make your Ftrue, then what you will > find is that even after adding all plausible experimental errors to the > data the final Rwork/Rfree invariably drop to small-molecule levels of > 3-4%. This is true even if you prune the structure back, shake it, and > rebuild it in various ways. The difference features always guide you > back to Rwork/Rfree = 3/4%. However, if you refine with phenix.refine, > you will find Rwork/Rfree stall at around 10-11%. This is because Ftrue > came from refmac and refmac and phenix.refine have somewhat different > bulk solvent models. If Ftrue comes from phenix and you refine with > refmac you get similar "high" R values. High for a small molecule > anyway. And, of course, if you get Ftrue from phenix and refine with > phenix you also get final Rwork/Rfree = 3/4%. If you do more things that > automated building doesn't do, like multi-headed side chains, or get the > bulk solvent from an MD simulation, then you can get "realistic" > Rwork/Rfree in the 20%s. All of this is the main conclusion from this > paper: https://dx.doi.org/10./febs.12922 > > But, in all these situations with various types of "systematic error" > thrown in, because you know Ftrue and PHItrue you can compare different > kinds of maps to this ground "truth" and see which is closest when you > compare electron density. In my experience, this is the 2mFo-DFc map, > phased with PHIcalc from the model. You might think that replacing > PHIcalc with PHItrue would make the map even better because PHItrue is a > "better" phase than PHIcalc, but it turns out this actually make things > worse! That's what is counter-intuitive: 2mFo-DFc amplitudes are > "designed" to be used with the slightly-wrong phase of PHIcalc, not > PHItrue. > > That's what I was trying to say. > > -James Holton > MAD Scientist > > > On 12/5/2018 7:36 PM, Keller, Jacob wrote: That said, model phases are not so bad. In fact, in all my experime
Re: [ccp4bb] Experimental phasing vs molecular replacement
Sorry for the confusion, I was going for brevity. Any time you do a thought experiment you make a fake-data data set, the "true" phases and "true" amplitudes become the ones you put into the simulation process. This is by definition. Is there potential for circular reasoning? Of course! But you can do controls: If you start with an ordinary single-conformer coordinate model and flat bulk solvent from refmac to make your Ftrue, then what you will find is that even after adding all plausible experimental errors to the data the final Rwork/Rfree invariably drop to small-molecule levels of 3-4%. This is true even if you prune the structure back, shake it, and rebuild it in various ways. The difference features always guide you back to Rwork/Rfree = 3/4%. However, if you refine with phenix.refine, you will find Rwork/Rfree stall at around 10-11%. This is because Ftrue came from refmac and refmac and phenix.refine have somewhat different bulk solvent models. If Ftrue comes from phenix and you refine with refmac you get similar "high" R values. High for a small molecule anyway. And, of course, if you get Ftrue from phenix and refine with phenix you also get final Rwork/Rfree = 3/4%. If you do more things that automated building doesn't do, like multi-headed side chains, or get the bulk solvent from an MD simulation, then you can get "realistic" Rwork/Rfree in the 20%s. All of this is the main conclusion from this paper: https://dx.doi.org/10./febs.12922 But, in all these situations with various types of "systematic error" thrown in, because you know Ftrue and PHItrue you can compare different kinds of maps to this ground "truth" and see which is closest when you compare electron density. In my experience, this is the 2mFo-DFc map, phased with PHIcalc from the model. You might think that replacing PHIcalc with PHItrue would make the map even better because PHItrue is a "better" phase than PHIcalc, but it turns out this actually make things worse! That's what is counter-intuitive: 2mFo-DFc amplitudes are "designed" to be used with the slightly-wrong phase of PHIcalc, not PHItrue. That's what I was trying to say. -James Holton MAD Scientist On 12/5/2018 7:36 PM, Keller, Jacob wrote: That said, model phases are not so bad. In fact, in all my experiments with fake data the model-phased 2mFo-DFc map always has the best correlation to the "true" map. If you substitute the "true" phases and use the 2mFo-DFc coefficients you actually make things worse. Counter-intuitive, but true. I don't understand what you mean by true and fake here--can you clarify? How are the true map and phases generated (from an original true model, I assume?), and how are the fake data generated? (Also from the true model?) I am wondering whether there is some circular reasoning? JPK To unsubscribe from the CCP4BB list, click the following link: https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1
Re: [ccp4bb] Experimental phasing vs molecular replacement
A point I like to make to new trainees trying to solve structures: Model phases from a good model are pretty good, but from a practical viewpoint, if your initial model for molecular replacement is that good, the resulting structure will probably not tell you much you didn't already know (with exceptions of course). If you only have an existing structure for, say, half of what's in your asymmetric unit, SAD phases will be less biased than model phases from molecular replacement, even though both may be noisy. ~~~ Phoebe A. Rice Dept. of Biochem & Mol. Biol. and Committee on Microbiology https://voices.uchicago.edu/phoebericelab/ On 12/5/18, 9:14 PM, "CCP4 bulletin board on behalf of James Holton" wrote: It is true that MAD phasing can give you hyper-accurate phases. This is because you are measuring the heavy atom signal in both directions on the Harker diagram, allowing the phase to be solved analytically. The phasing signal is noisy, of course, but you can fix that either with a bigger heavy atom (more signal) or by doing a lot of averaging (less noise). You will probably need more than one crystal. SAD, on the other hand, can never give you very good phases because the phase probability distributions are all bimodal. The technology that makes SAD practical is solvent flattening, but as soon as you start doing things like solvent flattening you are already imposing a model, and every model comes with some amount of bias. How important that bias is depends on the question you are trying to answer. MIR, like MAD, can get arbitrarily accurate phases, but this and every other technique requires a high degree of isomorphism. In practice, essentially all experimental phasing attempts are really trying to get you just over that ever-elusive tipping point of phase quality where solvent flattening and model building can take you the rest of the way. So, in the end what you have are model phases, just like if you had done MR. It's sad really how fleeting the involvement of experimental phases are in essentially all MAD/SAD structure determinations. Pun intended. That said, model phases are not so bad. In fact, in all my experiments with fake data the model-phased 2mFo-DFc map always has the best correlation to the "true" map. If you substitute the "true" phases and use the 2mFo-DFc coefficients you actually make things worse. Counter-intuitive, but true. -James Holton MAD Scientist On 12/5/2018 12:07 AM, 香川 亘 wrote: > Dear all, > > It is my understanding that experimental phasing (e.g. Se-SAD), in principle, yields better electron density maps than molecular replacement for protein regions with weak electron densities (partially disordered or flexible). I would appreciate if someone could provide comments on whether my understanding is correct or not. If there any good examples or literatures on this issue I would be grateful to know about it. > > I thank you in advance. > > Wataru Kagawa > > > To unsubscribe from the CCP4BB list, click the following link: > https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1 To unsubscribe from the CCP4BB list, click the following link: https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1 To unsubscribe from the CCP4BB list, click the following link: https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1
[ccp4bb] AW: [ccp4bb] Experimental phasing vs molecular replacement
I think Jacob is right. As long as protein crystals contain about 10% "dark matter"* not accounted for in any model, we cannot fake a "true" electron density map and it is then not surprising that an 2mFo-DFc map is closer to a model-based fake map than a map based on experimental phases. HS *This "dark matter" causes the best Rfactors for protein crystals to be ~15% instead of the 5% measurement errors and may include disorder, anisotropy, imperfectly modelled solvent etc. -Ursprüngliche Nachricht- Von: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] Im Auftrag von Keller, Jacob Gesendet: Donnerstag, 6. Dezember 2018 04:37 An: CCP4BB@JISCMAIL.AC.UK Betreff: [EXTERNAL] Re: [ccp4bb] Experimental phasing vs molecular replacement >>That said, model phases are not so bad. In fact, in all my experiments with >>fake data the model-phased 2mFo-DFc map always has the best correlation to >>the "true" map. If you substitute the "true" phases and use the 2mFo-DFc >>coefficients you actually make things worse. Counter-intuitive, but true. I don't understand what you mean by true and fake here--can you clarify? How are the true map and phases generated (from an original true model, I assume?), and how are the fake data generated? (Also from the true model?) I am wondering whether there is some circular reasoning? JPK To unsubscribe from the CCP4BB list, click the following link: https://urldefense.proofpoint.com/v2/url?u=https-3A__www.jiscmail.ac.uk_cgi-2Dbin_webadmin-3FSUBED1-3DCCP4BB-26A-3D1&d=DwIGaQ&c=Dbf9zoswcQ-CRvvI7VX5j3HvibIuT3ZiarcKl5qtMPo&r=HK-CY_tL8CLLA93vdywyu3qI70R4H8oHzZyRHMQu1AQ&m=YVXLK5IxGDpGCJovdCIRO5XSyyWR2c5GBj-Y2IPJ70s&s=AToqUF-6D9-eKOMC6YEFZWDNRgry_hBzKtpNklyB57w&e= To unsubscribe from the CCP4BB list, click the following link: https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1
Re: [ccp4bb] Experimental phasing vs molecular replacement
>>That said, model phases are not so bad. In fact, in all my experiments with >>fake data the model-phased 2mFo-DFc map always has the best correlation to >>the "true" map. If you substitute the "true" phases and use the 2mFo-DFc >>coefficients you actually make things worse. Counter-intuitive, but true. I don't understand what you mean by true and fake here--can you clarify? How are the true map and phases generated (from an original true model, I assume?), and how are the fake data generated? (Also from the true model?) I am wondering whether there is some circular reasoning? JPK To unsubscribe from the CCP4BB list, click the following link: https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1
Re: [ccp4bb] Experimental phasing vs molecular replacement
It is true that MAD phasing can give you hyper-accurate phases. This is because you are measuring the heavy atom signal in both directions on the Harker diagram, allowing the phase to be solved analytically. The phasing signal is noisy, of course, but you can fix that either with a bigger heavy atom (more signal) or by doing a lot of averaging (less noise). You will probably need more than one crystal. SAD, on the other hand, can never give you very good phases because the phase probability distributions are all bimodal. The technology that makes SAD practical is solvent flattening, but as soon as you start doing things like solvent flattening you are already imposing a model, and every model comes with some amount of bias. How important that bias is depends on the question you are trying to answer. MIR, like MAD, can get arbitrarily accurate phases, but this and every other technique requires a high degree of isomorphism. In practice, essentially all experimental phasing attempts are really trying to get you just over that ever-elusive tipping point of phase quality where solvent flattening and model building can take you the rest of the way. So, in the end what you have are model phases, just like if you had done MR. It's sad really how fleeting the involvement of experimental phases are in essentially all MAD/SAD structure determinations. Pun intended. That said, model phases are not so bad. In fact, in all my experiments with fake data the model-phased 2mFo-DFc map always has the best correlation to the "true" map. If you substitute the "true" phases and use the 2mFo-DFc coefficients you actually make things worse. Counter-intuitive, but true. -James Holton MAD Scientist On 12/5/2018 12:07 AM, 香川 亘 wrote: Dear all, It is my understanding that experimental phasing (e.g. Se-SAD), in principle, yields better electron density maps than molecular replacement for protein regions with weak electron densities (partially disordered or flexible). I would appreciate if someone could provide comments on whether my understanding is correct or not. If there any good examples or literatures on this issue I would be grateful to know about it. I thank you in advance. Wataru Kagawa To unsubscribe from the CCP4BB list, click the following link: https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1 To unsubscribe from the CCP4BB list, click the following link: https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1
Re: [ccp4bb] Experimental phasing vs molecular replacement
Hello, I think what you are alluding to is model bias in (macromolecular) crystallography. What you should consult are the publications associated with this topic, and those on the map coefficients used to compute electron density maps (e.g. SIGMAA weighting), OMIT maps, current refinement techniques... "Heavy-atom" phasing also suffers (or may suffer) from "imperfections" due to the heavy atom model used for phasing. Some of us remember ripples in electron density maps. Fred. On 2018-12-05 09:07, 香川 亘 wrote: Dear all, It is my understanding that experimental phasing (e.g. Se-SAD), in principle, yields better electron density maps than molecular replacement for protein regions with weak electron densities (partially disordered or flexible). I would appreciate if someone could provide comments on whether my understanding is correct or not. If there any good examples or literatures on this issue I would be grateful to know about it. I thank you in advance. Wataru Kagawa To unsubscribe from the CCP4BB list, click the following link: https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1 To unsubscribe from the CCP4BB list, click the following link: https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1
[ccp4bb] Experimental phasing vs molecular replacement
Dear all, It is my understanding that experimental phasing (e.g. Se-SAD), in principle, yields better electron density maps than molecular replacement for protein regions with weak electron densities (partially disordered or flexible). I would appreciate if someone could provide comments on whether my understanding is correct or not. If there any good examples or literatures on this issue I would be grateful to know about it. I thank you in advance. Wataru Kagawa To unsubscribe from the CCP4BB list, click the following link: https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1