[ccp4bb] AW: [ccp4bb] Where to cut the data in this medium resolution dataset
Dear Stefan, Did you have a look at the NCS related helices? To me it looks like your NCS restraints on B-factors are too strong, or not valid for your crystal packing. Best, Herman Von: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] Im Auftrag von Stefan Gajewski Gesendet: Mittwoch, 24. Juli 2013 07:18 An: CCP4BB@JISCMAIL.AC.UK Betreff: Re: [ccp4bb] Where to cut the data in this medium resolution dataset Nat, What do correct B-factors look like? What refinement strategy did you use for them? 1) If I see strong positive density in the Fo-Fc map along the backbone of two turns of an correctly placed alpha helix, therefore the B-factors are too high in that region. The model after refinement suggest less scattering in that region than is observed which is most likely explained by incorrectly high B-factors. 2) xyz, TLS and individual B-factors (no grouping). SS and ramachandran restraints on coordinates and NCS restraints on coordinates and B-factors. Note that the R-free value in the 3.4A shell is lower than the R-work (and also the Rpim in that shell!) which clearly indicates this refinement was not stable. I don't think it indicates anything about the stability of refinement - my guess would be that the NCS is biasing R-free. I suppose it could also indicate that the data in the 3.6-3.4 range are basically noise, although if the maps look better then that would suggest the opposite. I think the refinement is not parametrized correctly. Thank you, S.
Re: [ccp4bb] Where to cut the data in this medium resolution dataset
Hi Stefan, you write The diffraction pattern looks great, the 3.4A reflections are visible by eye and the edge of the detector is about 2.8A. and for the 3.4A data Mean((I)/sd(I)) in the highest shell is 2.3 . I'm tempted to ask: what prevents you from using higher resolution data to, say, 3.2 or 3.0 A - what do you gain by throwing reflections away? Using higher resolution _will_ reduce overfitting, and should improve the model. In the presence of NCS, Rfree will be biased towards Rwork. In your case of high-order NCS, you might consider choosing the free reflections in thin shells in reciprocal space. HTH, Kay
Re: [ccp4bb] Where to cut the data in this medium resolution dataset
Hi I'd agree with Kay here - since the edge of the detector is at ~2.8Å. It is almost always worthwhile integrating to a higher resolution than you can see spots on the images - for what I would call normal datasets, I would always integrate to ~0.2Å higher (as a first estimate), then after examining scaling statistics (e.g. correlation coefficients!) decide if you can actually integrate even higher. For modern extra fine phi slicing, it's usually worthwhile integrating to an even higher resolution before making any decisions about the true resolution, especially if you have non-negligible background on the images. There are a couple of issues with integrating to *much* higher resolution than you actually have - one is due to the crystal detector refinements becoming less stable if you include too many reflections with insignificant !/sig(I) - i.e. refining against noise, and the other is in optimising the profiles measurement boxes (again, using noise to determine these will not lead to optimal values). BTW, unless you have a *really* good reason for scaling with SCALA, I would seriously consider updating your CCP4 installation and using Aimless instead. Phil Evans is no longer developing SCALA, and doesn't seem to have updated the SCALA release notes since 2010, so I suspect that any newer versions (most recent is 3.3.21) only contain minor bug fixes (but I could be wrong). On 23 Jul 2013, at 08:11, Kay Diederichs wrote: Hi Stefan, you write The diffraction pattern looks great, the 3.4A reflections are visible by eye and the edge of the detector is about 2.8A. and for the 3.4A data Mean((I)/sd(I)) in the highest shell is 2.3 . I'm tempted to ask: what prevents you from using higher resolution data to, say, 3.2 or 3.0 A - what do you gain by throwing reflections away? Using higher resolution _will_ reduce overfitting, and should improve the model. In the presence of NCS, Rfree will be biased towards Rwork. In your case of high-order NCS, you might consider choosing the free reflections in thin shells in reciprocal space. HTH, Kay The diffraction pattern looks great, the 3.4A reflections are visible by eye and the edge of the detector is about 2.8A. The crystals were 10x20x50 um in size and spacegroup is P6522. Harry -- ** note change of address ** Dr Harry Powell, MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge Biomedical Campus, Cambridge CB2 0QH Chairman of European Crystallographic Association SIG9 (Crystallographic Computing)
Re: [ccp4bb] Where to cut the data in this medium resolution dataset
Nat, What do correct B-factors look like? What refinement strategy did you use for them? 1) If I see strong positive density in the Fo-Fc map along the backbone of two turns of an correctly placed alpha helix, therefore the B-factors are too high in that region. The model after refinement suggest less scattering in that region than is observed which is most likely explained by incorrectly high B-factors. 2) xyz, TLS and individual B-factors (no grouping). SS and ramachandran restraints on coordinates and NCS restraints on coordinates and B-factors. Note that the R-free value in the 3.4A shell is lower than the R-work (and also the Rpim in that shell!) which clearly indicates this refinement was not stable. I don't think it indicates anything about the stability of refinement - my guess would be that the NCS is biasing R-free. I suppose it could also indicate that the data in the 3.6-3.4 range are basically noise, although if the maps look better then that would suggest the opposite. I think the refinement is not parametrized correctly. Thank you, S.
[ccp4bb] Where to cut the data in this medium resolution dataset
Hey! I was reading a lot lately on data processing and the ongoing debate in the community on how to submit Table 1. Here is an example of medium resolution data integrated with XDS, merged in SCALA and preliminarily refined in phenix. Overall InnerShell OuterShell Low resolution limit 49.06 49.06 3.58 High resolution limit 3.40 10.74 3.40 Rmerge 0.224 0.050 1.324 Rmerge in top intensity bin0.049- - Rmeas (within I+/I-) 0.235 0.052 1.391 Rmeas (all I+ I-)0.235 0.052 1.391 Rpim (within I+/I-)0.067 0.014 0.407 Rpim (all I+ I-) 0.067 0.014 0.407 Fractional partial bias0.000 0.000 0.000 Total number of observations 275312 8491 39783 Total number unique21137 765 3016 Mean((I)/sd(I)) 10.2 31.2 2.3 Completeness98.4 95.7 98.4 Multiplicity13.0 11.1 13.2 r_work=0.2461 (0.3998) r_free= 0.2697 (0.3592) SigmaA highest shell = 0.78 scale factor highest shell (phenix.refine) = 0.87 Overall InnerShell OuterShell Low resolution limit 49.59 49.59 3.80 High resolution limit 3.60 11.39 3.60 Rmerge 0.231 0.048 0.819 Rmerge in top intensity bin0.045- - Rmeas (within I+/I-) 0.242 0.050 0.860 Rmeas (all I+ I-)0.242 0.050 0.860 Rpim (within I+/I-)0.069 0.014 0.249 Rpim (all I+ I-) 0.069 0.014 0.249 Fractional partial bias0.000 0.000 0.000 Total number of observations 230945 6997 33402 Total number unique17794 646 2531 Mean((I)/sd(I)) 11.5 30.7 3.7 Completeness98.3 95.5 98.7 Multiplicity13.0 10.8 13.2 r_work= 0.2372 (0.3585) r_free= 0.2663 (0.3770) SigmaA highest shell = 0.79 scale factor highest shell (phenix.refine) = 0.95 XSCALE gives significantly lower average Rrim and Rmerge for both integrations (~18%) and CC(1/2) is above 0.7 in all bins The diffraction pattern looks great, the 3.4A reflections are visible by eye and the edge of the detector is about 2.8A. The crystals were 10x20x50 um in size and spacegroup is P6522. The maps shows signs of over fitting, the B-factors do not look correct in my opinion. Note that the R-free value in the 3.4A shell is lower than the R-work (and also the Rpim in that shell!) which clearly indicates this refinement was not stable. The structure contains no beta sheets and refinement also profits greatly from very rigid high-order NCS. The maps are very detailed, in fact better than some 2.8A maps I've seen before. The 0.2A in question here are actually quite helpful to increase the map quality, so I keep wondering if I should deposit the structure with them or keep them only for my own interpretation. Before I continue optimizing the integration/refinement I would like to hear suggestions from the experts where to make the resolution cut-off in this case? Do I have all information I need to make that decision? What arguments should I present when dealing with the reviewers? I mean, the Rrim/Rmerge values are really very high. Thank you for your input, Stefan Gajewski
Re: [ccp4bb] Where to cut the data in this medium resolution dataset
On Mon, Jul 22, 2013 at 10:19 AM, Stefan Gajewski sgajew...@gmail.comwrote: The maps shows signs of over fitting, the B-factors do not look correct in my opinion. What do correct B-factors look like? What refinement strategy did you use for them? Note that the R-free value in the 3.4A shell is lower than the R-work (and also the Rpim in that shell!) which clearly indicates this refinement was not stable. I don't think it indicates anything about the stability of refinement - my guess would be that the NCS is biasing R-free. I suppose it could also indicate that the data in the 3.6-3.4 range are basically noise, although if the maps look better then that would suggest the opposite. The structure contains no beta sheets and refinement also profits greatly from very rigid high-order NCS. The maps are very detailed, in fact better than some 2.8A maps I've seen before. The 0.2A in question here are actually quite helpful to increase the map quality, so I keep wondering if I should deposit the structure with them or keep them only for my own interpretation. I would deposit the data to 3.4Å in any case; what cutoff you refine the structure to is a separate decision. Before I continue optimizing the integration/refinement I would like to hear suggestions from the experts where to make the resolution cut-off in this case? Do I have all information I need to make that decision? What arguments should I present when dealing with the reviewers? I mean, the Rrim/Rmerge values are really very high. Do what Karplus Diederichs suggest: take the structure refined to 3.4Å, and recalculate the R-factors for that model with the data cut to 3.6Å. If the R-free calculated this way is below the R-free for the model refined to only 3.6Å, then the extra 0.2Å is contributing real information and improving the quality of your model, which is the best justification for extending to higher resolution. -Nat