Dear Arpita, > Apologies if the below query seems very naive!
Your query is not at all naive, it is very probing. Sorry for the necessarily long reply to your questions - but there are a number of topics you raise where we think a large amount of confusion still exists. Please note that this reply represents /our/ view of things only of course. > This is to query on the consensus to use Staraniso for pdb submission. We > have solved a structure previously at 2.3 A resolution. So you had a dataset where you decided on a sphere in reciprocal space with a radius of 2.3A as a cut-off surface - based on some kind of local analysis that convinced you that all the measured reflections within that sphere (i.e. 2.3A and lower) are observed and should be kept, while all measured reflections outside that sphere should be regarded as unobserved and can be discarded as pure noise. > The same data (after reindexing the diffraction images in autoPROC) and > after reprocessing by ellipsoidal scaling in Staraniso gave structure at > ~2.16 A. OK, first some clarification: * The scaling of the unmerged reflection data in autoPROC (using AIMLESS) is neither spherical nor ellipsoidal in itself: it uses the data as it is with the typical scale parameterisation in AIMLESS, i.e. a scale k and an image B-factor (plus some absorption), all with default smoothing. This then leads to two output reflection files: aimless_alldata_unmerged.mtz = Scaled and unmerged reflections without cut-off. aimless_alldata.mtz = Scaled and merged reflections without cut-off. * The latter (scaled and merged reflection data without any cut-off) is then given to STARANISO to do the following: (a) Compute various local statistics that are then used to define a cut-off surface. (b) Assume that all reflections within that cut-off surface should be kept (and could have been observed) and all those outside should be ignored. ==> See how that is extremely similar to the type of analysis you did with the initial 2.3A data? One type of analysis (local 1D-shells of data in d*) lead to an isotropic sphere as a cut-off surface, while another (local 3D-spheres in reciprocal space) lead to an anisotropic cut-off surface. Remember that "anisotropic" just means "not isotropic" - it doesn't mean "ellipsoidal" (diffraction from a cubic crystal can be anisotropic since the [100], [110] and [111] directions have quite different properties, yet attempts to fit an ellipsoid to it will produce a sphere). The cut-off surface assigned this way by STARANISO can have any shape really (including being a sphere) because the analysis via local spheres doesn't assume/enforce isotropy - while the analysis via spherical shells does. So up to that point there is no difference really between the two approaches: using a criterion to define a cut-off surface and considering data within the surface as observable and data outside as unobservable. It is only the assumptions on which the criterion is based that differ: one assumes the data is isotropic, while the other doesn't. ==> The notion of "resolution" is a bit complicated in general here: if your crystal diffracted better in some directions than in others, a better description is the use of "diffraction limit" in some directions - e.g. defined as a the principal axes of an ellipsoid fitted to the cut-off surface. This is what autoPROC/STARANISO provides. (c) Analyse the anisotropic fall-off in intensity of the data within the cut-off surface to derive anisotropic correction factors and apply them to the data. This is similar to the anisotropic scaling a refinement program would perform using the current model as a reference (to anisotropically scale the observed data to the model). Here we apply an internal anisotropic scaling, without a reference to any model. > The previously solved structure did not have significant anisotropy > according to Aimless, so anisotropic scaling was not performed that time. See above: you most likely did use anisotropic scaling during refinement (with the model as the reference). Please note that using AIMLESS alone is not the best way to detect anisotropy; that is not its main purpose. As far as we know, it looks only along the crystal axes for anisotropy (whereas STARANISO looks in all directions). That means it will not detect anisotropy eigenvectors lying close to diagonals as can happen in monoclinic (a*-c* plane only since an anisotropy eigenvector is constrained to be parallel to the b* axis) and triclinic lattices. So if AIMLESS says there is significant anisotropy you can believe it; OTOH if it says no anisotropy was detected you should definitely perform further checks. Absence of evidence is not evidence of absence! In higher-symmetry lattices all eigenvectors are constrained to be parallel to crystal axes so this problem doesn't arise and AIMLESS should detect anisotropy correctly in those cases. > The overall spherical completeness of Staraniso structure is low (~73%) > while Ellipsoidal completeness is ~94%. Let's see what "completeness" means: what percentage of observable reflections did we actually observe? So the important point is to define "observable" ... which you have done above already! The cut-off surface defines the region in reciprocal space that you deemed "observable", i.e. either a sphere (assuming isotropy) or a general cut-off surface (that can be simplified through a fitted ellipsoid). If you want to judge the completeness for data analysed by STARANISO, you can /not/ use the spherical completeness: the completeness computation needs to be done with the cut-off surface employed - which in the case of STARANISO data is the anisotropic cut-off surface (simplified through a fitted ellipsoid). So you /have/ to use the ellipsoidal completeness as a measure here. Using the spherical completeness for data that went through STARANISO is the same as if one pretends a crystal would diffract to 1.0A when it only ever gives observable intensities to say 1.5A - but still compute completeness to the 1.0A limit. The overall completeness would be ridiculously low and no-one would choose those two different cut-off surfaces (one representing actual data, the other some over-optimisitic assumption) and do that kind of computation, right? In the same way, we shouldn't pretend that an anisotropically diffracting crystal could provide us with all observations within a sphere in reciprocal space: there are no observations in certain directions (in the same way that there aren't any observations at 1.0A if the crystal only diffracts to 1.5A). The bottom line is: if the ellipsoidal completeness is significantly higher than the spherical one it means that there is significant anisotropy. > Parallel isotropic scaling gives structure with 99.6% completeness (but > 2.3 A resolution). Remember that completeness computations only look at Miller indices (reciprocal lattice points): it is the responsibility of whoever calls that computational step to provide only reciprocal lattice points with actual observations ... we are after all interested in knowing what fraction of possible data did we collect (and not how many HKL values we have in a file). It might be easier to visualise those very simple concepts by looking at https://staraniso.globalphasing.org/anisotropy_about.html > The statistics (R merge and others) are better for Staraniso structure > (also benefited from removing specific frames with high R merge as > indicated by Staraniso). Are you sure you removed images based on Rmerge values in STARANISO? - autoPROC is doing a fair bit of analysis to determine poor image ranges (not based on R-values though) ... so maybe you mean that feature? > Also the interatomic distances in regions of interest in the staraniso > structure is on par with parallel molecular dynamics simulation data. So your model is "better" in terms of explaining some other, externally determined results when using autoPROC/STARANISO data? Very good. > The questions are: > > 1. Can the Staraniso structure be submitted to pdb saying reprocessed > structure at higher resolution (through Staraniso)? Your model is more meaningful as judged by external information (MD simulations) ... so why should one not deposit the data as-is when it clearly was instrumental in providing you that added information? Remember: anything happening in STARANISO is done without seeing anything of your model (so no model bias is possible at all)! It would be effectively a new deposition with different data, as opposed to a re-refinement of the structure with the same data, so yes. Unless you plan to obsolete the first deposition, you should make a reference to it explaining how it's related. As part of autoPROC/STARANISO processing, we are providing a deposition-ready mmCIF file that contains multiple datablocks - since different downstream programs and methods might require different stages of data processing and analysis. For the full history and background, please see: * "Introduce Global Phasing Extensions to v50 dictionary" (February 2021): https://github.com/wwpdb-dictionaries/mmcif_pdbx/commit/81a037c4bac0ccebdd8772717857d3527cb47db3 * "Improved support for extended PDBx/mmCIF structure factor files" (January 2022): https://www.rcsb.org/news/feature/61df48320fea311d064aa4de * https://www.globalphasing.com/buster/wiki/index.cgi?DepositionMmCif * https://www.wwpdb.org/deposition/preparing-pdbx-mmcif-files > 2. What is the factor more important for a structure: completeness > (spherical vs ellipsoidal) or R statistics? A model is better if the X-Ray data it is refined against contains all the information and leaves out pure noise. Obviously, that is an ideal that is hard to achieve - so during processing we try to define a cut-off surface that will include most signal and exclude most noise. That approach is taken both for isotropically diffracting crystals and for anisotropically diffracting ones. Completeness (see discussion above) is important, handling of poor image ranges can help (crystal moving out of beam etc), R-values are just numbers (especially ignore Rmerge!), <I/sigI> and CC_1/2 (the latter except for significantly anisotropic data) are good metrics ... but again: these are just numbers. If the density is clearer because the model refinement works better /and/ the model interpretation is more meaningful as judged by external results: that's all that matters, right? Everything else is just numbers ... but if you are looking at numbers: the ellipsoidal completeness out to the cut-off based on the local average I/sigma(I) is far more important than the merging stats (Rs and CC_1/2), as seems to be confirmed by your gratifying observation that the interatomic distances in regions of interest from the data processed by autoPROC+STARANISO accord with expectations. The merging statistics are not very reliable indicators of data quality; CC_1/2 especially can sometimes be unreliable as a measure of data quality in the presence of significant anisotropy because the common anisotropy between the half-sets can create a dominant contribution to the correlation between them - even if the data themselves don't agree particularly well. > 3. Why is the extra resolution not detected during indexing by iMOSFLM or > XDS (using default setup)? The indexed outputs of either of them did > not give extra resolution (through anisotropic scaling) in Staraniso, > although it said some data was missing. It is not clear what you mean here: iMOSFLM and XDS do not apply diffraction cut-offs to the data by default - unless those defaults have been changed by whatever program/tool is driving those programs (maybe done via some automatic processing pipelines at a synchrotron beamline?). You might want to check again how (and by what system) those programs were run: there is nothing to prevent those to also output the scaled+merged data without any cut-off (which then could go into STARANISO). If STARANISO (through the STARANISO webserver) complained about missing data, then some kind of cut-off was applied before the data was given to STARANISO. Most likely an isotropic cut-off (a sphere as the cut-off surface) was used - resulting in exclusion of observable data in the well-diffracting direction(s) and inclusion of noise in the poorly diffracting direction(s). > 4. Is there any option for using all reflections detection (like > autoPROC) in iMOSFLM or XDS? It is not clear what you mean by 'all-reflections detection': MOSFLM/iMOSFLM and XDS already output all valid reflections. Maybe you could clarify that point further? Hope this helps, get back to us if not. Regards Clemens, Ian & Gerard (for the autoPROC+STARANISO team) ######################################################################## To unsubscribe from the CCP4BB list, click the following link: https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1 This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list hosted by www.jiscmail.ac.uk, terms & conditions are available at https://www.jiscmail.ac.uk/policyandsecurity/