Re: [ccp4bb] Help sought for problem dataset

Lijun Liu Tue, 25 Nov 2008 09:49:48 -0800

James,

1) It does not seem to have a 222 symmetry from the self-rotationunder P2/P21.2) Besides what Eleanor suggested, there might be another possibilitythat it is twinned in a way with P2/P21 cell of a'(=a), b'(~a, =b/2),c'(c),

90, 90, 90.  The uniqueaxis is along a' or b', but not c'.

3) If it were twinned perfectly (50%), the data may be "equivalently"processed as P1 with 2(X)2 symmetry (Cell: a', 2b', c'; i.e., a, b,c). Butbecause the two twinning-related 2/21 axes are only perpendicular toeach other but not crossed, the do not make a real 2-fold symmetryin the 3rd direction, i.e., not a 222 symmetry generated but with aregional 2+2 symmetry. Special consideration may be needed to tell ifit is really

P2/P21 or P1.  In this case, you anomalous data might be useful.

4) Thus, would suggest P1 MR to find 4 resolutions without 222restraints. i.e., 2+2 may be used but not for the 2X2 symmetry.


Only my speculations!  Good luck!

Lijun

On Nov 24, 2008, at 10:23 PM, James Irving wrote:

Dear cpp4bb,
I'm wrestling with a crystal form that is causing us a great deal oftrouble, and I was hoping for some general suggestions that mightget us working in the right direction. The protein in question is40kDa. Details are provided below. The summary is that at themodel building/refinement stage this data is behaving as though itis twinned, or merged in the wrong space group, or there is someother fundamental anomaly that needs to be accounted for, but thetests we've performed indicate that it is solved, not twinned, inthe correct space group.
Here are some details:
1. It integrates and merges well in P2(1)2(1)2(1) without anytrouble (see table pasted at bottom). We've collected two datasets,around 2.5-2.6A each, one is of a selenomethionine derivative. Toget as strong data as possible for MIRAS we collected ~400 deg ofthe derivative, and around 205 deg of the native. Scaling andmerging in XDS (although we've used mosflm/scala as well) givesrather good stats at the low resolution (rmerge 1-2% <3.7A) anddeteriorate quite markedly in the high resolution bin (~50% at2.55-2.7A), with I/sigI of 60 for the former and 3.0 for thelatter. Systematic absences very strongly support a P212121 spacegroup.
2. There is no visible sign that this is part of a superlattice, orcomprises a superlattice. Virtually all spots are accounted forduring integration. By eye there are no systematically weak andstrong reflections. Playing with the minimum I/sigI at the indexingstage doesn't do anything, nor does deleting strong reflections andindexing only with weak ones, nor does indexing using only high, oronly low, resolution spots.
3. Unit cell: a=42.6 b= 85.3 c=108.5 90 90 90. a is almostexactly 2*b.
4. Wilson plot looks normal. There is no detectedpseudotranslation. Cumulative intensity distribution in truncateappears *very slightly* sigmoidal. Is it twinned? More on that ina moment...
5. There is a reasonably close homolog of this protein that has beencrystallised (~50% identity) - we were expecting an easy MRsolution. Phaser gives Z-score in the rotation function of ~17, and~11 in the translation function for a single molecule in the AU, asexpected. 2FoFc & FoFc maps look absolutely rubbish, very muchworse than would be expected for this protein at this resolutionwith this solution. Correction for anisotropy doesn't improve mapsmuch here or at any other stage in the building/refinement.
6. Scaling and merging in P21 on the off-chance of perfect twinningor pseudosymmetry gives exactly two solutions with very good Z-scores. Maps still look rubbish. Phenix.xtriage, as would beexpected, suggests a twinning operator with alpha ~0.5 that isidentical to a crystallographic operator in P212121. Rigid bodyrefinement using phenix.refine and this twinning operator givesRfactor/Rfree that are low but again maps are uninterpretable.Conclusion: this isn't perfectly pseudomeroherally twinned in P21.
7. Went back to basics in P1.  Same deal as P21.
8. All other enantiomorphs in monoclinic and orthorhombic givesignificantly lower, and poorly distinguished translation Z-scoresin MR.
9. The selenomethionine dataset was solved using MIRAS in SHARP/autoSHARP. The experimentally phased electron density yieldscontiguous tracts of density in the right place, unbiased densityindicates a good solution. Model building was conducted in P212121,initially into the experimental maps and later with refinement inrefmac using HL-coefficient-based restraints. In some regions,sequence can easily be deduced from clean electron density (for theresolution). In other regions, side chains are missing, and inothers, density is completely inconsistent with the connectivity ofthe chains and highly conserved structural elements. As occurssometimes with twinned data, many loops cannot be modelled at all,and the Rfree does not drop below 0.41 with an Rfactor of 0.34. Theresult is a model that is about 60-70% complete. Refinement wasperformed with and without B-factor refinement.
10. Using density modification in SOLOMON, a structure-based solventmask in DM or statistical modification in PIRATE fails to elucidatethese additional, significant missing regions (which includes threehelices, 1.5 beta sheets and several loops). Tellingly, comparingfinal models to the original experimentally phased maps showstruncation of the model at the same places as "truncation" occurs inthe electron density. During the building process as much care wastaken as possible that the structure was not being built into a"local minimum".
11. phenix.autobuild is able to build a polyalanine model thatcovers about 25% of the molecule.
12. The native and derivative datasets scale extremely welltogether: they are strongly consistent. This is often not the casewith twinned crystals.
Any suggestions would be greatly appreciated!

Thanks,
James

OUTPUT FROM CORRECT IN XDS:
SUBSET OF INTENSITY DATA WITH SIGNAL/NOISE >= -3.0 AS FUNCTION OFRESOLUTIONRESOLUTION NUMBER OF REFLECTIONS COMPLETENESS R-FACTOR R-FACTOR COMPARED I/SIGMA R-meas Rmrgd-F Anomal SigAno NanoLIMIT OBSERVED UNIQUE POSSIBLE OF DATA observedexpected Corr
7.02 4906 697 746 93.4% 1.9%2.2% 4897 77.46 2.0% 1.1% 11% 0.849 4475.04 9074 1152 1152 100.0% 2.1%2.5% 9074 66.65 2.3% 1.4% 1% 0.797 8904.13 11742 1445 1446 99.9% 2.3%2.6% 11742 65.87 2.5% 1.5% -7% 0.755 11783.59 13929 1685 1685 100.0% 3.4%3.6% 13929 45.70 3.6% 2.8% 2% 0.763 14283.21 15675 1884 1884 100.0% 6.5%6.5% 15675 28.42 6.9% 5.7% -1% 0.811 16182.94 17324 2066 2066 100.0% 14.1%14.2% 17324 14.14 15.0% 14.5% 0% 0.787 18042.72 18743 2232 2232 100.0% 27.5%27.7% 18743 7.61 29.3% 29.9% 0% 0.749 19652.55 20107 2388 2388 100.0% 46.1%45.4% 20107 4.85 49.1% 51.7% -2% 0.717 21232.40 20056 2466 2545 96.9% 73.6%72.2% 20023 3.04 78.5% 81.1% 1% 0.707 2144total 131556 16015 16144 99.2% 5.4%5.6% 131514 26.34 5.7% 10.6% 0% 0.758 13597
Attached figures:
Data scaled in P2, self-rotation function in MOLREP
Data scaled in P222, self-rotation function in MOLREP
Cumulative intensity distribution in TRUNCATE

<P222_srf.jpg><cumul_intensity_081008.jpg><P2_srf.jpg>


Lijun Liu, PhD
Institute of Molecular Biology
HHMI & Department of Physics
University of Oregon
Eugene, OR 97403
541-346-5176

Re: [ccp4bb] Help sought for problem dataset

Reply via email to