James,
1) It does not seem to have a 222 symmetry from the self-rotation
under P2/P21.
2) Besides what Eleanor suggested, there might be another possibility
that it is twinned in a way with P2/P21 cell of a'(=a), b'(~a, =b/2),
c'(c),
90, 90, 90. The uniqueaxis is along a' or b', but not c'.
3) If it were twinned perfectly (50%), the data may be "equivalently"
processed as P1 with 2(X)2 symmetry (Cell: a', 2b', c'; i.e., a, b,
c). But
because the two twinning-related 2/21 axes are only perpendicular to
each other but not crossed, the do not make a real 2-fold symmetry
in the 3rd direction, i.e., not a 222 symmetry generated but with a
regional 2+2 symmetry. Special consideration may be needed to tell if
it is really
P2/P21 or P1. In this case, you anomalous data might be useful.
4) Thus, would suggest P1 MR to find 4 resolutions without 222
restraints. i.e., 2+2 may be used but not for the 2X2 symmetry.
Only my speculations! Good luck!
Lijun
On Nov 24, 2008, at 10:23 PM, James Irving wrote:
Dear cpp4bb,
I'm wrestling with a crystal form that is causing us a great deal of
trouble, and I was hoping for some general suggestions that might
get us working in the right direction. The protein in question is
40kDa. Details are provided below. The summary is that at the
model building/refinement stage this data is behaving as though it
is twinned, or merged in the wrong space group, or there is some
other fundamental anomaly that needs to be accounted for, but the
tests we've performed indicate that it is solved, not twinned, in
the correct space group.
Here are some details:
1. It integrates and merges well in P2(1)2(1)2(1) without any
trouble (see table pasted at bottom). We've collected two datasets,
around 2.5-2.6A each, one is of a selenomethionine derivative. To
get as strong data as possible for MIRAS we collected ~400 deg of
the derivative, and around 205 deg of the native. Scaling and
merging in XDS (although we've used mosflm/scala as well) gives
rather good stats at the low resolution (rmerge 1-2% <3.7A) and
deteriorate quite markedly in the high resolution bin (~50% at
2.55-2.7A), with I/sigI of 60 for the former and 3.0 for the
latter. Systematic absences very strongly support a P212121 space
group.
2. There is no visible sign that this is part of a superlattice, or
comprises a superlattice. Virtually all spots are accounted for
during integration. By eye there are no systematically weak and
strong reflections. Playing with the minimum I/sigI at the indexing
stage doesn't do anything, nor does deleting strong reflections and
indexing only with weak ones, nor does indexing using only high, or
only low, resolution spots.
3. Unit cell: a=42.6 b= 85.3 c=108.5 90 90 90. a is almost
exactly 2*b.
4. Wilson plot looks normal. There is no detected
pseudotranslation. Cumulative intensity distribution in truncate
appears *very slightly* sigmoidal. Is it twinned? More on that in
a moment...
5. There is a reasonably close homolog of this protein that has been
crystallised (~50% identity) - we were expecting an easy MR
solution. Phaser gives Z-score in the rotation function of ~17, and
~11 in the translation function for a single molecule in the AU, as
expected. 2FoFc & FoFc maps look absolutely rubbish, very much
worse than would be expected for this protein at this resolution
with this solution. Correction for anisotropy doesn't improve maps
much here or at any other stage in the building/refinement.
6. Scaling and merging in P21 on the off-chance of perfect twinning
or pseudosymmetry gives exactly two solutions with very good Z-
scores. Maps still look rubbish. Phenix.xtriage, as would be
expected, suggests a twinning operator with alpha ~0.5 that is
identical to a crystallographic operator in P212121. Rigid body
refinement using phenix.refine and this twinning operator gives
Rfactor/Rfree that are low but again maps are uninterpretable.
Conclusion: this isn't perfectly pseudomeroherally twinned in P21.
7. Went back to basics in P1. Same deal as P21.
8. All other enantiomorphs in monoclinic and orthorhombic give
significantly lower, and poorly distinguished translation Z-scores
in MR.
9. The selenomethionine dataset was solved using MIRAS in SHARP/
autoSHARP. The experimentally phased electron density yields
contiguous tracts of density in the right place, unbiased density
indicates a good solution. Model building was conducted in P212121,
initially into the experimental maps and later with refinement in
refmac using HL-coefficient-based restraints. In some regions,
sequence can easily be deduced from clean electron density (for the
resolution). In other regions, side chains are missing, and in
others, density is completely inconsistent with the connectivity of
the chains and highly conserved structural elements. As occurs
sometimes with twinned data, many loops cannot be modelled at all,
and the Rfree does not drop below 0.41 with an Rfactor of 0.34. The
result is a model that is about 60-70% complete. Refinement was
performed with and without B-factor refinement.
10. Using density modification in SOLOMON, a structure-based solvent
mask in DM or statistical modification in PIRATE fails to elucidate
these additional, significant missing regions (which includes three
helices, 1.5 beta sheets and several loops). Tellingly, comparing
final models to the original experimentally phased maps shows
truncation of the model at the same places as "truncation" occurs in
the electron density. During the building process as much care was
taken as possible that the structure was not being built into a
"local minimum".
11. phenix.autobuild is able to build a polyalanine model that
covers about 25% of the molecule.
12. The native and derivative datasets scale extremely well
together: they are strongly consistent. This is often not the case
with twinned crystals.
Any suggestions would be greatly appreciated!
Thanks,
James
OUTPUT FROM CORRECT IN XDS:
SUBSET OF INTENSITY DATA WITH SIGNAL/NOISE >= -3.0 AS FUNCTION OF
RESOLUTION
RESOLUTION NUMBER OF REFLECTIONS COMPLETENESS R-FACTOR R-
FACTOR COMPARED I/SIGMA R-meas Rmrgd-F Anomal SigAno Nano
LIMIT OBSERVED UNIQUE POSSIBLE OF DATA observed
expected Corr
7.02 4906 697 746 93.4% 1.9%
2.2% 4897 77.46 2.0% 1.1% 11% 0.849 447
5.04 9074 1152 1152 100.0% 2.1%
2.5% 9074 66.65 2.3% 1.4% 1% 0.797 890
4.13 11742 1445 1446 99.9% 2.3%
2.6% 11742 65.87 2.5% 1.5% -7% 0.755 1178
3.59 13929 1685 1685 100.0% 3.4%
3.6% 13929 45.70 3.6% 2.8% 2% 0.763 1428
3.21 15675 1884 1884 100.0% 6.5%
6.5% 15675 28.42 6.9% 5.7% -1% 0.811 1618
2.94 17324 2066 2066 100.0% 14.1%
14.2% 17324 14.14 15.0% 14.5% 0% 0.787 1804
2.72 18743 2232 2232 100.0% 27.5%
27.7% 18743 7.61 29.3% 29.9% 0% 0.749 1965
2.55 20107 2388 2388 100.0% 46.1%
45.4% 20107 4.85 49.1% 51.7% -2% 0.717 2123
2.40 20056 2466 2545 96.9% 73.6%
72.2% 20023 3.04 78.5% 81.1% 1% 0.707 2144
total 131556 16015 16144 99.2% 5.4%
5.6% 131514 26.34 5.7% 10.6% 0% 0.758 13597
Attached figures:
Data scaled in P2, self-rotation function in MOLREP
Data scaled in P222, self-rotation function in MOLREP
Cumulative intensity distribution in TRUNCATE
<P222_srf.jpg><cumul_intensity_081008.jpg><P2_srf.jpg>
Lijun Liu, PhD
Institute of Molecular Biology
HHMI & Department of Physics
University of Oregon
Eugene, OR 97403
541-346-5176