[ccp4bb]: SUMMARY: Molecular replacement in the "twilight zone" and space group problems

david lawson \(JIC\) Mon, 19 Jun 2006 06:47:57 -0700

***  For details on how to be removed from this list visit the  ***
***          CCP4 home page http://www.ccp4.ac.uk         ***

Dear All,

Sorry for the (very) long delay in posting a summary on this one. I suppose at 
the back of my mind I was hoping that we would be able to resolve all the 
problems before doing so! Anyway, to remind you, here is the original posting: 

---------------------------------------

We are having a lot of trouble with a particularly awkward protein and would be 
extremely grateful for any helpful suggestions.

The protein appears to be a tetramer of 30 kDa subunits, as judged by dynamic 
light scattering and gel filtration, and so far we have obtained two crystal 
forms. 

Crystal form 1 - processes nicely as C2 with the following cell
parameters: a = 86.5, b = 139.2, c = 100.7 Ang, beta = 101.6 deg. We have 
collected complete data to 2.1 Ang resolution. We estimate a solvent content of 
50% for one tetramer per ASU. 

Crystal form 2 - processes equally well as rhombohedral, I centred tetragonal 
or F centred cubic (at the end of this message I have given a typical 
autoindexing run from Denzo). The crystals diffract strongly to at least 2 Ang 
resolution, but data collection much beyond 2.5 Ang in-house is difficult 
because of the long cell parameters. If the data are processed as F432, for 
example, this gives a solvent content of 65% for one tetramer per ASU.

So far we have a molecular replacement solution for xtal form 1 that I think is 
essentially correct, as it gives sensible packing and a NCS 4-fold that is 
consistent with a peak in a self-rot function. However, the calculated phases 
are very poor because the template structure represents only about 65% of the 
target structure and the sequence identity is in the low 20s (the template was 
produced using CHAINSAW).
Refinement on the structure beyond fitting in AMORE seems to make things worse. 
Running DM with 4-fold averaging seems to help, but the maps are heavily 
biased. I have taken advice from Kevin on which recipes to use, but so far have 
seen little improvement. My feeling is that if we could somehow refine the 
AMORE solution without it falling apart, perhaps we could get a better set of 
input phases for DM - any ideas here?

Molecular replacement with data from xtal form 2 in a variety of space groups, 
doesn't yield any sensible solutions. We have some "derivative"
data at 2.2 Ang resolution from crystals grown using tungstate as the 
precipitant. These were collected at the W L1 edge and appear to show a very 
significant anomalous signal. Eleanor took a look at these data (processed as 
F432) and found a native Patterson peak at 1/4,1/4,1/4, suggesting 2 molecules 
per ASU (this would give a solvent content of 30%). She also found 2 heavy atom 
sites with SHELX at 1/2,0,0 and the related 3/4,1/4,1/4, but these are not much 
use for anomalous phasing.
Perhaps the anomalous signal is due to a lot of low occupancy binding sites. 
BTW, the data seem to pass all the tests in TRUNCATE for not being twinned. Any 
advice here would be appreciated.

Clearly, working with xtal form 1 is the most straightforward, but if we can 
get some phase info for xtal form 2, then cross-xtal averaging becomes a 
possibility. Unfortunately, now we can only get crystals of form 2. Also the 
structure is unlikely to have any ordered Mets, so we have introduced 3 - now 
we can't get soluble Se-Met protein - but that's another story! 

Any helpful suggestions gratefully received. 

Many thanks,

Dave Lawson

---------------------------------

I got a very good response from the BB and rather than paraphrase all the 
suggestions, I have appended the responses at the end of this message. 
Apologies if I have forgotten to include someone.  

It turns out that the C2 (crystal form 1) MR solution was correct and we 
eventually managed to get somewhere from the rather poor starting phases as 
follows:

1) Chainsaw to give model that represents ~65% of target structure (~20% 
identity).
2) AMORE with tetramer (incidentally I ran PHASER and it gives the same top 
solution).
3) SUPERPOSE to put the full search model (before Chainsaw) onto the AMORE 
solution.
4) Use model from step 3 to generate a "fuller" mask of tetramer with 5A radius 
in NCSMASK. (In hindsight, a mask around the MR solution only was good enough)
5) 10 cycles of REFMAC RB refi of individual subunits from AMORE soln at 5Ang.
6) 1000 cycles of DM with 4-fold averaging and phase extension from 5A - 2.2A 
using calculated phases from step 5 (after running CAD to put back the higher 
res data). Refining sym ops and updating mask every 5 cycles. (In hindsight, 
running the same phase extension but for 200 cycles did not yield an 
interpretable map!). 
7) Resultant map looked good with regions of structure present that were not 
there in the AMORE soln. 
8) Arp/wArp to build starting model from "experimental phases". This fitted 
about 86% of the sequence.
9) Current model has R/Rfree values of 16/22%

I guess the take home message is that the 5A phases were probably pretty good 
and we got this idea from a couple of the BB replies eg. from Nicholas Glykos 
and Massimo Degano. 

Unfortunately the situation with crystal form 2 is still not resolved. All the 
indications were that F centred cubic was the symmetry, as suggested by 
indexing in MOSFLM (see Harry Powell's email) and analysis using both POINTLESS 
and XPREP (cell: 154.49 154.49 154.49  90.00  90.00  90.00). However, MR using 
the crystal form 1 model didn't work for F centred cubic space groups. Instead 
partial solutions were found for other symmetries (mainly using PHASER).  The 
tetramers have a concave and a convex surface. Common to all solutions was that 
tetramers were arranged such that they came together in pairs with the convex 
surfaces forming a cavity in the middle of an octamer. Searches were tried with 
monomers, dimers and tetramers. I show only the "best" results:

I422 (cell: 154.53 154.53 218.99  90.00  90.00  90.00)
3 monomers found. This leaves a hole in the packing that could accommodate 
another monomer, but there was no interpretable density here. solvent content 
54% for 4 dimers. 

F222 (cell: 218.47 218.60 218.99  90.00  90.00  90.00)
4 dimers found, but the fourth is poorly defined. Incidentally, Eleanor tried 
this and got a different orientation for the fourth dimer! solvent content 54% 
for 4 dimers.

C2 (cell: 267.46 154.41 154.65  90.00 125.04  90.00)
3 tetramers found. solvent content 65% for 3 tetramers (4 would give 54%). 

Using the tungstate data we also looked at anomalous difference Fouriers 
(phased on MR solutions) to help find missing molecules, but this didn't help. 

As you might imagine none of these solutions refined particularly well, the 
best being the F222 solution which sticks with an Rfree of ~31%. 

Data from crystal form 2 were analysed by the experts at the MAX-INF workshop 
in Barcelona (including George Sheldrick, Isabel Usón, Phil Evans and Garib 
Murshudov) and the conclusion was that there was some kind of twinning not 
detectable in the usual tests in truncate. Incidentally the maps for the F222 
solution looked very clear for 3 out of the 4 dimers and actually didn't show 
any significant differences with the crystal form 1 structure. Therefore we 
have finally decided to shelve the data from crystal form 2 - we have already 
wasted too much time on it!!!

Very many thanks to all who contributed, especially Eleanor!

Dave Lawson

------------------------------------

I have pasted responses to the original message below:

---------------------------------

Hi Dave

you can do two things to try to get a better handle on the true symmetry -

(1) index with Mosflm and in the autoindex dialogue answer "Y" to the question 
"Do you want to try enhanced solution picking", and answer "Y" to all except 
the question regarding minimum cell edge - this gives much more information 
than available from Denzo or earlier versions of Mosflm (even includes a Denzo 
style distortion index for people unfamiliar with Mosflm's penalties) - the 
figure to look for in the output table is probably the SDCELL - for good 
solutions it is around the same as the triclinic basis set.

(2) run Phil Evans' program Pointless on your unmerged dataset (I'm afraid I 
don't know how to do this with Denzo output) - this gives very clear 
indications of the highest likely symmetry with your data.

I hope this helps.

Harry
--
Dr Harry Powell, MRC Laboratory of Molecular Biology, MRC Centre, Hills Road, 
Cambridge, CB2 2QH

--------------------

Dear David,
Just a couple of comments, and probably not much help:

If I recall correctly, you will always get a rhombohedral solution if the data 
is cubic (you can compose a cube from 3 identical rhomboids), so that chances 
of it actually being rhombohedral are probably quite low.

On form 1, you might try resolve rather than dm to do your averaging ( the 
'prime-and-switch' method is designed to overcome model bias when working from 
MR solutions).

If I think of anything else, I'll let you know.

Cheers,
Charlie

-- 
Dr Charles S. Bond        University of Dundee   Tel: +44-1382-388325
Honorary Lecturer               Dow St, Dundee   Fax: +44-1382-345764
BBSRC David Phillips Fellow  DD1 5EH, Scotland  [EMAIL PROTECTED]
School of Life Sciences      http://stein.bioch.dundee.ac.uk/~charlie

------------------------------

Dear David,

        I suggest you try and refine it with Buster-TNT, declaring a sensible 
solvent content and switching on the modelling of the missing atoms based on 
the current phases - or - even better - the DM map.

        The program is available at http://www.globalphasing.com/buster/

        A description of how the program can help refinement and completion of 
severely incomplete structures is in:
Blanc E, Roversi P, Vonrhein C, Flensburg C, Lea SM, Bricogne G., Refinement of 
severely incomplete structures with maximum likelihood in BUSTER-TNT.
Acta Crystallogr D Biol Crystallogr. 2004 Dec;60(Pt 12 Pt 1):2210-21.

        Good luck!

        Pietro
--
Pietro Roversi - Laboratory of Molecular Biophysics - Biochemistry Dept.
University of Oxford - South Parks Road - Oxford OX1 3QU - England, UK Tel. 
0044 (0)1865 275385 - Fax. 0044 (0)1865 275182 
http://biop.ox.ac.uk/www/lea/website/index.htm

-----------------------------

Hi, Dave Lawson,

Why do you say a native Patterson peak at 1/4,1/4,1/4 suggests 2 molecules per 
ASU (this would give a solvent content of 30)? Is it possible that there is a 
pseudo-translation in your crystal? If so, you may take into account it and try 
MOLREP or other MR software.

By the way, you mean the sequence identity is 20%?

Heli Liu  

------------------------------

Hi David,
you do not give any info about your post-AMORE refinement protocol for crystal 
form 1.  Could you please elaborate a little bit on that?

In any case, I would recommend giving CNS and its simmulated annealing protocol 
a chance (with  NCS restraints), at least in the first couple of rounds of your 
refinements,  to help eliminate  grossly incorrect main-chain/side-chain 
conformations 'trapped' at the interfaces of your tetramer and/or elsewhere. 
The way you carried out your MR (e.g. a monomer vs dimer vs tetramer as search 
models) could also be a factor.
We had a case recently with three dimers in the asu (also in C2) in which 
searching with a monomer in PHASER using data to 2.5 angs (i.e.
higher res. than the 15-4 angs one uses in AMORE) resulted in a better behaving 
model during early refinement rounds than searching with a dimer. In that case 
SA protocols in CNS were also important to help us get out of an early rut. 
Once the model was fairly robust we switched to REFMAC.

Trying to reproduce the molecular replacement solution with other programs (e.g 
PHASER, MOLREP) may not be a bad idea afterall, as this could help you better 
evaluate the solution you already have.

I hope this helps a little bit.
Best wishes
Savvas

_________________________________________
Savvas N. Savvides
Ghent University
Laboratory for Protein Biochemistry
K.L.Ledeganckstraat 35
9000 Ghent, BELGIUM
Phone: +32-(0)9-264.51.24 ; +32-(0)472-92.85.19
FAX: +32-(0)9-264.53.38
Email: [EMAIL PROTECTED]
________________________________________

-------------------------

On Tue, 20 Dec 2005, david lawson (JIC) wrote:

> Running DM with 4-fold averaging seems to help, but the maps are 
> heavily biased.

With initial phases from molecular replacement, it is essential to allow the 
real space averaging to permit escape from model bias.

The initial phases are wrong (not completely wrong...) since they are biased by 
the model, by the wrong sequence, by the wrong loops... Being model phases they 
dominate during a phase combination process.

In other words, if you average (4-fold averaging) your initial density, you 
will move away from the wrong, model-like density. By combining the resulting 
phases with the molecular replacement phases you ensure that your resulting, 
phase-combined map, is essentially like your initial map, i.e. you do not 
escape from model bias.

Just ensure that you do not combine phases, i.e. use the model phases only once 
(to generate the initial map) then throw them away. For this you must ensure 
that the NCS operators are correct (4-body rigid body minimization to the 
highest resolution of the diffraction data) and the envelope must be correct. 
Envelope free averaging can help you to decide if your envelope is too small or 
if it encompasses the object entirely.

Fred.

-- 

s-mail: F.M.D. Vellieux (B.Sc., Ph.D.)
        Institut de Biologie Structurale J.-P. Ebel CEA CNRS UJF
        41 rue Jules Horowitz
        38027 Grenoble Cedex 01
        France
Tel:    (+33) (0) 438789605
Fax:    (+33) (0) 438785494
e-mail: [EMAIL PROTECTED]

-------------------------------------

Hi David,

  Maybe refinement of individual secondary structure elements with the 
geometric terms on (in xplor or cns) could improve the model without 
overfitting ? I have used the method once going down to residues per body, but 
this was with a small protein, see 
http://www.mbg.duth.gr/~glykos/polyAla_reprint.pdf for more details.

> So far we have a molecular replacement solution for xtal form 1 that I 
> think is essentially correct, as it gives sensible packing and a NCS 
> 4-fold that is consistent with a peak in a self-rot function. However, 
> the calculated phases are very poor because the template structure 
> represents only about 65% of the target structure and the sequence 
> identity is in the low 20s (the template was produced using CHAINSAW).
> Refinement on the structure beyond fitting in AMORE seems to make 
> things worse. Running DM with 4-fold averaging seems to help, but the 
> maps are heavily biased. I have taken advice from Kevin on which 
> recipes to use, but so far have seen little improvement. My feeling is 
> that if we could somehow refine the AMORE solution without it falling 
> apart, perhaps we could get a better set of input phases for DM - any ideas 
> here?

-- 

            Dr Nicholas M. Glykos, Department of Molecular 
        Biology and Genetics, Democritus University of Thrace,
    Dimitras 19, 68100 Alexandroupolis, Greece, Fax ++302551030613
     Tel ++302551030620 (77620),  http://www.mbg.duth.gr/~glykos/

--------------------------------

Dear David,

since you are in academia, you can try a good long run of Arp/wArp with model 
building. In a previous life I had success with this approach in a similar 
case, just let it run over the weekend or so and see what comes out of it.... 
2.1 Ang should be enough.

Flip

----------------------

Dear David,

A suggestion, which you might have already explored, using 4-fold averaging.
Start with phases to low resolution (5A, even 6A) calculated from the rigid 
body refined MR solution. Then extend the phases in DM to 2.5Å in hundreds (up 
to 1000) of cycles. We had a very similar case (4-fold averaging with a model 
that has approx 20% identity with the target structure, 30% solvent content, 
weak data to 2.5Å)), and other averaging protocols gave heavily biased maps. 
Switching to the phase extension procedure gave us beautiful maps that allowed 
the rebuilding of the entire molecule. Also, if you can tell where the missing 
parts of your molecule are you can build a mask that covers that region. This 
could help you improve the model & phases to see the 35% not included in your 
search molecule. Or at least to get a solution with your other crystal form.
Good luck,

        Massimo

--
Dr. Massimo Degano
Biocrystallography Unit
Dibit Fondazione San Raffaele
via Olgettina 58
20132 Milano
Italy
Office: (+39) 0226437152
Lab: (+39) 0226434921
FAX: (+39) 0226434153
email: [EMAIL PROTECTED]
http://www.sanraffaele.org/research/degano

-------------------------------

David-

You might want to try refinement in buster-tnt.  The suite explicitly takes 
into account the scattering from missing parts of your protein, and may give 
you better results...  I've had some success with a fairly incomplete model; I 
think most of the improvement came from buster-tnt's refinement of atomic 
bfactors for what that's worth.  If nothing else, it will give you something to 
do for a day (install the program) :).

Josh

_____________________________________________
-  Joshua Warren, PhD ([EMAIL PROTECTED])   -
-        212 Nanaline H. Duke               -
-                 DUMC                      -
-        home: (919) 918 7860               -
-        work: (919) 681 5266               -
_____________________________________________

-------------------------

Resolve has a very nice script called RESOLVE_BUILD that can rebuild your model 
and eliminate model bias. It is not the same as the"prime and switch". 
It takes longer to run but it is well worth the time. Just let it run overnight 
and the next day you may find some amazing results. In my case, I had 2.2 A 
data, 2 molecules in the asu. I used an mtz file after 1 refmac run and input 
FWT and PHIC to RESOLVE_BUILD. In your case, you could add 4-fold averaging 
into the script. If you find the result to be better than the input, you can do 
some manual building/correction and refine with refmac another round or two and 
repeat the RESOLVE_BUILD process using new FWT and PHIC. I found that iterative 
process help improving the map quite significantly.

Is it possible that for some reason, your protein in crystal form II has a 
different structure from crystal form I? I had a case that took me so long for 
MR. For that one, I had 2 crystal forms crystallizig in the same condition but 
differnt space groups. In crystal form II, one helix moves out of its original 
place and searching with the full model from crystal form I for some reason did 
not give a solution. In another case, my 5-alpha + 5-beta protein crystallized 
as a 4-alpha + 4-beta structure. Searching with a full model did not work 
either.

George Wisedchaisri

-------------------------------

Dear David,
is there a chance that your crystal form 1 is twinned? You say that truncate 
did not give any such indications for form 2 crystals but you did not say 
anything about form 1. What is the crystal morphology? I am wondering whether 
there is some kind of a twin relationship lurking in your lattice because I 
could not help notice that your b axis is
a+0.5*c. Also what other indexing options do you get for crystal form 1?

Savvas

---------------------------

Not to  cause undue concern, but I am aware of at least one contaminant band of 
~30 kd that is a degradation product and minor species in protein preps.  
It also fails to N terminal sequence due to its degraded N terminus.  The 
crystals are small but diffract great to up to 2 angstroms regardless of x-ray 
source and have about 65% solvent, 1 molecule in the asymmetric unit. If you 
use nickel resin to purify your protein, you may be concerned if your data is
F432 with cell parameters around 220 angstroms.  It binds metal, and can be 
solved by mad (I and 2 other lab mates have done it, I used platinum, but 
mercury also works).  It is a fragment of E2O, but I forget the pdb code off 
the top of my head.  You may want to check it out.  Some references to the 
protein are Knapp et al @ 1998-2000.

Just a quick heads up.  hopefully this is not the case.

Timothy I.Wood

-------------------------------

hi,
dont know if this really helps but i seem to have the experience that for 
whatever reason rigid body refinement in refmac does not necessarily work when 
CNS still can refine it quite OK to R below aroudn or below 40%.. (As the first 
step after mol. replacement) --something like this happened to me several times.

If someone knows what that is all about would be interesting to hear.

(although i guess with incomplete model maybe the arp idea is best or solve 
etc?)

tommi

-----------------------------

Hi David,
Have you tried Resolve with the prime and switch command? We found it very 
useful to get really nice phases in an MR involving a search model in the 
sub-30 percent sequence overall identity range using 3Å data. 
Just make sure to define the intial NCS and adjust the input mtz following the 
nice intructions on Terwilligers home page. An additional approach that you 
might already have tried, at this resolution, is to ARP/wARP the MR solution to 
death.

Good luck!!!

Karl-Magnus Larsson

------------------------

hi dave,

well i had a case where i had very poor phases from MIR and no NCS....

the protocol that worked for me was...

RESOLVE then put these phases in DMMULTI(you do not need phases from both the 
crystal forms, you just need phases from one of them and rotation and 
translation matrices relating the two moleculs in the two crystal
forms....) ...and then put these phases in Arp/warp (as so many people have 
suggested)...i would like to add one more thing...even if arp/warp is not able 
to build your model completely due to poor phase information....even then 
...just use the phases given in the output mtz (of arp/warp) to calculate maps 
with the help of fft.com of ccp4(do not use the maps directly output by 
arp/warp ..they are never that good..i dont know why...)...these calculated 
maps would be much better and would be a great help in manual model building...

and yes i would testify to the fact that putting your MR solution in CNS rigid 
body refinemnet and annealing first and then going over to refmac is much 
better than starting with refmac straight away...it has worked for me too (and 
even i dont know why...)...so you could put your MR solutions in CNS first 
(rgid body and simulated annealing)...use the pdb you get from there to 
calculate phases (using sfall and SIGMAA of ccp4)..then put these phases in 
RESOLVE...and then the rest will be as described above...

BEST OF LUCK for your endeavours..

Sameeta

-------------------------------

Dr. David M. Lawson
Biological Chemistry Dept.,
John Innes Centre,
Norwich,
NR4 7UH, UK.
Tel: +44-(0)1603-450725
Fax: +44-(0)1603-450018
Email: [EMAIL PROTECTED]
Web: http://www.jic.bbsrc.ac.uk/staff/david-lawson/index.htm

[ccp4bb]: SUMMARY: Molecular replacement in the "twilight zone" and space group problems

Reply via email to