Re: [ccp4bb] Why MAD didn't work but SAD works well

2013-08-23 Thread James Holton
My default MAD strategy is to do single-image inverse beam with round 
robin wavelength changes.  That is:

energy  phi
peak   0
peak 180
remote   0
remote 180
peak1
peak  181
remote1
etc

with one image taken for each line above.   I do this until a full 
sphere is collected for each wavelength (720 images) with an exposure 
time short enough so that the final dose is less than 5 MGy.  On ALS 
8.3.1 that's about 1 second/image.  The 5 MGy comes from the half-dose 
of the fastest-decaying SeMet site I have ever seen (Holton, 2007).  
Once the initial 5 MGy pass is done, then I quadruple the exposure time 
and move the detector a little closer to the sample for another 
sphere.  Moving the detector is to try and put the spots on fresh 
pixels and average over the systematic error associated with using 
exactly the same part of the detector over and over again.  This becomes 
important for Bijvoet ratios less than ~2%.   It is also a good idea to 
always do a full sphere for the same reason: never use the same pixel 
twice.


  The quadrupling of the exposure time is mainly for expediency. Given 
that any rad dam reaction will be essentially exponential, but with an 
unknown half-dose, the best way to sample the curve is with a geometric 
series of exposure times.  Doubling the exposure time increases 
signal/noise by not more than 40%, which seems hardly worth it.  
Quadrupling the exposure doubles the S/N for counting statistics.  So: 
1s, 4s, 16s, and then the crystal is usually pretty dead (at ~0.5 
MGy/min).  This then gives the user the opportunity to do RIP using 
the long exposures as the native.  Or, if there is little damage, they 
can just merge everything together and get the best signal.  The 
influence of read-out noise (if any) also gets effectively washed out in 
the longer exposures.


Now, what I call peak is actually a compromise between the usual 
peak and the inflection.  What I do is split the difference between 
these two for an inf-eak or pea-lection wavelength. From a 
signal-vs-damage point of view this seems to be optimal in my hands.  
Two wavelengths are about twice as good as one, even if the f and f' to 
the remote are only 80% of what they would be at their maxima.  Three 
wavelengths are better than two, but only ~20% better.  I judge this 
by looking at map correlations and the number of sites I can leave 
unmodeled in a 3-wavelength dataset and still get the same map quality 
as a 2-wavelength dataset.  I call such a 2-wavelength dataset DAD, or 
sometimes Bijvoet Anomalous and Dispersive Anomalous Scattering (BADAS).


The only time using the same pixel twice could actually be an advantage 
is if you could somehow put the same spot on the same pixel at two 
different wavelengths.  You can sort of do this by moving the detector 
by a distance proportional to the change in wavelength.  Doesn't work 
exactly because the Ewald sphere is curved and the detector isn't, but 
you can get some spots close.  This might be why Gonzalez et al. 
(2007) noticed that using inflection-and-remote tended to perform better 
than using just the peak.  I haven't done an experiment of my own to 
show this is due to pixel calibration, however.


Of course, for most test crystals it doesn't really matter how you 
collect the data because the anomalous signal is so strong relative to 
pixel calibration, or almost any other source of error for that matter.  
The problem with differentiating the efficacy of one strategy over 
another is that the transition between solvable and unsolvable is 
very very sharp.  Basically, phase improvement methods either make your 
phases better or they don't, and then you iterate.  But, in a rad 
dam-limited world (such as a very very small test crystal), the best 
strategy will prevail. The minimum crystal size you should need if you 
do everything right is what is reported by this web page:

http://bl831.als.lbl.gov/xtalsize.html

As for the terms, inverse beam I think came from Stout and Jensen in 
their description of absorption corrections.  It is supposed to be a 
variation on normal beam (which is where the x-ray beam is 
perpendicular to the spindle).  But like most things, the widespread use 
of the term arises because a popular piece of software (BLU-ICE) chose 
to put those words next to a button on the GUI.


The term round robin I take from a simple load-distribution technique 
in computer science where each CPU, network card, etc takes turns 
getting the next job.  This way each of the things being switched up 
gets the same amount of exposure with minimal granularity.  
Apparently, this name is derived from competitive sporting events where 
the athletes do pretty much the same thing.


One final word to the wise: My strategy of single-image-round-robin is 
not appropriate to all beamlines! Some shutters are better than others 
(even the electronic shutter used for shutterless data collections can 
experience some jitter), 

Re: [ccp4bb] Why MAD didn't work but SAD works well

2013-08-22 Thread James Holton


Yafang,

I'm afraid that just because you still have spots at the end of your 
dataset does not mean radiation damage was not a problem.  The 
reactions that disorder your heavy atom sites go to completion at doses 
that can be as little as 1/30th of the dose required to noticeably fade 
your spots.  There are a number of nice reviews written about this:

http://dx.doi.org/10.1107/S0909049509004361
http://dx.doi.org/10.1107/S0909049512050418
http://dx.doi.org/10.1107/S0909049506048898
http://dx.doi.org/10.1107/S0907444907019580

Also, If your datasets were collected one wavelength at a time, such as 
a complete dataset at the peak, then another complete dataset at the 
inflection, and then, after all that, you collect the reference 
dataset at the remote, then what you have is not a MAD dataset.  This is 
a series of SAD datasets (M-SAD).  Of these three SAD datasets only the 
peak is at the optimum energy for anomalous, and also has the least 
radiation damage, so that one will work better than the other two.  I 
use the term M-SAD instead of MAD because you are effectively using a 
different crystal for each wavelength, and that means the 
inter-wavelength differences are dominated by non-isomorphism. 
Non-isomorphism can easily bury an anomalous signal, and radiation 
damage is a pretty efficient way to make a crystal non-isomorphous with 
its former self.


By looking at examples in the literature, (such as Banumathi et al. 
2004) one can guestimate that the degree of non-isomorphism induced by 
radiation damage is about 1% per MGy of dose.  You can look up the 
nominal dose rate of the beamline you collected these data at here:

http://bl831.als.lbl.gov/damage_rates.pdf
I try to keep the numbers in this document up to date, but most 
beamlines are attenuated to the point where they deliver about 1 MGy per 
minute of shutter-open time. That's for a crystal with  ~20 mM heavy 
atoms, and unattenuated beam.


So, if the dispersive signal you are looking for is 3%, then once your 
crystal has endured more than ~3 minutes of shutter-open time, the 
non-isomorphism will start to overwhelm that signal, and then trying to 
use dispersive (inter-wavelength) differences becomes 
counterproductive.  This is because the software is trying to reconcile 
all the observed differences in terms of heavy-atom positions, and when 
half the differences are coming from non-isomorphism, the equations all 
fall apart.  This is probably why treating your M-SAD dataset as a MAD 
experiment fails.  Anomalous (Bijvoet) differences, however, tend to 
come up fairly close together in phi because once a spot passes 
through the Ewald sphere its Friedel mate will generally pop up on the 
opposite side of the beamstop a few degrees later.  Basically, if you're 
measuring a difference, it is best to measure the two numbers you are 
going to subtract as close together in time as possible.  This is why 
inverse beam with round robin wavelength changes is the approach 
that is most robust to damage effects.  Yes, you still get damage, but 
at least the differences you are subtracting are close together, and 
therefore comparing apples to apples.


I suppose it was the advent of saggital-focusing monochromators that 
made wavelength changes more difficult and more recently the advent of 
so-called shutterless data collection has led to more and more M-SAD 
data collections than MAD.  This is a pity, really, because as George 
has already said, MAD gives you significantly better phases than SAD.  
It just requires a little more patience to collect it properly.


-James Holton
MAD Scientist




On Tue, Aug 20, 2013 at 2:05 PM, Yafang Chen yafangche...@gmail.com 
mailto:yafangche...@gmail.com wrote:


   Hi All,

   I have three datasets of SeMet-incorporated protein at peak, infl
   and high wavelength respectively. SAD with peak dataset works well
   to solve the phase problem. However, MAD with all three datasets
   didn't work at all. The completeness of all three datasets are more
   than 99%. So I think radiation damage should not be a problem. Does
   anyone have any idea about the possible reasons that MAD didn't work
   in this case? Thank you so much for any of your help!

   Best,
   Yafang

   -- 
   Yafang Chen

   Graduate Research Assistant
   Mesecar Lab
   Department of Biological Sciences
   Purdue University
   Hockmeyer Hall of Structural Biology
   240 S. Martin Jischke Drive
   West Lafayette, IN 47907




Re: [ccp4bb] Why MAD didn't work but SAD works well

2013-08-20 Thread George Sheldrick

Dear Yafang,

If radiation damage is not a major problem, MAD should give you more 
phase information than SAD, i.e. better maps, especially
at low resolution. If SAD works but MAD doesn't, there are several 
possible explanations:


1. (most likely) your datasets are inconsistently indexed. This can 
hapen in various ways depending on the Laue group and the unit-cell. If 
you are using hkl2map the plots of the anomalous CC between the 
different datasets are a good quick check. Some programs (e.g. the 
current shelxc) will try to detect this and correct it automatically.


2. You have mixed up the wavelengths or labels of the datasets and so 
the dispersive difference comes out with the wrong sign.


3. You have significant radiation damage and the RIP and dispersive 
differences have opposite signs. Always measure the inflection

dataset last so that they reinforce each other rather than canceling.

4. You have severe radiation damage and only the first (peak) dataset is 
usable.


Best wishes, George


On 08/20/2013 11:05 PM, Yafang Chen wrote:

Hi All,

I have three datasets of SeMet-incorporated protein at peak, infl and 
high wavelength respectively. SAD with peak dataset works well to 
solve the phase problem. However, MAD with all three datasets didn't 
work at all. The completeness of all three datasets are more than 99%. 
So I think radiation damage should not be a problem. Does anyone have 
any idea about the possible reasons that MAD didn't work in this case? 
Thank you so much for any of your help!


Best,
Yafang

--
Yafang Chen
Graduate Research Assistant
Mesecar Lab
Department of Biological Sciences
Purdue University
Hockmeyer Hall of Structural Biology
240 S. Martin Jischke Drive
West Lafayette, IN 47907



--
Prof. George M. Sheldrick FRS
Dept. Structural Chemistry,
University of Goettingen,
Tammannstr. 4,
D37077 Goettingen, Germany
Tel. +49-551-39-33021 or -33068
Fax. +49-551-39-22582


Re: [ccp4bb] Why MAD didn't work but SAD works well

2013-08-20 Thread Bosch, Juergen
Dear George,

if #4 is correct, shouldn't he be able to get good SIRAS using the peak dataset 
as HA and the last collected dataset as native

Jürgen

..
Jürgen Bosch
Johns Hopkins University
Bloomberg School of Public Health
Department of Biochemistry  Molecular Biology
Johns Hopkins Malaria Research Institute
615 North Wolfe Street, W8708
Baltimore, MD 21205
Office: +1-410-614-4742
Lab:  +1-410-614-4894
Fax:  +1-410-955-2926
http://lupo.jhsph.edu

On Aug 20, 2013, at 5:42 PM, George Sheldrick wrote:

Dear Yafang,

If radiation damage is not a major problem, MAD should give you more
phase information than SAD, i.e. better maps, especially
at low resolution. If SAD works but MAD doesn't, there are several
possible explanations:

1. (most likely) your datasets are inconsistently indexed. This can
hapen in various ways depending on the Laue group and the unit-cell. If
you are using hkl2map the plots of the anomalous CC between the
different datasets are a good quick check. Some programs (e.g. the
current shelxc) will try to detect this and correct it automatically.

2. You have mixed up the wavelengths or labels of the datasets and so
the dispersive difference comes out with the wrong sign.

3. You have significant radiation damage and the RIP and dispersive
differences have opposite signs. Always measure the inflection
dataset last so that they reinforce each other rather than canceling.

4. You have severe radiation damage and only the first (peak) dataset is
usable.

Best wishes, George


On 08/20/2013 11:05 PM, Yafang Chen wrote:
Hi All,

I have three datasets of SeMet-incorporated protein at peak, infl and
high wavelength respectively. SAD with peak dataset works well to
solve the phase problem. However, MAD with all three datasets didn't
work at all. The completeness of all three datasets are more than 99%.
So I think radiation damage should not be a problem. Does anyone have
any idea about the possible reasons that MAD didn't work in this case?
Thank you so much for any of your help!

Best,
Yafang

--
Yafang Chen
Graduate Research Assistant
Mesecar Lab
Department of Biological Sciences
Purdue University
Hockmeyer Hall of Structural Biology
240 S. Martin Jischke Drive
West Lafayette, IN 47907


--
Prof. George M. Sheldrick FRS
Dept. Structural Chemistry,
University of Goettingen,
Tammannstr. 4,
D37077 Goettingen, Germany
Tel. +49-551-39-33021 or -33068
Fax. +49-551-39-22582