I agree that simple truncation is not a great way to create a
lower-resolution dataset. However, neither is simply applying a B
factor. It is harder than that to fool the downstream phasing
programs you will probably be running.
That said, the combination of a B factor with a resolution cutoff does
effectively suppress Fourier ripples, which are always there, but the
rms error they contribute to the map is just the rms value of all
structure factors beyond the resolution limit, divided by the cell
volume. So, if you apply a big enough B factor everything beyond the
resolution limit will be essentially zero. I recommend as a rule of
thumb combining a resolution cutoff of d with the B factor taken from
the general trend of the PDB:
B = 4*d^2+12
where B is the average atomic B factor from structures claiming
resolution d. That is, if you download every PDB entry with a
resolution of 2 A, and then take the average value of the B factor of
all the atoms in all those files, you'll get ~28. So, if you start with
a 1.8 A data set, chances are it will have an average atomic (aka
Wilson) B factor of 25. If you apply a B-factor of 45 to the observed
data with CAD, then the Wilson B will become 70, and the structure
factors at 3.8 A will now be about the same average magnitude as the 1.8
A data were in the original set. So, you can now cut of the data at
3.8 A and not change the maps in any serious way. The maps will look
like 3.8A data. This is actually how I made my resolution example movie:
http://bl831.als.lbl.gov/~jamesh/movies/index.html#reso
This treatment is fine for map calculation, but if you are trying to
test the effect of resolution on something more complicated, like
phasing or refinement, you will run into problems. For example, if you
calculate the isomorphism of the old 1.8 A dataset to the new 3.8 A
dataset with SCALEIT, you will find the R-factor between them is zero.
This is because the standard procedure for calculating an R factor is to
scale the two datasets together first, and scaling generally implies
fitting a B factor as well as an overall scale. In this case the
relative B factor (aka scaling B factor) will be 45, the number you
gave to CAD above. So, if you take a coordinate file refined against
the 1.8A data and refine it against your new 3.8A data, all the atomic
B factors will simply increase by 45, the atoms will hardly move, and
the R and Rfree will be a little better than they were with the 1.8A
data (because the noisy high-angle stuff is now cut off).You will
also find that the quality of the anomalous differences are largely
unaffected by applying a B factor. This is because if you scale all the
Fs and sigFs on a Harker diagram by a constant, it doesn't change the
phase. Yes, the refined B factor of the heavy atom sites will increase
by 45, but the phasing power, etc will be the same. I imagine this is
not what you had in mind?
Clearly, you have to add some noise in addition to applying the B factor
and cutting off the resolution. But what sort of noise? You have the
sigmas from the original dataset, but those are not noise, they are an
estimate of the noise that is already there, hidden in the value of F
itself. Nevertheless, it's all you've got, so it is helpful to consider
where SIGF comes from.
SIGF begins its life as the estimate of the number of photons that were
counted in a given spot area on the detector. The error in the
background-subtracted spot intensity is (at least) the square root of
the _total_ number of photons that hit in the spot region. That is,
background plus spot. You might have a hope of reconstructing the spot
intensity using F^2 and some sort of overall scale factor (related to
the illuminated crystal volume, beam intensity, etc), but the background
level is lost in the scaling and mergeing process. After all, different
observations of the same or symmetry-equivalent hkls will generally have
different background levels. They also have different intensities, due
to the Lorentz and polarization factors. This latter fact is often
neglected, but if you take the average value of the Lp factor (Holton
Frankel, 2010) vs resolution for a typical data collection situation
(wavelength = 1A, resolution up to ~1.5A), you will find that it is a
fairly straight line:
L*p*frac_obs ~ 1.55*d
where d is the d spacing of the spot. So, yes, high-angle spot
intensities are weaker than low-angle spot intensities not just because
F is smaller, but because d is smaller as well, and the actual spot
intensities on the detector are not proportional to F^2, but rather
d*F^2. On average. Most data sets have a few hkls that by chance are
very close to the rotation axis and stay in contact with the Ewald
sphere for the entire rotation range. These will accumulate a VERY
large number of counts. On the other hand, a spot that appears on the
equator won't register very many