The dominant source of error in an intensity measurement actually
depends on the magnitude of the intensity. For intensities near zero
and with zero background, the read-out noise of image plate or
CCD-based detectors becomes important. On most modern CCD detectors,
however, the read-out noise is quite low: equivalent to the noise
induced by having only a few extra photons/pixel (if any). For
intensities of more than ~1000 photons, the calibration of the detector
(~2-3% error) starts to dominate. It is only for a midrange between
~2 photons/pixel and 1000 integrated photons that shot noise (aka
photon counting error or Poisson statistics) plays the major role.
So it is perhaps a bit ironic that the photon counting error we worry
so much about is only significant for a very narrow range of intensities
in any given data set.
But yes, there does seem to be something wrong with ctruncate. It can
throw out a great deal of hkls that both xdsconv and the old truncate
keep. Graph of the resulting Wilson plots here:
http://bl831.als.lbl.gov/~jamesh/bugreports/ctruncate/truncated_wilsons.png
and the script for producing the data for this plot from scratch:
http://bl831.als.lbl.gov/~jamesh/bugreports/ctruncate/truncate_notes.com
Note that only 3 bins are even populated in the ctruncate result,
whereas truncate and xdsconv seem to reproduce the true Wilson plot
faithfully down to well below the noise, which in this case is a
Gaussian deviate with RMS = 1.0 added to each F^2.
The plateau in the result from xdsconv is something I've been working
with Kay to understand, but it seems to be a problem with the
French-Wilson algorithm itself, and not any particular implementation of
it. Basically, French and Wilson did not want to assume that the Wilson
plot was straight and therefore don't use the prior information that
if the intensities dropped into the noise at 2.0 A then the average
value of F and 1.0 A is much much less than sigma! As a result, the
French-Wilson values for F far above the traditional resolution
limit can be overestimated by as much as a factor of a million.
Perhaps this is why truncate and ctruncate complain bitterly about data
beyond useful resolution limit.
A shame really, because if the Wilson plot of the truncated data is
made to follow the linear trend we see in the low-angle data, then we
wouldn't need to argue so much. After all, the only reason we apply a
resolution cutoff is to try and suppress the noise coming from all
those background-only spots at high angle. But, on the other hand, we
don't want to cut the data too harshly or we will get series-termination
errors. So, we must strike a compromise between these two sources of
error and call that the resolution cutoff. But, if the conversion of
I to F actually used the prior knowledge of the fall-off of the Wilson
plot with resolution, then there would be no need for a resolution
cutoff at all. The current situation is portrayed in this graph:
http://bl831.als.lbl.gov/~jamesh/wilson/error_breakdown.png
which just showed the noise induced in an electron density map by
applying a resolution cutoff to otherwise perfect data, vs the error
due to adding noise and running truncate. If the noisy data were
down-weighted only a little bit, then the total noise curve would
continue to drop, even at infinite resolution.
I think it is also important to point out here that the resolution
cutoff of the data you provide to refmac or phenix.refine is not
necessarily the resolution of the structure. This latter quantity,
although emotionally charged, really does need to be more well-defined
by this community and preferably in a way that is historically
stable. You can't just take data that goes to 5.0A and call it 4.5A
data by changing your criterion. Yes, it is better to refine out to
4.5A when the intensities drop into the noise at 5A, but that is never
going to be as good as using data that does not drop into the noise
until 4.5A.
-James Holton
MAD Scientist
On 6/27/2013 9:30 AM, Ian Tickle wrote:
On 22 June 2013 19:39, Douglas Theobald dtheob...@brandeis.edu
mailto:dtheob...@brandeis.edu wrote:
So I'm no detector expert by any means, but I have been assured by
those who are that there are non-Poissonian sources of noise --- I
believe mostly in the readout, when photon counts get amplified.
Of course this will depend on the exact type of detector, maybe
the newest have only Poisson noise.
Sorry for delay in responding, I've been thinking about it. It's
indeed possible that the older detectors had non-Poissonian noise as
you say, but AFAIK all detectors return _unsigned_ integers (unless
possibly the number is to be interpreted as a flag to indicate some
error condition, but then obviously you wouldn't interpret it as a
count). So whatever the detector AFAIK it's physically impossible for
it to return a negative number that is to be