Re: [ccp4bb] lossy compression of diffraction images

James Holton Sun, 09 May 2010 09:02:28 -0700

Frank von Delft wrote:

Just looked at the algorithm, how it stores the average "non-spot"through all the images.
What happens with dataset where the "non-spot" (e.g. background)changes systematically through the dataset, i.e. anisotropic datasetsor thin crystals lying flat in a thin loop? How much worse iscompression for that?
Cheers
phx

Well, what will happen in that case (with the current "algorithm") isthat once a background pixel deviates from the median level by more than4 "sigmas", it will start to get stored losslessly. Essentially, theywill be treated as "spots" and the overall compression ratio will startto approach that of bzip2.

A "workaround" for this is simply to store the data set in "chunks"where the background level is similar, but I suppose a more intelligentthing to do would be to simply "scale" each image to the medianbackground image, and store the scale factors (a list of 100 numbers fora 100-image data set) along with the other ancillary data. I haven'tdone that yet. Didn't want to spend too much time on this in case Iincited some kind of revolt.


-James Holton
MAD Scientist

On 07/05/2010 06:07, James Holton wrote:
Ian Tickle wrote:
I found an old e-mail from James Holton where he suggested lossy
compression for diffraction images (as long as it didn't change the
F's significantly!) - I'm not sure whether anything came of that!
Well, yes, something did come of this.... But I don't think GerardBricogne is going to like it.
Details are here:
http://bl831.als.lbl.gov/~jamesh/lossy_compression/
Short version is that I found a way to compress a test lysozymedataset by a factor of ~33 with no apparent ill effects on the data.In fact, anomalous differences were completely unaffected, and Rfreedropped from 0.287 for the original data to 0.275 when refinedagainst Fs from the compressed images. This is no doubt a fluke ofthe excess noise added by compression, but I think it highlights howthe errors in crystallography are dominated by the inadequacies ofthe electron density models we use, and not the quality of our data.
The page above lists two data sets: "A" and "B", and I am interestedto know if and how anyone can "tell" which one of these data sets wascompressed. The first image of each data set can be found here:
http://bl831.als.lbl.gov/~jamesh/lossy_compression/firstimage.tar.bz2

-James Holton
MAD Scientist

Re: [ccp4bb] lossy compression of diffraction images

Reply via email to