Re: [Hdf-forum] to filter or not to filter

Werner Benger Sun, 16 Jun 2013 08:30:32 -0700

Hi Matt,

 I seem to remember that there already is a compression filter in HDF5 that
allows to store an IEEE floating point number as an integer value with an

floating point scale and offset parameter being attached to them, so onreading

you get floating point values, limited to the precision of the underlaying
integers, but also benefiting from integer compression schemes.


For 'special values' it might be best to use an attribute on the dataset
that tells which value, or even value range, means something particular
such a mask. You could even use the HDF5 internal default fill value for
this purpose - this value will be used if you for instance read a chunk
from a chunked dataset that doesn't exist on disk, which would make it
a good case for an 'undefined' region in the data.

http://www.hdfgroup.org/HDF5/doc_resource/H5Fill_Values.html

You would just need to set a fill value that does not occur as valid data.

Shifting data values to circumvent limitations of the visualizations
software sounds like a last-resort solution, but if required, such
should be fairly quick with a compression-like filter that is specific
this software. In the HDF5 file itself you might prefer to have data
values as close to their original values as possible, so doing the
dynamical shift (specific to each viz software that you use)  upon
reading sounds better than during writing. I would expect the
computational cost to be negligible as compared to disk I/O - however
there is an overhead in terms of RAM usage since compression filters
require additional storage memory. This might be an issue for large
data sets, but it should be controllable with bug-free coding (i.e.
having no memory leaks) and sufficiently small chunking of the data sets.

          Werner

On Sat, 15 Jun 2013 15:46:58 -0500, Dougherty, Matthew T<[email protected]> wrote:

I am seeking an opinion as to the computational load involving a simpleHDF filter, which will dictate how to encapsulate images with HDF5.
Problem:
Most of the software that generates images in cryo-EM creates a singledensity modality, 3D lattice. Generally the types of numerical valuesare IEEE floating point, but on occasion they are unsigned byte; severalother numerical representations are permissible. The range of thevalues are pretty arbitrary, which is the source of the problem.Typically these values range from +/- 10,000. The value of zero has nosignificant meaning, which is causing a lot of visualization problemsfor data between -1 and +1, particularly involving division or the factthat some programs attach a special meaning to zero, such as void usedin masking/clamping which introduces a new problem of co-minglingdensity and mask values indistinguishably, forever altering thedistinction and the histogram.
Proposed solution:
Shift all density values to positive, and start with the number one asthe minimum value. Zero would be reserved to indicate nothingness, suchas clamping to exclude density used in mask segmentation. An alternateapproach would be to use NaN, but this has several problems includingbreaking a lot of software.
When encapsulating my 3D image into HDF5 I could perform the simpleshift, creating metadata indicating the shift. To do this does notrequire supporting a shift filter.The alternate approach is to keep the density files as-is duringencapsulating, and upon reading the file I dynamically shift the densityvalues using an HDF5 filter.
The image sizes range from 30GB to 4TB.
My inclination is to go with the first method, for reasons ofcomputation, and not having to maintain/distribute the filter.But I am curious as to whether there is any significant computationalcosts for the second method.
Matthew Dougherty
National Center for Macromolecular Imaging
Baylor College of Medicine
_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org



--
___________________________________________________________________________
Dr. Werner Benger                Visualization Research
Laboratory for Creative Arts and Technology (LCAT)
Center for Computation & Technology at Louisiana State University (CCT/LSU)
211 Johnston Hall, Baton Rouge, Louisiana 70803
Tel.: +1 225 578 4809                        Fax.: +1 225 578-5362

_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org

Re: [Hdf-forum] to filter or not to filter

Reply via email to