Hello all,

here at the DKRZ (German Climate Computing Center) we have to store
large volumes of climate data, some of which is stored in HDF5-files.
So, during the last few month we have been doing some research into
climate data compression.

With very interesting results: We have been able to shrink our test data
set to 38.76% of the original file size. This is a compression factor of
more than 2.5 and it is significantly better than the performance of all
the standard methods we tested (bzip2 = 54%, gzip = 58%, sldc = 66% and
lzma = 46%). Also, we have seen that the lzma-algorithm performs much
better than the other standard algorithms.

Even though we have constructed our methods to fit climate data, the
features we exploited for compression are very general and likely to
apply to other scientific data as well. This is why we are confident
that many of you could profit from these methods as well, and we would
be happy to share our results with the rest of the community.

The filtering mechanism in HDF5 predestines it to be the first place for
us to share our algorithms. But first we would be very interested to see
the lzma algorithm integrated as an optional filtering method, something
that should be very easy to do and offers large benefits to all users.

Since we would be willing to do the necessary work, we just need to know
the (in)formal requirements to integrate a new filter. And, of course,
we would be very interested to hear about other recent work which
adresses compression in HDF5, and to get in touch with whoever works on
it.

Best regards,
Nathanael Hübbe

http://wr.informatik.uni-hamburg.de/
http://wr.informatik.uni-hamburg.de/people/nathanael_huebbe


_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

Reply via email to