Hmm. I thought that HDF5 library was generally 'smart' about compressing
and would abort a compression if the resultant 'compressed' bytestream
turned out to be larger than the UNcompressed byte stream. For small
datasets, say less than a few hundred characters, I supposed its highly
likely that 'compression' in fact turns out to be a bit larger.
Those two behaviours could explain why some of your datasets wind up
compressed and others not.
Best I can explain given the short time I've thought about the answer ;)
Mark
On Tue, 2012-11-13 at 07:59 -0800, ylegoc wrote:
> Our instrument control software uses hdf5 files to store neutron acquisition
> data files.
> When the size of the "data" group is growing, we have random compressions.
> Sometimes the dataset is compressed, sometimes not. Here is the dump of two
> files containing the same dataset but with different resulting compression:
>
> Bad file :
>
> HDF5 "000028.nxs" {
> GROUP "/" {
> ATTRIBUTE "HDF5_Version" {
> DATATYPE H5T_STRING {
> STRSIZE 5;
> STRPAD H5T_STR_NULLTERM;
> CSET H5T_CSET_ASCII;
> CTYPE H5T_C_S1;
> }
> DATASPACE SCALAR
> }
> GROUP "entry0" {
> ATTRIBUTE "NX_class" {
> DATATYPE H5T_STRING {
> STRSIZE 7;
> STRPAD H5T_STR_NULLTERM;
> CSET H5T_CSET_ASCII;
> CTYPE H5T_C_S1;
> }
> DATASPACE SCALAR
> }
> GROUP "data" {
> ATTRIBUTE "NX_class" {
> DATATYPE H5T_STRING {
> STRSIZE 6;
> STRPAD H5T_STR_NULLTERM;
> CSET H5T_CSET_ASCII;
> CTYPE H5T_C_S1;
> }
> DATASPACE SCALAR
> }
> DATASET "data" {
> DATATYPE H5T_STD_I32LE
> DATASPACE SIMPLE { ( 384, 256, 1024 ) / ( 384, 256, 1024 ) }
> STORAGE_LAYOUT {
> CHUNKED ( 384, 256, 1024 )
> SIZE 402653184 (1.000:1 COMPRESSION)
> }
> FILTERS {
> COMPRESSION DEFLATE { LEVEL 6 }
> }
> FILLVALUE {
> FILL_TIME H5D_FILL_TIME_IFSET
> VALUE 0
> }
> ALLOCATION_TIME {
> H5D_ALLOC_TIME_INCR
> }
> ATTRIBUTE "signal" {
> DATATYPE H5T_STD_I32LE
> DATASPACE SCALAR
> }
> }
> }
>
> Correct file :
>
> HDF5 "000029.nxs" {
> GROUP "/" {
> ATTRIBUTE "HDF5_Version" {
> DATATYPE H5T_STRING {
> STRSIZE 5;
> STRPAD H5T_STR_NULLTERM;
> CSET H5T_CSET_ASCII;
> CTYPE H5T_C_S1;
> }
> DATASPACE SCALAR
> }
> GROUP "entry0" {
> ATTRIBUTE "NX_class" {
> DATATYPE H5T_STRING {
> STRSIZE 7;
> STRPAD H5T_STR_NULLTERM;
> CSET H5T_CSET_ASCII;
> CTYPE H5T_C_S1;
> }
> DATASPACE SCALAR
> }
> GROUP "data" {
> ATTRIBUTE "NX_class" {
> DATATYPE H5T_STRING {
> STRSIZE 6;
> STRPAD H5T_STR_NULLTERM;
> CSET H5T_CSET_ASCII;
> CTYPE H5T_C_S1;
> }
> DATASPACE SCALAR
> }
> DATASET "data" {
> DATATYPE H5T_STD_I32LE
> DATASPACE SIMPLE { ( 384, 256, 1024 ) / ( 384, 256, 1024 ) }
> STORAGE_LAYOUT {
> CHUNKED ( 384, 256, 1024 )
> SIZE 139221680 (2.892:1 COMPRESSION)
> }
> FILTERS {
> COMPRESSION DEFLATE { LEVEL 6 }
> }
> FILLVALUE {
> FILL_TIME H5D_FILL_TIME_IFSET
> VALUE 0
> }
> ALLOCATION_TIME {
> H5D_ALLOC_TIME_INCR
> }
> ATTRIBUTE "signal" {
> DATATYPE H5T_STD_I32LE
> DATASPACE SCALAR
> }
> }
> }
>
> compression type : NX_COMP_LZW
> hdf5 version 1.8.3 called by the Nexus library 4.3.0
>
> Are there explanations for such random behaviour? Some solutions?
>
>
>
>
> --
> View this message in context:
> http://hdf-forum.184993.n3.nabble.com/hdf5-compression-problem-tp4025575.html
> Sent from the hdf-forum mailing list archive at Nabble.com.
>
> _______________________________________________
> Hdf-forum is for HDF software users discussion.
> [email protected]
> http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org
--
Mark C. Miller, Lawrence Livermore National Laboratory
!!!!!!!!!!!!!!!!!!LLNL BUSINESS ONLY!!!!!!!!!!!!!!!!!!
[email protected] urgent: [email protected]
T:8-6 (925)-423-5901 M/W/Th:7-12,2-7 (530)-753-8511
_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org