On Nov 13, 2012, at 11:51 AM, Mark Miller <[email protected]> wrote:

> Hmm. I thought that HDF5 library was generally 'smart' about compressing
> and would abort a compression if the resultant 'compressed' bytestream
> turned out to be larger than the UNcompressed byte stream. For small
> datasets, say less than a few hundred characters, I supposed its highly
> likely that 'compression' in fact turns out to be a bit larger.
> 
> Those two behaviours could explain why some of your datasets wind up
> compressed and others not.

        Yup.  That's very likely part of what's going on.

                Quincey

> 
> Best I can explain given the short time I've thought about the answer ;)
> 
> Mark
> 
> On Tue, 2012-11-13 at 07:59 -0800, ylegoc wrote:
>> Our instrument control software uses hdf5 files to store neutron acquisition
>> data files.
>> When the size of the "data" group is growing, we have random compressions.
>> Sometimes the dataset is compressed, sometimes not. Here is the dump of two
>> files containing the same dataset but with different resulting compression:
>> 
>> Bad file :
>> 
>> HDF5 "000028.nxs" {
>> GROUP "/" {
>>   ATTRIBUTE "HDF5_Version" {
>>      DATATYPE  H5T_STRING {
>>            STRSIZE 5;
>>            STRPAD H5T_STR_NULLTERM;
>>            CSET H5T_CSET_ASCII;
>>            CTYPE H5T_C_S1;
>>         }
>>      DATASPACE  SCALAR
>>   }
>>   GROUP "entry0" {
>>      ATTRIBUTE "NX_class" {
>>         DATATYPE  H5T_STRING {
>>               STRSIZE 7;
>>               STRPAD H5T_STR_NULLTERM;
>>               CSET H5T_CSET_ASCII;
>>               CTYPE H5T_C_S1;
>>            }
>>         DATASPACE  SCALAR
>>      }
>>      GROUP "data" {
>>         ATTRIBUTE "NX_class" {
>>            DATATYPE  H5T_STRING {
>>                  STRSIZE 6;
>>                  STRPAD H5T_STR_NULLTERM;
>>                  CSET H5T_CSET_ASCII;
>>                  CTYPE H5T_C_S1;
>>               }
>>            DATASPACE  SCALAR
>>         }
>>         DATASET "data" {
>>            DATATYPE  H5T_STD_I32LE
>>            DATASPACE  SIMPLE { ( 384, 256, 1024 ) / ( 384, 256, 1024 ) }
>>            STORAGE_LAYOUT {
>>               CHUNKED ( 384, 256, 1024 )
>>               SIZE 402653184 (1.000:1 COMPRESSION)
>>             }
>>            FILTERS {
>>               COMPRESSION DEFLATE { LEVEL 6 }
>>            }
>>            FILLVALUE {
>>               FILL_TIME H5D_FILL_TIME_IFSET
>>               VALUE  0            
>>            }
>>            ALLOCATION_TIME {
>>               H5D_ALLOC_TIME_INCR
>>            }
>>            ATTRIBUTE "signal" {
>>               DATATYPE  H5T_STD_I32LE
>>               DATASPACE  SCALAR
>>            }
>>         }
>>      }
>> 
>> Correct file :
>> 
>> HDF5 "000029.nxs" {
>> GROUP "/" {
>>   ATTRIBUTE "HDF5_Version" {
>>      DATATYPE  H5T_STRING {
>>            STRSIZE 5;
>>            STRPAD H5T_STR_NULLTERM;
>>            CSET H5T_CSET_ASCII;
>>            CTYPE H5T_C_S1;
>>         }
>>      DATASPACE  SCALAR
>>   }
>>   GROUP "entry0" {
>>      ATTRIBUTE "NX_class" {
>>         DATATYPE  H5T_STRING {
>>               STRSIZE 7;
>>               STRPAD H5T_STR_NULLTERM;
>>               CSET H5T_CSET_ASCII;
>>               CTYPE H5T_C_S1;
>>            }
>>         DATASPACE  SCALAR
>>      }
>>      GROUP "data" {
>>         ATTRIBUTE "NX_class" {
>>            DATATYPE  H5T_STRING {
>>                  STRSIZE 6;
>>                  STRPAD H5T_STR_NULLTERM;
>>                  CSET H5T_CSET_ASCII;
>>                  CTYPE H5T_C_S1;
>>               }
>>            DATASPACE  SCALAR
>>         }
>>         DATASET "data" {
>>            DATATYPE  H5T_STD_I32LE
>>            DATASPACE  SIMPLE { ( 384, 256, 1024 ) / ( 384, 256, 1024 ) }
>>            STORAGE_LAYOUT {
>>               CHUNKED ( 384, 256, 1024 )
>>               SIZE 139221680 (2.892:1 COMPRESSION)
>>             }
>>            FILTERS {
>>               COMPRESSION DEFLATE { LEVEL 6 }
>>            }
>>            FILLVALUE {
>>               FILL_TIME H5D_FILL_TIME_IFSET
>>               VALUE  0            
>>            }
>>            ALLOCATION_TIME {
>>               H5D_ALLOC_TIME_INCR
>>            }
>>            ATTRIBUTE "signal" {
>>               DATATYPE  H5T_STD_I32LE
>>               DATASPACE  SCALAR
>>            }
>>         }
>>      }
>> 
>> compression type : NX_COMP_LZW
>> hdf5 version 1.8.3 called by the Nexus library 4.3.0
>> 
>> Are there explanations for such random behaviour? Some solutions?
>> 
>> 
>> 
>> 
>> --
>> View this message in context: 
>> http://hdf-forum.184993.n3.nabble.com/hdf5-compression-problem-tp4025575.html
>> Sent from the hdf-forum mailing list archive at Nabble.com.
>> 
>> _______________________________________________
>> Hdf-forum is for HDF software users discussion.
>> [email protected]
>> http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org
> 
> -- 
> Mark C. Miller, Lawrence Livermore National Laboratory
> !!!!!!!!!!!!!!!!!!LLNL BUSINESS ONLY!!!!!!!!!!!!!!!!!!
> [email protected]      urgent: [email protected]
> T:8-6 (925)-423-5901    M/W/Th:7-12,2-7 (530)-753-8511
> 
> 
> _______________________________________________
> Hdf-forum is for HDF software users discussion.
> [email protected]
> http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org


_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

Reply via email to