Hi,

I've figured out what the problem with the GZIP compression was. It turned
out I was using a wrong version hdf5dll.dll. Replacing this file solved the
issues I was having.

Thanks for your help guys!

Regards,

Bas


From: Bas Schoen <[email protected]>
Date: Sat, Jul 9, 2011 at 10:51 PM
Subject: Re: [Hdf-forum] Fwd: GZIP Compression and h5repack
To: HDF Users Discussion List <[email protected]>


Hi Gerd,

Thank you very much for your reply. I managed to reproduce your sample in C#
with the same results. I am not really sure what I have been doing wrong the
whole time, but at least I've got a working sample now. This even gives the
same result compared to h5repack, which is great!

I'll try to implement this in the real project, and see if it works. Thanks
to both of you (Jonathan & Gerd).

Once I have figured out my mistake I will post it here, so others can
benefit from it.

Regards,

Bas


On Fri, Jul 8, 2011 at 9:00 PM, Gerd Heber <[email protected]> wrote:

>
>
> Bas, how are you? Attached is an IronPython (
> http://ironpython.codeplex.com/) script that you might want
>
> to run and recreate in C#. (IronPython uses HDF5DotNet.dll the same way you
> do from C#.)
>
> It creates a compressed, chunked dataset of a compound type (int, float,
> double).
>
> ‘h5dump –p –H SDScompound.h5’ yields.
>
>
>
> HDF5 "SDScompound.h5" {
>
> GROUP "/" {
>
>    DATASET "ArrayOfStructures" {
>
>       DATATYPE  H5T_COMPOUND {
>
>          H5T_STD_I32LE "a_name";
>
>          H5T_IEEE_F32LE "b_name";
>
>          H5T_IEEE_F64LE "c_name";
>
>       }
>
>       DATASPACE  SIMPLE { ( 1024 ) / ( 1024 ) }
>
>       STORAGE_LAYOUT {
>
>          CHUNKED ( 128 )
>
>          SIZE 4944 (3.314:1 COMPRESSION)
>
>        }
>
>      FILTERS {
>
>          COMPRESSION DEFLATE { LEVEL 9 }
>
>       }
>
>       FILLVALUE {
>
>          FILL_TIME H5D_FILL_TIME_IFSET
>
>          VALUE  {
>
>          0,
>
>          0,
>
>          0
>
>       }
>
>       }
>
>       ALLOCATION_TIME {
>
>          H5D_ALLOC_TIME_INCR
>
>       }
>
>    }
>
> }
>
> }
>
>
>
> Can you reproduce that?
>
>
>
> Best, G.
>
>
>
>
>
>
>
> *From:* [email protected] [mailto:
> [email protected]] *On Behalf Of *Bas Schoen
> *Sent:* Friday, July 08, 2011 11:01 AM
> *To:* HDF Users Discussion List
> *Subject:* Re: [Hdf-forum] Fwd: GZIP Compression and h5repack
>
>
>
> Hi Jonathan,
>
>
>
> I've attached the output from h5dump in 3 files, original, compressed(using
> my code) and repack.
>
>
>
> I ran h5repack with the following commands "h5repack -f SHUF -f GZIP=9
> <input.h5> <output.h5>
>
>
>
> Just to make sure: The problems I'm having are not really related to
> h5repack. My implementation of gzip compression just doesn't compress the
> hdf5 file at all. Or even worse, if I use a small chunk size (say: 20) the
> filesize increases compared to the original hdf5 file.
>
>
>
> I've tried to make things easier and written a small test function which
> doesn't write a compound datatype but just an int array. But my compression
> still doesn't work. I've got the feeling I am missing a important step in
> the compression process.
>
>
>
> This is what I tried:
>
>
>
> Random rand = new Random();
>   int[] data = new int[32508];
>
>   //Fill some dummy data
>
> for (int i = 0; i <   32508; i++)
>        data[i] = rand.Next();
>
> //Create DataSpace
> H5DataSpaceId dataSpace = H5S.create_simple(1, new long[] { data.Length });
>
> //Create Creation Property List
> H5PropertyListId compressProperty =
> H5P.create(H5P.PropertyListClass.DATASET_CREATE);
> H5P.setShuffle(compressProperty);
> H5P.setDeflate(compressProperty, 9);
> H5P.setChunk(compressProperty,   new long[] { data.Length } );
>
> //Create DataSet with compression enabled
>   H5DataSetId dataSet = H5D.create(fileId, "Test", H5T.H5Type.STD_I32LE,
> dataSpace, new H5PropertyListId(H5P.Template.DEFAULT), compressProperty, new
> H5PropertyListId(H5P.Template.DEFAULT)); //this line has been used to turn
> compression on
>   //H5DataSetId dataSet = H5D.create(fileId, "Test", H5T.H5Type.STD_I32LE,
> dataSpace); //this line has been used to turn compression off
>
> //Write data to file
> H5D.write(dataSet, new H5DataTypeId(H5T.H5Type.NATIVE_INT), new
> H5DataSpaceId(H5S.H5SType.ALL), new H5DataSpaceId(H5S.H5SType.ALL),
> new H5PropertyListId(H5P.Template.DEFAULT), new H5Array<int>(data));
>
> H5P.close(compressProperty);
> H5D.close(dataSet);
> H5S.close(dataSpace);
>
>
>
> Regards,
>
>
>
> Bas
>
>
>
> On Fri, Jul 8, 2011 at 5:03 PM, Jonathan Kim <[email protected]> wrote:
>
> Hi Bas,
>
> It sounds that h5repack did the work, but your code didn't do as expected
> if the result files have same size.  Just to mention,  h5repack can
> potential reduce more size when dealing with entire file as it refreshes all
> the objects from previous changes.
>
> According to your reply there are 3 HDF5 files; 1. original , 2. result
> from h5repack  , 3. result from your code.
> Could you send us either the outputs from "h5dump -p -H  <HDF5 file>" or
> sanpshots from hdf-view's 'show properties' pop-up window for the 3 files?
>
> Also could you send me how you ran h5repack?
>
> Regards,
>
> Jonathan
>
> On 7/8/2011 2:44 AM, Bas Schoen wrote:
>
> Hi Jonathan,
>
>
>
> Thanks for your reply.
>
>
>
> 1. The size difference is between the file sizes of two HDF5 files. One
> with compression with my code, and one with repack. (and an original not
> compressed file which is the same size as with my compression code)
>
> 2. The chunk size I used is the count of items written to the dataset,
> which in this case was 32509. If I open the two files with the hdf5 viewer
> this value is shown in the properties in both files.
>
>
>
> When creating the dataset I am not really sure whether to use the
> dataTypeFile or the dataTypeMem. I tried both and the result are the same
> (at least the difference between my code and repack stays the same, the file
> size of both files do however change).
>
>
>
>
>
> Regards,
>
>
>
> Bas
>
>
>
> On Thu, Jul 7, 2011 at 6:15 PM, Jonathan Kim <[email protected]> wrote:
>
> Hi Bas,
>
> I have a couple questions.
>   1. About the size differences between h5repack and your code, is it the
> size of HDF5 file or dataset?
>   2. About the chunk, what the size of chunk used for h5repack and your
> code?
>
> Jonathan
>
> On 7/7/2011 10:00 AM, Bas Schoen wrote:
>
> Hi,
>
>
>
> I'm trying to create a hdf5 file with some compound datatypes with GZIP
> compression. The development is done in C# using the HDF5DotNet dll.
>
> I need these compression options: shuffle & gzip=9 and I would like to
> achieve the same compression ration as h5repack.
>
>
>
> The problem however is that the compressed file is the same size as the not
> compressed file. If I use h5repack on that file, this size is 10 times
> smaller. Can someone see what I am doing wrong?
>
>
>
> Part of my implementation:
>
>
>
>
>
> // We want to write a compound datatype, which is a struct containing a int
> and some bytes
>
>   DataStruct[]  data = new   DataStruct[]{...};  //data has been filled
>
>
>
> // Create the compound datatype for memory
> H5DataTypeId dataTypeMem = H5T.create(H5T.CreateClass.COMPOUND,
> (int)Marshal.SizeOf(default(DataStruct)));
>   H5T.insert(dataTypeMem, "A", (int)Marshal.OffsetOf(typeof( DataStruct ),
> "A"), H5T.H5Type.NATIVE_INT);
> H5T.insert(dataTypeMem, "B", (int)Marshal.OffsetOf(typeof( DataStruct ),
> "B"), H5T.H5Type.NATIVE_UCHAR);
> H5T.insert(dataTypeMem, "C", (int)Marshal.OffsetOf(typeof( DataStruct ),
> "C"), H5T.H5Type.NATIVE_UCHAR);
> H5T.insert(dataTypeMem, "D", (int)Marshal.OffsetOf(typeof( DataStruct ),
> "D"), H5T.H5Type.NATIVE_UCHAR);
> H5T.insert(dataTypeMem, "E", (int)Marshal.OffsetOf(typeof( DataStruct ),
> "E"), H5T.H5Type.NATIVE_UCHAR);
>
> // Create the compound datatype for the file. Because the standard
> // types we are using for the file may have different sizes than
> // the corresponding native types, we must manually calculate the
> // offset of each member.
> int offset = 0;
> H5DataTypeId dataTypeFile = H5T.create(H5T.CreateClass.COMPOUND, (int)(4 +
> 1 + 1 + 1 + 1));
> H5T.insert(dataTypeFile, "A", offset, H5T.H5Type.STD_U32BE);
> offset += 4;
> H5T.insert(dataTypeFile, "B", offset, H5T.H5Type.STD_U8BE);
> offset += 1;
> H5T.insert(dataTypeFile, "C", offset, H5T.H5Type.STD_U8BE);
> offset += 1;
> H5T.insert(dataTypeFile, "D", offset, H5T.H5Type.STD_U8BE);
> offset += 1;
> H5T.insert(dataTypeFile, "E", offset, H5T.H5Type.STD_U8BE);
> offset += 1;
>
> long[] dims = { (long)  data.Count() };
>
> try
> {
>   // Create dataspace, with maximum = current
> H5DataSpaceId dataSpace = H5S.create_simple(1, dims);
>
> //Create compression properties
>   long[] chunk = dims; //What value should be used as chunk?
> H5PropertyListId compressProperty =
> H5P.create(H5P.PropertyListClass.DATASET_CREATE);
> H5P.setShuffle(compressProperty);
>   H5P.setDeflate(compressProperty, 9)
> H5P.setChunk(compressProperty, chunk);
>
> // Create the data set
>   H5DataSetId dataSet = H5D.create(fileId, "NAME", dataTypeFile, dataSpace,
> new H5PropertyListId(H5P.Template.DEFAULT), compressProperty, new
> H5PropertyListId(H5P.Template.DEFAULT));
>
> // Write data to it
> H5D.write(dataSet, dataTypeMem, new H5DataSpaceId(H5S.H5SType.ALL), new
> H5DataSpaceId(H5S.H5SType.ALL), new H5PropertyListId(H5P.Template.DEFAULT),
> new H5Array< DataStruct >(data));
>
> // Cleanup
> H5T.close(dataTypeMem);
> H5T.close(dataTypeFile);
> H5D.close(dataSet);
> H5P.close(compressProperty);
> H5S.close(dataSpace);
>
> }
>
> catch
>
> {
>
> ...
>
> }
>
>
>
>
>
> All steps are: creating datatype for both file and memory, creating
>  dataspace, creating the dataset with shuffle and compression creation
> properties and finally writing the data to file.
>
> It might be a bit difficult to check this code, but are there any steps
> missing/incorrect?
>
>
>
> Help appreciated.
>
>
>
> Best regards,
>
>
> Bas Schoen
>
>
>
> _______________________________________________
>
> Hdf-forum is for HDF software users discussion.
>
> [email protected]
>
> http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org
>
>
>
>
> _______________________________________________
> Hdf-forum is for HDF software users discussion.
> [email protected]
> http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org
>
>
>
> _______________________________________________
>
> Hdf-forum is for HDF software users discussion.
>
> [email protected]
>
> http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org
>
>
>
>
> _______________________________________________
> Hdf-forum is for HDF software users discussion.
> [email protected]
> http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org
>
>
>
> _______________________________________________
> Hdf-forum is for HDF software users discussion.
> [email protected]
> http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org
_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

Reply via email to