Hi everyone, I am using HDF5 via h5py to store simulation data. The data are hierarchical and I am using a nested tree of HDF5 Groups to store them. Each Group has about 3 Datasets which are small, 3 Attributes, and a number <10 of descendants.
My problem is that writing is kind of slow and the files are big. They also seem very redundant since compressing the whole file with gzip gives almost 20x compression ratio while turning on gzip compression for the datasets has almost no effect on file size. I also tried to set the new Group compact/indexed storage format which reduces file size only a little. Am I doing something wrong in the layout of the file? The actual data hierarchy cannot be changed, but maybe I can rearrange data differently? Here is a link to an example file if anyone would like to have a look: http://dl.dropbox.com/u/5077634/br_0.h5.tar.gz (760k compressed, 3500 Groups, 7000 Datasets) thx for any hints! -- View this message in context: http://hdf-forum.184993.n3.nabble.com/file-h5-tar-gz-is-20-times-smaller-what-s-wrong-tp2509949p2509949.html Sent from the hdf-forum mailing list archive at Nabble.com. _______________________________________________ Hdf-forum is for HDF software users discussion. [email protected] http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org
