Hi Nils, On Feb 17, 2011, at 10:45 AM, nls wrote: > > Hi everyone, > > thanks for the helpful comments. > > I did check that > 1. compression is actually on > 2. i am using the new 1.8 group format > (this actually required me t write my first nontrivial cython wrapper since > h5py does not provide access to LIBVER_LATEST) > > Following the helpful advice on the chunk index overhead I tried to use > contiguous storage. Unfortunately, again I > ran into an unsupported feature: h5py only supports resizable Datasets when > they're chunked, even when using the "low-level" functions which wrap the > HDF5 C-API: > > "h5py._stub.NotImplementedError: Extendible contiguous non-external dataset > (Dataset: Feature is unsupported)" > > Since I do need resizing, I guess I am stuck with chunked Datasets for now. > I tried different chunk sizes but that did not make a noticeable difference. >
> In conclusion, I see no way to get less than about 15x file size overhead > when using HDF5 with h5py for my data.... > Hmmm, doesn't make sense. I think we can do better. Would it be possible for you to write a C program that does the same thing as your Python script? Can you send us output of "h5dump -H -p" on your file? Also, could you please run h5stat on the file and post that output too? Thank you! Elena > cheers, Nils > -- > View this message in context: > http://hdf-forum.184993.n3.nabble.com/file-h5-tar-gz-is-20-times-smaller-what-s-wrong-tp2509949p2520498.html > Sent from the hdf-forum mailing list archive at Nabble.com. > > _______________________________________________ > Hdf-forum is for HDF software users discussion. > [email protected] > http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org _______________________________________________ Hdf-forum is for HDF software users discussion. [email protected] http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org
