Re: [Hdf-forum] loading chunks or even the whole hdf5 file into memory

Thorben Kröger Wed, 03 Mar 2010 04:25:50 -0800

On a related note, I've just found this piece of information which might 
accelerate our program as well:

---

http://www.hdfgroup.org/HDF5/doc/ADGuide/CompatFormat180.html

H5Pset_libver_bounds( hid_t fapl_id, H5F_libver_t low, H5F_libver_t high ) 

Compact-or-indexed groups enable much-compressed link storage for groups with
very few members and improved efficiency and performance for groups with very 
large numbers of members. The efficiency and performance impacts are most 
noticeable at the extremes: all unnecessary overhead is eliminated for groups
 with zero members; groups with tens of thousands of members may see as much 
as a 100-fold performance gain.

H5Pset_libver_bounds( hid_t fapl_id, H5F_libver_t low, H5F_libver_t high )
H5Pget_libver_bounds( hid_t fapl_id, H5F_libver_t* low, H5F_libver_t* high )

Default behavior: If H5Pset_libver_bounds is not called with low equal to 
HDF_LIBVER_LATEST, then the HDF5 Library provides the greatest-possible
format compatibility. It does this by creating objects with the earliest
opssible format that will handle the data being stored and accommodate 
the action being taken.

---

Though the 30GB file I was talking of was written using HDF5 1.8.4, if I 
understand correctly, it will not make use of these new features because it 
tries to maintain downward compatibility to 1.6. Correct?

Is there a tool available that converts an existing file to a new file version 
to make use of all these performance improvements? Or should I hack this into 
h5repack.c myself? I'd like to avoid that...

Cheers,
Thorben

On Wednesday 03 March 2010 11:47:18 Thorben Kröger wrote:
> Hello,
> We have a ~30GB HDF5 file with something like 100 million small datasets in
> it. We need to iterate through all of them, and doing so is very slow as
> each one has to be loaded from disk. I also don't know if it is possible
> to find out a proper ordering to go through them, so I suspect that there
> might also be a lot of disk seeks necessary.
> 
> Maybe it isn't such a good idea to have so many small objects in the file,
> but I'm stuck with this format now. What options do I have?
> 
> I'm working now on a machine with 128GB of RAM, so my file would fit
> comfortably inside. Is it possible to load the file completely into memory
> to avoid all of the above problems?
> 
> Thanks,
> Thorben
> 
> _______________________________________________
> Hdf-forum is for HDF software users discussion.
> [email protected]
> http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

Re: [Hdf-forum] loading chunks or even the whole hdf5 file into memory

Reply via email to