Malcolm,

Please try to use the latest file format when you create a file. It should be 
more efficient in handling groups with a big number of objects.

See the H5Pset_libver_bounds function 
(http://www.hdfgroup.org/HDF5/doc/RM/RM_H5P.html#Property-SetLibverBounds); use 
H5F_LIBVER_LATEST for the last two parameters.

You may repack an existing file with h5repack using -L flag.

Elena

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Elena Pourmal  The HDF Group  http://hdfgroup.org   
1800 So. Oak St., Suite 203, Champaign IL 61820
217.531.6112
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~



On Sep 5, 2012, at 4:25 AM, Malcolm MacLeod wrote:

> Hello,
> 
> Our software has for a long time made use of the HDF5 library without any 
> issues. Recently we have started to run into datasets far larger than wh at 
> was previously used and some scalability issues appear to be showing.
> 
> The HDF5 file in question contains a single group with many datasets - A 
> specific piece of code opens every dataset one at a time and reads from it 
> via 
> H5DRead.
> 
> Previously it was rare to have more than ~90000 datasets here so this was 
> never noticed - but after H5DRead has been called about ~60000 times 
> subsequent calls appear to start to become increasingly slow, by about ~80000 
> calls it slows to a crawl (instead of processing 1000s a second it is 
> processing only two or three per second)
> 
> I have tried upgrading from 1.8.8 -> 1.8.9 and this seems to have helped 
> slightly, it now becomes unbearable at around ~100000 instead of ~80000 calls.
> 
> 
> Some observations:
> 1) This does not appear to be due to a seek delay or (larger datasets in the 
> middle) or anything like that, I have tried e.g. starting at the back of a 
> group of ~500000 datasets instead of the front and the same thing happens. I 
> have tried also to start in various spots towards the middle and also the 
> same 
> behaviour can be observed.
> 2) If I cancel the loop, allow the software to idle for a while and then give 
> it another go the same thing happens (it is fast again until a certain 
> quantity of reads) - so it appears that HDF5 may be doing something in the 
> background once it is not busy that allows reads to be fast again?
> 
> 
> I would greatly appreciate any thoughts on this or ideas as to what might be 
> going on?
> 
> Regards,
> Malcolm MacLeod
> 
> _______________________________________________
> Hdf-forum is for HDF software users discussion.
> [email protected]
> http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org


_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

Reply via email to