Re: [Hdf-forum] Scalability/Speed issues using H5DRead

Malcolm MacLeod Sat, 08 Sep 2012 02:32:38 -0700

Hello Elena,

Sorry I should have mentioned that, I am already setting H5F_LIBVER_LATEST and 
have recreated the file (which is what gave the slight speed boost I mentioned 
originally when upgrading) but the same issue is unfortunately still present.


- Malcolm


> Malcolm,
> 
> Please try to use the latest file format when you create a file. It should
> be more efficient in handling groups with a big number of objects.
> 
> See the H5Pset_libver_bounds function
> (http://www.hdfgroup.org/HDF5/doc/RM/RM_H5P.html#Property-SetLibverBounds);
> use H5F_LIBVER_LATEST for the last two parameters.
> 
> You may repack an existing file with h5repack using -L flag.
> 
> Elena
> 
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> Elena Pourmal  The HDF Group  http://hdfgroup.org
> 1800 So. Oak St., Suite 203, Champaign IL 61820
> 217.531.6112
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> 
> On Sep 5, 2012, at 4:25 AM, Malcolm MacLeod wrote:
> > Hello,
> > 
> > Our software has for a long time made use of the HDF5 library without any
> > issues. Recently we have started to run into datasets far larger than wh
> > at
> > was previously used and some scalability issues appear to be showing.
> > 
> > The HDF5 file in question contains a single group with many datasets - A
> > specific piece of code opens every dataset one at a time and reads from it
> > via H5DRead.
> > 
> > Previously it was rare to have more than ~90000 datasets here so this was
> > never noticed - but after H5DRead has been called about ~60000 times
> > subsequent calls appear to start to become increasingly slow, by about
> > ~80000 calls it slows to a crawl (instead of processing 1000s a second it
> > is processing only two or three per second)
> > 
> > I have tried upgrading from 1.8.8 -> 1.8.9 and this seems to have helped
> > slightly, it now becomes unbearable at around ~100000 instead of ~80000
> > calls.
> > 
> > 
> > Some observations:
> > 1) This does not appear to be due to a seek delay or (larger datasets in
> > the middle) or anything like that, I have tried e.g. starting at the back
> > of a group of ~500000 datasets instead of the front and the same thing
> > happens. I have tried also to start in various spots towards the middle
> > and also the same behaviour can be observed.
> > 2) If I cancel the loop, allow the software to idle for a while and then
> > give it another go the same thing happens (it is fast again until a
> > certain quantity of reads) - so it appears that HDF5 may be doing
> > something in the background once it is not busy that allows reads to be
> > fast again?
> > 
> > 
> > I would greatly appreciate any thoughts on this or ideas as to what might
> > be going on?
> > 
> > Regards,
> > Malcolm MacLeod
> > 
> > _______________________________________________
> > Hdf-forum is for HDF software users discussion.
> > [email protected]
> > http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

Re: [Hdf-forum] Scalability/Speed issues using H5DRead

Reply via email to