Hello Elena, Sorry I should have mentioned that, I am already setting H5F_LIBVER_LATEST and have recreated the file (which is what gave the slight speed boost I mentioned originally when upgrading) but the same issue is unfortunately still present.
- Malcolm > Malcolm, > > Please try to use the latest file format when you create a file. It should > be more efficient in handling groups with a big number of objects. > > See the H5Pset_libver_bounds function > (http://www.hdfgroup.org/HDF5/doc/RM/RM_H5P.html#Property-SetLibverBounds); > use H5F_LIBVER_LATEST for the last two parameters. > > You may repack an existing file with h5repack using -L flag. > > Elena > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > Elena Pourmal The HDF Group http://hdfgroup.org > 1800 So. Oak St., Suite 203, Champaign IL 61820 > 217.531.6112 > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > On Sep 5, 2012, at 4:25 AM, Malcolm MacLeod wrote: > > Hello, > > > > Our software has for a long time made use of the HDF5 library without any > > issues. Recently we have started to run into datasets far larger than wh > > at > > was previously used and some scalability issues appear to be showing. > > > > The HDF5 file in question contains a single group with many datasets - A > > specific piece of code opens every dataset one at a time and reads from it > > via H5DRead. > > > > Previously it was rare to have more than ~90000 datasets here so this was > > never noticed - but after H5DRead has been called about ~60000 times > > subsequent calls appear to start to become increasingly slow, by about > > ~80000 calls it slows to a crawl (instead of processing 1000s a second it > > is processing only two or three per second) > > > > I have tried upgrading from 1.8.8 -> 1.8.9 and this seems to have helped > > slightly, it now becomes unbearable at around ~100000 instead of ~80000 > > calls. > > > > > > Some observations: > > 1) This does not appear to be due to a seek delay or (larger datasets in > > the middle) or anything like that, I have tried e.g. starting at the back > > of a group of ~500000 datasets instead of the front and the same thing > > happens. I have tried also to start in various spots towards the middle > > and also the same behaviour can be observed. > > 2) If I cancel the loop, allow the software to idle for a while and then > > give it another go the same thing happens (it is fast again until a > > certain quantity of reads) - so it appears that HDF5 may be doing > > something in the background once it is not busy that allows reads to be > > fast again? > > > > > > I would greatly appreciate any thoughts on this or ideas as to what might > > be going on? > > > > Regards, > > Malcolm MacLeod > > > > _______________________________________________ > > Hdf-forum is for HDF software users discussion. > > [email protected] > > http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org _______________________________________________ Hdf-forum is for HDF software users discussion. [email protected] http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org
