Hi Malcom, Doesn't sound good ;-). Would it be possible to submit a program that demonstrates the issue to [email protected], so we can take a look?
Thank you! Elena ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Elena Pourmal The HDF Group http://hdfgroup.org 1800 So. Oak St., Suite 203, Champaign IL 61820 217.531.6112 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ On Sep 8, 2012, at 3:31 AM, Malcolm MacLeod wrote: > Hello Elena, > > Sorry I should have mentioned that, I am already setting H5F_LIBVER_LATEST > and > have recreated the file (which is what gave the slight speed boost I > mentioned > originally when upgrading) but the same issue is unfortunately still present. > > - Malcolm > > >> Malcolm, >> >> Please try to use the latest file format when you create a file. It should >> be more efficient in handling groups with a big number of objects. >> >> See the H5Pset_libver_bounds function >> (http://www.hdfgroup.org/HDF5/doc/RM/RM_H5P.html#Property-SetLibverBounds); >> use H5F_LIBVER_LATEST for the last two parameters. >> >> You may repack an existing file with h5repack using -L flag. >> >> Elena >> >> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >> Elena Pourmal The HDF Group http://hdfgroup.org >> 1800 So. Oak St., Suite 203, Champaign IL 61820 >> 217.531.6112 >> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >> >> On Sep 5, 2012, at 4:25 AM, Malcolm MacLeod wrote: >>> Hello, >>> >>> Our software has for a long time made use of the HDF5 library without any >>> issues. Recently we have started to run into datasets far larger than wh >>> at >>> was previously used and some scalability issues appear to be showing. >>> >>> The HDF5 file in question contains a single group with many datasets - A >>> specific piece of code opens every dataset one at a time and reads from it >>> via H5DRead. >>> >>> Previously it was rare to have more than ~90000 datasets here so this was >>> never noticed - but after H5DRead has been called about ~60000 times >>> subsequent calls appear to start to become increasingly slow, by about >>> ~80000 calls it slows to a crawl (instead of processing 1000s a second it >>> is processing only two or three per second) >>> >>> I have tried upgrading from 1.8.8 -> 1.8.9 and this seems to have helped >>> slightly, it now becomes unbearable at around ~100000 instead of ~80000 >>> calls. >>> >>> >>> Some observations: >>> 1) This does not appear to be due to a seek delay or (larger datasets in >>> the middle) or anything like that, I have tried e.g. starting at the back >>> of a group of ~500000 datasets instead of the front and the same thing >>> happens. I have tried also to start in various spots towards the middle >>> and also the same behaviour can be observed. >>> 2) If I cancel the loop, allow the software to idle for a while and then >>> give it another go the same thing happens (it is fast again until a >>> certain quantity of reads) - so it appears that HDF5 may be doing >>> something in the background once it is not busy that allows reads to be >>> fast again? >>> >>> >>> I would greatly appreciate any thoughts on this or ideas as to what might >>> be going on? >>> >>> Regards, >>> Malcolm MacLeod >>> >>> _______________________________________________ >>> Hdf-forum is for HDF software users discussion. >>> [email protected] >>> http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org > > _______________________________________________ > Hdf-forum is for HDF software users discussion. > [email protected] > http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org
_______________________________________________ Hdf-forum is for HDF software users discussion. [email protected] http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org
