Hello, Our software has for a long time made use of the HDF5 library without any issues. Recently we have started to run into datasets far larger than wh at was previously used and some scalability issues appear to be showing.
The HDF5 file in question contains a single group with many datasets - A specific piece of code opens every dataset one at a time and reads from it via H5DRead. Previously it was rare to have more than ~90000 datasets here so this was never noticed - but after H5DRead has been called about ~60000 times subsequent calls appear to start to become increasingly slow, by about ~80000 calls it slows to a crawl (instead of processing 1000s a second it is processing only two or three per second) I have tried upgrading from 1.8.8 -> 1.8.9 and this seems to have helped slightly, it now becomes unbearable at around ~100000 instead of ~80000 calls. Some observations: 1) This does not appear to be due to a seek delay or (larger datasets in the middle) or anything like that, I have tried e.g. starting at the back of a group of ~500000 datasets instead of the front and the same thing happens. I have tried also to start in various spots towards the middle and also the same behaviour can be observed. 2) If I cancel the loop, allow the software to idle for a while and then give it another go the same thing happens (it is fast again until a certain quantity of reads) - so it appears that HDF5 may be doing something in the background once it is not busy that allows reads to be fast again? I would greatly appreciate any thoughts on this or ideas as to what might be going on? Regards, Malcolm MacLeod _______________________________________________ Hdf-forum is for HDF software users discussion. [email protected] http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org
