Lately I am using the buffered write (backing store) of hdf5 to write multiple time levels to hdf5 files. Each time level is its own group (zero padded character string) and 3D floating point variables are members of each groups.
My concern - perhaps unfounded - is that the very small bits of what I call metadata (integers, lists of what variables are in the file, and other very small bits of data I write which describe stuff like the 3D data and is necessary for my reader code) will be placed after the huge 3d data such that accessing it will require long seeks through 3d data. The only reason I am worried about this is I noticed doing a h5dump on one of my small metadata datasets that it took more than 10 seconds to output data on one of my files. I got the impression that perhaps h5dump was having to make its way through the 3d arrays before getting to the metadata. However, my C code seemed to access the metadata quickly; perhaps it's an issue with h5dump. So I guess my question is, should I not worry about things like what order data is written to the hdf5 file and assume that the layout is intelligent enough such that small structures/arrays/integers etc. will be accessible quickly? If not, how do I force the small stuff to be at the beginning of the file so it's quickly accessible? I will be looking at thousands of files, each of which is tens of GB in size and may have dozens of groups (each of which has dozens of 3D floating point arrays), so I am looking for all ways to squeeze the fastest I/O I can. Leigh -- Leigh Orf Associate Professor of Atmospheric Science Department of Earth and Atmospheric Science Central Michigan University
_______________________________________________ Hdf-forum is for HDF software users discussion. [email protected] http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org
