Hello everyone, I am new to HDF and am trying to understand whether or not it might be a suitable file format for my application. The data I'm interested to store is usually written by the collecting instrument to basic binary files of concatenated packets (think c structures), each of which contains a header with a time stamp, packet format, packet identifier, and packet size followed by the data itself (arrays) and associated metadata. There are 10's of types of packets that may come in any order and they are usually written to the file sequentially. Packets contain from 10-100 fields, some of which may be arrays of data of various sizes.
This format allows one to relatively quickly index a file by passing through the file and parsing only these headers. Then one can use the index to pull subsets of the data in a non-linear fashion, sometimes simultaneously in multiple threads for quite fast reading. The problem is that every instrument manufacturer has their own method of encoding packets and a single format is needed for archival purposes. My question to you is how might a similar model be implemented in HDF5 such that the same kind of indexing and parallel data retrieval is possible? What is to be avoided is the need to read through a file sequentially to get to the fields to extract. It seems like HDF5 should handle this kind of thing well, but because I am inexperienced and because most folks using it seem to be storing relatively small numbers of very large arrays (imagery in many cases), rather than relatively large numbers of smaller numbers of fields and smaller arrays, it is not clear to me how such an implementation might perform. So I guess I'm also asking, what is the relative penalty for writing lots of small sets of data? I hope this makes sense. Thanks in advance, Val ------------------------------------------------------ Val Schmidt CCOM/JHC University of New Hampshire Chase Ocean Engineering Lab 24 Colovos Road Durham, NH 03824 e: vschmidt [AT] ccom.unh.edu m: 614.286.3726 _______________________________________________ Hdf-forum is for HDF software users discussion. [email protected] http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org
