On Fri, Aug 5, 2011 at 11:09 PM, Charles Darwin <trok...@yahoo.com> wrote: > Hi,
Hi Charles, > 1. Is this the expected performance or am I possibly doing something wrong? Expected performance is tricky to answer, given how many variables are involved. A good way to check the upper level of performance you can expect from hdf5 on your machine is to use h5perf_serial. > 2. If I understand correctly, even though I'm writing 1 row at a time, the > data isn't actually being written to the disk until the chunk is evicted > from the cache and it is only at that point that the entire chunk gets > written to the disk (until then it's only writing to the chunk in the > cache), if this is true then I would expect the performance to be similar to > writing bulks of 5500 x 28 (chunk size * compound data type size) = 154,000 > bytes to the HDD which I would expect to perform at least 5x better. > > Is my understanding correct? > > Does writing 1 record at a time cause overhead? If it does where is the > overhead coming from? All of the operations involved in writing data to a dataset have a non-zero cost. resizing the dataset, allocating the read/write dataspaces and performing the write all take some time, time that you've now brought into your innermost loop. A general strategy for investigating possible optimizations that should serve you well continuing with HDF5 is to try several configurations and compare their performance. That said, I would definitely investigate writing more than one row to the dataset at a time. -- Mike Davis mikeda...@uchicago.edu _______________________________________________ Hdf-forum is for HDF software users discussion. Hdf-forum@hdfgroup.org http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org