Speaking of h5perf_serial, I ran it (v 1.8.6) on my test machine (win 2008 32bit) and I'm seeing strange results for the HDF5 Read measurement. For a single run, I see 0.0 or 19 MB/s or whatever. When I run many iterations, I get something like Maximum Throughput @ 0.0 MB/s, Average Throughput @ 72 MB/s, and Minimum Throughput 19 MB/s. I'm guessing a couple values aren't really set for this case. The results for other cases do look believable.
Scott > -----Original Message----- > From: [email protected] [mailto:hdf-forum- > [email protected]] On Behalf Of Mike Davis > Sent: Monday, August 15, 2011 8:47 AM > To: HDF Users Discussion List > Subject: Re: [Hdf-forum] Writing Performance > > On Fri, Aug 5, 2011 at 11:09 PM, Charles Darwin <[email protected]> > wrote: > > Hi, > > Hi Charles, > > > 1. Is this the expected performance or am I possibly doing something > wrong? > > Expected performance is tricky to answer, given how many variables are > involved. A good way to check the upper level of performance you can > expect from hdf5 on your machine is to use h5perf_serial. > > > 2. If I understand correctly, even though I'm writing 1 row at a > time, the > > data isn't actually being written to the disk until the chunk is > evicted > > from the cache and it is only at that point that the entire chunk > gets > > written to the disk (until then it's only writing to the chunk in the > > cache), if this is true then I would expect the performance to be > similar to > > writing bulks of 5500 x 28 (chunk size * compound data type size) = > 154,000 > > bytes to the HDD which I would expect to perform at least 5x better. > > > > Is my understanding correct? > > > > Does writing 1 record at a time cause overhead? If it does where is > the > > overhead coming from? > > All of the operations involved in writing data to a dataset have a > non-zero cost. resizing the dataset, allocating the read/write > dataspaces and performing the write all take some time, time that > you've now brought into your innermost loop. > > A general strategy for investigating possible optimizations that > should serve you well continuing with HDF5 is to try several > configurations and compare their performance. That said, I would > definitely investigate writing more than one row to the dataset at a > time. > > -- > Mike Davis > [email protected] > > _______________________________________________ > Hdf-forum is for HDF software users discussion. > [email protected] > http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org This e-mail and any files transmitted with it may be proprietary and are intended solely for the use of the individual or entity to whom they are addressed. If you have received this e-mail in error please notify the sender. Please note that any views or opinions presented in this e-mail are solely those of the author and do not necessarily represent those of ITT Corporation. The recipient should check this e-mail and any attachments for the presence of viruses. ITT accepts no liability for any damage caused by any virus transmitted by this e-mail. _______________________________________________ Hdf-forum is for HDF software users discussion. [email protected] http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org
