[Hdf-forum] write faster than read? weird.

stnchris Fri, 03 Dec 2010 10:54:24 -0800

Finally developed things to a point that I can get useful performancenumbers for my application. So far, things look good. But, when I lookat the performance numbers I see behavior I don't expect -- namely,that my write throughput is almost 2x greater than my read throughput.

My system: x64 windows xp, ntfs file system, C++, HDF5 compiled w/ VS2008 (thread-safe).

My data: Dummy GIS data scattered in a region. We have a known grid ofgeocells that data should be split into, and store the data into somenumber of HDF5 files such that a given file contains data forneighboring geocells. I take the GIS data, clip it to lat/lonboundaries (not really lat/lon, but it's sort of equivalent tolat/lon), determine which HDF5 file the clipped region should bestored within, and write each clipped GIS dataset in a separate HDF5dataset (dataset is converted to a 1D stream of 32-bit integeropcodes/data).

I've created a thin wrapper around HDF5 that retains an LRU cache ofrecently opened HDF5 files and datasets. It also hides the details ofour HDF5 file hierarchy and the configuration details of our datasetsfrom its client applications.

Everything appears to be working great and I've been doing someperformance testing to determine the effects ofcompression/chunksize/contiguous-vs-chunked/etc.

The attached images are the results of running some performance teststo look at read/write throughput versus chunksize. At each chunk size,I re-ran the test 8 times, throwing out the min/max. Each node in thegraph is the mean of the remaining 6 runs, the error bars representthe stddev.

The test data was 2million randomly generated GIS points, split into afew hundred HDF5 datasets in about 25 HDF5 files.


None - chunked datasets w/out compression
NoneNoChunk - contiguous datasets
lzf - chunked w/ lzf compression
zlib1, 4, 9 - zlib at diff levels

The compression ratio shows what I expect. LZF isn't as good as ZLIBat compression. Minimal compression difference at the various zliblevels.Not shown here are the runs I did with the shuffle filter, which formy data didn't help compression and just slowed things down. Thecompression ratio for NoneNoChunk threw me for a bit until I realizedI was seeing the increased file size due to the file space allocatedfor partially-used chunks.

The write throughput graph shows LZF considerably better for my datathan the other options for every chunk size. And zlib's mb/secthroughput is significantly worse, and worse than contiguous orno-compression.

The read graph shows better for zlib -- it outperforms theno-compression options. But, again LZF has better throughput than zlib.

So, I confirmed what I had expected performance-wise. But, then Ilooked both read & write graphs.


On read throughput, my datasets w/ LZF average 70-80 MB/sec.

But, on write throughput, my datasets w/ LZF average 125 MB/sec.

It doesn't just seem to be related to a compression filter. The writethroughput for my contiguous dataset runs (NoneNoChunk) was ~60MB/sec, and its read throughput was ~45 MB/sec.

Unfortunately, I cannot share my code. Any ideas where to look forwhat might be causing this? Or, any hints for how to diagnose thesedifferences myself?

Writing all this down, I'm starting to wonder if comparing myread/write throughput is a valid comparison at all. The way myperformance testing application is writing out data is different thanread.

In both, I read/write the same total amount of data and traverse thesame datasets. However, the order I do that dataset traversal isdifferent.

My geocell datasets end up similar a 2D array. In the table below each2digit number represents a dataset. The spacing represents how thosedatasets are stored in separate HDF5 files -- e.g. datasets 00-03,10-13, 20-23, 30,33 are stored in a single file.


00 01 02 03   04 05 06 07   08 09
10 11 12 13   14 15 16 17   18 19
20 21 22 23   24 25 26 27   28 29
30 31 32 33   34 35 36 37   38 39

40 41 42 43   44 45 46 47   48 49
50 51 52 53   54 55 56 57   58 59
60 61 62 63   64 65 66 67   68 69
70 71 72 73   74 75 76 77   78 79

In my read test, I do a row-major traversal of the datasets (00-09,10-19, 20-29, etc). In the write tests, that's not the case -- everydataset is held in a hash map before being written to disk.

Maybe the unexpected throughput behavior is due to my wrapper librarythat implements the LRU cache of files. The file-handle cache is small(<5), and depending on the length of the row by the time the read testreaches the end of the row and moves to the next dataset in the firstfile, that first file may have fallen out of cache.

Will have to fiddle with my cache configuration and see if thateliminates this behavior.




_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

[Hdf-forum] write faster than read? weird.

Reply via email to