Hi, I've been looking into HDF5 as possible data storage for an application.
I'd appreciate some assistance with understanding writing performance. The data that is being written is a compound data type of 28 bytes (one unsigned 8 byte int, four 4 byte floats and one unsigned 4 byte int; 1 x STD_U64LE, 4 x IEEE_F32LE, 1 x STD_U32LE). The chunk size is set to 5500 elements, cache settings are at their default. The writing is done 1 row at a time (appending to the end, using the dataset API, not the packet table or the table API) and the performance I get is around 200,000 rows per second which is below my expectations. 1. Is this the expected performance or am I possibly doing something wrong? 2. If I understand correctly, even though I'm writing 1 row at a time, the data isn't actually being written to the disk until the chunk is evicted from the cache and it is only at that point that the entire chunk gets written to the disk (until then it's only writing to the chunk in the cache), if this is true then I would expect the performance to be similar to writing bulks of 5500 x 28 (chunk size * compound data type size) = 154,000 bytes to the HDD which I would expect to perform at least 5x better. Is my understanding correct? Does writing 1 record at a time cause overhead? If it does where is the overhead coming from? -- View this message in context: http://hdf-forum.184993.n3.nabble.com/Writing-Performance-tp3230432p3230432.html Sent from the hdf-forum mailing list archive at Nabble.com. _______________________________________________ Hdf-forum is for HDF software users discussion. [email protected] http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org
