You have to pay attention to the chunk size and access pattern.
E.g. a 3-dim dataset with shape [1000,1000,1000] can be accessed in many
ways. E.g. successive xy-planes, xz-planes or yz-planes. The chunk shape
can be defined such that a certain pattern is favoured. Say you always
access successive xz-planes, it makes sense to define a chunk shape
[c1,1,c3]. However, if you sometimes also access xy-planes, such a chunk
shape is very bad because you need 1000 chunks to get all data of the
y-axis and it is the question if you have sufficient memory to keep all
those chunks (to service the next xy-plane without rereading all those
chunks).
So if you have different access patterns, the chunk shape has to be a
compromise such that all patterns can be serviced reasonably well.

I attach a C++ test program I wrote some time ago to test various access
patterns. I hope it is clear enough.

Ger

>>> Alexander Tzokev  08/22/13 7:58 AM >>>
Hello Ger,

thank you for the reply. We noticed that the dataset size is small and
fits into memory and performed new experiments with 8GB single dataset.
The RAM on the test workstation is 4GB.


Unfortunately the results are the same. There is no difference in
execution time and IO operations with different values for mdc_nelmst,
rdcc_nelemts, rdcc_nbytes and rdcc_w0.


If needed I can post the source code for the example program and some
test results.


Attachment: tHDF5.cc
Description: Binary data

_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://mail.lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org

Reply via email to