Hi Albert and Mohamad,
haven't received e-mails with your replies :( so I cannot reply to them
specifically (or do not know how). So, replying to my original...
@Albert:
All the used clusters uses the Lustre file system, and I believe, this
file system should be scalable, at least to some extent. Apparently, it
is scalable for the single-file-per-process strategy.
I understand the notice about memory-to-kernel writes. However, again, I
am comparing the single/multiple-file strategy. Both give quite
different results. Moreover, the multiple-file case correspond, within
my measurements, to the listed peak I/O bandwidth of the storage
systems. The single-file case is much much worse. Obviously, this is not
limited by the memory-kernel copying.
Thanks for the link to hd5perf, I will try it.
@Mohamad:
Thanks for hint, all file systems are Lustre-based, indeed with default
strip count 1. I will rerun my measurements with different strip
size/count and post the results.
Daniel
Dne 30. 8. 2013 16:05, Daniel Langr napsal(a):
I've run some benchmark, where within an MPI program, each process wrote
3 plain 1D arrays to 3 datasets of an HDF5 file. I've used the following
writing strategies:
1) each process writes to its own file,
2) each process writes to the same file to its own dataset,
3) each process writes to the same file to a same dataset.
I've tested 1)-3) for both fixed/chunked datasets (chunk size 1024), and
I've tested 2)-3) for both independent/collective options of the MPI
driver. I've also used 3 different clusters for measurements (all quite
modern).
As a result, the running (storage) times of the same-file strategy, i.e.
2) and 3), were of orders of magnitudes longer than the running times of
the separate-files strategy. For illustration:
cluster #1, 512 MPI processes, each process stores 100 MB of data, fixed
data sets:
1) separate files: 2.73 [s]
2) single file, independent calls, separate data sets: 88.54[s]
cluster #2, 256 MPI processes, each process stores 100 MB of data,
chunked data sets (chunk size 1024):
1) separate files: 10.40 [s]
2) single file, independent calls, shared data sets: 295 [s]
3) single file, collective calls, shared data sets: 3275 [s]
Any idea why the single-file strategy gives so poor writing performance?
Daniel
_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://mail.lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org