Rob,

thanks a lot for hints. I will look at the suggested option and try some experiments with it :).

Daniel



Dne 17. 9. 2013 15:34, Rob Latham napsal(a):
On Tue, Sep 17, 2013 at 11:15:02AM +0200, Daniel Langr wrote:
separate files: 1.36 [s]
single file, 1 stripe: 133.6 [s]
single file, best result: 17.2 [s]

(I did multiple runs with various combinations of strip count and
size, presenting the best results I have obtained.)

Increasing the number of stripes obviously helped a lot, but
comparing with the separate-files strategy, the writing time is
still more than ten times slower . Do you think it is "normal"?

It might be "normal" for Lustre, but it's not good.  I wish I had
more experience tuning the Cray/MPI-IO/Lustre stack, but I do not.
The ADIOS folks report tuned-HDF5 to a single shared file runs about
60% slower than ADIOS to multiple files, not 10x slower, so it seems
there is room for improvement.

I've asked them about the kinds of things "tuned HDF5" entails, and
they didn't know (!).

There are quite a few settings documented in the intro_mpi(3) man
page.  MPICH_MPIIO_CB_ALIGN will probably be the most important thing
you can try.  I'm sorry to report that in my limited experience, the
documentation and reality are sometimes out of sync, especially with
respect to which settings are default or not.

==rob

Thanks,
Daniel

Dne 30. 8. 2013 16:05, Daniel Langr napsal(a):
I've run some benchmark, where within an MPI program, each process wrote
3 plain 1D arrays to 3 datasets of an HDF5 file. I've used the following
writing strategies:

1) each process writes to its own file,
2) each process writes to the same file to its own dataset,
3) each process writes to the same file to a same dataset.

I've tested 1)-3) for both fixed/chunked datasets (chunk size 1024), and
I've tested 2)-3) for both independent/collective options of the MPI
driver. I've also used 3 different clusters for measurements (all quite
modern).

As a result, the running (storage) times of the same-file strategy, i.e.
2) and 3), were of orders of magnitudes longer than the running times of
the separate-files strategy. For illustration:

cluster #1, 512 MPI processes, each process stores 100 MB of data, fixed
data sets:

1) separate files: 2.73 [s]
2) single file, independent calls, separate data sets: 88.54[s]

cluster #2, 256 MPI processes, each process stores 100 MB of data,
chunked data sets (chunk size 1024):

1) separate files: 10.40 [s]
2) single file, independent calls, shared data sets: 295 [s]
3) single file, collective calls, shared data sets: 3275 [s]

Any idea why the single-file strategy gives so poor writing performance?

Daniel

_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://mail.lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org


_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://mail.lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org

Reply via email to