Hi Raul,

HDF5 uses MPI I/O to do parallel I/O, so underneath the covers you will use two-phase I/O in most cases when you do collective I/O.

All you need to do is set the file access property list to use the MPIO VFD and use it to create the file. And whenever you do raw data I/O (H5Dread/H5Dwrite) you can set the data transfer property list to do collective I/O which should translate to the two-phase I/O algorithm that your MPI library uses.

For more detailed information you can look through these tutorials:
http://www.hdfgroup.org/HDF5/PHDF5/

Thanks,
Mohamad

On 7/2/2013 12:05 PM, Raúl de la Cruz wrote:
Hi there,

I am developing a parallel scientific application where each MPI process
writes very small buffers of large datasets on each I/O step. As
expected, the performance is really poor due to the FS lock block issue
(block size is 2MB in our GPFS and buffers 4kb-16kb per MPI process) and
the small sets written per iteration.

Checking the literature I realized about 'two-phase I/O' (a.k.a
collective buffering) (papers: 'Improved parallel I/O via a two-phase
run-time access strategy' and 'Colletive I/O buffering: Improving
parallel I/O performance').

I was wondering if it is possible to use directly these techniques in
HDF5 with MPIO and GPFS. So, the point is that I would like to know if
this I/O strategies are already implemented in HDF5/MPIIO/GPFS layers or
I should have to program that by myself.

Thank you in advance.

Best regards,
Raúl


_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://mail.lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org

Reply via email to