We use hdf5 for parallel I/O in VORPAL, our laser plasma simulation code. For 
the most part, it works fine, but on certain machines (e.g., early Cray and 
BG/P) and certain types of filesystems, we've noticed that parallel I/O hangs, 
so we instituted a -id (individual dump) option which causes each MPI rank to 
dump its own hdf5 file, and once the simulation is complete, we merge the 
individual dump files.

We have a customer for whom parallel I/O is hanging, and they are using -id as 
described above. We're trying to pinpoint why parallel I/O is not working on 
their system, which is CentOS 5.5 cluster.

In the past we ourselves have had problems with parallel I/O failing on ext3 
filesystems, so we reformatted as XFS and the problem went away. Our customer 
did this, but the problem still persists.

Anyone have any words of wisdom as to what other things could cause parallel 
I/O to hang?

Thanks for any help!
Dave
_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

Reply via email to