On Tue, 2011-04-12 at 10:39 -0700, Leigh Orf wrote: > > > Understand that I just discovered the ability to do buffered I/O with > hdf5. I wasn't aware of the core serial driver until Friday!
Yeah, there are a lot of interesting dark corners of HDF5 library that are useful to know about. Core driver is definitely one of them. That has saved my behind a few times when we've been in a bind on performance. > I am going to looking carefully at your code. At first glance, it > appears to be a similar approach to what I have tried but in my case I > created new MPI communicators which spanned any number of cores (but > it has to divide evenly into the full problem, unlike with your > approach). In my case, each subcommunicator would use pHDF5 collective > calls to concurrently write to its own file, and I could choose the > number of files. I still had lousy performance with all my choices of > number of files. > > It is not entirely clear to me that you are doing true collective > parallel HDF5 (where I have had problems but have been led to believe Thats right. There is NOTHING I/O-wise that is parallel. That code is designed to work with SERIAL compiled HDF5. The only parallel parts are the file management to orchestrate parallel I/O to multiple files concurrently. It is the 'Poor Mans' approach to parallel I/O. It is described in the pmpio.h header file a bit and more here... http://visitbugs.ornl.gov/projects/hpc-hdf5/wiki/Poor_Man's_vs_Rich_Mans'_Parallel_IO > it is a path to happiness) as you do not call h5pset_dxpl_mpio and > set the H5FD_MPIO_COLLECTIVE flag. You also do not construct a > property list and pass it to h5dwrite, instructing each I/O core to > write its own piece of a hdf5 file using offset arrays, > h5sselect_hyperslab calls etc., which is what the examples I have > found led me to. It seems you are effectively doing serial hdf5 in > parallel, which is what I am leaning towards at this point. Your > approach is more elegant than mine but I am (a) stuck with fortran and > (b) not a programmer by training, although C is my preferred language > for I/O. Not sure if I could call your code from fortran easily > without going through contortions (again forgive me, I am a weather > guy who pretends he is a programmer). > > I fully embraced parallel hdf5 because I thought it could give me all > the flexibility I needed to essentially tune So, I find the all-collective-all-the-time API for parallel HDF5 to be way too 'inflexible' to handle sophisticated I/O patterns where data type, size and shape and existence even vary substantially from processor-to-processor. For bread-and-butter data parallel apps where essentially the same few data structures (distributed arrayys) are distributed across processors, it works ok. But, none of the simulation apps I support have that kind of a (simple) I/O pattern nor even approximate it, especially for plot outputs. -- Mark C. Miller, Lawrence Livermore National Laboratory ================!!LLNL BUSINESS ONLY!!================ [email protected] urgent: [email protected] T:8-6 (925)-423-5901 M/W/Th:7-12,2-7 (530)-753-8511 _______________________________________________ Hdf-forum is for HDF software users discussion. [email protected] http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org
