Thanks for the information. After I sent my email I realized I left out some
relevant information. I am not using pHDF5 but regular HDF5, but in a
parallel environment. The only reason I am doing this is because I want the
ability to write compressed HDF5 files (gzip, szip, scale-offset, nbit,
etc.). As I understand it, at this point (and maybe forever) pHDF5 cannot do
compression.

I currently have tried two approaches with compression and HDF5 in a
parallel environment: (1) Each MPI rank writes its own compressed HDF5 file.
(2) I create a new MPI communicator (call it subcomm) which operates on a
sub-block of the entire domain. Each instance of subcomm (which could, for
instance, operate on one multicore chip) does a MPI_GATHER to rank 0 of
subcomm, and that root core does the compression and writes to disk. The
problem with (1) is there are too many files with large simulations, the
problem with (2) is rank 0 is operating on a lot of data and the compression
code slows things down dramatically - rank 0 cranks away while the other
ranks are at a barrier. So I am trying a third approach where you still have
subcomm, but instead of doing the MPI_GATHER, each core writes, in a
round-robin fashion, to the file created by rank 0 of subcomm. I am hoping
that I'll get the benefits of compression (being done in parallel) and not
suffer a huge penalty for the round-robin approach.

If there were a way to do compressed pHDF5 I'd just do a hybrid approach
where each subcomm root node wrote (in parallel) to its HDF5 file. In this
case, I would presume that the computationally expensive compression
algorithms would be parallelized efficiently. Our goal is to reduce the
number of compressed hdf5 files. Not all the way to 1 file, but not 1 file
per MP1 rank. We are not using OpenMP and probably will not be in the
future.

Leigh


-- 
Leigh Orf
Associate Professor of Atmospheric Science
Department of Geology and Meteorology
Central Michigan University
Currently on sabbatical at the National Center for Atmospheric Research
in Boulder, CO
NCAR office phone: (303) 497-8200
_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

Reply via email to