Hi, I'm helping a user at NERSC modify an out-of-core matrix calculation code to use HDF5 for temporary storage. Each of his 30 MPI tasks is writing to its own file using the MPI-IO VFD in independent mode with the MPI_COMM_SELF communicator. He is creating about 20,000 datasets and writing anywhere from 4KB to 32MB to each one. In IO profiles, we are seeing a huge spike in <1KB writes (about 100,000). My questions are:
* Are these small writes we are seeing associated with dataset metadata? * Is there a "best practice" for handling this number of datasets? For instance, is it better to pre-allocate the datasets before writing to them? Thanks Mark _______________________________________________ Hdf-forum is for HDF software users discussion. [email protected] http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org
