[Hdf-forum] Efficiently creating and writing to 20,000 datasets

Mark Howison Tue, 11 May 2010 16:24:32 -0700

Hi,

I'm helping a user at NERSC modify an out-of-core matrix calculation
code to use HDF5 for temporary storage. Each of his 30 MPI tasks is
writing to its own file using the MPI-IO VFD in independent mode with
the MPI_COMM_SELF communicator. He is creating about 20,000 datasets
and writing anywhere from 4KB to 32MB to each one. In IO profiles, we
are seeing a huge spike in <1KB writes (about 100,000). My questions
are:


* Are these small writes we are seeing associated with dataset metadata?

* Is there a "best practice" for handling this number of datasets? For
instance, is it better to pre-allocate the datasets before writing to
them?

Thanks
Mark

_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

[Hdf-forum] Efficiently creating and writing to 20,000 datasets

Reply via email to