Hi,
I am writing a very large (572GB) file with HDF5, on a Lustre
filesystem, using 960 MPI processes spread over 120 nodes.
I am monitoring the IO that is going on the filesystem at this time. I
see a very large peak, around ~2GB/s for roughly 3-4 minutes. My
internal timers (from creating the dataset, selecting the memory
hyperslab, writing the dataset, and closing the dataset), tells me
writing takes 180s, which corresponds to the peak I see on our Lustre
servers.
After writing the big dataset, I write a small "metadata" dataset,
containing details of the run. This dataset is very small, and contains
various data types, while the big dataset contains only doubles.
My problem : the H5 file keeps being updated (I watch the last modified
date) long after the big dataset is written ~10-15 minutes after.
Is it possible that writing the small dataset at the end takes so much
time, while the big dataset is so quick to write ? In the 10-15 minutes
after writing the big dataset, I see next to nothing happening on our
lustre filesystem.
Any idea what may be going on ?
--
---------------------------------
Maxime Boissonneault
Analyste de calcul - Calcul Québec, Université Laval
Ph. D. en physique
_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://mail.lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5