Hi Mohamad,
Le 2015-01-28 10:32, Mohamad Chaarawi a écrit :
Ha.. I bet that writeMetaDataDataset is the culprit here...
so you are saying that you create a scalar dataset (with 1 element), and then
write that same element n (n being the number of processes) the same time from
all processes? Why would you need to do such a thing in the first place? If you
need to write that element, you should just call writeMetaDataDataset from rank
0. If you don't need that float, then you should just not write it at all.
I was under the impression that HDF5 (or MPI IO) managed under the hood
which process actually wrote data, and that such a small dataset would
end up being written only by one rank. I actually thought that H5Dwrite,
H5*close *needed* to be called by all processes, i.e. that they were
collective.
I guess that at least H5Fclose is collective, since all processes need
to close the file. Are the other ones not collective ?
You called the metadata dataset an empty dataset essentially, so I understand
that you don't need it? If that is the case, then why not create the attributes
on the root group, or a different sub group for the current run, or even the
large dataset?
I did not know that there was a default, root, dataset. So you are
saying that I can simply call
H5LTset_attribute_string(file_id, "root", key, value)
without creating a dataset first ?
I do not attach the metadata to the large dataset, because it is
collective metadata and there may be more than one large dataset in the
same file.
What you are causing is having every process grab a lock on that file system
OST block, write that element and then release the lock. This is happening 960
times in your case, which I interpret what is causing this performance
degradation..
This makes sense. I will test and make changes accordingly.
Maxime
_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5