The collective calls to create datasets are required because the metadata needs 
to be consistent across all ranks for the datasets to be created correctly. You 
can write a sample program which violates this rule, and you’ll see how the 
datasets get clobbered if not coordinated and you don’t get the results you 
would like.
Same with groups, attributes, and anything that affects the metadata, hence the 
list of required collective calls.

Mohamad’s solution should work, since collective calls are only required if 
opening the file in parallel across multiple ranks. Once the metadata is 
created for all the datasets and written to the file and closed, you can then 
open the file in parallel and each rank can write into the file without 
affecting the datasets for the other ranks.

Jarom

From: Hdf-forum [mailto:[email protected]] On Behalf Of 
Mohamad Chaarawi
Sent: Wednesday, August 31, 2016 6:19 AM
To: HDF Users Discussion List
Subject: Re: [Hdf-forum] Parallel I/O with HDF5

The dataset creation has to be called on all ranks, not the actual writing of 
the array data.
So all ranks should call H5Dcreate() for all the datasets, but then each rank 
can write to its corresponding dataset.

Alternatively, you can have 1 rank create the entire file serially, then close 
the file, then all other ranks open and write the raw data in parallel.

Thanks,
Mohamad

From: Hdf-forum 
<[email protected]<mailto:[email protected]>>
 on behalf of jaber javanshir 
<[email protected]<mailto:[email protected]>>
Reply-To: hdf-forum 
<[email protected]<mailto:[email protected]>>
Date: Tuesday, August 30, 2016 at 4:21 PM
To: hdf-forum 
<[email protected]<mailto:[email protected]>>
Subject: [Hdf-forum] Parallel I/O with HDF5

Hi All,

Hopa all is well.

I am trying to use hdf5 parallel feature for extreme scale computing.
I would like each processor write out a dseparate dataset.
This question is actually mentioned on the HDF5 website.
Due to the collective call every processor has to call the same data set.

https://www.hdfgroup.org/HDF5/faq/parallel.html How do you write to a single 
file in parallel in which different processes write to separate datasets?Please 
advise on this matter.

The answer is not satisfying for the extreme scale computing where hundreds of 
thousand cores are involved.

Is there and better way of overcoming this issue?

Your advise on this issue is greatly appreciated

Thanks

Dr J
_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5

Reply via email to