The collective calls to create datasets are required because the metadata needs to be consistent across all ranks for the datasets to be created correctly. You can write a sample program which violates this rule, and you’ll see how the datasets get clobbered if not coordinated and you don’t get the results you would like. Same with groups, attributes, and anything that affects the metadata, hence the list of required collective calls.
Mohamad’s solution should work, since collective calls are only required if opening the file in parallel across multiple ranks. Once the metadata is created for all the datasets and written to the file and closed, you can then open the file in parallel and each rank can write into the file without affecting the datasets for the other ranks. Jarom From: Hdf-forum [mailto:[email protected]] On Behalf Of Mohamad Chaarawi Sent: Wednesday, August 31, 2016 6:19 AM To: HDF Users Discussion List Subject: Re: [Hdf-forum] Parallel I/O with HDF5 The dataset creation has to be called on all ranks, not the actual writing of the array data. So all ranks should call H5Dcreate() for all the datasets, but then each rank can write to its corresponding dataset. Alternatively, you can have 1 rank create the entire file serially, then close the file, then all other ranks open and write the raw data in parallel. Thanks, Mohamad From: Hdf-forum <[email protected]<mailto:[email protected]>> on behalf of jaber javanshir <[email protected]<mailto:[email protected]>> Reply-To: hdf-forum <[email protected]<mailto:[email protected]>> Date: Tuesday, August 30, 2016 at 4:21 PM To: hdf-forum <[email protected]<mailto:[email protected]>> Subject: [Hdf-forum] Parallel I/O with HDF5 Hi All, Hopa all is well. I am trying to use hdf5 parallel feature for extreme scale computing. I would like each processor write out a dseparate dataset. This question is actually mentioned on the HDF5 website. Due to the collective call every processor has to call the same data set. https://www.hdfgroup.org/HDF5/faq/parallel.html How do you write to a single file in parallel in which different processes write to separate datasets?Please advise on this matter. The answer is not satisfying for the extreme scale computing where hundreds of thousand cores are involved. Is there and better way of overcoming this issue? Your advise on this issue is greatly appreciated Thanks Dr J
_______________________________________________ Hdf-forum is for HDF software users discussion. [email protected] http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org Twitter: https://twitter.com/hdf5
