Hi Markus,
On Apr 26, 2012, at 6:55 AM, Markus Bina wrote:
> Hi Quincey,
>
> Thanks for your help!
> I upgraded to HDF5 1.8.8 (src) as you suggested.
> Now H5Pget_mpio_actual_io_mode() is called right after H5DWrite and
> gives 0 ( = H5D_MPIO_NO_COLLECTIVE - return value is the second argument).
Ah, OK, for some reason, whatever you are doing is forcing the library
into using independent I/O.
> Why H5D_MPIO_NO_COLLECTIVE is returned I do not understand.
> It is true that the data written by a process is non-continguous, can this be
> the cause?
We've got plans to implement another API routine which would answer
this question, but for now, you'll have to put some printf() in the
H5D_mpio_opt_possible routine (in src/H5Dmpio.c) to see which of the conditions
that could break collective I/O is getting triggered when your application
calls H5Dwrite(). Let me know which condition gets triggered and we can talk
about the overall reason.
Quincey
> Best Regards,
>
> Markus
>
> On 2012-04-25 13:23, Quincey Koziol wrote:
>> Hi Markus,
>> Seems like it should be working, but can you upgrade to 1.8.8 (or the
>> 1.8.9 prerelease) and use the H5Pget_mpio_actual_io_mode() routine to see if
>> collective I/O is occurring? Further actions will depend on the results
>> from that routine...
>>
>> Quincey
>>
>> On Apr 24, 2012, at 8:41 AM, Markus Bina wrote:
>>
>>> Dear HDF Forum users.
>>>
>>> In my program I started using HDF5 1.8.2 and hyperslabs to write
>>> distributed data to a single output file a few weeks ago.
>>>
>>> The data is a Nx3 matrix (N is known at runtime).
>>>
>>> What am I doing (see code below):
>>> Opening a group. Creating a dataset for all the data. Each process
>>> selects via hyperslabs where to write the data to (position).
>>> The selected hyperslabs of two processes contain overlapping portions
>>> (exactly the same data in the overlapping parts).
>>> After that I do a collective write call. It is this call, which consumes
>>> a lot of time (approx. 10 min for N=70e3).
>>> Making the hypersalbs non-overlapping (takes less than a second) before
>>> does not change the execution time significantly.
>>>
>>> The program is being executed on a cluster having gpfs available and
>>> enabled with mpirun and ~100 processes,
>>> but might also be executed using other (non-parallel) file systems.
>>>
>>> Does anyone have a hint for me?
>>> Maybe I made a simple rookie mistake, which I just can't find.
>>>
>>> I would not write this post if I had found the answer in manuals, tutorials
>>> or via google.
>>>
>>> Thanks!
>>>
>>> Best Regards,
>>> Markus
>>>
>>> Code-snippet:
>>> ##################
>>> group_id = H5Gopen(file_id, group_name.c_str(), H5P_DEFAULT);
>>>
>>> dataspace_id = H5Screate_simple(2, dims, NULL);
>>> dataset_id = H5Dcreate(group_id,
>>> HDFNamingPolicy::coord_data_name.c_str(), H5T_NATIVE_DOUBLE, dataspace_id,
>>> H5P_DEFAULT, H5P_DEFAULT, H5P_DEFAULT);
>>> status = H5Sclose(dataspace_id);
>>>
>>> hid_t filespace = H5Dget_space(dataset_id);
>>>
>>> slab_dims[0] = N; slab_dims[1] = 3;
>>> dataspace_id = H5Screate_simple(2, slab_dims, NULL);
>>>
>>> slab_dims[0] = 1 ;
>>> for (int j = 0; j< N; j++)
>>> {
>>> offset[0] = this->_coord_map[j]; // figure out where to write the slab
>>> to
>>> // write the row as hyperslab to file
>>> // with partially overlapping data (the overlapping portions contain
>>> the same numbers/data).
>>> if (j==0) status = H5Sselect_hyperslab(filespace, H5S_SELECT_SET,
>>> offset, NULL, slab_dims, NULL);
>>> else status = H5Sselect_hyperslab(filespace, H5S_SELECT_OR, offset,
>>> NULL, slab_dims, NULL);
>>> }
>>> hid_t plist_id = H5Pcreate(H5P_DATASET_XFER);
>>> H5Pset_dxpl_mpio(plist_id, H5FD_MPIO_COLLECTIVE);
>>>
>>> // This is the slow part:
>>> status = H5Dwrite(dataset_id, H5T_NATIVE_DOUBLE, dataspace_id,
>>> filespace, plist_id, data);
>>>
>>> status = H5Pclose(plist_id);
>>> status = H5Sclose(dataspace_id);
>>>
>>> H5Sclose(filespace);
>>> status = H5Dclose(dataset_id);
>>> status = H5Gclose(group_id);
>>> #####################
>>>
>>>
>>> _______________________________________________
>>> Hdf-forum is for HDF software users discussion.
>>> [email protected]
>>> http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org
>>
>
>
>
> _______________________________________________
> Hdf-forum is for HDF software users discussion.
> [email protected]
> http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org
_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org