Hi Markus,

On Apr 26, 2012, at 6:55 AM, Markus Bina wrote:

> Hi Quincey,
> 
> Thanks for your help!
> I upgraded to HDF5 1.8.8 (src) as you suggested.
> Now H5Pget_mpio_actual_io_mode() is called right after H5DWrite and
> gives 0 ( = H5D_MPIO_NO_COLLECTIVE - return value is the second argument).

        Ah, OK, for some reason, whatever you are doing is forcing the library 
into using independent I/O.

> Why H5D_MPIO_NO_COLLECTIVE is returned I do not understand.
> It is true that the data written by a process is non-continguous, can this be 
> the cause?

        We've got plans to implement another API routine which would answer 
this question, but for now, you'll have to put some printf() in the 
H5D_mpio_opt_possible routine (in src/H5Dmpio.c) to see which of the conditions 
that could break collective I/O is getting triggered when your application 
calls H5Dwrite().  Let me know which condition gets triggered and we can talk 
about the overall reason.

        Quincey

> Best Regards,
> 
>   Markus
> 
> On 2012-04-25 13:23, Quincey Koziol wrote:
>> Hi Markus,
>>      Seems like it should be working, but can you upgrade to 1.8.8 (or the 
>> 1.8.9 prerelease) and use the H5Pget_mpio_actual_io_mode() routine to see if 
>> collective I/O is occurring?  Further actions will depend on the results 
>> from that routine...
>> 
>>      Quincey
>> 
>> On Apr 24, 2012, at 8:41 AM, Markus Bina wrote:
>> 
>>> Dear HDF Forum users.
>>> 
>>> In my program I started using HDF5 1.8.2 and hyperslabs to write 
>>> distributed data to a single output file a few weeks ago.
>>> 
>>> The data is a Nx3 matrix (N is known at runtime).
>>> 
>>> What am I doing (see code below):
>>>   Opening a group. Creating a dataset for all the data. Each process 
>>> selects via hyperslabs where to write the data to (position).
>>>   The selected hyperslabs of two processes contain overlapping portions 
>>> (exactly the same data in the overlapping parts).
>>>   After that I do a collective write call. It is this call, which consumes 
>>> a lot of time (approx. 10 min for N=70e3).
>>> Making the hypersalbs non-overlapping (takes less than a second) before 
>>> does not change the execution time significantly.
>>> 
>>> The program is being executed on a cluster having gpfs available and 
>>> enabled with mpirun and ~100 processes,
>>> but might also be executed using other (non-parallel) file systems.
>>> 
>>> Does anyone have a hint for me?
>>> Maybe I made a simple rookie mistake, which I just can't find.
>>> 
>>> I would not write this post if I had found the answer in manuals, tutorials 
>>> or via google.
>>> 
>>> Thanks!
>>> 
>>> Best Regards,
>>>    Markus
>>> 
>>> Code-snippet:
>>> ##################
>>>    group_id = H5Gopen(file_id, group_name.c_str(), H5P_DEFAULT);
>>> 
>>>    dataspace_id = H5Screate_simple(2, dims, NULL);
>>>    dataset_id = H5Dcreate(group_id, 
>>> HDFNamingPolicy::coord_data_name.c_str(), H5T_NATIVE_DOUBLE, dataspace_id, 
>>> H5P_DEFAULT, H5P_DEFAULT, H5P_DEFAULT);
>>>    status = H5Sclose(dataspace_id);
>>> 
>>>    hid_t filespace = H5Dget_space(dataset_id);
>>> 
>>>    slab_dims[0] = N; slab_dims[1] = 3;
>>>    dataspace_id = H5Screate_simple(2, slab_dims, NULL);
>>> 
>>>    slab_dims[0] = 1 ;
>>>    for (int j = 0; j<  N; j++)
>>>    {
>>>      offset[0] = this->_coord_map[j]; // figure out where to write the slab 
>>> to
>>>      // write the row as hyperslab to file
>>>      // with partially overlapping data (the overlapping portions contain 
>>> the same numbers/data).
>>>       if (j==0) status = H5Sselect_hyperslab(filespace, H5S_SELECT_SET, 
>>> offset, NULL, slab_dims, NULL);
>>>       else status = H5Sselect_hyperslab(filespace, H5S_SELECT_OR, offset, 
>>> NULL, slab_dims, NULL);
>>>    }
>>>    hid_t plist_id = H5Pcreate(H5P_DATASET_XFER);
>>>    H5Pset_dxpl_mpio(plist_id, H5FD_MPIO_COLLECTIVE);
>>> 
>>>    // This is the slow part:
>>>    status = H5Dwrite(dataset_id, H5T_NATIVE_DOUBLE, dataspace_id, 
>>> filespace, plist_id, data);
>>> 
>>>    status = H5Pclose(plist_id);
>>>    status = H5Sclose(dataspace_id);
>>> 
>>>    H5Sclose(filespace);
>>>    status = H5Dclose(dataset_id);
>>>    status = H5Gclose(group_id);
>>> #####################
>>> 
>>> 
>>> _______________________________________________
>>> Hdf-forum is for HDF software users discussion.
>>> [email protected]
>>> http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org
>> 
> 
> 
> 
> _______________________________________________
> Hdf-forum is for HDF software users discussion.
> [email protected]
> http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org


_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

Reply via email to