Dear HDF Forum users.

In my program I started using HDF5 1.8.2 and hyperslabs to write distributed data to a single output file a few weeks ago.

The data is a Nx3 matrix (N is known at runtime).

What am I doing (see code below):
Opening a group. Creating a dataset for all the data. Each process selects via hyperslabs where to write the data to (position). The selected hyperslabs of two processes contain overlapping portions (exactly the same data in the overlapping parts). After that I do a collective write call. It is this call, which consumes a lot of time (approx. 10 min for N=70e3). Making the hypersalbs non-overlapping (takes less than a second) before does not change the execution time significantly.

The program is being executed on a cluster having gpfs available and enabled with mpirun and ~100 processes,
but might also be executed using other (non-parallel) file systems.

Does anyone have a hint for me?
Maybe I made a simple rookie mistake, which I just can't find.

I would not write this post if I had found the answer in manuals, tutorials or via google.

Thanks!

Best Regards,
    Markus

Code-snippet:
##################
    group_id = H5Gopen(file_id, group_name.c_str(), H5P_DEFAULT);

    dataspace_id = H5Screate_simple(2, dims, NULL);
dataset_id = H5Dcreate(group_id, HDFNamingPolicy::coord_data_name.c_str(), H5T_NATIVE_DOUBLE, dataspace_id, H5P_DEFAULT, H5P_DEFAULT, H5P_DEFAULT);
    status = H5Sclose(dataspace_id);

    hid_t filespace = H5Dget_space(dataset_id);

    slab_dims[0] = N; slab_dims[1] = 3;
    dataspace_id = H5Screate_simple(2, slab_dims, NULL);

    slab_dims[0] = 1 ;
    for (int j = 0; j < N; j++)
    {
offset[0] = this->_coord_map[j]; // figure out where to write the slab to
      // write the row as hyperslab to file
// with partially overlapping data (the overlapping portions contain the same numbers/data). if (j==0) status = H5Sselect_hyperslab(filespace, H5S_SELECT_SET, offset, NULL, slab_dims, NULL); else status = H5Sselect_hyperslab(filespace, H5S_SELECT_OR, offset, NULL, slab_dims, NULL);
    }
    hid_t plist_id = H5Pcreate(H5P_DATASET_XFER);
    H5Pset_dxpl_mpio(plist_id, H5FD_MPIO_COLLECTIVE);

    // This is the slow part:
status = H5Dwrite(dataset_id, H5T_NATIVE_DOUBLE, dataspace_id, filespace, plist_id, data);

    status = H5Pclose(plist_id);
    status = H5Sclose(dataspace_id);

    H5Sclose(filespace);
    status = H5Dclose(dataset_id);
    status = H5Gclose(group_id);
#####################


_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

Reply via email to