HI Leigh, Yes, using sub-communicators is another way to break up your writes. I meant in the case where you want a subset of your (sub)communicator to write, you can use empty selections. But the solution you are proposing with subcommunicators is much better for your particular I/O pattern, and it will not requires empty selections.
Mark On Tue, Jan 18, 2011 at 12:27 PM, Leigh Orf <[email protected]> wrote: > > > On Tue, Jan 18, 2011 at 5:15 AM, Mark Howison <[email protected]>wrote: > >> Hi Leigh, >> >> Yes, it is only a small difference in code between collective and >> independent mode for the MPI-IO VFD. To enable collective I/O, you pass a >> dataset transfer property list to H5Dwrite like this: >> >> dxpl_id = H5Pcreate(H5P_DATASET_XFER);H5Pset_dxpl_mpio(dxpl_id, >> H5FD_MPIO_COLLECTIVE); >> >> H5Dwrite(dset_id, H5T_NATIVE_FLOAT, memspace, filespace, dxpl_id, somedata0); >> >> One additional constraint with collective I/O, though, is that all MPI >> tasks must call H5Dwrite. If not, your program will stall in a barrier. In >> contrast, with independent I/O you can execute writes with no coordination >> among MPI tasks. >> >> If you do want only a subset of MPI tasks to write in collective mode, you >> can pass an empty selection to H5Dwrite for the non-writing tasks. >> > > Concerning your second comment: Let's say I wish to have each 32 core SMP > chip write 1 hdf5 file, and that each core on a SMP chip is an MPI > rank/process. I create a new MPI communicator which is all 32 ranks on each > SMP chip. Could I not just pass that communicator as the second argument to > h5pset_fapl_mpio and do collective communication for that communicator? I > don't think we want our really big simulations to create only one hdf5 file > per model dump, so hopefully pHDF5 does not require all collective > operations to be on MPI_COMM_WORLD. > > Leigh > > >> Mark >> >> >> On Tue, Jan 18, 2011 at 12:45 AM, Leigh Orf <[email protected]> wrote: >> >>> Elena, >>> >>> That is good news, indeed this was with 1.8.5-patch1. >>> >>> Is code written with using independent IO structured significantly >>> different than with collective IO? I would like to get moving with pHDF5 and >>> as I am currently not too familiar with it, want to make sure that I am not >>> going to have to do a rewrite after the collective code works. It does seem >>> to all occur behind the scenes with the h5dwrite command, so I presume I am >>> safe. >>> >>> Thanks, >>> >>> Leigh >>> >>> On Mon, Jan 17, 2011 at 4:59 PM, Elena Pourmal <[email protected]>wrote: >>> >>>> Leigh, >>>> >>>> I am writing to confirm that the bug you reported does exist in >>>> 1.8.5-patch1, but is fixed in 1.8.6 (coming soon). >>>> >>>> Elena >>>> On Jan 16, 2011, at 3:47 PM, Leigh Orf wrote: >>>> >>>> I managed to build pHDF5 on blueprint.ncsa.uiuc.edu (IBM AIX Power 6). >>>> I compiled the hyperslab_by_chunk.f90 test program found at >>>> http://www.hdfgroup.org/HDF5/Tutor/phypechk.html without error. When I >>>> run it, however, I get the following output: >>>> >>>> ATTENTION: 0031-408 4 tasks allocated by LoadLeveler, continuing... >>>> ERROR: 0032-110 Attempt to free a predefined datatype (2) in >>>> MPI_Type_free, task 0 >>>> ERROR: 0032-110 Attempt to free a predefined datatype (2) in >>>> MPI_Type_free, task 1 >>>> ERROR: 0032-110 Attempt to free a predefined datatype (2) in >>>> MPI_Type_free, task 2 >>>> ERROR: 0032-110 Attempt to free a predefined datatype (2) in >>>> MPI_Type_free, task 3 >>>> HDF5: infinite loop closing library >>>> >>>> >>>> D,S,T,D,S,F,D,G,S,T,F,AC,FD,P,FD,P,FD,P,E,E,SL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL >>>> HDF5: infinite loop closing library >>>> >>>> The line which causes the grief is: >>>> >>>> CALL h5dwrite_f(dset_id, H5T_NATIVE_INTEGER, data, dimsfi, error, & >>>> file_space_id = filespace, mem_space_id = memspace, >>>> xfer_prp = plist_id) >>>> >>>> If I replace that call with the one that is commented out in the >>>> program, it runs without a problem. That line is: >>>> >>>> CALL h5dwrite_f(dset_id, H5T_NATIVE_INTEGER, data, dimsfi,error, & >>>> file_space_id = filespace, mem_space_id = >>>> memspace) >>>> >>>> Any ideas? I definitely want to take advantage of doing collective I/O >>>> if possible. >>>> >>>> Leigh >>>> >>>> -- >>>> Leigh Orf >>>> Associate Professor of Atmospheric Science >>>> Department of Geology and Meteorology >>>> Central Michigan University >>>> Currently on sabbatical at the National Center for Atmospheric Research >>>> in Boulder, CO >>>> NCAR office phone: (303) 497-8200 >>>> >>>> _______________________________________________ >>>> Hdf-forum is for HDF software users discussion. >>>> [email protected] >>>> http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org >>>> >>>> >>>> >>>> _______________________________________________ >>>> Hdf-forum is for HDF software users discussion. >>>> [email protected] >>>> http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org >>>> >>>> >>> >>> >>> -- >>> Leigh Orf >>> Associate Professor of Atmospheric Science >>> Department of Geology and Meteorology >>> Central Michigan University >>> Currently on sabbatical at the National Center for Atmospheric Research >>> in Boulder, CO >>> NCAR office phone: (303) 497-8200 >>> >>> >>> _______________________________________________ >>> Hdf-forum is for HDF software users discussion. >>> [email protected] >>> http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org >>> >>> >> >> _______________________________________________ >> Hdf-forum is for HDF software users discussion. >> [email protected] >> http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org >> >> > > > -- > Leigh Orf > Associate Professor of Atmospheric Science > Department of Geology and Meteorology > Central Michigan University > Currently on sabbatical at the National Center for Atmospheric Research > in Boulder, CO > NCAR office phone: (303) 497-8200 > > > _______________________________________________ > Hdf-forum is for HDF software users discussion. > [email protected] > http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org > >
_______________________________________________ Hdf-forum is for HDF software users discussion. [email protected] http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org
