Re: [Hdf-forum] Collective IO and filters

Michael K. Edwards Thu, 09 Nov 2017 08:23:24 -0800

And that's because of this logic up in PETSc:

  if (n > 0) {
    PetscStackCallHDF5Return(memspace,H5Screate_simple,(dim, count, NULL));
  } else {
    /* Can't create dataspace with zero for any dimension, so create
null dataspace. */
    PetscStackCallHDF5Return(memspace,H5Screate,(H5S_NULL));
  }


where n is the number of elements in the rank's slice of the data.  I
think.  There is a corresponding branch later in the code:

  if (n > 0) {
    PetscStackCallHDF5Return(filespace,H5Dget_space,(dset_id));
    PetscStackCallHDF5(H5Sselect_hyperslab,(filespace, H5S_SELECT_SET,
offset, NULL, count, NULL));
  } else {
    /* Create null filespace to match null memspace. */
    PetscStackCallHDF5Return(filespace,H5Screate,(H5S_NULL));
  }

It seems clear that PETSc is mishandling this situation, but I'm not
sure how to fix it if the comment is right.  Advice?


On Thu, Nov 9, 2017 at 7:43 AM, Michael K. Edwards
<m.k.edwa...@gmail.com> wrote:
> Replacing Intel's build of MVAPICH2 2.2 with a fresh build of MVAPICH2
> 2.3b got me farther along.  The comm mismatch does not seem to be a
> problem.  I am guessing that the root cause was whatever bug is listed
> in 
> http://mvapich.cse.ohio-state.edu/static/media/mvapich/MV2_CHANGELOG-2.3b.txt
> as:
>
>     - Fix hang in MPI_Probe
>         - Thanks to John Westlund@Intel for the report
>
> I fixed the H5D__cmp_filtered_collective_io_info_entry_owner
> comparator, and now I'm back to fixing things about my patch to PETSc.
> I seem to be trying to filter a dataset that I shouldn't be.
>
> HDF5-DIAG: Error detected in HDF5 (1.11.0) MPI-process 0:
>   #000: H5Dio.c line 319 in H5Dwrite(): can't prepare for writing data
>     major: Dataset
>     minor: Write failed
>   #001: H5Dio.c line 395 in H5D__pre_write(): can't write data
>     major: Dataset
>     minor: Write failed
>   #002: H5Dio.c line 831 in H5D__write(): unable to adjust I/O info
> for parallel I/O
>     major: Dataset
>     minor: Unable to initialize object
>   #003: H5Dio.c line 1264 in H5D__ioinfo_adjust(): Can't perform
> independent write with filters in pipeline.
>     The following caused a break from collective I/O:
>         Local causes:
>         Global causes: one of the dataspaces was neither simple nor scalar
>     major: Low-level I/O
>     minor: Can't perform independent IO
>
>
> On Wed, Nov 8, 2017 at 11:37 PM, Michael K. Edwards
> <m.k.edwa...@gmail.com> wrote:
>> Oddly enough, it is not the tag that is mismatched between receiver
>> and senders; it is io_info->comm.  Something is decidedly out of whack
>> here.
>>
>> Rank 0, owner 0 probing with tag 0 on comm -1006632942
>> Rank 2, owner 0 sent with tag 0 to comm -1006632952 as request 0
>> Rank 3, owner 0 sent with tag 0 to comm -1006632952 as request 0
>> Rank 1, owner 0 sent with tag 0 to comm -1006632952 as request 0
>>
>>
>> On Wed, Nov 8, 2017 at 2:51 PM, Michael K. Edwards
>> <m.k.edwa...@gmail.com> wrote:
>>>
>>> I see that you're re-sorting by owner using a comparator called
>>> H5D__cmp_filtered_collective_io_info_entry_owner() which does not sort
>>> by a secondary key within items with equal owners.  That, together
>>> with a sort that isn't stable (which HDqsort() probably isn't on most
>>> platforms; quicksort/introsort is not stable), will scramble the order
>>> in which different ranks traverse their local chunk arrays.  That will
>>> cause deadly embraces between ranks that are waiting for each other's
>>> chunks to be sent.  To fix that, it's probably sufficient to use the
>>> chunk offset as a secondary sort key in that comparator.
>>>
>>> That's not the root cause of the hang I'm currently experiencing,
>>> though.  Still digging into that.
>>>
>>>
>>> On Wed, Nov 8, 2017 at 1:50 PM, Dana Robinson <derob...@hdfgroup.org> wrote:
>>> > Yes. All outside code that frees, allocates, or reallocates memory created
>>> > inside the library (or that will be passed back into the library, where it
>>> > could be freed or reallocated) should use these functions. This includes
>>> > filters.
>>> >
>>> >
>>> >
>>> > Dana
>>> >
>>> >
>>> >
>>> > From: Jordan Henderson <jhender...@hdfgroup.org>
>>> > Date: Wednesday, November 8, 2017 at 13:46
>>> > To: Dana Robinson <derob...@hdfgroup.org>, "m.k.edwa...@gmail.com"
>>> > <m.k.edwa...@gmail.com>, HDF List <hdf-forum@lists.hdfgroup.org>
>>> > Subject: Re: [Hdf-forum] Collective IO and filters
>>> >
>>> >
>>> >
>>> > Dana,
>>> >
>>> >
>>> >
>>> > would it then make sense for all outside filters to use these routines? 
>>> > Due
>>> > to Parallel Compression's internal nature, it uses buffers allocated via
>>> > H5MM_ routines to collect and scatter data, which works fine for the
>>> > internal filters like deflate, since they use these as well. However, 
>>> > since
>>> > some of the outside filters use the raw malloc/free routines, causing
>>> > issues, I'm wondering if having all outside filters use the H5_ routines 
>>> > is
>>> > the cleanest solution..
>>> >
>>> >
>>> >
>>> > Michael,
>>> >
>>> >
>>> >
>>> > Based on the "num_writers: 4" field, the NULL "receive_requests_array" and
>>> > the fact that for the same chunk, rank 0 shows "original owner: 0, new
>>> > owner: 0" and rank 3 shows "original owner: 3, new_owner: 0", it seems as
>>> > though everyone IS interested in the chunk the rank 0 is now working on, 
>>> > but
>>> > now I'm more confident that at some point either the messages may have
>>> > failed to send or rank 0 is having problems finding the messages.
>>> >
>>> >
>>> >
>>> > Since in the unfiltered case it won't hit this particular code path, I'm 
>>> > not
>>> > surprised that that case succeeds. If I had to make another guess based on
>>> > this, I would be inclined to think that rank 0 must be hanging on the
>>> > MPI_Mprobe due to a mismatch in the "tag" field. I use the index of the
>>> > chunk as the tag for the message in order to funnel specific messages to 
>>> > the
>>> > correct rank for the correct chunk during the last part of the chunk
>>> > redistribution and if rank 0 can't match the tag it of course won't find 
>>> > the
>>> > message. Why this might be happening, I'm not entirely certain currently.

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5

Re: [Hdf-forum] Collective IO and filters

Reply via email to