Re: [OMPI devel] MPI_Win_lock semantic

George Bosilca Mon, 21 Nov 2016 20:15:07 -0800

Gilles,

I looked at the test and I think the current behavior is indeed correct.
What matters for an exclusive lock is that all operations in an epoch
(everything surrounded by lock/unlock) are atomically applied to the
destination (and are not interleaved with other updates). As Nathan stated,
MPI_Win_lock might be implemented as non-blocking, in which case it is
totally legit for the process 2 to acquire the lock first, and update the
array before the process 0 access it. Thus the test will fail.


The test will never deadlock, because even if the MPI_Win_lock is
implemented as a blocking operation (which is also legit), the send and
receive match correctly with the lock/unlock.

Moreover, I think the behavior described by the comments can only be
implemented by enforcing an order between the only conceptually meaningful
operations unlock/send/recv.

if (me == 0) {
MPI_Win_lock(MPI_LOCK_EXCLUSIVE, 1, 0, win);
     MPI_Get(a,len,MPI_DOUBLE,1,0,len,MPI_DOUBLE,win);
        MPI_Win_unlock(1, win);
        MPI_Send(NULL, 0, MPI_BYTE, 2, 1001, MPI_COMM_WORLD);
    }
    if (me == 2) {
/* this should block till 0 releases the lock. */
        MPI_Recv(NULL, 0, MPI_BYTE, 0, 1001, MPI_COMM_WORLD,
MPI_STATUS_IGNORE);
MPI_Win_lock(MPI_LOCK_EXCLUSIVE, 1, 0, win);
MPI_Put(a,len,MPI_DOUBLE,1,0,len,MPI_DOUBLE,win);
        MPI_Win_unlock(1, win);
    }

However, if we relax the code a little and want to ensure the atomicity of
the operations, then we need to change the check to make sure that either
no elements of the array have been altered or all of them have been altered
(set to zero by process 2).

  George.



On Mon, Nov 21, 2016 at 8:57 PM, Nathan Hjelm <hje...@me.com> wrote:

> To be safe I would call MPI_Get then MPI_Win_flush. That lock will always
> be acquired before the MPI_Win_flush call returns. As long as it is more
> than 0 bytes. We always short-circuit 0-byte operations in both osc/rdma
> and osc/pt2pt.
>
> -Nathan
>
> > On Nov 21, 2016, at 8:54 PM, Gilles Gouaillardet <gil...@rist.or.jp>
> wrote:
> >
> > Thanks Nathan,
> >
> >
> > any thoughts about my modified version of the test ?
> >
> > do i need to MPI_Win_flush() after the first MPI_Get() in order to
> ensure the lock was acquired ?
> >
> > (and hence the program will either success or hang, but never fail)
> >
> >
> > Cheers,
> >
> >
> > Gilles
> >
> >
> > On 11/22/2016 12:29 PM, Nathan Hjelm wrote:
> >> MPI_Win_lock does not have to be blocking. In osc/rdma it is blocking
> in most cases but not others (lock all with on-demand is non-blocking) but
> in osc/pt2pt is is almost always non-blocking (it has to be blocking for
> proc self). If you really want to ensure the lock is acquired you can call
> MPI_Win_flush. I think this should work even if you have not started any
> RMA operations inside the epoch.
> >>
> >> -Nathan
> >>
> >>> On Nov 21, 2016, at 7:53 PM, Gilles Gouaillardet <gil...@rist.or.jp>
> wrote:
> >>>
> >>> Nathan,
> >>>
> >>>
> >>> we briefly discussed the test_lock1 test from the onesided test suite
> using osc/pt2pt
> >>>
> >>> https://github.com/open-mpi/ompi-tests/blob/master/
> onesided/test_lock1.c#L57-L70
> >>>
> >>>
> >>> task 0 does
> >>>
> >>> MPI_Win_lock(MPI_LOCK_EXCLUSIVE, rank=1,...);
> >>>
> >>> MPI_Send(...,dest=2,...)
> >>>
> >>>
> >>> and task 2 does
> >>>
> >>> MPI_Win_lock(MPI_LOCK_EXCLUSIVE, rank=1,...);
> >>>
> >>> MPI_Recv(...,source=0,...)
> >>>
> >>>
> >>> hoping to guarantee task 0 will acquire the lock first.
> >>>
> >>>
> >>> once in a while, the test fails when task 2 acquires the lock first
> >>>
> >>> /* MPI_Win_lock() only sends a lock request, and return without owning
> the lock */
> >>>
> >>> so if task 1 is running on a loaded server, and even if task 2
> requests the lock *after* task 0,
> >>>
> >>> lock request from task 2 can be processed first, and hence task 2 is
> not guaranteed to acquire the lock *before* task 0.
> >>>
> >>>
> >>> can you please confirm MPI_Win_lock() behaves as it is supposed to ?
> >>>
> >>> if yes, is there a way for task 0 to block until it acquires the lock ?
> >>>
> >>>
> >>> i modified the test, and inserted in task 0 a MPI_Get of 1 MPI_Double
> *before* MPI_Send.
> >>>
> >>> see my patch below (note i increased the message length)
> >>>
> >>>
> >>> my expectation is that the test would either success (e.g. task 0 gets
> the lock first) or hang
> >>>
> >>> (if task 1 gets the lock first)
> >>>
> >>>
> >>>
> >>> surprisingly, the test never hangs (so far ...) but once in a while,
> it fails (!), which makes me very confused
> >>>
> >>>
> >>> Any thoughts ?
> >>>
> >>>
> >>> Cheers,
> >>>
> >>>
> >>> Gilles
> >>>
> >>>
> >>>
> >>> diff --git a/onesided/test_lock1.c b/onesided/test_lock1.c
> >>> index c549093..9fa3f8d 100644
> >>> --- a/onesided/test_lock1.c
> >>> +++ b/onesided/test_lock1.c
> >>> @@ -20,7 +20,7 @@ int
> >>> test_lock1(void)
> >>> {
> >>>     double *a = NULL;
> >>> -    size_t     len = 10;
> >>> +    size_t     len = 1000000;
> >>>     MPI_Win    win;
> >>>     int        i;
> >>>
> >>> @@ -56,6 +56,7 @@ test_lock1(void)
> >>>      */
> >>>     if (me == 0) {
> >>>        MPI_Win_lock(MPI_LOCK_EXCLUSIVE, 1, 0, win);
> >>> +       MPI_Get(a,1,MPI_DOUBLE,1,0,1,MPI_DOUBLE,win);
> >>>         MPI_Send(NULL, 0, MPI_BYTE, 2, 1001, MPI_COMM_WORLD);
> >>>        MPI_Get(a,len,MPI_DOUBLE,1,0,len,MPI_DOUBLE,win);
> >>>         MPI_Win_unlock(1, win);
> >>> @@ -76,6 +77,7 @@ test_lock1(void)
> >>>         /* make sure 0 got the data from 1 */
> >>>        for (i = 0; i < len; i++) {
> >>>            if (a[i] != (double)(10*1+i)) {
> >>> +                if (0 == nfail) fprintf(stderr, "at index %d,
> expected %lf but got %lf\n", i, (double)10*1+i, a[i]);
> >>>                nfail++;
> >>>            }
> >>>        }
> >>>
> >>> _______________________________________________
> >>> devel mailing list
> >>> devel@lists.open-mpi.org
> >>> https://rfd.newmexicoconsortium.org/mailman/listinfo/devel
> >> _______________________________________________
> >> devel mailing list
> >> devel@lists.open-mpi.org
> >> https://rfd.newmexicoconsortium.org/mailman/listinfo/devel
> >>
> >
> > _______________________________________________
> > devel mailing list
> > devel@lists.open-mpi.org
> > https://rfd.newmexicoconsortium.org/mailman/listinfo/devel
>
> _______________________________________________
> devel mailing list
> devel@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/devel
>

_______________________________________________
devel mailing list
devel@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/devel

Re: [OMPI devel] MPI_Win_lock semantic

Reply via email to