Re: [OMPI devel] MPI_Win_lock semantic

Gilles Gouaillardet Mon, 21 Nov 2016 21:32:26 -0800

Thanks George,


i understand the change you suggest will fix the test.

tasks will no more fight for the same lock, and i believe that was theinitial intent of the test


(e.g. will task 2 wait for task 0 to release the lock before MPI_Put() data)

but since there is no other way to test this, i will go ahead and updatethe test



Cheers,

Gilles

On 11/22/2016 1:13 PM, George Bosilca wrote:

Gilles,

I looked at the test and I think the current behavior is indeedcorrect. What matters for an exclusive lock is that all operations inan epoch (everything surrounded by lock/unlock) are atomically appliedto the destination (and are not interleaved with other updates). AsNathan stated, MPI_Win_lock might be implemented as non-blocking, inwhich case it is totally legit for the process 2 to acquire the lockfirst, and update the array before the process 0 access it. Thus thetest will fail.

The test will never deadlock, because even if the MPI_Win_lock isimplemented as a blocking operation (which is also legit), the sendand receive match correctly with the lock/unlock.

Moreover, I think the behavior described by the comments can only beimplemented by enforcing an order between the only conceptuallymeaningful operations unlock/send/recv.


if (me == 0) {
MPI_Win_lock(MPI_LOCK_EXCLUSIVE, 1, 0, win);
MPI_Get(a,len,MPI_DOUBLE,1,0,len,MPI_DOUBLE,win);
        MPI_Win_unlock(1, win);
        MPI_Send(NULL, 0, MPI_BYTE, 2, 1001, MPI_COMM_WORLD);
    }
    if (me == 2) {
/* this should block till 0 releases the lock. */
        MPI_Recv(NULL, 0, MPI_BYTE, 0, 1001, MPI_COMM_WORLD,
MPI_STATUS_IGNORE);
MPI_Win_lock(MPI_LOCK_EXCLUSIVE, 1, 0, win);
MPI_Put(a,len,MPI_DOUBLE,1,0,len,MPI_DOUBLE,win);
        MPI_Win_unlock(1, win);
    }

However, if we relax the code a little and want to ensure theatomicity of the operations, then we need to change the check to makesure that either no elements of the array have been altered or all ofthem have been altered (set to zero by process 2).


  George.

On Mon, Nov 21, 2016 at 8:57 PM, Nathan Hjelm <hje...@me.com<mailto:hje...@me.com>> wrote:


    To be safe I would call MPI_Get then MPI_Win_flush. That lock will
    always be acquired before the MPI_Win_flush call returns. As long
    as it is more than 0 bytes. We always short-circuit 0-byte
    operations in both osc/rdma and osc/pt2pt.

    -Nathan

    > On Nov 21, 2016, at 8:54 PM, Gilles Gouaillardet
    <gil...@rist.or.jp <mailto:gil...@rist.or.jp>> wrote:
    >
    > Thanks Nathan,
    >
    >
    > any thoughts about my modified version of the test ?
    >
    > do i need to MPI_Win_flush() after the first MPI_Get() in order
    to ensure the lock was acquired ?
    >
    > (and hence the program will either success or hang, but never fail)
    >
    >
    > Cheers,
    >
    >
    > Gilles
    >
    >
    > On 11/22/2016 12:29 PM, Nathan Hjelm wrote:
    >> MPI_Win_lock does not have to be blocking. In osc/rdma it is
    blocking in most cases but not others (lock all with on-demand is
    non-blocking) but in osc/pt2pt is is almost always non-blocking
    (it has to be blocking for proc self). If you really want to
    ensure the lock is acquired you can call MPI_Win_flush. I think
    this should work even if you have not started any RMA operations
    inside the epoch.
    >>
    >> -Nathan
    >>
    >>> On Nov 21, 2016, at 7:53 PM, Gilles Gouaillardet
    <gil...@rist.or.jp <mailto:gil...@rist.or.jp>> wrote:
    >>>
    >>> Nathan,
    >>>
    >>>
    >>> we briefly discussed the test_lock1 test from the onesided
    test suite using osc/pt2pt
    >>>
    >>>
    
https://github.com/open-mpi/ompi-tests/blob/master/onesided/test_lock1.c#L57-L70
    
<https://github.com/open-mpi/ompi-tests/blob/master/onesided/test_lock1.c#L57-L70>
    >>>
    >>>
    >>> task 0 does
    >>>
    >>> MPI_Win_lock(MPI_LOCK_EXCLUSIVE, rank=1,...);
    >>>
    >>> MPI_Send(...,dest=2,...)
    >>>
    >>>
    >>> and task 2 does
    >>>
    >>> MPI_Win_lock(MPI_LOCK_EXCLUSIVE, rank=1,...);
    >>>
    >>> MPI_Recv(...,source=0,...)
    >>>
    >>>
    >>> hoping to guarantee task 0 will acquire the lock first.
    >>>
    >>>
    >>> once in a while, the test fails when task 2 acquires the lock
    first
    >>>
    >>> /* MPI_Win_lock() only sends a lock request, and return
    without owning the lock */
    >>>
    >>> so if task 1 is running on a loaded server, and even if task 2
    requests the lock *after* task 0,
    >>>
    >>> lock request from task 2 can be processed first, and hence
    task 2 is not guaranteed to acquire the lock *before* task 0.
    >>>
    >>>
    >>> can you please confirm MPI_Win_lock() behaves as it is
    supposed to ?
    >>>
    >>> if yes, is there a way for task 0 to block until it acquires
    the lock ?
    >>>
    >>>
    >>> i modified the test, and inserted in task 0 a MPI_Get of 1
    MPI_Double *before* MPI_Send.
    >>>
    >>> see my patch below (note i increased the message length)
    >>>
    >>>
    >>> my expectation is that the test would either success (e.g.
    task 0 gets the lock first) or hang
    >>>
    >>> (if task 1 gets the lock first)
    >>>
    >>>
    >>>
    >>> surprisingly, the test never hangs (so far ...) but once in a
    while, it fails (!), which makes me very confused
    >>>
    >>>
    >>> Any thoughts ?
    >>>
    >>>
    >>> Cheers,
    >>>
    >>>
    >>> Gilles
    >>>
    >>>
    >>>
    >>> diff --git a/onesided/test_lock1.c b/onesided/test_lock1.c
    >>> index c549093..9fa3f8d 100644
    >>> --- a/onesided/test_lock1.c
    >>> +++ b/onesided/test_lock1.c
    >>> @@ -20,7 +20,7 @@ int
    >>> test_lock1(void)
    >>> {
    >>>     double *a = NULL;
    >>> -    size_t     len = 10;
    >>> +    size_t     len = 1000000;
    >>>     MPI_Win    win;
    >>>     int        i;
    >>>
    >>> @@ -56,6 +56,7 @@ test_lock1(void)
    >>>      */
    >>>     if (me == 0) {
    >>>        MPI_Win_lock(MPI_LOCK_EXCLUSIVE, 1, 0, win);
    >>> +       MPI_Get(a,1,MPI_DOUBLE,1,0,1,MPI_DOUBLE,win);
    >>>         MPI_Send(NULL, 0, MPI_BYTE, 2, 1001, MPI_COMM_WORLD);
    >>>        MPI_Get(a,len,MPI_DOUBLE,1,0,len,MPI_DOUBLE,win);
    >>>         MPI_Win_unlock(1, win);
    >>> @@ -76,6 +77,7 @@ test_lock1(void)
    >>>         /* make sure 0 got the data from 1 */
    >>>        for (i = 0; i < len; i++) {
    >>>            if (a[i] != (double)(10*1+i)) {
    >>> +                if (0 == nfail) fprintf(stderr, "at index %d,
    expected %lf but got %lf\n", i, (double)10*1+i, a[i]);
    >>>                nfail++;
    >>>            }
    >>>        }
    >>>
    >>> _______________________________________________
    >>> devel mailing list
    >>> devel@lists.open-mpi.org <mailto:devel@lists.open-mpi.org>
    >>> https://rfd.newmexicoconsortium.org/mailman/listinfo/devel
    <https://rfd.newmexicoconsortium.org/mailman/listinfo/devel>
    >> _______________________________________________
    >> devel mailing list
    >> devel@lists.open-mpi.org <mailto:devel@lists.open-mpi.org>
    >> https://rfd.newmexicoconsortium.org/mailman/listinfo/devel
    <https://rfd.newmexicoconsortium.org/mailman/listinfo/devel>
    >>
    >
    > _______________________________________________
    > devel mailing list
    > devel@lists.open-mpi.org <mailto:devel@lists.open-mpi.org>
    > https://rfd.newmexicoconsortium.org/mailman/listinfo/devel
    <https://rfd.newmexicoconsortium.org/mailman/listinfo/devel>

    _______________________________________________
    devel mailing list
    devel@lists.open-mpi.org <mailto:devel@lists.open-mpi.org>
    https://rfd.newmexicoconsortium.org/mailman/listinfo/devel
    <https://rfd.newmexicoconsortium.org/mailman/listinfo/devel>




_______________________________________________
devel mailing list
devel@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/devel

_______________________________________________
devel mailing list
devel@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/devel

Re: [OMPI devel] MPI_Win_lock semantic

Reply via email to