Thanks Nathan,

any thoughts about my modified version of the test ?

do i need to MPI_Win_flush() after the first MPI_Get() in order to ensure the lock was acquired ?

(and hence the program will either success or hang, but never fail)


Cheers,


Gilles


On 11/22/2016 12:29 PM, Nathan Hjelm wrote:
MPI_Win_lock does not have to be blocking. In osc/rdma it is blocking in most 
cases but not others (lock all with on-demand is non-blocking) but in osc/pt2pt 
is is almost always non-blocking (it has to be blocking for proc self). If you 
really want to ensure the lock is acquired you can call MPI_Win_flush. I think 
this should work even if you have not started any RMA operations inside the 
epoch.

-Nathan

On Nov 21, 2016, at 7:53 PM, Gilles Gouaillardet <gil...@rist.or.jp> wrote:

Nathan,


we briefly discussed the test_lock1 test from the onesided test suite using 
osc/pt2pt

https://github.com/open-mpi/ompi-tests/blob/master/onesided/test_lock1.c#L57-L70


task 0 does

MPI_Win_lock(MPI_LOCK_EXCLUSIVE, rank=1,...);

MPI_Send(...,dest=2,...)


and task 2 does

MPI_Win_lock(MPI_LOCK_EXCLUSIVE, rank=1,...);

MPI_Recv(...,source=0,...)


hoping to guarantee task 0 will acquire the lock first.


once in a while, the test fails when task 2 acquires the lock first

/* MPI_Win_lock() only sends a lock request, and return without owning the lock 
*/

so if task 1 is running on a loaded server, and even if task 2 requests the 
lock *after* task 0,

lock request from task 2 can be processed first, and hence task 2 is not 
guaranteed to acquire the lock *before* task 0.


can you please confirm MPI_Win_lock() behaves as it is supposed to ?

if yes, is there a way for task 0 to block until it acquires the lock ?


i modified the test, and inserted in task 0 a MPI_Get of 1 MPI_Double *before* 
MPI_Send.

see my patch below (note i increased the message length)


my expectation is that the test would either success (e.g. task 0 gets the lock 
first) or hang

(if task 1 gets the lock first)



surprisingly, the test never hangs (so far ...) but once in a while, it fails 
(!), which makes me very confused


Any thoughts ?


Cheers,


Gilles



diff --git a/onesided/test_lock1.c b/onesided/test_lock1.c
index c549093..9fa3f8d 100644
--- a/onesided/test_lock1.c
+++ b/onesided/test_lock1.c
@@ -20,7 +20,7 @@ int
test_lock1(void)
{
     double *a = NULL;
-    size_t     len = 10;
+    size_t     len = 1000000;
     MPI_Win    win;
     int        i;

@@ -56,6 +56,7 @@ test_lock1(void)
      */
     if (me == 0) {
        MPI_Win_lock(MPI_LOCK_EXCLUSIVE, 1, 0, win);
+       MPI_Get(a,1,MPI_DOUBLE,1,0,1,MPI_DOUBLE,win);
         MPI_Send(NULL, 0, MPI_BYTE, 2, 1001, MPI_COMM_WORLD);
        MPI_Get(a,len,MPI_DOUBLE,1,0,len,MPI_DOUBLE,win);
         MPI_Win_unlock(1, win);
@@ -76,6 +77,7 @@ test_lock1(void)
         /* make sure 0 got the data from 1 */
        for (i = 0; i < len; i++) {
            if (a[i] != (double)(10*1+i)) {
+                if (0 == nfail) fprintf(stderr, "at index %d, expected %lf but got 
%lf\n", i, (double)10*1+i, a[i]);
                nfail++;
            }
        }

_______________________________________________
devel mailing list
devel@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/devel
_______________________________________________
devel mailing list
devel@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/devel


_______________________________________________
devel mailing list
devel@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/devel

Reply via email to