Why is MPI_Win_flush required to ensure the lock is acquired ? According to the standard MPI_Win_flush "completes all outstanding RMA operations initiated by the calling process to the target rank on the specified window", which can be read as being a noop if no pending operations exists.
George. On Mon, Nov 21, 2016 at 8:29 PM, Nathan Hjelm <hje...@me.com> wrote: > MPI_Win_lock does not have to be blocking. In osc/rdma it is blocking in > most cases but not others (lock all with on-demand is non-blocking) but in > osc/pt2pt is is almost always non-blocking (it has to be blocking for proc > self). If you really want to ensure the lock is acquired you can call > MPI_Win_flush. I think this should work even if you have not started any > RMA operations inside the epoch. > > -Nathan > > > On Nov 21, 2016, at 7:53 PM, Gilles Gouaillardet <gil...@rist.or.jp> > wrote: > > > > Nathan, > > > > > > we briefly discussed the test_lock1 test from the onesided test suite > using osc/pt2pt > > > > https://github.com/open-mpi/ompi-tests/blob/master/ > onesided/test_lock1.c#L57-L70 > > > > > > task 0 does > > > > MPI_Win_lock(MPI_LOCK_EXCLUSIVE, rank=1,...); > > > > MPI_Send(...,dest=2,...) > > > > > > and task 2 does > > > > MPI_Win_lock(MPI_LOCK_EXCLUSIVE, rank=1,...); > > > > MPI_Recv(...,source=0,...) > > > > > > hoping to guarantee task 0 will acquire the lock first. > > > > > > once in a while, the test fails when task 2 acquires the lock first > > > > /* MPI_Win_lock() only sends a lock request, and return without owning > the lock */ > > > > so if task 1 is running on a loaded server, and even if task 2 requests > the lock *after* task 0, > > > > lock request from task 2 can be processed first, and hence task 2 is not > guaranteed to acquire the lock *before* task 0. > > > > > > can you please confirm MPI_Win_lock() behaves as it is supposed to ? > > > > if yes, is there a way for task 0 to block until it acquires the lock ? > > > > > > i modified the test, and inserted in task 0 a MPI_Get of 1 MPI_Double > *before* MPI_Send. > > > > see my patch below (note i increased the message length) > > > > > > my expectation is that the test would either success (e.g. task 0 gets > the lock first) or hang > > > > (if task 1 gets the lock first) > > > > > > > > surprisingly, the test never hangs (so far ...) but once in a while, it > fails (!), which makes me very confused > > > > > > Any thoughts ? > > > > > > Cheers, > > > > > > Gilles > > > > > > > > diff --git a/onesided/test_lock1.c b/onesided/test_lock1.c > > index c549093..9fa3f8d 100644 > > --- a/onesided/test_lock1.c > > +++ b/onesided/test_lock1.c > > @@ -20,7 +20,7 @@ int > > test_lock1(void) > > { > > double *a = NULL; > > - size_t len = 10; > > + size_t len = 1000000; > > MPI_Win win; > > int i; > > > > @@ -56,6 +56,7 @@ test_lock1(void) > > */ > > if (me == 0) { > > MPI_Win_lock(MPI_LOCK_EXCLUSIVE, 1, 0, win); > > + MPI_Get(a,1,MPI_DOUBLE,1,0,1,MPI_DOUBLE,win); > > MPI_Send(NULL, 0, MPI_BYTE, 2, 1001, MPI_COMM_WORLD); > > MPI_Get(a,len,MPI_DOUBLE,1,0,len,MPI_DOUBLE,win); > > MPI_Win_unlock(1, win); > > @@ -76,6 +77,7 @@ test_lock1(void) > > /* make sure 0 got the data from 1 */ > > for (i = 0; i < len; i++) { > > if (a[i] != (double)(10*1+i)) { > > + if (0 == nfail) fprintf(stderr, "at index %d, expected > %lf but got %lf\n", i, (double)10*1+i, a[i]); > > nfail++; > > } > > } > > > > _______________________________________________ > > devel mailing list > > devel@lists.open-mpi.org > > https://rfd.newmexicoconsortium.org/mailman/listinfo/devel > > _______________________________________________ > devel mailing list > devel@lists.open-mpi.org > https://rfd.newmexicoconsortium.org/mailman/listinfo/devel >
_______________________________________________ devel mailing list devel@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/devel