Nathan,
we briefly discussed the test_lock1 test from the onesided test suite
using osc/pt2pt
https://github.com/open-mpi/ompi-tests/blob/master/onesided/test_lock1.c#L57-L70
task 0 does
MPI_Win_lock(MPI_LOCK_EXCLUSIVE, rank=1,...);
MPI_Send(...,dest=2,...)
and task 2 does
MPI_Win_lock(MPI_LOCK_EXCLUSIVE, rank=1,...);
MPI_Recv(...,source=0,...)
hoping to guarantee task 0 will acquire the lock first.
once in a while, the test fails when task 2 acquires the lock first
/* MPI_Win_lock() only sends a lock request, and return without owning
the lock */
so if task 1 is running on a loaded server, and even if task 2 requests
the lock *after* task 0,
lock request from task 2 can be processed first, and hence task 2 is not
guaranteed to acquire the lock *before* task 0.
can you please confirm MPI_Win_lock() behaves as it is supposed to ?
if yes, is there a way for task 0 to block until it acquires the lock ?
i modified the test, and inserted in task 0 a MPI_Get of 1 MPI_Double
*before* MPI_Send.
see my patch below (note i increased the message length)
my expectation is that the test would either success (e.g. task 0 gets
the lock first) or hang
(if task 1 gets the lock first)
surprisingly, the test never hangs (so far ...) but once in a while, it
fails (!), which makes me very confused
Any thoughts ?
Cheers,
Gilles
diff --git a/onesided/test_lock1.c b/onesided/test_lock1.c
index c549093..9fa3f8d 100644
--- a/onesided/test_lock1.c
+++ b/onesided/test_lock1.c
@@ -20,7 +20,7 @@ int
test_lock1(void)
{
double *a = NULL;
- size_t len = 10;
+ size_t len = 1000000;
MPI_Win win;
int i;
@@ -56,6 +56,7 @@ test_lock1(void)
*/
if (me == 0) {
MPI_Win_lock(MPI_LOCK_EXCLUSIVE, 1, 0, win);
+ MPI_Get(a,1,MPI_DOUBLE,1,0,1,MPI_DOUBLE,win);
MPI_Send(NULL, 0, MPI_BYTE, 2, 1001, MPI_COMM_WORLD);
MPI_Get(a,len,MPI_DOUBLE,1,0,len,MPI_DOUBLE,win);
MPI_Win_unlock(1, win);
@@ -76,6 +77,7 @@ test_lock1(void)
/* make sure 0 got the data from 1 */
for (i = 0; i < len; i++) {
if (a[i] != (double)(10*1+i)) {
+ if (0 == nfail) fprintf(stderr, "at index %d, expected
%lf but got %lf\n", i, (double)10*1+i, a[i]);
nfail++;
}
}
_______________________________________________
devel mailing list
devel@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/devel