Either there is a problem with MPI_Ibarrier or I don't understand the
semantics.
The following example is with openmpi-1.9a1r26747. (Thanks for the fix
in 26757. I tried with that as well with same results.) I get similar
results for different OSes, compilers, bitness, etc.
% cat ibarrier.c
#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <mpi.h>
int main(int argc, char** argv) {
int i, me;
double t0, t1, t2;
MPI_Request req;
MPI_Init(&argc,&argv);
MPI_Comm_rank(MPI_COMM_WORLD,&me);
MPI_Barrier(MPI_COMM_WORLD);
MPI_Barrier(MPI_COMM_WORLD);
MPI_Barrier(MPI_COMM_WORLD);
t0 = MPI_Wtime(); /* set "time zero" */
if ( me < 2 ) sleep(3); /* two processes delay before
hitting barrier */
t1 = MPI_Wtime() - t0;
MPI_Barrier(MPI_COMM_WORLD);
t2 = MPI_Wtime() - t0;
printf("%d entered at %3.1lf and exited at %3.1lf\n", me, t1, t2);
if ( me < 2 ) sleep(3); /* two processes delay before
hitting barrier */
t1 = MPI_Wtime() - t0;
MPI_Ibarrier(MPI_COMM_WORLD, &req);
MPI_Wait(&req, MPI_STATUS_IGNORE);
t2 = MPI_Wtime() - t0;
printf("%d entered at %3.1lf and exited at %3.1lf\n", me, t1, t2);
MPI_Finalize();
return 0;
}
% mpirun -n 4 ./a.out
0 entered at 3.0 and exited at 3.0
1 entered at 3.0 and exited at 3.0
2 entered at 0.0 and exited at 3.0
3 entered at 0.0 and exited at 3.0
0 entered at 6.0 and exited at 6.0
1 entered at 6.0 and exited at 6.0
2 entered at 3.0 and exited at 3.0
3 entered at 3.0 and exited at 3.0
With the first barrier, no one leaves until the last process has
entered. With the non-blocking barrier, two processes enter and leave
before the two laggards arrive at the barrier. Is that right?