What's the story about calling MPI_Finalize without first calling MPI_Buffer_detach?

If I do an MPI_Bsend followed by MPI_Finalize, the corresponding MPI_Recv takes forever. In contrast, if I insert an MPI_Buffer_detach, then performance is reasonable. I can imagine the explanation. I suspect that MPI_Bsend leaves the message in a local buffer, and so you need to progress the sender in order for the receive to complete. MPI_Buffer_detach must progress more aggressively than MPI_Finalize.

1) Any guidance from MPI gurus regarding what is proper practice?
2) Any guidance from OMPI devels what sort of fix makes sense?

I attach a test case. On some platforms, the final delay can be on order of a minute.

% mpif90          main.F90
% mpirun -np 2 -mca btl sm,self a.out
  1    0.021321
  0    0.066568
  1    0.020978
  0    0.061625
  1    0.021969
  0    0.062380
  1    0.020938
  0    0.064401
  1    0.020759
  0    4.098010          # yipes! last receive takes a long time!
% mpif90 -DDETACH main.F90
% mpirun -np 2 -mca btl sm,self a.out
  1    0.020913
  0    0.064076
  1    0.020746
  0    0.061015
  1    0.020454
  0    0.061780
  1    0.020457
  0    0.060776
  1    0.020619
  0    0.062484
  include "mpif.h"
  integer, parameter :: nbufbytes = 16000000, nsendbytes = 15892480
  real(8) buf(nbufbytes/8), x(nsendbytes/8), t
  real(8) buf2
  integer mbufbytes

  call MPI_Init(ier)
  call MPI_Comm_size(MPI_COMM_WORLD,np,ier)
  call MPI_Comm_rank(MPI_COMM_WORLD,me,ier)
  buf = 0.d0
  x   = 0.d0

  if ( me == 1 ) call MPI_Buffer_attach(buf, nbufbytes, ier)

  do i = 1, 5
    call MPI_Barrier(MPI_COMM_WORLD,ier)
    t = MPI_Wtime()
    if ( me == 0 ) call MPI_Recv 
(x,nsendbytes,MPI_BYTE,1,343,MPI_COMM_WORLD,MPI_STATUS_IGNORE,ier)
    if ( me == 1 ) call MPI_Bsend(x,nsendbytes,MPI_BYTE,0,343,MPI_COMM_WORLD,   
               ier)
    t = MPI_Wtime() - t
    write(6,'(i4,f12.6,f8.3)') me, t
  end do

#ifdef DETACH
  if ( me == 1 ) call MPI_Buffer_detach(buf2, mbufbytes, ier)
#endif

  call MPI_Finalize(ier)
end

Reply via email to