The proper practice based on the MPI Standard will be to call the detach function before finalize. From a pure OMPI perspective, we do the same thing in both cases, i.e. we wait until all pending communications on the buffer are completed to detach it. I think dhe difference in performance came from the fact that in case of MPI_Finalize we call the poll at every iteration in opal_progress.

  george.

On Nov 18, 2009, at 12:18 , Eugene Loh wrote:

What's the story about calling MPI_Finalize without first calling MPI_Buffer_detach?

If I do an MPI_Bsend followed by MPI_Finalize, the corresponding MPI_Recv takes forever. In contrast, if I insert an MPI_Buffer_detach, then performance is reasonable. I can imagine the explanation. I suspect that MPI_Bsend leaves the message in a local buffer, and so you need to progress the sender in order for the receive to complete. MPI_Buffer_detach must progress more aggressively than MPI_Finalize.

1) Any guidance from MPI gurus regarding what is proper practice?
2) Any guidance from OMPI devels what sort of fix makes sense?

I attach a test case. On some platforms, the final delay can be on order of a minute.

% mpif90          main.F90
% mpirun -np 2 -mca btl sm,self a.out
 1    0.021321
 0    0.066568
 1    0.020978
 0    0.061625
 1    0.021969
 0    0.062380
 1    0.020938
 0    0.064401
 1    0.020759
 0    4.098010          # yipes! last receive takes a long time!
% mpif90 -DDETACH main.F90
% mpirun -np 2 -mca btl sm,self a.out
 1    0.020913
 0    0.064076
 1    0.020746
 0    0.061015
 1    0.020454
 0    0.061780
 1    0.020457
 0    0.060776
 1    0.020619
 0    0.062484
 include "mpif.h"
 integer, parameter :: nbufbytes = 16000000, nsendbytes = 15892480
 real(8) buf(nbufbytes/8), x(nsendbytes/8), t
 real(8) buf2
 integer mbufbytes

 call MPI_Init(ier)
 call MPI_Comm_size(MPI_COMM_WORLD,np,ier)
 call MPI_Comm_rank(MPI_COMM_WORLD,me,ier)
 buf = 0.d0
 x   = 0.d0

 if ( me == 1 ) call MPI_Buffer_attach(buf, nbufbytes, ier)

 do i = 1, 5
   call MPI_Barrier(MPI_COMM_WORLD,ier)
   t = MPI_Wtime()
if ( me == 0 ) call MPI_Recv (x,nsendbytes,MPI_BYTE, 1,343,MPI_COMM_WORLD,MPI_STATUS_IGNORE,ier) if ( me == 1 ) call MPI_Bsend(x,nsendbytes,MPI_BYTE, 0,343,MPI_COMM_WORLD, ier)
   t = MPI_Wtime() - t
   write(6,'(i4,f12.6,f8.3)') me, t
 end do

#ifdef DETACH
 if ( me == 1 ) call MPI_Buffer_detach(buf2, mbufbytes, ier)
#endif

 call MPI_Finalize(ier)
end
_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

Reply via email to