Sean,

you might also want to confirm openib is (part of) the issue by running 
your app on TCP only.

mpirun --mca pml ob1 --mca btl tcp,self, ...

Cheers,

Gilles


----- Original Message -----
> Hi Sean,
> 
> Thanks for the report! I have a few questions/suggestions:
> 
> 1) What version of Open MPI are you using?
> 2) What is your network? It sounds like you are on an IB cluster using 
> btl/openib (which is essentially discontinued). Can you try the Open 
MPI 
> 4.0.4 release with UCX instead of openib (configure with --without-
verbs 
> and --with-ucx)?
> 3) If that does not help, can you boil your code down to a minimum 
> working example? That would make it easier for people to try to 
> reproduce what happens.
> 
> Cheers
> Joseph
> 
> On 7/24/20 11:34 PM, Lewis,Sean via users wrote:
> > Hi all,
> > 
> > I am encountering a silent hang involving MPI_Ssend and MPI_Irecv. 
The 
> > subroutine in question is called by each processor and is structured 
> > similar to the pseudo code below. The subroutine is successfully 
called 
> > several thousand times before the silent hang behavior manifests and 
> > never resolves. The hang will occur in nearly (but not exactly) the 
same 
> > spot for bit-wise identical tests. During the hang, all MPI ranks 
will 
> > be at the Line 18 Barrier except for two. One will be waiting at 
Line 
> > 17, waiting for its Irecv to complete, and the other at one of the 
Ssend 
> > Line 9 or 14. This suggests that a MPI_Irecv never completes and a 
> > processor is indefinitely blocked in the Ssend unable to complete 
the 
> > transfer.
> > 
> > I’ve found similar discussion of this kind of behavior on the 
OpenMPI 
> > mailing list: 
> > https://www.mail-archive.com/users@lists.open-mpi.org/msg19227.html 
> > ultimately resolving in setting the mca parameter btl_openib_flags 
to 
> > 304 or 305 (default 310): 
> > https://www.mail-archive.com/users@lists.open-mpi.org/msg19277.html. 
I 
> > have seen some promising behavior by doing the same. As the mailer 
> > suggests, this implies a problem with the RDMA protocols in 
infiniband 
> > for large messages.
> > 
> > I wanted to breathe life back into this conversation as the silent 
hang 
> > issue is particularly debilitating and confusing to me. 
> > Increasing/decreasing the number of processors used does not seem to 
> > alleviate the issue, using MPI_Send results in the same behavior, 
> > perhaps a message has exceeded a memory limit? I am running a test 
now 
> > that reports the individual message sizes but I previously 
implemented a 
> > switch to check for buffer size discrepancies which is not triggered.
 In 
> > the meantime, has anyone run into similar issues or have thoughts as 
to 
> > remedies for this behavior?
> > 
> > 1:  call MPI_BARRIER(…)
> > 
> > 2:  do i = 1,nprocs
> > 
> > 3:   if(commatrix_recv(i) .gt. 0) then     ! Identify which procs to 
> > receive from via predefined matrix
> > 
> > 4:         call Mpi_Irecv(…)
> > 
> > 5:   endif
> > 
> > 6:   enddo
> > 
> > 7:   do j = mype+1,nproc
> > 
> > 8:   if(commatrix_send(j) .gt. 0) then     ! Identify which procs to 
> > send to via predefined matrix
> > 
> > 9:         MPI_Ssend(…)
> > 
> > 10: endif
> > 
> > 11: enddo
> > 
> > 12: do j = 1,mype
> > 
> > 13:  if(commatrix_send(j) .gt. 0) then     ! Identify which procs to 
> > send to via predefined matrix
> > 
> > 14:        MPI_Ssend(…)
> > 
> > 15: endif
> > 
> > 16: enddo
> > 
> > 17: call MPI_Waitall(…)     ! Wait for all Irecv to complete
> > 
> > 18: call MPI_Barrier(…)
> > 
> > Cluster information:
> > 
> > 30 processors
> > 
> > Managed by slurm
> > 
> > OS: Red Hat v. 7.7
> > 
> > Thank you for help/advice you can provide,
> > 
> > Sean
> > 
> > *Sean C. Lewis*
> > 
> > Doctoral Candidate
> > 
> > Department of Physics
> > 
> > Drexel University
> > 
> 

Reply via email to