George,

Thanks, that was it!

Kurt

From: George Bosilca <bosi...@icl.utk.edu>
Sent: Wednesday, March 16, 2022 4:38 PM
To: Open MPI Users <users@lists.open-mpi.org>
Cc: Mccall, Kurt E. (MSFC-EV41) <kurt.e.mcc...@nasa.gov>
Subject: [EXTERNAL] Re: [OMPI users] MPI_Intercomm_create error

I see similar issues on platforms with multiple IP addresses, if some of them 
are not fully connected. In general, specifying which interface OMPI can use 
(with --mca btl_tcp_if_include x.y.z.t/s) solves the problem.

  George.


On Wed, Mar 16, 2022 at 5:11 PM Mccall, Kurt E. (MSFC-EV41) via users 
<users@lists.open-mpi.org<mailto:users@lists.open-mpi.org>> wrote:
I’m using OpenMpi 4.1.2 under Slurm 20.11.8.  My 2 process job is successfully 
launched, but when the main process rank 0
attempts to create an intercommunicator with process rank 1 on the other node:

MPI_Comm intercom;
MPI_Intercomm_create(MPI_COMM_SELF, 0, MPI_COMM_WORLD, 1, <tag>,   &intercom);

OpenMpi spins deep inside the MPI_Intercomm_create code, and the following 
error is reported:

WARNING: Open MPI accepted a TCP connection from what appears to be a
another Open MPI process but cannot find a corresponding process
entry for that peer.

This attempted connection will be ignored; your MPI job may or may not
continue properly.

The output resulting from using the mpirun arguments ā€œ--mca ras_base_verbose 5 
--display-devel-map --mca rmaps_base_verbose 5ā€ is attached.
Any help would be appreciated.

Reply via email to