I see similar issues on platforms with multiple IP addresses, if some of
them are not fully connected. In general, specifying which interface OMPI
can use (with --mca btl_tcp_if_include x.y.z.t/s) solves the problem.

  George.


On Wed, Mar 16, 2022 at 5:11 PM Mccall, Kurt E. (MSFC-EV41) via users <
users@lists.open-mpi.org> wrote:

> I’m using OpenMpi 4.1.2 under Slurm 20.11.8.  My 2 process job is
> successfully launched, but when the main process rank 0
>
> attempts to create an intercommunicator with process rank 1 on the other
> node:
>
>
>
> MPI_Comm intercom;
>
> MPI_Intercomm_create(MPI_COMM_SELF, 0, MPI_COMM_WORLD, 1, <tag>,
>   &intercom);
>
>
>
> OpenMpi spins deep inside the MPI_Intercomm_create code, and the following
> error is reported:
>
>
>
> *WARNING: Open MPI accepted a TCP connection from what appears to be a*
>
> *another Open MPI process but cannot find a corresponding process*
>
> *entry for that peer.*
>
>
>
> *This attempted connection will be ignored; your MPI job may or may not*
>
> *continue properly.*
>
>
>
> The output resulting from using the mpirun arguments ā€œ--mca
> ras_base_verbose 5 --display-devel-map --mca rmaps_base_verbose 5ā€ is
> attached.
>
> Any help would be appreciated.
>

Reply via email to