Hi,

how many MPI tasks are you running?
are you running from a terminal? from two different jobs? two mpirun
within the same job?
what happens next? hang? abort? crash? app runs just fine?

fwiw, the message says that rank 3 received an unexpected connection from rank 4

Cheers,

Gilles

On Tue, May 5, 2020 at 9:08 AM Kulshrestha, Vipul via users
<users@lists.open-mpi.org> wrote:
>
> Hi,
>
>
>
> Could somebody explain what does these warning imply? Is this caused if 2 
> distinct openmpi application end up running on same machine?
>
>
>
> I am using 4.0.1 version.
>
>
>
> Thanks,
> Vipul
>
>
>
> Message in the stdout of the application
>
>
>
> [orw-med-fenway1][[61362,1],3][btl_tcp_endpoint.c:626:mca_btl_tcp_endpoint_recv_connect_ack]
>  received unexpected process identifier [[61362,1],4]
>
>
>
> Messages from mpirun:
>
> --------------------------------------------------------------------------
>
> WARNING: Open MPI accepted a TCP connection from what appears to be a
>
> another Open MPI process but cannot find a corresponding process
>
> entry for that peer.
>
>
>
> This attempted connection will be ignored; your MPI job may or may not
>
> continue properly.
>
>
>
>   Local host: orw-med-fenway2
>
>   PID:        10748
>
> --------------------------------------------------------------------------
>
> [orw-med-pats1:30498] 8 more processes have sent help message 
> help-mpi-btl-tcp.txt / server accept cannot find guid
>
> [orw-med-pats1:30498] Set MCA parameter "orte_base_help_aggregate" to 0 to 
> see all help / error messages
>
> [orw-med-pats1:30498] 4 more processes have sent help message 
> help-mpi-btl-tcp.txt / server accept cannot find guid
>
> [orw-med-pats1:30498] 1 more process has sent help message 
> help-mpi-btl-tcp.txt / server accept cannot find guid
>
> [orw-med-pats1:30498] 1 more process has sent help message 
> help-mpi-btl-tcp.txt / server accept cannot find guid
>
> [orw-med-pats1:30498] 1 more process has sent help message 
> help-mpi-btl-tcp.txt / server accept cannot find guid
>
> [orw-med-pats1:30498] 9 more processes have sent help message 
> help-mpi-btl-tcp.txt / server accept cannot find guid
>
> [orw-med-pats1:30498] 3 more processes have sent help message 
> help-mpi-btl-tcp.txt / server accept cannot find guid

Reply via email to