Hi, how many MPI tasks are you running? are you running from a terminal? from two different jobs? two mpirun within the same job? what happens next? hang? abort? crash? app runs just fine?
fwiw, the message says that rank 3 received an unexpected connection from rank 4 Cheers, Gilles On Tue, May 5, 2020 at 9:08 AM Kulshrestha, Vipul via users <users@lists.open-mpi.org> wrote: > > Hi, > > > > Could somebody explain what does these warning imply? Is this caused if 2 > distinct openmpi application end up running on same machine? > > > > I am using 4.0.1 version. > > > > Thanks, > Vipul > > > > Message in the stdout of the application > > > > [orw-med-fenway1][[61362,1],3][btl_tcp_endpoint.c:626:mca_btl_tcp_endpoint_recv_connect_ack] > received unexpected process identifier [[61362,1],4] > > > > Messages from mpirun: > > -------------------------------------------------------------------------- > > WARNING: Open MPI accepted a TCP connection from what appears to be a > > another Open MPI process but cannot find a corresponding process > > entry for that peer. > > > > This attempted connection will be ignored; your MPI job may or may not > > continue properly. > > > > Local host: orw-med-fenway2 > > PID: 10748 > > -------------------------------------------------------------------------- > > [orw-med-pats1:30498] 8 more processes have sent help message > help-mpi-btl-tcp.txt / server accept cannot find guid > > [orw-med-pats1:30498] Set MCA parameter "orte_base_help_aggregate" to 0 to > see all help / error messages > > [orw-med-pats1:30498] 4 more processes have sent help message > help-mpi-btl-tcp.txt / server accept cannot find guid > > [orw-med-pats1:30498] 1 more process has sent help message > help-mpi-btl-tcp.txt / server accept cannot find guid > > [orw-med-pats1:30498] 1 more process has sent help message > help-mpi-btl-tcp.txt / server accept cannot find guid > > [orw-med-pats1:30498] 1 more process has sent help message > help-mpi-btl-tcp.txt / server accept cannot find guid > > [orw-med-pats1:30498] 9 more processes have sent help message > help-mpi-btl-tcp.txt / server accept cannot find guid > > [orw-med-pats1:30498] 3 more processes have sent help message > help-mpi-btl-tcp.txt / server accept cannot find guid