Hi Gilles,

I'm currently testing and here are some preliminary results:

On Thu, Mar 23, 2017 at 10:33 AM, Gilles Gouaillardet
<gilles.gouaillar...@gmail.com> wrote:
> Can you please try
> mpirun --mca btl tcp,self ...

this failed to produce the program output, there were lots of errors like this:
[pax11-00][[54124,1],31][btl_tcp_endpoint.c:803:mca_btl_tcp_endpoint_complete_connect]
connect() to 192.168.225.202 failed: Connection timed out (110)

I had to terminate the job.

That's why I have added the option --mca btl_tcp_if_exclude ib0 . In
this case, the program started to produce output, but started hanging
early on with this error:
 
[pax11-00][[61232,1],31][btl_tcp_endpoint.c:803:mca_btl_tcp_endpoint_complete_connect]
connect() to 127.0.0.1 failed: Connection refused (111)
[pax11-01][[61232,1],63][btl_tcp_endpoint.c:649:mca_btl_tcp_endpoint_recv_connect_ack]
received unexpected process identifier [[61232,1],33]

I have aborted that job as well.


> And if it works
> mpirun --mca btl openib,self ...
This is running fine so far but will take some more time.

Regards, Götz
_______________________________________________
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Reply via email to