Spawned tasks cannot use the sm nor vader btl so you need an other one
(tcp, openib, ...)
self btl is only to send/recvcount with oneself (e.g. it does not work for
inter process and intra node communications.

I am pretty sure the lo interface is always discarded by openmpi, so I have
no solution on top of my head that involves openmpi.
maybe your bed bet is to use a "dummy" interface, for example tan or tun or
even a bridge.

Cheers,

Gilles



On Friday, March 11, 2016, Rémy Grünblatt <remy.grunbl...@ens-lyon.fr>
wrote:

> Hello,
> I'm having communications problem between two processes (with one being
> spawned by the other, on the *same* physical machine). Everything works
> as expected when I have network interface such as eth0 or wlo1 up, but
> as soon as they are down, I get errors (such as « At least one pair of
> MPI processes are unable to reach each other for MPI communications […] »).
> I tried to specify a set of mca parameters including the btl "self"
> parameter and including the lo interface in btl_tcp_if_include list, as
> advised by https://www.open-mpi.org/faq/?category=tcp but I didn't reach
> any working state for this code with "external" network interface down.
>
> Got any idea about what I might do wrong ?
>
> Example code that triggers the problem: https://ptpb.pw/YOjr.tar.gz
> Ompi_info:  https://ptpb.pw/Vt_V.txt
> Full log: https://ptpb.pw/JCXn.txt
>
> Rémy
>
>
>

Reply via email to