Folks,

this RFC is a follow-up of
 * issue 585 https://github.com/open-mpi/ompi/issues/585
 * related PR 591 https://github.com/open-mpi/ompi/pull/591

As some of you might have already noticed, Open MPI fails if configure'd with --enable-ipv6 and ipv6 interfaces are found on the system.

The root cause is IPv6 link-local addresses are not (yet) correctly handled.

Wikipedia has a good pages about link-locak addresses at http://en.wikipedia.org/wiki/Link-local_address

basically, in IPv4, link-local addresses are 169.254.0.0/16 and should be used only when zeroconf'ing the IP stack. on the other hand, IPv6 are fe80::/10, are also used when zeroconf'ing, but must always be present, in addition of a non link-local address.

Currently, these addresses are considered as regular addresses, but the tcp btl (and probably oob tcp too) do not know how to handle them, and that causes OpenMPI crash.

I can think of three options :
1) it is very unlikely a user wants Open MPI use a link-local address, so link-local addresses should be simply skipped
2) each module should decide if/how to handle link-local addresses
3) all modules should correctly handle link-local addresses (that requires some extra devel)

as far as i am concerned, i am fine with 1) because i think it is very unlikely an user ever wants to use link-local addresses.

Thanks in advance for your feedback so we can move forward.

Cheers,

Gilles


Reply via email to