Thanks Jeff,

from the FAQ, openmpi should work on nodes who have different number of IB
ports (at least since v1.2)

about IB ports on the same subnet, all i was able to find is explanation
about why i get this warning :

WARNING: There are more than one active ports on host '%s', but the
default subnet GID prefix was detected on more than one of these
ports.  If these ports are connected to different physical OFA
networks, this configuration will fail in Open MPI.  This version of
Open MPI requires that every physically separate OFA subnet that is
used between connected MPI processes must have different subnet ID
values.


i really had to read between the lines (and thanks to your email) in order
to figure out IB ports on the same subnet is not the most optimal way.

the following sentence is even more confusing :

"All this being said, note that there are valid network configurations
where multiple ports on the same host can share the same subnet ID value.
For example, two ports from a single host can be connected to the
*same* network
as a bandwidth multiplier or a high-availability configuration."


from a pragmatic approach, and this is not OpenMPI specific, the two IB
ports of the servers are physically connected to the same IB switch.

/* i would guess the NVIDIA Ivy cluster is similar in that sense */

a few years ago (e.g. last time i checked), using different subnets was
possible by partitionning the switch via OpenSM. IMHO this was not an easy
to maintain solution (e.g. if a switch is replaced, the opensm config had
to be changed as well).

is there a simple and free way today to put ports physically connected to
the same switch in different subnets ?

/* such as tagged vlan in Ethernet => simple switch configuration, and the
host can decide by itself in which vlan a port must be */

Cheers,

Gilles

On Mon, Jun 2, 2014 at 8:50 PM, Jeff Squyres (jsquyres) <jsquy...@cisco.com>
wrote:

>  I'm AFK but let me reply about the IB thing: double ports/multi rail is
> a good thing. It's not a good thing if they're on the same subnet.
>
>  Check the FAQ - http://www.open-mpi.org/faq/?category=openfabrics - I
> can't see it well enough on the small screen of my phone, but I think
> there's a q on there about how multi rail destinations are chosen.
>
>  Spoiler: put your ports in different subnets so that OMPI makes
> deterministic choices.
>
> Sent from my phone. No type good.
>

Reply via email to