Hi Gilles

Thanks for responding quickly; however, I am afraid I did not explain my
question clearly enough; my apologies.

What I am trying to understand is this:

My cluster has (say) 7 nodes. I use IP-over-Ethernet for Orted (for job
launch and control traffic); this is not used for MPI messaging. Let's say
that the IP addresses are 192.168.1.2-192.168.1.9. They are all in the same
IP subnet.

The MPI messaging is used using some other interconnects, such as
Infiniband. All 7 nodes are connected to the same Infiniband switch and
hence are in the same (infiniband) subnet as well.

In my host file, I mention (say) 4 IP addresses:  192.168.3-192.168.1.7

My question is, how does OpenMPI pick the 4 Infiniband interfaces that
matches the IP addresses? Put another way, the ranks of each launched jobs
are (I presume) setup by orted by some mechanism. When I do an MPI_Send()
to a given rank, the message goes to the Infiniband interface with a
particular LID. How does this IP-to-Infiniband LID mapping happen?

Thanks
Durga

We learn from history that we never learn from history.

On Fri, Apr 8, 2016 at 12:12 AM, Gilles Gouaillardet <gil...@rist.or.jp>
wrote:

> Hi,
>
> the hostnames (or their IPs) are only used to ssh orted.
>
>
> if you use only the tcp btl :
>
> TCP *MPI* communications (vs OOB management communications) are handled by
> btl/tcp
> by default, all usable interfaces are used, then messages are split (iirc,
> by ob1 pml) and then "fragments"
> are sent using all interfaces.
>
> each interface has a latency and bandwidth that is used to split message
> into fragments.
> (assuming it is correctly configured, 90% of a large message is sent over
> the 10GbE interface, and 10% is sent over the GbE interface)
>
> if you can explicitly list/blacklist interface
> mpirun --mca btl_tcp_if_include ...
> or
> mpirun --mca btl_tcp_if_exclude ...
>
> (see ompi_info --all for the syntax)
>
>
> but if you use several btls (for example tcp and openib), the btl(s) with
> the lower exclusivity are not used.
> (for example, a large message is *not* split and send using native ib,
> IPoIB and GbE because the openib btl
> has a higher exclusivity than the tcp btl)
>
>
> did this answer your question ?
>
> Cheers,
>
> Gilles
>
>
>
> On 4/8/2016 12:24 PM, dpchoudh . wrote:
>
> Hello all
>
> (Newbie warning! Sorry :-(  )
>
> Let's say my cluster has 7 nodes, connected via IP-over-Ethernet for
> control traffic and some kind of raw verbs (or anything else such as SRIO)
> interface for data transfer. Let's say my host file chooses 4 out of the 7
> nodes for an MPI job, based on the IP address, which are assigned to the
> Ethernet interfaces.
>
> My question is: where in the code does this mapping between
> IP-to-whatever_interface_is_used_for_MPI_Send/Recv is determined, such as
> only those chosen nodes receive traffic over the verbs interface?
>
> Thanks in advance
> Durga
>
> We learn from history that we never learn from history.
>
>
> _______________________________________________
> devel mailing listde...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2016/04/18746.php
>
>
>
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2016/04/18747.php
>

Reply via email to