Hi Gilles Thanks for responding quickly; however, I am afraid I did not explain my question clearly enough; my apologies.
What I am trying to understand is this: My cluster has (say) 7 nodes. I use IP-over-Ethernet for Orted (for job launch and control traffic); this is not used for MPI messaging. Let's say that the IP addresses are 192.168.1.2-192.168.1.9. They are all in the same IP subnet. The MPI messaging is used using some other interconnects, such as Infiniband. All 7 nodes are connected to the same Infiniband switch and hence are in the same (infiniband) subnet as well. In my host file, I mention (say) 4 IP addresses: 192.168.3-192.168.1.7 My question is, how does OpenMPI pick the 4 Infiniband interfaces that matches the IP addresses? Put another way, the ranks of each launched jobs are (I presume) setup by orted by some mechanism. When I do an MPI_Send() to a given rank, the message goes to the Infiniband interface with a particular LID. How does this IP-to-Infiniband LID mapping happen? Thanks Durga We learn from history that we never learn from history. On Fri, Apr 8, 2016 at 12:12 AM, Gilles Gouaillardet <gil...@rist.or.jp> wrote: > Hi, > > the hostnames (or their IPs) are only used to ssh orted. > > > if you use only the tcp btl : > > TCP *MPI* communications (vs OOB management communications) are handled by > btl/tcp > by default, all usable interfaces are used, then messages are split (iirc, > by ob1 pml) and then "fragments" > are sent using all interfaces. > > each interface has a latency and bandwidth that is used to split message > into fragments. > (assuming it is correctly configured, 90% of a large message is sent over > the 10GbE interface, and 10% is sent over the GbE interface) > > if you can explicitly list/blacklist interface > mpirun --mca btl_tcp_if_include ... > or > mpirun --mca btl_tcp_if_exclude ... > > (see ompi_info --all for the syntax) > > > but if you use several btls (for example tcp and openib), the btl(s) with > the lower exclusivity are not used. > (for example, a large message is *not* split and send using native ib, > IPoIB and GbE because the openib btl > has a higher exclusivity than the tcp btl) > > > did this answer your question ? > > Cheers, > > Gilles > > > > On 4/8/2016 12:24 PM, dpchoudh . wrote: > > Hello all > > (Newbie warning! Sorry :-( ) > > Let's say my cluster has 7 nodes, connected via IP-over-Ethernet for > control traffic and some kind of raw verbs (or anything else such as SRIO) > interface for data transfer. Let's say my host file chooses 4 out of the 7 > nodes for an MPI job, based on the IP address, which are assigned to the > Ethernet interfaces. > > My question is: where in the code does this mapping between > IP-to-whatever_interface_is_used_for_MPI_Send/Recv is determined, such as > only those chosen nodes receive traffic over the verbs interface? > > Thanks in advance > Durga > > We learn from history that we never learn from history. > > > _______________________________________________ > devel mailing listde...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2016/04/18746.php > > > > _______________________________________________ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2016/04/18747.php >