On Mar 13, 2006, at 8:38 AM, Michael Kluskens wrote:
On Mar 11, 2006, at 1:00 PM, Jayabrata Chakrabarty wrote:
Hi I have been looking for information on how to use multiple
Gigabit Ethernet Interface for MPI communication.
So far what i have found out is i have to use mca_btl_tcp.
But what i wish to know, is what IP Address to assign to each
Network Interface. I also wish to know if there will be any change
in the format of "hostfile"
I have two Gigabit Ethernet Interface on a cluster of 5 nodes at
present.
It seems to me that an easier approach would be to bond the ethernet
interfaces together at the Unix/Linux level and then you have only
one ethernet interface to worry about in MPI. Our Operton-based
cluster shipped with that setup in SUSE Linux. When I rebuilt it
with Debian Linux I configured the ethernet interface bonding myself
using references I found via google. My master node has three
physical interfaces and two ip addresses, all the rest have two
physical interfaces and one ip address.
I have not tested throughput to see if I choose the best type of
bonding, but the choices were clear enough.
That is one option, yes. However, Channel bonding can result in much
lower performance than letting Open MPI do the stripping and
fragmenting. This is true for a couple of reasons. Channel bonding
requires that packet delivery be in order, so it can not round-robin
short message delivery. While we may have to queue a message
temporarily, we can effectively use both NICs for short messages.
Second, our effective bandwidth for large messages should be nearly N
* effective bandwidth for one NIC. This is rarely the case for
channel bonding, again because of ordering issues. We don't even
have to queue long message fragments internally in the multi-nic
case, as we can immediately write that part of the message into user
space (even if it's after a fragment we haven't received yet).
Of course, if you also need more bandwidth for NFS or MPI
implementations that don't support multi-nic usage, you might not
have an option outside of channel bonding.
Brian
--
Brian Barrett
Open MPI developer
http://www.open-mpi.org/