Re: [OMPI users] OpenMP + OpenMPI

Jeff Squyres Thu, 6 Dec 2007 10:08:48 -0500

On Dec 6, 2007, at 9:54 AM, Durga Choudhury wrote:

Automatically striping large messages across multiple NICs iscertainly a very nice feature; I was not aware that OpenMPI doesthis transparently. (I wonder if other MPI implementations do thisor not). However, I have the following concern: Since thecommunication over an ethernet NIC is most likely over IP, does ittake into account the route cost when striping messages? Forexample, host A and B in the MPD ring might be connected via twoNICs, one direct and one via an intermediate router, or one with alarge bandwidth and another with a small bandwidth. Does OpenMPIsend a smaller chunk of data over a route with a higher cost?


Not unless you tell it.

In IB networks, the network API exposes bandwidth differences of theNIC and Open MPI takes that into account by deciding how much data tosend down each endpoint. Open MPI does not currently know anythingabout / try to optimize based on the costs of different routes.

On a TCP network, whether you go through 2 or 3 switches -- does itreally matter? The latency is so high that adding another switch (or2 or 3 or ...) may not make much of a difference anyway. Rawbandwidth differences between two networks will make a difference, butnumber of hops -- as long as they're not *too* difference -- might not.

Also consider: if you're combining 100Mbps and 1Gbps ethernet networks-- is it really worth it? If your goal is simple bandwidth addition,note that you're adding a fraction of the capability to the 1Gbpsnetwork at the cost of additional complexity in your software and/orfragmentation reassembly penalties. Will you really see moredelivered bandwidth? It's probably dependent upon your application(e.g., are you continually sending very large messages?). You mightget much more bang for your buck if you combine like networks (e.g.,2x100Mbps or 2x1Gbps) because you'll be [potentially] doubling yourbandwidth.

Because of this concern, I think the channel bonding approachsomeone else suggested is more preferable; all these details will betaken care of at the hardware level instead of at the IP level.

That's not quite true. Both approaches are handled in software; oneis in the kernel, the other is in the middleware. The hardware isunaware that you are striping large messages.


--
Jeff Squyres
Cisco Systems

Re: [OMPI users] OpenMP + OpenMPI

Reply via email to