Re: [OMPI devel] Multi-rail on openib

2009-06-14 Thread Pavel Shamis (Pasha)
Nifty Tom Mitchell wrote: On Tue, Jun 09, 2009 at 04:33:51PM +0300, Pavel Shamis (Pasha) wrote: Open MPI currently needs to have connected fabrics, but maybe that's something we will like to change in the future, having two separate rails. (Btw Pasha, will your current work enable this ?)

Re: [OMPI devel] Multi-rail on openib

2009-06-12 Thread Nifty Tom Mitchell
On Tue, Jun 09, 2009 at 04:33:51PM +0300, Pavel Shamis (Pasha) wrote: > >> Open MPI currently needs to have connected fabrics, but maybe that's >> something we will like to change in the future, having two separate >> rails. (Btw Pasha, will your current work enable this ?) > I do not completel

Re: [OMPI devel] Multi-rail on openib

2009-06-09 Thread Pavel Shamis (Pasha)
Open MPI currently needs to have connected fabrics, but maybe that's something we will like to change in the future, having two separate rails. (Btw Pasha, will your current work enable this ?) I do not completely understand what do you mean here under two separate rails ... Already today you

Re: [OMPI devel] Multi-rail on openib

2009-06-09 Thread Sylvain Jeaugey
On Mon, 8 Jun 2009, NiftyOMPI Tom Mitchell wrote: ??? dual rail does double the number of switch ports. If you want to address switch failure each rail must connect to a different switch. If you do not want to have isolated fabrics you must have some additional ports on all switches to connect

Re: [OMPI devel] Multi-rail on openib

2009-06-09 Thread Pavel Shamis (Pasha)
Most of the IB protocols used by MPI target a LID. There is no existing notification path I know of that can replace LID-xyz with LID-123. The subnet manager might be able to do this but begs security issues. Interesting problem. It is not exactly correct. For migration between port

Re: [OMPI devel] Multi-rail on openib

2009-06-08 Thread NiftyOMPI Tom Mitchell
On 6/8/09, Sylvain Jeaugey wrote: > Hi Tom, > > Yes, there is a goal in mind, and definetly not performance : we are > working on device failover, i.e when a network adapter or switch fails, > use the remaining one. We don't intend to improve performance with > multi-rail (which as you said, will

Re: [OMPI devel] Multi-rail on openib

2009-06-08 Thread Sylvain Jeaugey
Hi Tom, Yes, there is a goal in mind, and definetly not performance : we are working on device failover, i.e when a network adapter or switch fails, use the remaining one. We don't intend to improve performance with multi-rail (which as you said, will not happen unless you have a DDR card wit

Re: [OMPI devel] Multi-rail on openib

2009-06-05 Thread Nifty Tom Mitchell
On Fri, Jun 05, 2009 at 09:52:39AM -0400, Jeff Squyres wrote: > > See this FAQ entry for a description: > > http://www.open-mpi.org/faq/?category=openfabrics#ofa-port-wireup > > Right now, there's no way to force a particular connection pattern on > the openib btl at run-time. The startup s

Re: [OMPI devel] Multi-rail on openib

2009-06-05 Thread Jeff Squyres
See this FAQ entry for a description: http://www.open-mpi.org/faq/?category=openfabrics#ofa-port-wireup Right now, there's no way to force a particular connection pattern on the openib btl at run-time. The startup sequence has gotten sufficiently complicated / muddied over the years tha

[OMPI devel] Multi-rail on openib

2009-06-05 Thread Mouhamed Gueye
Hi all, I am working on multi-rail IB and I was wondering how connections are established between ports. I have two hosts, each with 2 ports on a same IB card, connected to the same switch. My question is : how ports are connected between them ? Is there a queue pair between all ports or o