Re: [OMPI users] OpenMPI scaling > 512 cores

2008-06-06 Thread Scott Shaw
amis (Pasha) > Sent: Wednesday, June 04, 2008 5:18 PM > To: Open MPI Users > Subject: Re: [OMPI users] OpenMPI scaling > 512 cores > > Scott Shaw wrote: > > Hi, I hope this is the right forum for my questions. I am running into > > a problem when scaling >512 core

Re: [OMPI users] OpenMPI scaling > 512 cores

2008-06-04 Thread Pavel Shamis (Pasha)
Scott Shaw wrote: Hi, I hope this is the right forum for my questions. I am running into a problem when scaling >512 cores on a infiniband cluster which has 14,336 cores. I am new to openmpi and trying to figure out the right -mca options to pass to avoid the "mca_oob_tcp_peer_complete_connect:

Re: [OMPI users] OpenMPI scaling > 512 cores

2008-06-04 Thread Jeff Squyres
One other parameter that I neglected to mention (and Scott pointed out to me is *not* documented in the FAQ) is the mpi_preconnect_oob MCA param. This parameter will cause all the OOB connections to be created during MPI_INIT, and *may* help such kind of issues. You *do* need to have

Re: [OMPI users] OpenMPI scaling > 512 cores

2008-06-04 Thread Åke Sandgren
On Wed, 2008-06-04 at 11:43 -0700, Scott Shaw wrote: > Hi, I was wondering if anyone had any comments with regarding to my > posting of questions. Am I off base with my questions or is this the > wrong forum for these types of questions? > > > > > Hi, I hope this is the right forum for my

Re: [OMPI users] OpenMPI scaling > 512 cores

2008-06-04 Thread Jeff Squyres
First and foremost: is it possible to upgrade your version of Open MPI? The version you are using (1.2.2) is rather ancient -- many bug fixes have occurred since then (including TCP wireup issues). Note that oob_tcp_in|exclude were renamed to be oob_tcp_if_in|exclude in 1.2.3 to be

[OMPI users] OpenMPI scaling > 512 cores

2008-06-03 Thread Scott Shaw
Hi, I hope this is the right forum for my questions. I am running into a problem when scaling >512 cores on a infiniband cluster which has 14,336 cores. I am new to openmpi and trying to figure out the right -mca options to pass to avoid the "mca_oob_tcp_peer_complete_connect: connection failed:"