Re: [OMPI users] How are the Open MPI processes spawned?

2011-12-06 Thread Ralph Castain
I'll take a look at having the rsh launcher forward MCA params up to the cmd line limit, and warn if there are too many to fit. Shouldn't be too hard, I would think. On Dec 6, 2011, at 1:28 PM, Paul Kapinos wrote: > Hello Jeff, Ralph, all! > Meaning that per my output from above, what

Re: [OMPI users] How are the Open MPI processes spawned?

2011-12-06 Thread Paul Kapinos
Hello Jeff, Ralph, all! Meaning that per my output from above, what Paul was trying should have worked, no? I.e., setenv'ing OMPI_, and those env vars should magically show up in the launched process. In the -launched process- yes. However, his problem was that they do not show up for the

Re: [OMPI users] How are the Open MPI processes spawned?

2011-11-28 Thread Jeff Squyres
On Nov 28, 2011, at 7:39 PM, Ralph Castain wrote: >> Meaning that per my output from above, what Paul was trying should have >> worked, no? I.e., setenv'ing OMPI_, and those env vars should >> magically show up in the launched process. > > In the -launched process- yes. However, his problem

Re: [OMPI users] How are the Open MPI processes spawned?

2011-11-28 Thread Ralph Castain
On Nov 28, 2011, at 5:32 PM, Jeff Squyres wrote: > On Nov 28, 2011, at 6:56 PM, Ralph Castain wrote: > Right-o. Knew there was something I forgot... > >> So on rsh, we do not put envar mca params onto the orted cmd line. This has >> been noted repeatedly on the user and devel lists, so it

Re: [OMPI users] How are the Open MPI processes spawned?

2011-11-28 Thread Jeff Squyres
On Nov 28, 2011, at 6:56 PM, Ralph Castain wrote: > I'm afraid that example is incorrect - you were running under slurm on your > cluster, not rsh. Ummm... right. Duh. > If you look at the actual code, you will see that we slurp up the envars into > the environment of each app_context, and

Re: [OMPI users] How are the Open MPI processes spawned?

2011-11-28 Thread Ralph Castain
I'm afraid that example is incorrect - you were running under slurm on your cluster, not rsh. If you look at the actual code, you will see that we slurp up the envars into the environment of each app_context, and then send that to the backend. In environments like slurm, we can also apply

Re: [OMPI users] How are the Open MPI processes spawned?

2011-11-28 Thread Jeff Squyres
On Nov 28, 2011, at 5:39 PM, Jeff Squyres wrote: > (off list) Hah! So much for me discretely asking off-list before coming back with a definitive answer... :-\ -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/

Re: [OMPI users] How are the Open MPI processes spawned?

2011-11-28 Thread Jeff Squyres
(off list) Are you sure about OMPI_MCA_* params not being treated specially? I know for a fact that they *used* to be. I.e., we bundled up all env variables that began with OMPI_MCA_* and sent them with the job to back-end nodes. It allowed sysadmins to set global MCA param values without

Re: [OMPI users] How are the Open MPI processes spawned?

2011-11-25 Thread Ralph Castain
On Nov 25, 2011, at 12:29 PM, Paul Kapinos wrote: > Hello again, > >>> Ralph Castain wrote: Yes, that would indeed break things. The 1.5 series isn't correctly checking connections across multiple interfaces until it finds one that works - it just uses the first one it sees.

Re: [OMPI users] How are the Open MPI processes spawned?

2011-11-25 Thread Paul Kapinos
Hello again, Ralph Castain wrote: Yes, that would indeed break things. The 1.5 series isn't correctly checking connections across multiple interfaces until it finds one that works - it just uses the first one it sees. :-( Yahhh!! This behaviour - catch a random interface and hang forever if

Re: [OMPI users] How are the Open MPI processes spawned?

2011-11-24 Thread Ralph Castain
On Nov 24, 2011, at 11:49 AM, Paul Kapinos wrote: > Hello Ralph, Terry, all! > > again, two news: the good one and the second one. > > Ralph Castain wrote: >> Yes, that would indeed break things. The 1.5 series isn't correctly checking >> connections across multiple interfaces until it finds

Re: [OMPI users] How are the Open MPI processes spawned?

2011-11-24 Thread Paul Kapinos
Hello Ralph, Terry, all! again, two news: the good one and the second one. Ralph Castain wrote: Yes, that would indeed break things. The 1.5 series isn't correctly checking connections across multiple interfaces until it finds one that works - it just uses the first one it sees. :-( Yahhh!!

Re: [OMPI users] How are the Open MPI processes spawned?

2011-11-23 Thread Ralph Castain
Yes, that would indeed break things. The 1.5 series isn't correctly checking connections across multiple interfaces until it finds one that works - it just uses the first one it sees. :-( The solution is to specify -mca oob_tcp_if_include ib0. This will direct the run-time wireup across the IP

Re: [OMPI users] How are the Open MPI processes spawned?

2011-11-23 Thread TERRY DONTJE
On 11/23/2011 2:02 PM, Paul Kapinos wrote: Hello Ralph, hello all, Two news, as usual a good and a bad one. The good: we believe to find out *why* it hangs The bad: it seem for me, this is a bug or at least undocumented feature of Open MPI /1.5.x. In detail: As said, we see mystery

Re: [OMPI users] How are the Open MPI processes spawned?

2011-11-23 Thread Paul Kapinos
Hello Ralph, hello all, Two news, as usual a good and a bad one. The good: we believe to find out *why* it hangs The bad: it seem for me, this is a bug or at least undocumented feature of Open MPI /1.5.x. In detail: As said, we see mystery hang-ups if starting on some nodes using some

Re: [OMPI users] How are the Open MPI processes spawned?

2011-11-22 Thread Ralph Castain
On Nov 22, 2011, at 10:10 AM, Paul Kapinos wrote: > Hello Ralph, hello all. > >> No real ideas, I'm afraid. We regularly launch much larger jobs than that >> using ssh without problem, > I was also able to run a 288-node-job yesterday - the size alone is not the > problem... > > > >> so it

Re: [OMPI users] How are the Open MPI processes spawned?

2011-11-22 Thread Paul Kapinos
Hello Ralph, hello all. No real ideas, I'm afraid. We regularly launch much larger jobs than that using ssh without problem, I was also able to run a 288-node-job yesterday - the size alone is not the problem... so it is likely something about the local setup of that node that is causing

Re: [OMPI users] How are the Open MPI processes spawned?

2011-11-21 Thread Ralph Castain
No real ideas, I'm afraid. We regularly launch much larger jobs than that using ssh without problem, so it is likely something about the local setup of that node that is causing the problem. Offhand, it sounds like either the mapper isn't getting things right, or for some reason the daemon on