I'll take a look at having the rsh launcher forward MCA params up to the cmd
line limit, and warn if there are too many to fit. Shouldn't be too hard, I
would think.
On Dec 6, 2011, at 1:28 PM, Paul Kapinos wrote:
> Hello Jeff, Ralph, all!
>
Meaning that per my output from above, what
Hello Jeff, Ralph, all!
Meaning that per my output from above, what Paul was trying should have worked, no?
I.e., setenv'ing OMPI_, and those env vars should magically show up
in the launched process.
In the -launched process- yes. However, his problem was that they do not show
up for the
On Nov 28, 2011, at 7:39 PM, Ralph Castain wrote:
>> Meaning that per my output from above, what Paul was trying should have
>> worked, no? I.e., setenv'ing OMPI_, and those env vars should
>> magically show up in the launched process.
>
> In the -launched process- yes. However, his problem
On Nov 28, 2011, at 5:32 PM, Jeff Squyres wrote:
> On Nov 28, 2011, at 6:56 PM, Ralph Castain wrote:
> Right-o. Knew there was something I forgot...
>
>> So on rsh, we do not put envar mca params onto the orted cmd line. This has
>> been noted repeatedly on the user and devel lists, so it
On Nov 28, 2011, at 6:56 PM, Ralph Castain wrote:
> I'm afraid that example is incorrect - you were running under slurm on your
> cluster, not rsh.
Ummm... right. Duh.
> If you look at the actual code, you will see that we slurp up the envars into
> the environment of each app_context, and
I'm afraid that example is incorrect - you were running under slurm on your
cluster, not rsh. If you look at the actual code, you will see that we slurp up
the envars into the environment of each app_context, and then send that to the
backend.
In environments like slurm, we can also apply
On Nov 28, 2011, at 5:39 PM, Jeff Squyres wrote:
> (off list)
Hah! So much for me discretely asking off-list before coming back with a
definitive answer... :-\
--
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/
(off list)
Are you sure about OMPI_MCA_* params not being treated specially? I know for a
fact that they *used* to be. I.e., we bundled up all env variables that began
with OMPI_MCA_* and sent them with the job to back-end nodes. It allowed
sysadmins to set global MCA param values without
On Nov 25, 2011, at 12:29 PM, Paul Kapinos wrote:
> Hello again,
>
>>> Ralph Castain wrote:
Yes, that would indeed break things. The 1.5 series isn't correctly
checking connections across multiple interfaces until it finds one that
works - it just uses the first one it sees.
Hello again,
Ralph Castain wrote:
Yes, that would indeed break things. The 1.5 series isn't correctly checking
connections across multiple interfaces until it finds one that works - it just
uses the first one it sees. :-(
Yahhh!!
This behaviour - catch a random interface and hang forever if
On Nov 24, 2011, at 11:49 AM, Paul Kapinos wrote:
> Hello Ralph, Terry, all!
>
> again, two news: the good one and the second one.
>
> Ralph Castain wrote:
>> Yes, that would indeed break things. The 1.5 series isn't correctly checking
>> connections across multiple interfaces until it finds
Hello Ralph, Terry, all!
again, two news: the good one and the second one.
Ralph Castain wrote:
Yes, that would indeed break things. The 1.5 series isn't correctly
checking connections across multiple interfaces until it finds one that
works - it just uses the first one it sees. :-(
Yahhh!!
Yes, that would indeed break things. The 1.5 series isn't correctly checking
connections across multiple interfaces until it finds one that works - it just
uses the first one it sees. :-(
The solution is to specify -mca oob_tcp_if_include ib0. This will direct the
run-time wireup across the IP
On 11/23/2011 2:02 PM, Paul Kapinos wrote:
Hello Ralph, hello all,
Two news, as usual a good and a bad one.
The good: we believe to find out *why* it hangs
The bad: it seem for me, this is a bug or at least undocumented
feature of Open MPI /1.5.x.
In detail:
As said, we see mystery
Hello Ralph, hello all,
Two news, as usual a good and a bad one.
The good: we believe to find out *why* it hangs
The bad: it seem for me, this is a bug or at least undocumented feature
of Open MPI /1.5.x.
In detail:
As said, we see mystery hang-ups if starting on some nodes using some
On Nov 22, 2011, at 10:10 AM, Paul Kapinos wrote:
> Hello Ralph, hello all.
>
>> No real ideas, I'm afraid. We regularly launch much larger jobs than that
>> using ssh without problem,
> I was also able to run a 288-node-job yesterday - the size alone is not the
> problem...
>
>
>
>> so it
Hello Ralph, hello all.
No real ideas, I'm afraid. We regularly launch much larger jobs than that using
ssh without problem,
I was also able to run a 288-node-job yesterday - the size alone is not
the problem...
so it is likely something about the local setup of that node that is causing
No real ideas, I'm afraid. We regularly launch much larger jobs than that using
ssh without problem, so it is likely something about the local setup of that
node that is causing the problem. Offhand, it sounds like either the mapper
isn't getting things right, or for some reason the daemon on
18 matches
Mail list logo