Re: [OMPI users] A problem with 'mpiexec -launch-agent'

2010-06-15 Thread Reuti
Am 15.06.2010 um 14:52 schrieb Jeff Squyres:

> On Jun 14, 2010, at 3:13 PM, Reuti wrote:
> 
>>> bash: -c: line 0: syntax error near unexpected token `('
>>> bash: -c: line 0: ` PATH=/OMPI_dir/bin:$PATH ; export PATH ; 
>>> LD_LIBRARY_PATH=/OMPI_dir/lib:$LD_LIBRARY_PATH ; export 
>>> LD_LIBRARY_PATH ; /some_path/myscript /OMPI_dir/bin/(null) --
>>> daemonize -mca ess env -mca orte_ess_jobid 1978662912 -mca 
>>> orte_ess_vpid 1 -mca orte_ess_num_procs 2 --hnp-uri 
>>> "1978662912.0;tcp://180.0.14.12:54844;tcp://190.0.14.12:54844"'
> 
> The problem is that "(null)" in the middle.  We'll have to dig into how that 
> got there...  Reuti's probably right that something is somehow NULL in there, 
> and glibc is snprintf'ing (null) instead of SEGV'ing.

I think the problem is not only the (null) itself, but also the output 
"prefix_dir" and "bin_base" (unless the launch-agent would have 
ignore/interpret $1 $2 in a proper way). The (null) is then the content of 
"orted_cmd".

-- Reuti


> 
> Ralph and I are talking about this issue, but we're hindered by the fact that 
> I'm at the MPI Forum this week (i.e., meetings are taking up all my days).  I 
> haven't had a chance to look at the code in depth yet.
> 
> -- 
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
> 
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] A problem with 'mpiexec -launch-agent'

2010-06-15 Thread Jeff Squyres
On Jun 14, 2010, at 3:13 PM, Reuti wrote:

> > bash: -c: line 0: syntax error near unexpected token `('
> > bash: -c: line 0: ` PATH=/OMPI_dir/bin:$PATH ; export PATH ; 
> > LD_LIBRARY_PATH=/OMPI_dir/lib:$LD_LIBRARY_PATH ; export 
> > LD_LIBRARY_PATH ; /some_path/myscript /OMPI_dir/bin/(null) --
> > daemonize -mca ess env -mca orte_ess_jobid 1978662912 -mca 
> > orte_ess_vpid 1 -mca orte_ess_num_procs 2 --hnp-uri 
> > "1978662912.0;tcp://180.0.14.12:54844;tcp://190.0.14.12:54844"'

The problem is that "(null)" in the middle.  We'll have to dig into how that 
got there...  Reuti's probably right that something is somehow NULL in there, 
and glibc is snprintf'ing (null) instead of SEGV'ing.

Ralph and I are talking about this issue, but we're hindered by the fact that 
I'm at the MPI Forum this week (i.e., meetings are taking up all my days).  I 
haven't had a chance to look at the code in depth yet.

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/




Re: [OMPI users] A problem with 'mpiexec -launch-agent'

2010-06-15 Thread Jeff Squyres
On Jun 14, 2010, at 5:24 PM, Terry Frankcombe wrote:

> Speaking as no more than an uneducated user, having the behaviour change
> depending on invoking by an absolute path or invoking by some
> unspecified (potentially shell-dependent) path magic seems like a bad
> idea.

FWIW, this specific feature was copied (at the request of multiple users) from 
another MPI implementation.  

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/




Re: [OMPI users] A problem with 'mpiexec -launch-agent'

2010-06-14 Thread Ralph Castain
It isn't our intention either - still looking at this to see what is going on.


On Jun 14, 2010, at 6:24 PM, Terry Frankcombe wrote:

> On Tue, 2010-06-15 at 00:13 +0200, Reuti wrote:
>> Hi,
>> 
>> Am 13.06.2010 um 09:02 schrieb Zhang Linbo:
>> 
>>> Hi,
>>> 
>>> I'm new to OpenMPI and have encountered a problem with mpiexec.
>>> 
>>> Since I need to set up the execution environment for OpenMPI programs
>>> on the execution nodes, I use the following command line to launch an
>>> OMPI program:
>>> 
>>>  mpiexec -launch-agent /some_path/myscript 
>>> 
>>> The problem is: the above command works fine if I invoke 'mpiexec'
>>> without an absolute path just like above (assuming the PATH variable
>>> is properly set), but if I prepend an absolute path to 'mpiexec',  
>>> e.g.:
>>> 
>>>  /OMPI_dir/bin/mpiexec -launch-agent /some_path/myscript 
>> 
>> using an absolute path is equivalent to use the --prefix option to  
>> `mpiexec`. Both ways lead obviously to the erroneous behavior you  
>> encounter.
> 
> Hi folks
> 
> Speaking as no more than an uneducated user, having the behaviour change
> depending on invoking by an absolute path or invoking by some
> unspecified (potentially shell-dependent) path magic seems like a bad
> idea.
> 
> As a long-time *nix user, this just rubs me the wrong way.
> 
> Ciao
> Terry
> 
> 
> -- 
> Dr. Terry Frankcombe
> Research School of Chemistry, Australian National University
> Ph: (+61) 0417 163 509Skype: terry.frankcombe
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] A problem with 'mpiexec -launch-agent'

2010-06-14 Thread Terry Frankcombe
On Tue, 2010-06-15 at 00:13 +0200, Reuti wrote:
> Hi,
> 
> Am 13.06.2010 um 09:02 schrieb Zhang Linbo:
> 
> > Hi,
> >
> > I'm new to OpenMPI and have encountered a problem with mpiexec.
> >
> > Since I need to set up the execution environment for OpenMPI programs
> > on the execution nodes, I use the following command line to launch an
> > OMPI program:
> >
> >   mpiexec -launch-agent /some_path/myscript 
> >
> > The problem is: the above command works fine if I invoke 'mpiexec'
> > without an absolute path just like above (assuming the PATH variable
> > is properly set), but if I prepend an absolute path to 'mpiexec',  
> > e.g.:
> >
> >   /OMPI_dir/bin/mpiexec -launch-agent /some_path/myscript 
> 
> using an absolute path is equivalent to use the --prefix option to  
> `mpiexec`. Both ways lead obviously to the erroneous behavior you  
> encounter.

Hi folks

Speaking as no more than an uneducated user, having the behaviour change
depending on invoking by an absolute path or invoking by some
unspecified (potentially shell-dependent) path magic seems like a bad
idea.

As a long-time *nix user, this just rubs me the wrong way.

Ciao
Terry


-- 
Dr. Terry Frankcombe
Research School of Chemistry, Australian National University
Ph: (+61) 0417 163 509Skype: terry.frankcombe



Re: [OMPI users] A problem with 'mpiexec -launch-agent'

2010-06-14 Thread Reuti

Am 15.06.2010 um 00:26 schrieb Ralph Castain:

Jeff and I are taking a look at the logic in that code now - I know  
we thought we understood it back when we wrote it, but somehow it  
just doesn't look right any more...


To avoid confusion: I meant "orted_prefix" below which holds the name  
of the launch-agent and can't be used with this demonstration fix.


Sorry for the typo.

-- Reuti




On Jun 14, 2010, at 4:13 PM, Reuti wrote:


Hi,

Am 13.06.2010 um 09:02 schrieb Zhang Linbo:


Hi,

I'm new to OpenMPI and have encountered a problem with mpiexec.

Since I need to set up the execution environment for OpenMPI  
programs
on the execution nodes, I use the following command line to launch  
an

OMPI program:

mpiexec -launch-agent /some_path/myscript 

The problem is: the above command works fine if I invoke 'mpiexec'
without an absolute path just like above (assuming the PATH variable
is properly set), but if I prepend an absolute path to 'mpiexec',  
e.g.:


/OMPI_dir/bin/mpiexec -launch-agent /some_path/myscript 


using an absolute path is equivalent to use the --prefix option to  
`mpiexec`. Both ways lead obviously to the erroneous behavior you  
encounter.




then I get the following error message:

bash: -c: line 0: syntax error near unexpected token `('
bash: -c: line 0: ` PATH=/OMPI_dir/bin:$PATH ; export PATH ;  
LD_LIBRARY_PATH=/OMPI_dir/lib:$LD_LIBRARY_PATH ; export  
LD_LIBRARY_PATH ; /some_path/myscript /OMPI_dir/bin/(null) -- 
daemonize -mca ess env -mca orte_ess_jobid 1978662912 -mca  
orte_ess_vpid 1 -mca orte_ess_num_procs 2 --hnp-uri  
"1978662912.0;tcp://180.0.14.12:54844;tcp://190.0.14.12:54844"'


Reason seems to be, that in case of a given prefix the assembly of  
the necessary command line includes some elements too much. I tried  
to circumvent this by a new case in "orte/mca/plm/rsh/ 
plm_rsh_module.c":


  if (orted_prefix != NULL) {
  asprintf (_cmd,
"%s%s%s PATH=%s/%s:$PATH ; export PATH ; "
"LD_LIBRARY_PATH=%s/%s:$LD_LIBRARY_PATH ;  
export LD_LIBRARY_PATH ; "

"%s",
(opal_prefix != NULL ? "OPAL_PREFIX=" : ""),
(opal_prefix != NULL ? opal_prefix : ""),
(opal_prefix != NULL ? " ; export  
OPAL_PREFIX;" : ""),

prefix_dir, bin_base,
prefix_dir, lib_base,
orted_prefix );
  }
  else {
  asprintf (_cmd,
"%s%s%s PATH=%s/%s:$PATH ; export PATH ; "
"LD_LIBRARY_PATH=%s/%s:$LD_LIBRARY_PATH ;  
export LD_LIBRARY_PATH ; "

"%s %s/%s/%s",
(opal_prefix != NULL ? "OPAL_PREFIX=" : ""),
(opal_prefix != NULL ? opal_prefix : ""),
(opal_prefix != NULL ? " ; export  
OPAL_PREFIX;" : ""),

prefix_dir, bin_base,
prefix_dir, lib_base,
(orted_prefix != NULL ? orted_prefix : ""),
prefix_dir, bin_base,
orted_cmd);
  }

The name of the agent is for sake of easiness stored in  
"opal_prefix" AFAICS.


This is of course not a clean solution (as "opal_prefix" can't be  
used any more), but more a proof of concept, as only sh-like shelle  
are handled. Sure there are better ways to solve it. Anyway, it's a  
bug and should be filed


-- Reuti


I'd like to know what causes the above problem and how should I  
deal with it.
I want to use absolute pathname of mpiexec to avoid possible  
inteferences

with other MPI installations. Thanks in advance.

LB


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] A problem with 'mpiexec -launch-agent'

2010-06-14 Thread Ralph Castain
Jeff and I are taking a look at the logic in that code now - I know we thought 
we understood it back when we wrote it, but somehow it just doesn't look right 
any more...


On Jun 14, 2010, at 4:13 PM, Reuti wrote:

> Hi,
> 
> Am 13.06.2010 um 09:02 schrieb Zhang Linbo:
> 
>> Hi,
>> 
>> I'm new to OpenMPI and have encountered a problem with mpiexec.
>> 
>> Since I need to set up the execution environment for OpenMPI programs
>> on the execution nodes, I use the following command line to launch an
>> OMPI program:
>> 
>>  mpiexec -launch-agent /some_path/myscript 
>> 
>> The problem is: the above command works fine if I invoke 'mpiexec'
>> without an absolute path just like above (assuming the PATH variable
>> is properly set), but if I prepend an absolute path to 'mpiexec', e.g.:
>> 
>>  /OMPI_dir/bin/mpiexec -launch-agent /some_path/myscript 
> 
> using an absolute path is equivalent to use the --prefix option to `mpiexec`. 
> Both ways lead obviously to the erroneous behavior you encounter.
> 
> 
>> then I get the following error message:
>> 
>> bash: -c: line 0: syntax error near unexpected token `('
>> bash: -c: line 0: ` PATH=/OMPI_dir/bin:$PATH ; export PATH ; 
>> LD_LIBRARY_PATH=/OMPI_dir/lib:$LD_LIBRARY_PATH ; export LD_LIBRARY_PATH ; 
>> /some_path/myscript /OMPI_dir/bin/(null) --daemonize -mca ess env -mca 
>> orte_ess_jobid 1978662912 -mca orte_ess_vpid 1 -mca orte_ess_num_procs 2 
>> --hnp-uri "1978662912.0;tcp://180.0.14.12:54844;tcp://190.0.14.12:54844"'
> 
> Reason seems to be, that in case of a given prefix the assembly of the 
> necessary command line includes some elements too much. I tried to circumvent 
> this by a new case in "orte/mca/plm/rsh/plm_rsh_module.c":
> 
>if (orted_prefix != NULL) {
>asprintf (_cmd,
>  "%s%s%s PATH=%s/%s:$PATH ; export PATH ; "
>  "LD_LIBRARY_PATH=%s/%s:$LD_LIBRARY_PATH ; export 
> LD_LIBRARY_PATH ; "
>  "%s",
>  (opal_prefix != NULL ? "OPAL_PREFIX=" : ""),
>  (opal_prefix != NULL ? opal_prefix : ""),
>  (opal_prefix != NULL ? " ; export OPAL_PREFIX;" : ""),
>  prefix_dir, bin_base,
>  prefix_dir, lib_base,
>  orted_prefix );
>}
>else {
>asprintf (_cmd,
>  "%s%s%s PATH=%s/%s:$PATH ; export PATH ; "
>  "LD_LIBRARY_PATH=%s/%s:$LD_LIBRARY_PATH ; export 
> LD_LIBRARY_PATH ; "
>  "%s %s/%s/%s",
>  (opal_prefix != NULL ? "OPAL_PREFIX=" : ""),
>  (opal_prefix != NULL ? opal_prefix : ""),
>  (opal_prefix != NULL ? " ; export OPAL_PREFIX;" : ""),
>  prefix_dir, bin_base,
>  prefix_dir, lib_base,
>  (orted_prefix != NULL ? orted_prefix : ""),
>  prefix_dir, bin_base,
>  orted_cmd);
>}
> 
> The name of the agent is for sake of easiness stored in "opal_prefix" AFAICS.
> 
> This is of course not a clean solution (as "opal_prefix" can't be used any 
> more), but more a proof of concept, as only sh-like shelle are handled. Sure 
> there are better ways to solve it. Anyway, it's a bug and should be filed
> 
> -- Reuti
> 
> 
>> I'd like to know what causes the above problem and how should I deal with it.
>> I want to use absolute pathname of mpiexec to avoid possible inteferences
>> with other MPI installations. Thanks in advance.
>> 
>> LB
>> 
>> 
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] A problem with 'mpiexec -launch-agent'

2010-06-14 Thread Reuti

Hi,

Am 13.06.2010 um 09:02 schrieb Zhang Linbo:


Hi,

I'm new to OpenMPI and have encountered a problem with mpiexec.

Since I need to set up the execution environment for OpenMPI programs
on the execution nodes, I use the following command line to launch an
OMPI program:

  mpiexec -launch-agent /some_path/myscript 

The problem is: the above command works fine if I invoke 'mpiexec'
without an absolute path just like above (assuming the PATH variable
is properly set), but if I prepend an absolute path to 'mpiexec',  
e.g.:


  /OMPI_dir/bin/mpiexec -launch-agent /some_path/myscript 


using an absolute path is equivalent to use the --prefix option to  
`mpiexec`. Both ways lead obviously to the erroneous behavior you  
encounter.




then I get the following error message:

bash: -c: line 0: syntax error near unexpected token `('
bash: -c: line 0: ` PATH=/OMPI_dir/bin:$PATH ; export PATH ;  
LD_LIBRARY_PATH=/OMPI_dir/lib:$LD_LIBRARY_PATH ; export  
LD_LIBRARY_PATH ; /some_path/myscript /OMPI_dir/bin/(null) -- 
daemonize -mca ess env -mca orte_ess_jobid 1978662912 -mca  
orte_ess_vpid 1 -mca orte_ess_num_procs 2 --hnp-uri  
"1978662912.0;tcp://180.0.14.12:54844;tcp://190.0.14.12:54844"'


Reason seems to be, that in case of a given prefix the assembly of the  
necessary command line includes some elements too much. I tried to  
circumvent this by a new case in "orte/mca/plm/rsh/plm_rsh_module.c":


if (orted_prefix != NULL) {
asprintf (_cmd,
  "%s%s%s PATH=%s/%s:$PATH ; export PATH ; "
  "LD_LIBRARY_PATH=%s/%s:$LD_LIBRARY_PATH ;  
export LD_LIBRARY_PATH ; "

  "%s",
  (opal_prefix != NULL ? "OPAL_PREFIX=" : ""),
  (opal_prefix != NULL ? opal_prefix : ""),
  (opal_prefix != NULL ? " ; export  
OPAL_PREFIX;" : ""),

  prefix_dir, bin_base,
  prefix_dir, lib_base,
  orted_prefix );
}
else {
asprintf (_cmd,
  "%s%s%s PATH=%s/%s:$PATH ; export PATH ; "
  "LD_LIBRARY_PATH=%s/%s:$LD_LIBRARY_PATH ;  
export LD_LIBRARY_PATH ; "

  "%s %s/%s/%s",
  (opal_prefix != NULL ? "OPAL_PREFIX=" : ""),
  (opal_prefix != NULL ? opal_prefix : ""),
  (opal_prefix != NULL ? " ; export  
OPAL_PREFIX;" : ""),

  prefix_dir, bin_base,
  prefix_dir, lib_base,
  (orted_prefix != NULL ? orted_prefix : ""),
  prefix_dir, bin_base,
  orted_cmd);
}

The name of the agent is for sake of easiness stored in "opal_prefix"  
AFAICS.


This is of course not a clean solution (as "opal_prefix" can't be used  
any more), but more a proof of concept, as only sh-like shelle are  
handled. Sure there are better ways to solve it. Anyway, it's a bug  
and should be filed


-- Reuti


I'd like to know what causes the above problem and how should I deal  
with it.
I want to use absolute pathname of mpiexec to avoid possible  
inteferences

with other MPI installations. Thanks in advance.

LB


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




[OMPI users] A problem with 'mpiexec -launch-agent'

2010-06-13 Thread Zhang Linbo

Hi,

I'm new to OpenMPI and have encountered a problem with mpiexec.

Since I need to set up the execution environment for OpenMPI programs
on the execution nodes, I use the following command line to launch an
OMPI program:

   mpiexec -launch-agent /some_path/myscript 

The problem is: the above command works fine if I invoke 'mpiexec'
without an absolute path just like above (assuming the PATH variable
is properly set), but if I prepend an absolute path to 'mpiexec', e.g.:

   /OMPI_dir/bin/mpiexec -launch-agent /some_path/myscript 

then I get the following error message:

bash: -c: line 0: syntax error near unexpected token `('
bash: -c: line 0: ` PATH=/OMPI_dir/bin:$PATH ; export PATH ; 
LD_LIBRARY_PATH=/OMPI_dir/lib:$LD_LIBRARY_PATH ; export LD_LIBRARY_PATH 
; /some_path/myscript /OMPI_dir/bin/(null) --daemonize -mca ess env -mca 
orte_ess_jobid 1978662912 -mca orte_ess_vpid 1 -mca orte_ess_num_procs 2 
--hnp-uri "1978662912.0;tcp://180.0.14.12:54844;tcp://190.0.14.12:54844"'


I'd like to know what causes the above problem and how should I deal 
with it.

I want to use absolute pathname of mpiexec to avoid possible inteferences
with other MPI installations. Thanks in advance.

LB