Re: [OMPI users] A problem with 'mpiexec -launch-agent'
Am 15.06.2010 um 14:52 schrieb Jeff Squyres: > On Jun 14, 2010, at 3:13 PM, Reuti wrote: > >>> bash: -c: line 0: syntax error near unexpected token `(' >>> bash: -c: line 0: ` PATH=/OMPI_dir/bin:$PATH ; export PATH ; >>> LD_LIBRARY_PATH=/OMPI_dir/lib:$LD_LIBRARY_PATH ; export >>> LD_LIBRARY_PATH ; /some_path/myscript /OMPI_dir/bin/(null) -- >>> daemonize -mca ess env -mca orte_ess_jobid 1978662912 -mca >>> orte_ess_vpid 1 -mca orte_ess_num_procs 2 --hnp-uri >>> "1978662912.0;tcp://180.0.14.12:54844;tcp://190.0.14.12:54844"' > > The problem is that "(null)" in the middle. We'll have to dig into how that > got there... Reuti's probably right that something is somehow NULL in there, > and glibc is snprintf'ing (null) instead of SEGV'ing. I think the problem is not only the (null) itself, but also the output "prefix_dir" and "bin_base" (unless the launch-agent would have ignore/interpret $1 $2 in a proper way). The (null) is then the content of "orted_cmd". -- Reuti > > Ralph and I are talking about this issue, but we're hindered by the fact that > I'm at the MPI Forum this week (i.e., meetings are taking up all my days). I > haven't had a chance to look at the code in depth yet. > > -- > Jeff Squyres > jsquy...@cisco.com > For corporate legal information go to: > http://www.cisco.com/web/about/doing_business/legal/cri/ > > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] A problem with 'mpiexec -launch-agent'
On Jun 14, 2010, at 3:13 PM, Reuti wrote: > > bash: -c: line 0: syntax error near unexpected token `(' > > bash: -c: line 0: ` PATH=/OMPI_dir/bin:$PATH ; export PATH ; > > LD_LIBRARY_PATH=/OMPI_dir/lib:$LD_LIBRARY_PATH ; export > > LD_LIBRARY_PATH ; /some_path/myscript /OMPI_dir/bin/(null) -- > > daemonize -mca ess env -mca orte_ess_jobid 1978662912 -mca > > orte_ess_vpid 1 -mca orte_ess_num_procs 2 --hnp-uri > > "1978662912.0;tcp://180.0.14.12:54844;tcp://190.0.14.12:54844"' The problem is that "(null)" in the middle. We'll have to dig into how that got there... Reuti's probably right that something is somehow NULL in there, and glibc is snprintf'ing (null) instead of SEGV'ing. Ralph and I are talking about this issue, but we're hindered by the fact that I'm at the MPI Forum this week (i.e., meetings are taking up all my days). I haven't had a chance to look at the code in depth yet. -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
Re: [OMPI users] A problem with 'mpiexec -launch-agent'
On Jun 14, 2010, at 5:24 PM, Terry Frankcombe wrote: > Speaking as no more than an uneducated user, having the behaviour change > depending on invoking by an absolute path or invoking by some > unspecified (potentially shell-dependent) path magic seems like a bad > idea. FWIW, this specific feature was copied (at the request of multiple users) from another MPI implementation. -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
Re: [OMPI users] A problem with 'mpiexec -launch-agent'
It isn't our intention either - still looking at this to see what is going on. On Jun 14, 2010, at 6:24 PM, Terry Frankcombe wrote: > On Tue, 2010-06-15 at 00:13 +0200, Reuti wrote: >> Hi, >> >> Am 13.06.2010 um 09:02 schrieb Zhang Linbo: >> >>> Hi, >>> >>> I'm new to OpenMPI and have encountered a problem with mpiexec. >>> >>> Since I need to set up the execution environment for OpenMPI programs >>> on the execution nodes, I use the following command line to launch an >>> OMPI program: >>> >>> mpiexec -launch-agent /some_path/myscript >>> >>> The problem is: the above command works fine if I invoke 'mpiexec' >>> without an absolute path just like above (assuming the PATH variable >>> is properly set), but if I prepend an absolute path to 'mpiexec', >>> e.g.: >>> >>> /OMPI_dir/bin/mpiexec -launch-agent /some_path/myscript >> >> using an absolute path is equivalent to use the --prefix option to >> `mpiexec`. Both ways lead obviously to the erroneous behavior you >> encounter. > > Hi folks > > Speaking as no more than an uneducated user, having the behaviour change > depending on invoking by an absolute path or invoking by some > unspecified (potentially shell-dependent) path magic seems like a bad > idea. > > As a long-time *nix user, this just rubs me the wrong way. > > Ciao > Terry > > > -- > Dr. Terry Frankcombe > Research School of Chemistry, Australian National University > Ph: (+61) 0417 163 509Skype: terry.frankcombe > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] A problem with 'mpiexec -launch-agent'
On Tue, 2010-06-15 at 00:13 +0200, Reuti wrote: > Hi, > > Am 13.06.2010 um 09:02 schrieb Zhang Linbo: > > > Hi, > > > > I'm new to OpenMPI and have encountered a problem with mpiexec. > > > > Since I need to set up the execution environment for OpenMPI programs > > on the execution nodes, I use the following command line to launch an > > OMPI program: > > > > mpiexec -launch-agent /some_path/myscript > > > > The problem is: the above command works fine if I invoke 'mpiexec' > > without an absolute path just like above (assuming the PATH variable > > is properly set), but if I prepend an absolute path to 'mpiexec', > > e.g.: > > > > /OMPI_dir/bin/mpiexec -launch-agent /some_path/myscript > > using an absolute path is equivalent to use the --prefix option to > `mpiexec`. Both ways lead obviously to the erroneous behavior you > encounter. Hi folks Speaking as no more than an uneducated user, having the behaviour change depending on invoking by an absolute path or invoking by some unspecified (potentially shell-dependent) path magic seems like a bad idea. As a long-time *nix user, this just rubs me the wrong way. Ciao Terry -- Dr. Terry Frankcombe Research School of Chemistry, Australian National University Ph: (+61) 0417 163 509Skype: terry.frankcombe
Re: [OMPI users] A problem with 'mpiexec -launch-agent'
Am 15.06.2010 um 00:26 schrieb Ralph Castain: Jeff and I are taking a look at the logic in that code now - I know we thought we understood it back when we wrote it, but somehow it just doesn't look right any more... To avoid confusion: I meant "orted_prefix" below which holds the name of the launch-agent and can't be used with this demonstration fix. Sorry for the typo. -- Reuti On Jun 14, 2010, at 4:13 PM, Reuti wrote: Hi, Am 13.06.2010 um 09:02 schrieb Zhang Linbo: Hi, I'm new to OpenMPI and have encountered a problem with mpiexec. Since I need to set up the execution environment for OpenMPI programs on the execution nodes, I use the following command line to launch an OMPI program: mpiexec -launch-agent /some_path/myscript The problem is: the above command works fine if I invoke 'mpiexec' without an absolute path just like above (assuming the PATH variable is properly set), but if I prepend an absolute path to 'mpiexec', e.g.: /OMPI_dir/bin/mpiexec -launch-agent /some_path/myscript using an absolute path is equivalent to use the --prefix option to `mpiexec`. Both ways lead obviously to the erroneous behavior you encounter. then I get the following error message: bash: -c: line 0: syntax error near unexpected token `(' bash: -c: line 0: ` PATH=/OMPI_dir/bin:$PATH ; export PATH ; LD_LIBRARY_PATH=/OMPI_dir/lib:$LD_LIBRARY_PATH ; export LD_LIBRARY_PATH ; /some_path/myscript /OMPI_dir/bin/(null) -- daemonize -mca ess env -mca orte_ess_jobid 1978662912 -mca orte_ess_vpid 1 -mca orte_ess_num_procs 2 --hnp-uri "1978662912.0;tcp://180.0.14.12:54844;tcp://190.0.14.12:54844"' Reason seems to be, that in case of a given prefix the assembly of the necessary command line includes some elements too much. I tried to circumvent this by a new case in "orte/mca/plm/rsh/ plm_rsh_module.c": if (orted_prefix != NULL) { asprintf (_cmd, "%s%s%s PATH=%s/%s:$PATH ; export PATH ; " "LD_LIBRARY_PATH=%s/%s:$LD_LIBRARY_PATH ; export LD_LIBRARY_PATH ; " "%s", (opal_prefix != NULL ? "OPAL_PREFIX=" : ""), (opal_prefix != NULL ? opal_prefix : ""), (opal_prefix != NULL ? " ; export OPAL_PREFIX;" : ""), prefix_dir, bin_base, prefix_dir, lib_base, orted_prefix ); } else { asprintf (_cmd, "%s%s%s PATH=%s/%s:$PATH ; export PATH ; " "LD_LIBRARY_PATH=%s/%s:$LD_LIBRARY_PATH ; export LD_LIBRARY_PATH ; " "%s %s/%s/%s", (opal_prefix != NULL ? "OPAL_PREFIX=" : ""), (opal_prefix != NULL ? opal_prefix : ""), (opal_prefix != NULL ? " ; export OPAL_PREFIX;" : ""), prefix_dir, bin_base, prefix_dir, lib_base, (orted_prefix != NULL ? orted_prefix : ""), prefix_dir, bin_base, orted_cmd); } The name of the agent is for sake of easiness stored in "opal_prefix" AFAICS. This is of course not a clean solution (as "opal_prefix" can't be used any more), but more a proof of concept, as only sh-like shelle are handled. Sure there are better ways to solve it. Anyway, it's a bug and should be filed -- Reuti I'd like to know what causes the above problem and how should I deal with it. I want to use absolute pathname of mpiexec to avoid possible inteferences with other MPI installations. Thanks in advance. LB ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] A problem with 'mpiexec -launch-agent'
Jeff and I are taking a look at the logic in that code now - I know we thought we understood it back when we wrote it, but somehow it just doesn't look right any more... On Jun 14, 2010, at 4:13 PM, Reuti wrote: > Hi, > > Am 13.06.2010 um 09:02 schrieb Zhang Linbo: > >> Hi, >> >> I'm new to OpenMPI and have encountered a problem with mpiexec. >> >> Since I need to set up the execution environment for OpenMPI programs >> on the execution nodes, I use the following command line to launch an >> OMPI program: >> >> mpiexec -launch-agent /some_path/myscript >> >> The problem is: the above command works fine if I invoke 'mpiexec' >> without an absolute path just like above (assuming the PATH variable >> is properly set), but if I prepend an absolute path to 'mpiexec', e.g.: >> >> /OMPI_dir/bin/mpiexec -launch-agent /some_path/myscript > > using an absolute path is equivalent to use the --prefix option to `mpiexec`. > Both ways lead obviously to the erroneous behavior you encounter. > > >> then I get the following error message: >> >> bash: -c: line 0: syntax error near unexpected token `(' >> bash: -c: line 0: ` PATH=/OMPI_dir/bin:$PATH ; export PATH ; >> LD_LIBRARY_PATH=/OMPI_dir/lib:$LD_LIBRARY_PATH ; export LD_LIBRARY_PATH ; >> /some_path/myscript /OMPI_dir/bin/(null) --daemonize -mca ess env -mca >> orte_ess_jobid 1978662912 -mca orte_ess_vpid 1 -mca orte_ess_num_procs 2 >> --hnp-uri "1978662912.0;tcp://180.0.14.12:54844;tcp://190.0.14.12:54844"' > > Reason seems to be, that in case of a given prefix the assembly of the > necessary command line includes some elements too much. I tried to circumvent > this by a new case in "orte/mca/plm/rsh/plm_rsh_module.c": > >if (orted_prefix != NULL) { >asprintf (_cmd, > "%s%s%s PATH=%s/%s:$PATH ; export PATH ; " > "LD_LIBRARY_PATH=%s/%s:$LD_LIBRARY_PATH ; export > LD_LIBRARY_PATH ; " > "%s", > (opal_prefix != NULL ? "OPAL_PREFIX=" : ""), > (opal_prefix != NULL ? opal_prefix : ""), > (opal_prefix != NULL ? " ; export OPAL_PREFIX;" : ""), > prefix_dir, bin_base, > prefix_dir, lib_base, > orted_prefix ); >} >else { >asprintf (_cmd, > "%s%s%s PATH=%s/%s:$PATH ; export PATH ; " > "LD_LIBRARY_PATH=%s/%s:$LD_LIBRARY_PATH ; export > LD_LIBRARY_PATH ; " > "%s %s/%s/%s", > (opal_prefix != NULL ? "OPAL_PREFIX=" : ""), > (opal_prefix != NULL ? opal_prefix : ""), > (opal_prefix != NULL ? " ; export OPAL_PREFIX;" : ""), > prefix_dir, bin_base, > prefix_dir, lib_base, > (orted_prefix != NULL ? orted_prefix : ""), > prefix_dir, bin_base, > orted_cmd); >} > > The name of the agent is for sake of easiness stored in "opal_prefix" AFAICS. > > This is of course not a clean solution (as "opal_prefix" can't be used any > more), but more a proof of concept, as only sh-like shelle are handled. Sure > there are better ways to solve it. Anyway, it's a bug and should be filed > > -- Reuti > > >> I'd like to know what causes the above problem and how should I deal with it. >> I want to use absolute pathname of mpiexec to avoid possible inteferences >> with other MPI installations. Thanks in advance. >> >> LB >> >> >> ___ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] A problem with 'mpiexec -launch-agent'
Hi, Am 13.06.2010 um 09:02 schrieb Zhang Linbo: Hi, I'm new to OpenMPI and have encountered a problem with mpiexec. Since I need to set up the execution environment for OpenMPI programs on the execution nodes, I use the following command line to launch an OMPI program: mpiexec -launch-agent /some_path/myscript The problem is: the above command works fine if I invoke 'mpiexec' without an absolute path just like above (assuming the PATH variable is properly set), but if I prepend an absolute path to 'mpiexec', e.g.: /OMPI_dir/bin/mpiexec -launch-agent /some_path/myscript using an absolute path is equivalent to use the --prefix option to `mpiexec`. Both ways lead obviously to the erroneous behavior you encounter. then I get the following error message: bash: -c: line 0: syntax error near unexpected token `(' bash: -c: line 0: ` PATH=/OMPI_dir/bin:$PATH ; export PATH ; LD_LIBRARY_PATH=/OMPI_dir/lib:$LD_LIBRARY_PATH ; export LD_LIBRARY_PATH ; /some_path/myscript /OMPI_dir/bin/(null) -- daemonize -mca ess env -mca orte_ess_jobid 1978662912 -mca orte_ess_vpid 1 -mca orte_ess_num_procs 2 --hnp-uri "1978662912.0;tcp://180.0.14.12:54844;tcp://190.0.14.12:54844"' Reason seems to be, that in case of a given prefix the assembly of the necessary command line includes some elements too much. I tried to circumvent this by a new case in "orte/mca/plm/rsh/plm_rsh_module.c": if (orted_prefix != NULL) { asprintf (_cmd, "%s%s%s PATH=%s/%s:$PATH ; export PATH ; " "LD_LIBRARY_PATH=%s/%s:$LD_LIBRARY_PATH ; export LD_LIBRARY_PATH ; " "%s", (opal_prefix != NULL ? "OPAL_PREFIX=" : ""), (opal_prefix != NULL ? opal_prefix : ""), (opal_prefix != NULL ? " ; export OPAL_PREFIX;" : ""), prefix_dir, bin_base, prefix_dir, lib_base, orted_prefix ); } else { asprintf (_cmd, "%s%s%s PATH=%s/%s:$PATH ; export PATH ; " "LD_LIBRARY_PATH=%s/%s:$LD_LIBRARY_PATH ; export LD_LIBRARY_PATH ; " "%s %s/%s/%s", (opal_prefix != NULL ? "OPAL_PREFIX=" : ""), (opal_prefix != NULL ? opal_prefix : ""), (opal_prefix != NULL ? " ; export OPAL_PREFIX;" : ""), prefix_dir, bin_base, prefix_dir, lib_base, (orted_prefix != NULL ? orted_prefix : ""), prefix_dir, bin_base, orted_cmd); } The name of the agent is for sake of easiness stored in "opal_prefix" AFAICS. This is of course not a clean solution (as "opal_prefix" can't be used any more), but more a proof of concept, as only sh-like shelle are handled. Sure there are better ways to solve it. Anyway, it's a bug and should be filed -- Reuti I'd like to know what causes the above problem and how should I deal with it. I want to use absolute pathname of mpiexec to avoid possible inteferences with other MPI installations. Thanks in advance. LB ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
[OMPI users] A problem with 'mpiexec -launch-agent'
Hi, I'm new to OpenMPI and have encountered a problem with mpiexec. Since I need to set up the execution environment for OpenMPI programs on the execution nodes, I use the following command line to launch an OMPI program: mpiexec -launch-agent /some_path/myscript The problem is: the above command works fine if I invoke 'mpiexec' without an absolute path just like above (assuming the PATH variable is properly set), but if I prepend an absolute path to 'mpiexec', e.g.: /OMPI_dir/bin/mpiexec -launch-agent /some_path/myscript then I get the following error message: bash: -c: line 0: syntax error near unexpected token `(' bash: -c: line 0: ` PATH=/OMPI_dir/bin:$PATH ; export PATH ; LD_LIBRARY_PATH=/OMPI_dir/lib:$LD_LIBRARY_PATH ; export LD_LIBRARY_PATH ; /some_path/myscript /OMPI_dir/bin/(null) --daemonize -mca ess env -mca orte_ess_jobid 1978662912 -mca orte_ess_vpid 1 -mca orte_ess_num_procs 2 --hnp-uri "1978662912.0;tcp://180.0.14.12:54844;tcp://190.0.14.12:54844"' I'd like to know what causes the above problem and how should I deal with it. I want to use absolute pathname of mpiexec to avoid possible inteferences with other MPI installations. Thanks in advance. LB