This is strange. I assume that you what to use rsh or ssh to launch the
processes?
If you want to use ssh, does "which ssh" find ssh? Similarly, if you
want to use rsh, does "which rsh" find rsh?
Thanks,
Tim
Adam C Powell IV wrote:
On Wed, 2007-07-18 at 09:50 -0400, Tim Prins wrote:
Adam C Powell IV wrote:
Greetings,
I'm running the Debian package of OpenMPI in a chroot (with /proc
mounted properly), and orte_init is failing as follows:
[snip]
What could be wrong? Does orterun not run in a chroot environment?
What more can I do to investigate further?
Try running mpirun with the added options:
-mca orte_debug 1 -mca pls_base_verbose 20
Then send the output to the list.
Thanks! Here's the output:
$ orterun -mca orte_debug 1 -mca pls_base_verbose 20 -np 1 uptime
[new-host-3:19201] mca: base: components_open: Looking for pls components
[new-host-3:19201] mca: base: components_open: distilling pls components
[new-host-3:19201] mca: base: components_open: accepting all pls components
[new-host-3:19201] mca: base: components_open: opening pls components
[new-host-3:19201] mca: base: components_open: found loaded component
gridengine[new-host-3:19201] mca: base: components_open: component gridengine
open function successful
[new-host-3:19201] mca: base: components_open: found loaded component proxy
[new-host-3:19201] mca: base: components_open: component proxy open function
successful
[new-host-3:19201] mca: base: components_open: found loaded component rsh
[new-host-3:19201] mca: base: components_open: component rsh open function
successful
[new-host-3:19201] mca: base: components_open: found loaded component slurm
[new-host-3:19201] mca: base: components_open: component slurm open function
successful
[new-host-3:19201] orte:base:select: querying component gridengine
[new-host-3:19201] pls:gridengine: NOT available for selection
[new-host-3:19201] orte:base:select: querying component proxy
[new-host-3:19201] orte:base:select: querying component rsh
[new-host-3:19201] orte:base:select: querying component slurm
[new-host-3:19201] [0,0,0] ORTE_ERROR_LOG: Error in file
runtime/orte_init_stage1.c at line 312
--------------------------------------------------------------------------
It looks like orte_init failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems. This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):
orte_pls_base_select failed
--> Returned value -1 instead of ORTE_SUCCESS
--------------------------------------------------------------------------
[new-host-3:19201] [0,0,0] ORTE_ERROR_LOG: Error in file
runtime/orte_system_init.c at line 42
[new-host-3:19201] [0,0,0] ORTE_ERROR_LOG: Error in file runtime/orte_init.c at
line 52
--------------------------------------------------------------------------
Open RTE was unable to initialize properly. The error occured while
attempting to orte_init(). Returned value -1 instead of ORTE_SUCCESS.
--------------------------------------------------------------------------
-Adam