Could you rerun that with -mca plm_base_verbose 1? What environment are you in - I assume rsh/ssh?

I would like to see the cmd line being used to launch the orted. What this indicates is that we are not getting the cmd line correct. Could just be that some patch in the trunk didn't get completely applied to the 1.3 branch.

BTW: you probably can't run orted directly off of the cmd line. It likely needs some cmd line params to get critical info.

Ralph

On Sep 24, 2008, at 9:47 AM, Will Portnoy wrote:

I'm trying to use MPI_Comm_Spawn with MPI_Info's host key to spawn
processes from a process not started with mpirun.  This works with the
host key set to the localhost's hostname, but it does not work when I
use other hosts.

I'm using version 1.3a1r19602.  I need to use orte_launch_agent to set
up my environment a bit before orted is started, but it fails with
errors listed below.

When I try to run orted directly on the command line with some of the
verbosity flags turned to "11", I receive the same messages.

Does anybody have any suggestions?

thank you,

Will


[fqdn:24761] mca: base: components_open: Looking for ess components
[fqdn:24761] mca: base: components_open: opening ess components
[fqdn:24761] mca: base: components_open: found loaded component env
[fqdn:24761] mca: base: components_open: component env has no register function [fqdn:24761] mca: base: components_open: component env open function successful
[fqdn:24761] mca: base: components_open: found loaded component hnp
[fqdn:24761] mca: base: components_open: component hnp has no register function [fqdn:24761] mca: base: components_open: component hnp open function successful [fqdn:24761] mca: base: components_open: found loaded component singleton
[fqdn:24761] mca: base: components_open: component singleton has no
register function
[fqdn:24761] mca: base: components_open: component singleton open
function successful
[fqdn:24761] mca: base: components_open: found loaded component slurm
[fqdn:24761] mca: base: components_open: component slurm has no
register function
[fqdn:24761] mca: base: components_open: component slurm open function
successful
[fqdn:24761] mca: base: components_open: found loaded component tool
[fqdn:24761] mca: base: components_open: component tool has no register function [fqdn:24761] mca: base: components_open: component tool open function successful
[fqdn:24761] mca:base:select: Auto-selecting ess components
[fqdn:24761] mca:base:select:(  ess) Querying component [env]
[fqdn:24761] mca:base:select:(  ess) Skipping component [env]. Query
failed to return a module
[fqdn:24761] mca:base:select:(  ess) Querying component [hnp]
[fqdn:24761] mca:base:select:(  ess) Skipping component [hnp]. Query
failed to return a module
[fqdn:24761] mca:base:select:(  ess) Querying component [singleton]
[fqdn:24761] mca:base:select:(  ess) Skipping component [singleton].
Query failed to return a module
[fqdn:24761] mca:base:select:(  ess) Querying component [slurm]
[fqdn:24761] mca:base:select:(  ess) Skipping component [slurm]. Query
failed to return a module
[fqdn:24761] mca:base:select:(  ess) Querying component [tool]
[fqdn:24761] mca:base:select:(  ess) Skipping component [tool]. Query
failed to return a module
[fqdn:24761] mca:base:select:(  ess) No component selected!
[fqdn:24761] [[INVALID],INVALID] ORTE_ERROR_LOG: Not found in file
runtime/orte_init.c at line 125
--------------------------------------------------------------------------
It looks like orte_init failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems.  This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

 orte_ess_base_select failed
 --> Returned value Not found (-13) instead of ORTE_SUCCESS
--------------------------------------------------------------------------
[fqdn:24761] [[INVALID],INVALID] ORTE_ERROR_LOG: Not found in file
orted/orted_main.c at line 315
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

Reply via email to