isn't it related to https://svn.open-mpi.org/trac/ompi/ticket/1469 ?
On 6/30/08, Lenny Verkhovsky <lenny.verkhov...@gmail.com> wrote: > > I am not familiar with spawn test of IBM, but maybe this is right > behavior, > if spawn test allocates 3 ranks on the node, and then allocates another 3 > then this test suppose to fail due to max_slots=4. > > But it fails with the fallowing hostfile as well BUT WITH A DIFFERENT > ERROR. > > #cat hostfile2 > witch2 slots=4 max_slots=4 > witch3 slots=4 max_slots=4 > witch1:/home/BENCHMARKS/IBM # /home/USERS/lenny/OMPI_ORTE_18772/bin/mpirun > -np 3 -hostfile hostfile2 dynamic/spawn > bash: orted: command not found > [witch1:22789] > -------------------------------------------------------------------------- > A daemon (pid 22791) died unexpectedly with status 127 while attempting > to launch so we are aborting. > There may be more information reported by the environment (see above). > This may be because the daemon was unable to find all the needed shared > libraries on the remote node. You may set your LD_LIBRARY_PATH to have the > location of the shared libraries on the remote nodes and this will > automatically be forwarded to the remote nodes. > -------------------------------------------------------------------------- > [witch1:22789] > -------------------------------------------------------------------------- > mpirun was unable to cleanly terminate the daemons on the nodes shown > below. Additional manual cleanup may be required - please refer to > the "orte-clean" tool for assistance. > -------------------------------------------------------------------------- > witch3 - daemon did not report back when launched > > On Mon, Jun 30, 2008 at 9:38 AM, Lenny Verkhovsky < > lenny.verkhov...@gmail.com> wrote: > >> Hi, >> trying to run mtt I failed to run IBM spawn test. It fails only when using >> hostfile, and not when using host list. >> ( OMPI from TRUNK ) >> >> This is working : >> #mpirun -np 3 -H witch2 dynamic/spawn >> >> This Fails: >> # cat hostfile >> witch2 slots=4 max_slots=4 >> >> #mpirun -np 3 -hostfile hostfile dynamic/spawn >> [witch1:12392] >> -------------------------------------------------------------------------- >> There are not enough slots available in the system to satisfy the 3 slots >> that were requested by the application: >> dynamic/spawn >> >> Either request fewer slots for your application, or make more slots >> available >> for use. >> -------------------------------------------------------------------------- >> [witch1:12392] >> -------------------------------------------------------------------------- >> A daemon (pid unknown) died unexpectedly on signal 1 while attempting to >> launch so we are aborting. >> >> There may be more information reported by the environment (see above). >> This may be because the daemon was unable to find all the needed shared >> libraries on the remote node. You may set your LD_LIBRARY_PATH to have the >> location of the shared libraries on the remote nodes and this will >> automatically be forwarded to the remote nodes. >> -------------------------------------------------------------------------- >> mpirun: clean termination accomplished >> >> >> Using hostfile1 also works >> #cat hostfile1 >> witch2 >> witch2 >> witch2 >> >> >> Best Regards >> Lenny. >> > >