I saw it. But I think it something else, since it works if I run it with hostlist
#mpirun -np 3 -H witch2,witch3 dynamic/spawn # On Mon, Jun 30, 2008 at 4:03 PM, Ralph H Castain <r...@lanl.gov> wrote: > Well, that error indicates that it was unable to launch the daemon on > witch3 > for some reason. If you look at the error reported by bash, you will see > that the ³orted² binary wasn¹t found! > > Sounds like a path error you might check to see if witch3 has the > binaries > installed, and if they are where you told the system to look... > > Ralph > > > > On 6/30/08 5:21 AM, "Lenny Verkhovsky" <lenny.verkhov...@gmail.com> wrote: > > > I am not familiar with spawn test of IBM, but maybe this is right > behavior, > > if spawn test allocates 3 ranks on the node, and then allocates another 3 > > then this test suppose to fail due to max_slots=4. > > > > But it fails with the fallowing hostfile as well BUT WITH A DIFFERENT > ERROR. > > > > #cat hostfile2 > > witch2 slots=4 max_slots=4 > > witch3 slots=4 max_slots=4 > > witch1:/home/BENCHMARKS/IBM # > /home/USERS/lenny/OMPI_ORTE_18772/bin/mpirun -np > > 3 -hostfile hostfile2 dynamic/spawn > > bash: orted: command not found > > [witch1:22789] > > > -------------------------------------------------------------------------- > > A daemon (pid 22791) died unexpectedly with status 127 while attempting > > to launch so we are aborting. > > There may be more information reported by the environment (see above). > > This may be because the daemon was unable to find all the needed shared > > libraries on the remote node. You may set your LD_LIBRARY_PATH to have > the > > location of the shared libraries on the remote nodes and this will > > automatically be forwarded to the remote nodes. > > > -------------------------------------------------------------------------- > > [witch1:22789] > > > -------------------------------------------------------------------------- > > mpirun was unable to cleanly terminate the daemons on the nodes shown > > below. Additional manual cleanup may be required - please refer to > > the "orte-clean" tool for assistance. > > > -------------------------------------------------------------------------- > > witch3 - daemon did not report back when launched > > > > On Mon, Jun 30, 2008 at 9:38 AM, Lenny Verkhovsky < > lenny.verkhov...@gmail.com> > > wrote: > >> Hi, > >> trying to run mtt I failed to run IBM spawn test. It fails only when > using > >> hostfile, and not when using host list. > >> ( OMPI from TRUNK ) > >> > >> This is working : > >> #mpirun -np 3 -H witch2 dynamic/spawn > >> > >> This Fails: > >> # cat hostfile > >> witch2 slots=4 max_slots=4 > >> #mpirun -np 3 -hostfile hostfile dynamic/spawn > >> [witch1:12392] > >> > -------------------------------------------------------------------------- > >> There are not enough slots available in the system to satisfy the 3 > slots > >> that were requested by the application: > >> dynamic/spawn > >> > >> Either request fewer slots for your application, or make more slots > available > >> for use. > >> > -------------------------------------------------------------------------- > >> [witch1:12392] > >> > -------------------------------------------------------------------------- > >> A daemon (pid unknown) died unexpectedly on signal 1 while attempting > to > >> launch so we are aborting. > >> > >> There may be more information reported by the environment (see above). > >> > >> This may be because the daemon was unable to find all the needed shared > >> libraries on the remote node. You may set your LD_LIBRARY_PATH to have > the > >> location of the shared libraries on the remote nodes and this will > >> automatically be forwarded to the remote nodes. > >> > -------------------------------------------------------------------------- > >> mpirun: clean termination accomplished > >> > >> > >> Using hostfile1 also works > >> #cat hostfile1 > >> witch2 > >> witch2 > >> witch2 > >> > >> > >> Best Regards > >> Lenny. > >> > > > > > >