Well, that error indicates that it was unable to launch the daemon on witch3
for some reason. If you look at the error reported by bash, you will see
that the ³orted² binary wasn¹t found!

Sounds like a path error ­ you might check to see if witch3 has the binaries
installed, and if they are where you told the system to look...

Ralph



On 6/30/08 5:21 AM, "Lenny Verkhovsky" <lenny.verkhov...@gmail.com> wrote:

> I am not familiar with spawn test of IBM, but maybe this is right behavior,
> if spawn test allocates 3 ranks on the node, and then allocates another 3
> then this test suppose to fail due to max_slots=4.
>  
> But it fails with the fallowing hostfile as well BUT WITH A DIFFERENT ERROR.
>  
> #cat hostfile2 
> witch2 slots=4 max_slots=4
> witch3 slots=4 max_slots=4
> witch1:/home/BENCHMARKS/IBM # /home/USERS/lenny/OMPI_ORTE_18772/bin/mpirun -np
> 3 -hostfile hostfile2 dynamic/spawn
> bash: orted: command not found
> [witch1:22789] 
> --------------------------------------------------------------------------
> A daemon (pid 22791) died unexpectedly with status 127 while attempting
> to launch so we are aborting.
> There may be more information reported by the environment (see above).
> This may be because the daemon was unable to find all the needed shared
> libraries on the remote node. You may set your LD_LIBRARY_PATH to have the
> location of the shared libraries on the remote nodes and this will
> automatically be forwarded to the remote nodes.
> --------------------------------------------------------------------------
> [witch1:22789] 
> --------------------------------------------------------------------------
> mpirun was unable to cleanly terminate the daemons on the nodes shown
> below. Additional manual cleanup may be required - please refer to
> the "orte-clean" tool for assistance.
> --------------------------------------------------------------------------
>         witch3 - daemon did not report back when launched
>                  
> On Mon, Jun 30, 2008 at 9:38 AM, Lenny Verkhovsky <lenny.verkhov...@gmail.com>
> wrote:
>> Hi, 
>> trying to run mtt I failed to run IBM spawn test. It fails only when using
>> hostfile, and not when using host list.
>> ( OMPI from TRUNK )
>>  
>> This is working :
>> #mpirun -np 3 -H witch2 dynamic/spawn
>>  
>> This Fails:
>> # cat hostfile
>> witch2 slots=4 max_slots=4
>> #mpirun -np 3 -hostfile hostfile dynamic/spawn
>> [witch1:12392] 
>> --------------------------------------------------------------------------
>> There are not enough slots available in the system to satisfy the 3 slots
>> that were requested by the application:
>>   dynamic/spawn
>> 
>> Either request fewer slots for your application, or make more slots available
>> for use.
>> --------------------------------------------------------------------------
>> [witch1:12392] 
>> --------------------------------------------------------------------------
>> A daemon (pid unknown) died unexpectedly on signal 1  while attempting to
>> launch so we are aborting.
>> 
>> There may be more information reported by the environment (see above).
>> 
>> This may be because the daemon was unable to find all the needed shared
>> libraries on the remote node. You may set your LD_LIBRARY_PATH to have the
>> location of the shared libraries on the remote nodes and this will
>> automatically be forwarded to the remote nodes.
>> --------------------------------------------------------------------------
>> mpirun: clean termination accomplished
>>  
>>  
>> Using hostfile1 also works
>> #cat hostfile1
>> witch2
>> witch2
>> witch2
>>  
>>  
>> Best Regards
>> Lenny.
>> 
> 




Reply via email to