Thanks for the quick response.

>> I'm using OMPI 1.6.4 in a Torque-like environment.
>> However, since there are modifications in Torque that prevent OMPI from 
>> spawning processes the way it does with MPI_COMM_SPAWN, 
> 
> That hasn't been true in the past - did you folks locally modify Torque to 
> prevent it?

Plain Torque still supports the TM-based spawning as before.
The Problem is that the RM on the system I'm using is based on Torque with 
modifications.

>> I want to circumvent Torque and use plain ssh only.
>> 
>> So, I configured --without-tm and can successfully run mpiexec with 
>> -hostfile.
>> 
>> Now I want to MPI_COMM_SPAWN using the hostfile info argument.
>> 
>> I start with
>> 
>> $ mpiexec -np 1 -hostfile hostfile_all ./spawn_parent
>> 
>> where hostfile_all is a superset of hostfile_spawn which is provided in the 
>> info argument to MPI_COMM_SPAWN.
>> 
>> The message I get is:
>> 
>> --------------------------------------------------------------------------
>> All nodes which are allocated for this job are already filled.
>> --------------------------------------------------------------------------
> 
> I'll take a look in the morning when my cluster comes back up - sounds like 
> we have a bug. However, note that there are no current plans for a 1.6.5 
> release, so I don't know how long it will be before any fix shows up.
> 
> Meantime, I'll check the 1.7 series to ensure it works correctly there as 
> well.

When it works with 1.7 this would already be fine for me.

Reply via email to