Terry is correct - not guaranteed, but that is the typical behavior.

However, you -can- guarantee that rank=0 will be on a particular host. Just run 
your job:

mpirun -n 1 -host <target> my_app : -n (N-1) my_app

This guarantees that rank=0 is on host <target>. All other ranks will be 
distributed according to the selected mapping algorithm, including loadbalance

Ralph

On Dec 10, 2010, at 12:08 PM, Terry Dontje wrote:

> On 12/10/2010 01:46 PM, David Mathog wrote:
>> 
>> The master is commonly very different from the workers, so I expected
>> there would be something like
>> 
>>   --rank0-on <hostname>
>> 
>> but there doesn't seem to be a single switch on mpirun to do that.
>> 
>> If "mastermachine" is the first entry in the hostfile, or the first
>> machine in a -hosts list, will rank 0 always run there?  If so, will it
>> always run in the first slot on the first machine listed?  That seems to
>> be the case in practice, but is it guaranteed?  Even if -loadbalance is
>> used?  
>> 
> For Open MPI the above is correct, I am hesitant to use guaranteed though.
>> Otherwise, there is the rankfile method.  In the situation where the
>> master must run on a specific node, but there is no preference for the
>> workers, would a rank file like this be sufficient?
>> 
>> rank 0=mastermachine slot=0
> I thought you may have had to give all ranks but empirically it looks like 
> you can.
>> The mpirun man page gives an example where all nodes/slots are
>> specified, but it doesn't say explicitly what happens if the
>> configuration is only partially specified, or how it interacts with the
>> -np parameter.  Modifying the man page example:
>> 
>> cat myrankfile
>> rank 0=aa slot=1:0-2
>> rank 1=bb slot=0:0,1
>> rank 2=cc slot=1-2
>> mpirun -H aa,bb,cc,dd -np 4 -rf myrankfile ./a.out
>> 
>> Rank 0 runs on node aa, bound to socket 1, cores 0-2.
>> Rank 1 runs on node bb, bound to socket 0, cores 0 and 1.
>> Rank 2 runs on node cc, bound to cores 1 and 2.
>> 
>> Rank 3 runs where?  not at all, or on dd, aa:slot=0, or ...? 
> From my empirical runs it looks to me like rank 3 would end up on aa possibly 
> slot=0.
> In other words once you run out of entries in the rankfile it looks like the 
> processes then start from the beginning of the hostlist again.  
> 
> --td
>> Also, in my limited testing --host and -hostfile seem to be mutually
>> exclusive.  That is reasonable, but it isn't clear that it is intended.
>>  Example, with a hostfile containing one entry for "monkey02.cluster
>> slots=1":
>> 
>> mpirun  --host monkey01   --mca plm_rsh_agent rsh  hostname
>> monkey01.cluster
>> mpirun  --host monkey02   --mca plm_rsh_agent rsh  hostname
>> monkey02.cluster
>> mpirun  -hostfile /usr/common/etc/openmpi.machines.test1 \
>>    --mca plm_rsh_agent rsh  hostname
>> monkey02.cluster
>> mpirun  --host monkey01  \
>>   -hostfile /usr/commom/etc/openmpi.machines.test1 \
>>   --mca plm_rsh_agent rsh  hostname
>> --------------------------------------------------------------------------
>> There are no allocated resources for the application 
>>   hostname
>> that match the requested mapping:
>>   
>> 
>> Verify that you have mapped the allocated resources properly using the 
>> --host or --hostfile specification.
>> --------------------------------------------------------------------------
>> 
>> 
>> 
>> 
>> Thanks,
>> 
>> David Mathog
>> mat...@caltech.edu
>> Manager, Sequence Analysis Facility, Biology Division, Caltech
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> -- 
> <Mail Attachment.gif>
> Terry D. Dontje | Principal Software Engineer
> Developer Tools Engineering | +1.781.442.2631
> Oracle - Performance Technologies
> 95 Network Drive, Burlington, MA 01803
> Email terry.don...@oracle.com
> 
> 
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

Reply via email to