Terry is correct - not guaranteed, but that is the typical behavior. However, you -can- guarantee that rank=0 will be on a particular host. Just run your job:
mpirun -n 1 -host <target> my_app : -n (N-1) my_app This guarantees that rank=0 is on host <target>. All other ranks will be distributed according to the selected mapping algorithm, including loadbalance Ralph On Dec 10, 2010, at 12:08 PM, Terry Dontje wrote: > On 12/10/2010 01:46 PM, David Mathog wrote: >> >> The master is commonly very different from the workers, so I expected >> there would be something like >> >> --rank0-on <hostname> >> >> but there doesn't seem to be a single switch on mpirun to do that. >> >> If "mastermachine" is the first entry in the hostfile, or the first >> machine in a -hosts list, will rank 0 always run there? If so, will it >> always run in the first slot on the first machine listed? That seems to >> be the case in practice, but is it guaranteed? Even if -loadbalance is >> used? >> > For Open MPI the above is correct, I am hesitant to use guaranteed though. >> Otherwise, there is the rankfile method. In the situation where the >> master must run on a specific node, but there is no preference for the >> workers, would a rank file like this be sufficient? >> >> rank 0=mastermachine slot=0 > I thought you may have had to give all ranks but empirically it looks like > you can. >> The mpirun man page gives an example where all nodes/slots are >> specified, but it doesn't say explicitly what happens if the >> configuration is only partially specified, or how it interacts with the >> -np parameter. Modifying the man page example: >> >> cat myrankfile >> rank 0=aa slot=1:0-2 >> rank 1=bb slot=0:0,1 >> rank 2=cc slot=1-2 >> mpirun -H aa,bb,cc,dd -np 4 -rf myrankfile ./a.out >> >> Rank 0 runs on node aa, bound to socket 1, cores 0-2. >> Rank 1 runs on node bb, bound to socket 0, cores 0 and 1. >> Rank 2 runs on node cc, bound to cores 1 and 2. >> >> Rank 3 runs where? not at all, or on dd, aa:slot=0, or ...? > From my empirical runs it looks to me like rank 3 would end up on aa possibly > slot=0. > In other words once you run out of entries in the rankfile it looks like the > processes then start from the beginning of the hostlist again. > > --td >> Also, in my limited testing --host and -hostfile seem to be mutually >> exclusive. That is reasonable, but it isn't clear that it is intended. >> Example, with a hostfile containing one entry for "monkey02.cluster >> slots=1": >> >> mpirun --host monkey01 --mca plm_rsh_agent rsh hostname >> monkey01.cluster >> mpirun --host monkey02 --mca plm_rsh_agent rsh hostname >> monkey02.cluster >> mpirun -hostfile /usr/common/etc/openmpi.machines.test1 \ >> --mca plm_rsh_agent rsh hostname >> monkey02.cluster >> mpirun --host monkey01 \ >> -hostfile /usr/commom/etc/openmpi.machines.test1 \ >> --mca plm_rsh_agent rsh hostname >> -------------------------------------------------------------------------- >> There are no allocated resources for the application >> hostname >> that match the requested mapping: >> >> >> Verify that you have mapped the allocated resources properly using the >> --host or --hostfile specification. >> -------------------------------------------------------------------------- >> >> >> >> >> Thanks, >> >> David Mathog >> mat...@caltech.edu >> Manager, Sequence Analysis Facility, Biology Division, Caltech >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users > > > -- > <Mail Attachment.gif> > Terry D. Dontje | Principal Software Engineer > Developer Tools Engineering | +1.781.442.2631 > Oracle - Performance Technologies > 95 Network Drive, Burlington, MA 01803 > Email terry.don...@oracle.com > > > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users