The master is commonly very different from the workers, so I expected there would be something like
--rank0-on <hostname> but there doesn't seem to be a single switch on mpirun to do that. If "mastermachine" is the first entry in the hostfile, or the first machine in a -hosts list, will rank 0 always run there? If so, will it always run in the first slot on the first machine listed? That seems to be the case in practice, but is it guaranteed? Even if -loadbalance is used? Otherwise, there is the rankfile method. In the situation where the master must run on a specific node, but there is no preference for the workers, would a rank file like this be sufficient? rank 0=mastermachine slot=0 The mpirun man page gives an example where all nodes/slots are specified, but it doesn't say explicitly what happens if the configuration is only partially specified, or how it interacts with the -np parameter. Modifying the man page example: cat myrankfile rank 0=aa slot=1:0-2 rank 1=bb slot=0:0,1 rank 2=cc slot=1-2 mpirun -H aa,bb,cc,dd -np 4 -rf myrankfile ./a.out Rank 0 runs on node aa, bound to socket 1, cores 0-2. Rank 1 runs on node bb, bound to socket 0, cores 0 and 1. Rank 2 runs on node cc, bound to cores 1 and 2. Rank 3 runs where? not at all, or on dd, aa:slot=0, or ...? Also, in my limited testing --host and -hostfile seem to be mutually exclusive. That is reasonable, but it isn't clear that it is intended. Example, with a hostfile containing one entry for "monkey02.cluster slots=1": mpirun --host monkey01 --mca plm_rsh_agent rsh hostname monkey01.cluster mpirun --host monkey02 --mca plm_rsh_agent rsh hostname monkey02.cluster mpirun -hostfile /usr/common/etc/openmpi.machines.test1 \ --mca plm_rsh_agent rsh hostname monkey02.cluster mpirun --host monkey01 \ -hostfile /usr/commom/etc/openmpi.machines.test1 \ --mca plm_rsh_agent rsh hostname -------------------------------------------------------------------------- There are no allocated resources for the application hostname that match the requested mapping: Verify that you have mapped the allocated resources properly using the --host or --hostfile specification. -------------------------------------------------------------------------- Thanks, David Mathog mat...@caltech.edu Manager, Sequence Analysis Facility, Biology Division, Caltech