On Fri, 5 Sep 2014, Ralph Castain wrote:

On Sep 5, 2014, at 3:34 PM, Allin Cottrell <cottr...@wfu.edu> wrote:

I suspect there is a new (to openmpi 1.8.N?) warning with respect to requesting a number of MPI processes greater than the number of "real" cores on a given machine. I can provide a good deal more information is that's required, but can I just pose it as a question for now? Does anyone know of a a relevant change in the code?

The reason I'm asking is that I've been experimenting, on a couple of machines and with more than one computational problem, to see if I'm better off restricting the number of MPI processes to the number of "real" or "physical" cores available, or if it's better to allow a larger number of processes up to the number of hyperthreads available (which is twice the number of cores on the machines I'm working on).

If you are going to treat hyperthreads as independent processors, then you should probably set the --use-hwthreads-as-cpus flag so OMPI knows to treat it that way

Hmm, where would I set that? (For reference) mpiexec --version gives

mpiexec (OpenRTE) 1.8.2

and if I append --use-hwthreads-as-cpus to my mpiexec command I get

mpiexec: Error: unknown option "--use-hwthreads-as-cpus"

However, via trial and error I've found that these options work: either

--map-by hwthread OR
--oversubscribe (not mentioned in the mpiexec man page)

What's puzzling me, though, is that the use of these flags was not necessary when, earlier this year, I was running ompi 1.6.5. Neither is it necessary when running ompi 1.7.3 on a different machine. The warning that's printed without these flags seems to be new.

It seems to me that openmpi >= 1.8 is giving me a (somewhat obscure and non-user friendly) warning whenever I specify to mpiexec a number of processes > the number of "real" cores [...]

Could you pass along the warning? It should only give you a warning if the #procs > #slots as you are then oversubscribed. You can turn that warning off by just add the oversubscribe flag to your mapping directive

Here's what I'm seeing:

<quote>
A request was made to bind to that would result in binding more
processes than cpus on a resource:

   Bind to:     CORE
   Node:        waverley
   #processes:  2
   #cpus:       1

You can override this protection by adding the "overload-allowed"
option to your binding directive.
</quote>

The machine in question has two cores and four threads. The thing that's confusing here is that I'm not aware of supplying any "binding directive": my command line (for running on a single host) is just this:

mpiexec -np <N> <myprogram> <myprogram-data>

In fact it seems that current ompi "does the right thing" in respect of the division of labor even without the extra flags: depending on the nature of computation, I can get faster times with -np 4 than with -np 2 (and not degradation). It just insists on printing this warning which I'd like to be able to turn off "globally" if possible.

Allin Cottrell

Reply via email to