On Fri, 5 Sep 2014, Ralph Castain wrote:
On Sep 5, 2014, at 3:34 PM, Allin Cottrell <cottr...@wfu.edu> wrote:
I suspect there is a new (to openmpi 1.8.N?) warning with respect to
requesting a number of MPI processes greater than the number of "real"
cores on a given machine. I can provide a good deal more information is
that's required, but can I just pose it as a question for now? Does
anyone know of a a relevant change in the code?
The reason I'm asking is that I've been experimenting, on a couple of
machines and with more than one computational problem, to see if I'm
better off restricting the number of MPI processes to the number of
"real" or "physical" cores available, or if it's better to allow a
larger number of processes up to the number of hyperthreads available
(which is twice the number of cores on the machines I'm working on).
If you are going to treat hyperthreads as independent processors, then
you should probably set the --use-hwthreads-as-cpus flag so OMPI knows
to treat it that way
Hmm, where would I set that? (For reference) mpiexec --version gives
mpiexec (OpenRTE) 1.8.2
and if I append --use-hwthreads-as-cpus to my mpiexec command I get
mpiexec: Error: unknown option "--use-hwthreads-as-cpus"
However, via trial and error I've found that these options work: either
--map-by hwthread OR
--oversubscribe (not mentioned in the mpiexec man page)
What's puzzling me, though, is that the use of these flags was not
necessary when, earlier this year, I was running ompi 1.6.5. Neither is it
necessary when running ompi 1.7.3 on a different machine. The warning
that's printed without these flags seems to be new.
It seems to me that openmpi >= 1.8 is giving me a (somewhat obscure and
non-user friendly) warning whenever I specify to mpiexec a number of
processes > the number of "real" cores [...]
Could you pass along the warning? It should only give you a warning if
the #procs > #slots as you are then oversubscribed. You can turn that
warning off by just add the oversubscribe flag to your mapping directive
Here's what I'm seeing:
<quote>
A request was made to bind to that would result in binding more
processes than cpus on a resource:
Bind to: CORE
Node: waverley
#processes: 2
#cpus: 1
You can override this protection by adding the "overload-allowed"
option to your binding directive.
</quote>
The machine in question has two cores and four threads. The thing that's
confusing here is that I'm not aware of supplying any "binding directive":
my command line (for running on a single host) is just this:
mpiexec -np <N> <myprogram> <myprogram-data>
In fact it seems that current ompi "does the right thing" in respect of
the division of labor even without the extra flags: depending on the
nature of computation, I can get faster times with -np 4 than with -np 2
(and not degradation). It just insists on printing this warning which I'd
like to be able to turn off "globally" if possible.
Allin Cottrell