Ok, thanks :) 2016-01-06 22:03 GMT+01:00 Ralph Castain <r...@open-mpi.org>:
> Not really - just consistent with the other cmd line options. > > On Jan 6, 2016, at 12:58 PM, Nick Papior <nickpap...@gmail.com> wrote: > > It was just that when I started using map-by I didn't get why: > ppr:2 > but > PE=2 > I would at least have expected: > ppr=2:PE=2 > or > ppr:2:PE:2 > ? > Does this have a reason? > > 2016-01-06 21:54 GMT+01:00 Ralph Castain <r...@open-mpi.org>: > >> <LOL> ah yes, “r” = “resource”!! Thanks for the reminder :-) >> >> The difference in delimiter is just to simplify parsing - we can “split” >> the string on colons to separate out the options, and then use “=“ to set >> the value. Nothing particularly significant about the choice. >> >> >> On Jan 6, 2016, at 12:48 PM, Nick Papior <nickpap...@gmail.com> wrote: >> >> Your are correct. "socket" means that the resource is socket, "ppr:2" >> means 2 processes per resource. >> PE=<n> is Processing Elements per process. >> >> Perhaps the dev's can shed some light on why PE uses "=" and ppr has ":" >> as delimiter for resource request? >> >> This "old" slide show from Jeff shows the usage (although the input have >> changed since 1.7): >> >> http://www.slideshare.net/jsquyres/open-mpi-explorations-in-process-affinity-eurompi13-presentation >> >> >> 2016-01-06 21:33 GMT+01:00 Matt Thompson <fort...@gmail.com>: >> >>> A ha! The Gurus know all. The map-by was the magic sauce: >>> >>> (1176) $ env OMP_NUM_THREADS=7 KMP_AFFINITY=compact mpirun -np 4 -map-by >>> ppr:2:socket:pe=7 ./hello-hybrid.x | sort -g -k 18 >>> Hello from thread 0 out of 7 from process 0 out of 4 on borgo035 on CPU 0 >>> Hello from thread 1 out of 7 from process 0 out of 4 on borgo035 on CPU 1 >>> Hello from thread 2 out of 7 from process 0 out of 4 on borgo035 on CPU 2 >>> Hello from thread 3 out of 7 from process 0 out of 4 on borgo035 on CPU 3 >>> Hello from thread 4 out of 7 from process 0 out of 4 on borgo035 on CPU 4 >>> Hello from thread 5 out of 7 from process 0 out of 4 on borgo035 on CPU 5 >>> Hello from thread 6 out of 7 from process 0 out of 4 on borgo035 on CPU 6 >>> Hello from thread 0 out of 7 from process 1 out of 4 on borgo035 on CPU 7 >>> Hello from thread 1 out of 7 from process 1 out of 4 on borgo035 on CPU 8 >>> Hello from thread 2 out of 7 from process 1 out of 4 on borgo035 on CPU 9 >>> Hello from thread 3 out of 7 from process 1 out of 4 on borgo035 on CPU >>> 10 >>> Hello from thread 4 out of 7 from process 1 out of 4 on borgo035 on CPU >>> 11 >>> Hello from thread 5 out of 7 from process 1 out of 4 on borgo035 on CPU >>> 12 >>> Hello from thread 6 out of 7 from process 1 out of 4 on borgo035 on CPU >>> 13 >>> Hello from thread 0 out of 7 from process 2 out of 4 on borgo035 on CPU >>> 14 >>> Hello from thread 1 out of 7 from process 2 out of 4 on borgo035 on CPU >>> 15 >>> Hello from thread 2 out of 7 from process 2 out of 4 on borgo035 on CPU >>> 16 >>> Hello from thread 3 out of 7 from process 2 out of 4 on borgo035 on CPU >>> 17 >>> Hello from thread 4 out of 7 from process 2 out of 4 on borgo035 on CPU >>> 18 >>> Hello from thread 5 out of 7 from process 2 out of 4 on borgo035 on CPU >>> 19 >>> Hello from thread 6 out of 7 from process 2 out of 4 on borgo035 on CPU >>> 20 >>> Hello from thread 0 out of 7 from process 3 out of 4 on borgo035 on CPU >>> 21 >>> Hello from thread 1 out of 7 from process 3 out of 4 on borgo035 on CPU >>> 22 >>> Hello from thread 2 out of 7 from process 3 out of 4 on borgo035 on CPU >>> 23 >>> Hello from thread 3 out of 7 from process 3 out of 4 on borgo035 on CPU >>> 24 >>> Hello from thread 4 out of 7 from process 3 out of 4 on borgo035 on CPU >>> 25 >>> Hello from thread 5 out of 7 from process 3 out of 4 on borgo035 on CPU >>> 26 >>> Hello from thread 6 out of 7 from process 3 out of 4 on borgo035 on CPU >>> 27 >>> >>> So, a question: what does "ppr" mean? The man page seems to accept it as >>> an axiom of Open MPI: >>> >>> --map-by <foo> >>> Map to the specified object, defaults to socket. >>> Supported options include slot, hwthread, core, L1cache, L2cache, L3cache, >>> socket, numa, >>> board, node, sequential, distance, and ppr. Any object can >>> include modifiers by adding a : and any combination of PE=n (bind n >>> processing >>> elements to each proc), SPAN (load balance the >>> processes across the allocation), OVERSUBSCRIBE (allow more processes on a >>> node than pro‐ >>> cessing elements), and NOOVERSUBSCRIBE. This includes >>> PPR, where the pattern would be terminated by another colon to separate it >>> from the >>> modifiers. >>> >>> Is it an acronym/initialism? From some experimenting it seems to be >>> ppr:2:socket means 2 processes per socket? And pe=7 means leave 7 processes >>> between them? Is that about right? >>> >>> Matt >>> >>> On Wed, Jan 6, 2016 at 3:19 PM, Ralph Castain <r...@open-mpi.org> wrote: >>> >>>> I believe he wants two procs/socket, so you’d need ppr:2:socket:pe=7 >>>> >>>> >>>> On Jan 6, 2016, at 12:14 PM, Nick Papior <nickpap...@gmail.com> wrote: >>>> >>>> I do not think KMP_AFFINITY should affect anything in OpenMPI, it is an >>>> MKL env setting? Or am I wrong? >>>> >>>> Note that these are used in an environment where openmpi automatically >>>> gets the host-file. Hence they are not present. >>>> With intel mkl and openmpi I got the best performance using these, >>>> rather long flags: >>>> >>>> export KMP_AFFINITY=verbose,compact,granularity=core >>>> export KMP_STACKSIZE=62M >>>> export KMP_SETTINGS=1 >>>> >>>> def_flags="--bind-to core -x OMP_PROC_BIND=true --report-bindings" >>>> def_flags="$def_flags -x KMP_AFFINITY=$KMP_AFFINITY" >>>> >>>> # in your case 7: >>>> ONP=7 >>>> flags="$def_flags -x MKL_NUM_THREADS=$ONP -x MKL_DYNAMIC=FALSE" >>>> flags="$flags -x OMP_NUM_THREADS=$ONP -x OMP_DYNAMIC=FALSE" >>>> flags="$flags -x KMP_STACKSIZE=$KMP_STACKSIZE" >>>> flags="$flags --map-by ppr:1:socket:pe=7" >>>> >>>> then run your program: >>>> >>>> mpirun $flags <app> >>>> >>>> A lot of the option flags are duplicated (and strictly not needed), but >>>> I provide them for easy testing changes. >>>> Surely this is application dependent, but for my case it was performing >>>> really well. >>>> >>>> >>>> 2016-01-06 20:48 GMT+01:00 Erik Schnetter <schnet...@gmail.com>: >>>> >>>>> Setting KMP_AFFINITY will probably override anything that OpenMPI >>>>> sets. Can you try without? >>>>> >>>>> -erik >>>>> >>>>> On Wed, Jan 6, 2016 at 2:46 PM, Matt Thompson <fort...@gmail.com> >>>>> wrote: >>>>> > Hello Open MPI Gurus, >>>>> > >>>>> > As I explore MPI-OpenMP hybrid codes, I'm trying to figure out how >>>>> to do >>>>> > things to get the same behavior in various stacks. For example, I >>>>> have a >>>>> > 28-core node (2 14-core Haswells), and I'd like to run 4 MPI >>>>> processes and 7 >>>>> > OpenMP threads. Thus, I'd like the processes to be 2 processes per >>>>> socket >>>>> > with the OpenMP threads laid out on them. Using a "hybrid Hello >>>>> World" >>>>> > program, I can achieve this with Intel MPI (after a lot of testing): >>>>> > >>>>> > (1097) $ env OMP_NUM_THREADS=7 KMP_AFFINITY=compact mpirun -np 4 >>>>> > ./hello-hybrid.x | sort -g -k 18 >>>>> > srun.slurm: cluster configuration lacks support for cpu binding >>>>> > Hello from thread 0 out of 7 from process 2 out of 4 on borgo035 on >>>>> CPU 0 >>>>> > Hello from thread 1 out of 7 from process 2 out of 4 on borgo035 on >>>>> CPU 1 >>>>> > Hello from thread 2 out of 7 from process 2 out of 4 on borgo035 on >>>>> CPU 2 >>>>> > Hello from thread 3 out of 7 from process 2 out of 4 on borgo035 on >>>>> CPU 3 >>>>> > Hello from thread 4 out of 7 from process 2 out of 4 on borgo035 on >>>>> CPU 4 >>>>> > Hello from thread 5 out of 7 from process 2 out of 4 on borgo035 on >>>>> CPU 5 >>>>> > Hello from thread 6 out of 7 from process 2 out of 4 on borgo035 on >>>>> CPU 6 >>>>> > Hello from thread 0 out of 7 from process 3 out of 4 on borgo035 on >>>>> CPU 7 >>>>> > Hello from thread 1 out of 7 from process 3 out of 4 on borgo035 on >>>>> CPU 8 >>>>> > Hello from thread 2 out of 7 from process 3 out of 4 on borgo035 on >>>>> CPU 9 >>>>> > Hello from thread 3 out of 7 from process 3 out of 4 on borgo035 on >>>>> CPU 10 >>>>> > Hello from thread 4 out of 7 from process 3 out of 4 on borgo035 on >>>>> CPU 11 >>>>> > Hello from thread 5 out of 7 from process 3 out of 4 on borgo035 on >>>>> CPU 12 >>>>> > Hello from thread 6 out of 7 from process 3 out of 4 on borgo035 on >>>>> CPU 13 >>>>> > Hello from thread 0 out of 7 from process 0 out of 4 on borgo035 on >>>>> CPU 14 >>>>> > Hello from thread 1 out of 7 from process 0 out of 4 on borgo035 on >>>>> CPU 15 >>>>> > Hello from thread 2 out of 7 from process 0 out of 4 on borgo035 on >>>>> CPU 16 >>>>> > Hello from thread 3 out of 7 from process 0 out of 4 on borgo035 on >>>>> CPU 17 >>>>> > Hello from thread 4 out of 7 from process 0 out of 4 on borgo035 on >>>>> CPU 18 >>>>> > Hello from thread 5 out of 7 from process 0 out of 4 on borgo035 on >>>>> CPU 19 >>>>> > Hello from thread 6 out of 7 from process 0 out of 4 on borgo035 on >>>>> CPU 20 >>>>> > Hello from thread 0 out of 7 from process 1 out of 4 on borgo035 on >>>>> CPU 21 >>>>> > Hello from thread 1 out of 7 from process 1 out of 4 on borgo035 on >>>>> CPU 22 >>>>> > Hello from thread 2 out of 7 from process 1 out of 4 on borgo035 on >>>>> CPU 23 >>>>> > Hello from thread 3 out of 7 from process 1 out of 4 on borgo035 on >>>>> CPU 24 >>>>> > Hello from thread 4 out of 7 from process 1 out of 4 on borgo035 on >>>>> CPU 25 >>>>> > Hello from thread 5 out of 7 from process 1 out of 4 on borgo035 on >>>>> CPU 26 >>>>> > Hello from thread 6 out of 7 from process 1 out of 4 on borgo035 on >>>>> CPU 27 >>>>> > >>>>> > Other than the odd fact that Process #0 seemed to start on Socket #1 >>>>> (this >>>>> > might be an artifact of how I'm trying to detect the CPU I'm on), >>>>> this looks >>>>> > reasonable. 14 threads on each socket and each process is laying out >>>>> its >>>>> > threads in a nice orderly fashion. >>>>> > >>>>> > I'm trying to figure out how to do this with Open MPI (version >>>>> 1.10.0) and >>>>> > apparently I am just not quite good enough to figure it out. The >>>>> closest >>>>> > I've gotten is: >>>>> > >>>>> > (1155) $ env OMP_NUM_THREADS=7 KMP_AFFINITY=compact mpirun -np 4 >>>>> -map-by >>>>> > ppr:2:socket ./hello-hybrid.x | sort -g -k 18 >>>>> > Hello from thread 0 out of 7 from process 0 out of 4 on borgo035 on >>>>> CPU 0 >>>>> > Hello from thread 0 out of 7 from process 1 out of 4 on borgo035 on >>>>> CPU 0 >>>>> > Hello from thread 1 out of 7 from process 0 out of 4 on borgo035 on >>>>> CPU 1 >>>>> > Hello from thread 1 out of 7 from process 1 out of 4 on borgo035 on >>>>> CPU 1 >>>>> > Hello from thread 2 out of 7 from process 0 out of 4 on borgo035 on >>>>> CPU 2 >>>>> > Hello from thread 2 out of 7 from process 1 out of 4 on borgo035 on >>>>> CPU 2 >>>>> > Hello from thread 3 out of 7 from process 0 out of 4 on borgo035 on >>>>> CPU 3 >>>>> > Hello from thread 3 out of 7 from process 1 out of 4 on borgo035 on >>>>> CPU 3 >>>>> > Hello from thread 4 out of 7 from process 0 out of 4 on borgo035 on >>>>> CPU 4 >>>>> > Hello from thread 4 out of 7 from process 1 out of 4 on borgo035 on >>>>> CPU 4 >>>>> > Hello from thread 5 out of 7 from process 0 out of 4 on borgo035 on >>>>> CPU 5 >>>>> > Hello from thread 5 out of 7 from process 1 out of 4 on borgo035 on >>>>> CPU 5 >>>>> > Hello from thread 6 out of 7 from process 0 out of 4 on borgo035 on >>>>> CPU 6 >>>>> > Hello from thread 6 out of 7 from process 1 out of 4 on borgo035 on >>>>> CPU 6 >>>>> > Hello from thread 0 out of 7 from process 2 out of 4 on borgo035 on >>>>> CPU 14 >>>>> > Hello from thread 0 out of 7 from process 3 out of 4 on borgo035 on >>>>> CPU 14 >>>>> > Hello from thread 1 out of 7 from process 2 out of 4 on borgo035 on >>>>> CPU 15 >>>>> > Hello from thread 1 out of 7 from process 3 out of 4 on borgo035 on >>>>> CPU 15 >>>>> > Hello from thread 2 out of 7 from process 2 out of 4 on borgo035 on >>>>> CPU 16 >>>>> > Hello from thread 2 out of 7 from process 3 out of 4 on borgo035 on >>>>> CPU 16 >>>>> > Hello from thread 3 out of 7 from process 2 out of 4 on borgo035 on >>>>> CPU 17 >>>>> > Hello from thread 3 out of 7 from process 3 out of 4 on borgo035 on >>>>> CPU 17 >>>>> > Hello from thread 4 out of 7 from process 2 out of 4 on borgo035 on >>>>> CPU 18 >>>>> > Hello from thread 4 out of 7 from process 3 out of 4 on borgo035 on >>>>> CPU 18 >>>>> > Hello from thread 5 out of 7 from process 2 out of 4 on borgo035 on >>>>> CPU 19 >>>>> > Hello from thread 5 out of 7 from process 3 out of 4 on borgo035 on >>>>> CPU 19 >>>>> > Hello from thread 6 out of 7 from process 2 out of 4 on borgo035 on >>>>> CPU 20 >>>>> > Hello from thread 6 out of 7 from process 3 out of 4 on borgo035 on >>>>> CPU 20 >>>>> > >>>>> > Obviously not right. Any ideas on how to help me learn? The man >>>>> mpirun page >>>>> > is a bit formidable in the pinning part, so maybe I've missed an >>>>> obvious >>>>> > answer. >>>>> > >>>>> > Matt >>>>> > -- >>>>> > Matt Thompson >>>>> > >>>>> > Man Among Men >>>>> > Fulcrum of History >>>>> > >>>>> > >>>>> > _______________________________________________ >>>>> > users mailing list >>>>> > us...@open-mpi.org >>>>> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>> > Link to this post: >>>>> > http://www.open-mpi.org/community/lists/users/2016/01/28217.php >>>>> >>>>> >>>>> >>>>> -- >>>>> Erik Schnetter <schnet...@gmail.com> >>>>> http://www.perimeterinstitute.ca/personal/eschnetter/ >>>>> _______________________________________________ >>>>> users mailing list >>>>> us...@open-mpi.org >>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>> Link to this post: >>>>> http://www.open-mpi.org/community/lists/users/2016/01/28218.php >>>>> >>>> >>>> >>>> >>>> -- >>>> Kind regards Nick >>>> _______________________________________________ >>>> users mailing list >>>> us...@open-mpi.org >>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >>>> Link to this post: >>>> http://www.open-mpi.org/community/lists/users/2016/01/28219.php >>>> >>>> >>>> >>>> _______________________________________________ >>>> users mailing list >>>> us...@open-mpi.org >>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >>>> Link to this post: >>>> http://www.open-mpi.org/community/lists/users/2016/01/28221.php >>>> >>> >>> >>> >>> -- >>> Matt Thompson >>> >>> Man Among Men >>> Fulcrum of History >>> >>> >>> _______________________________________________ >>> users mailing list >>> us...@open-mpi.org >>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >>> Link to this post: >>> http://www.open-mpi.org/community/lists/users/2016/01/28223.php >>> >> >> >> >> -- >> Kind regards Nick >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >> Link to this post: >> http://www.open-mpi.org/community/lists/users/2016/01/28224.php >> >> >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >> Link to this post: >> http://www.open-mpi.org/community/lists/users/2016/01/28226.php >> > > > > -- > Kind regards Nick > _______________________________________________ > users mailing list > us...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: > http://www.open-mpi.org/community/lists/users/2016/01/28227.php > > > > _______________________________________________ > users mailing list > us...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: > http://www.open-mpi.org/community/lists/users/2016/01/28228.php > -- Kind regards Nick