On Tue, 29 Aug 2017 17:55:22 +0000
"Barrett, Brian via devel" <devel@lists.open-mpi.org> wrote:

> The fourth release candidate for Open MPI 3.0.0 is now available for
> download.  Changes since rc2 include:
...
>     https://www.open-mpi.org/software/ompi/v3.0/

TLDR: worked fine on a few different systems for me.


I took the time to give it a spin on three (purposefully different
systems):

1) AMD Zen + Fedora 25 + gcc-7.2 + mlx4(inbox)/verbs + ssh/rsh
2) Intel HSW ht:on + CentOS 6.current + icc/ifort 2015 +
truescale(inbox)/psm + slurm(14.11.11)
3) Intel SNB ht:off + CentOS 6.current + icc/ifort 2017 +
mlx4(inbox)/verbs + slurm(14.11.11)

I built pretty vanilla (CC= CXX= FC= orterun prefix and disable cma).
No issues with configure and make.



On 1, 2 and 3 I ran HPGMG-fv(modern C proxy app) in mpi+openmp mode

On 1 I also compiled and ran a 200+ KLOC unstructured CFD solver
(Fortran 77-2008)

I verified that default launch ran on correct mtl/btl. I also verified
that forcing tcp worked.



General comments (relating mostly to MPI+OpenMP):

* I tried to launch my hpgmg hybrid run with --cpus-per-rank N (for
  OMP_NUM_THREADS=N) and got a deprecation warning (but it worked).
  The suggested replacement was non-obvious:

  Command line options:
    Deprecated:  --cpus-per-proc, -cpus-per-proc, --cpus-per-rank,
    -cpus-per-rank
    Replacement: --map-by <obj>:PE=N, default <obj>=NUMA

  The, what I think is, correct way
  "--map-by NUMA:PE=4" (OMP_NUM_THREADS=4) wasn't completely obvious to
  me. And the help output from mpirun certainly didn't help:

  $ mpirun -h mapping
   -cpus-per-rank|--cpus-per-rank <arg0> ...
   --map-by <arg0>       Mapping Policy [slot | hwthread | core | socket
                         (default) | numa | board | node]
  1) cpus-per-rank does not show as deprecated
  2) --map-by does not include info about the PE=x syntax

* --display-map does not pick up on a cpus-per-task>1 from slurm
  A slurm geometry of "-n 8 -c 4" on 16 core nodes shows correctly as 4
  slots per node but it also says "1 procs", shouldn't that be 4 procs?

   Data for node: n2Num slots: 4Max slots: 0Num procs: 4
   Process OMPI jobid: [7395,1] App: 0 Process rank: 4 Bound: N/A
   Process OMPI jobid: [7395,1] App: 0 Process rank: 5 Bound: N/A
   Process OMPI jobid: [7395,1] App: 0 Process rank: 6 Bound: N/A
   Process OMPI jobid: [7395,1] App: 0 Process rank: 7 Bound: N/A

* On the slurm systems OMP_NUM_THREADS is forwarded (default slurm
  behavior) but on ssh/rsh one must add "-x OMP_NUM_THREADS" or suffer
  the assymetric situation. Maybe that should be forwarded by default
  if set or emit a warning?


/end-of-rc-feedback

Keep up the great work
 Peter K
_______________________________________________
devel mailing list
devel@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/devel

Reply via email to