Re: [OMPI devel] Seeing message failures in OpenMPI 4.0.1 on UCX

2019-06-03 Thread Dave Turner via devel
  I've rerun my NetPIPE tests using --mca btl ^uct as Yossi suggested
and that
does indeed get rid of the message failures.  I don't see any difference in
performance but wanted to check if there is any downside to doing the build
without uct as suggested.

 Dave Turner


>
> Today's Topics:
>
>1. Re: Seeing message failures in OpenMPI 4.0.1 on UCX (Yossi Itigin)
>
>
> --
>
> Message: 1
> Date: Wed, 15 May 2019 10:57:49 +
> From: Yossi Itigin 
> To: "drdavetur...@gmail.com" 
> Cc: Pavel Shamis , "devel@lists.open-mpi.org"
>     , Sergey Oblomov 
> Subject: Re: [OMPI devel] Seeing message failures in OpenMPI 4.0.1 on
> UCX
> Message-ID:
> <
> am0pr05mb6115e809ad58f76189805b99a6...@am0pr05mb6115.eurprd05.prod.outlook.com
> >
>
> Content-Type: text/plain; charset="utf-8"
>
> Hi,
>
> (resending this edited email due to 150kB limit on email message size)
>
> The issue below is caused by btl_uct ? it mistakenly calls
> ucm_set_external_event() without checking opal_mem_hooks_support_level().
> This leads UCX to believe that memory hooks would be provided by OMPI, but
> in fact they are not, so pinned physical pages become out-of-sync with
> process virtual address.
>
>   *   btl_uct wrong call:
> https://github.com/open-mpi/ompi/blob/master/opal/mca/btl/uct/btl_uct_component.c#L132
>   *   Correct way:
> https://github.com/open-mpi/ompi/blob/master/opal/mca/common/ucx/common_ucx.c#L104
>
> Since btl_uct component currently does have a maintainer, my best
> suggestion is to either disable it in OMPI configure (as described in
> https://github.com/open-mpi/ompi/issues/6640#issuecomment-490465625), or
> during runtime : ?mpirun -mca btl ^uct ??
>
> UCX reference issue: https://github.com/openucx/ucx/issues/3581
>
> --Yossi
>
> From: Dave Turner mailto:drdavetur...@gmail.com>>
> Sent: Monday, April 22, 2019 10:26 PM
> To: Yossi Itigin mailto:yos...@mellanox.com>>
> Cc: Pavel Shamis mailto:pasharesea...@gmail.com>>;
> devel@lists.open-mpi.org<mailto:devel@lists.open-mpi.org>; Sergey Oblomov
> mailto:serg...@mellanox.com>>
> Subject: Re: [OMPI devel] Seeing message failures in OpenMPI 4.0.1 on UCX
>
> Yossi,
>
> I reran the base NetPIPE test then added --mca
> opal_common_ucx_opal_mem_hooks 1 as
> you suggested but got the same failures and no warning messages.  Let me
> know if
> there is anything else you'd like me to try.
>
>   Dave Turner
>
> Elf22 module purge
> Elf22 /homes/daveturner/libs/openmpi-4.0.1-ucx/bin/mpirun -np 2 --hostfile
> hf.elf NPmpi-4.0.1-ucx -o np.elf.mpi-4.0.1-ucx-ib --printhostnames
> Saving output to np.elf.mpi-4.0.1-ucx-ib
>
> Proc 0 is on host elf22
>
> Proc 1 is on host elf23
>
>   Clock resolution ~   1.000 nsecs  Clock accuracy ~  33.000 nsecs
>
> Start testing with 7 trials for each message size
>   1:   1  B 24999 times -->2.754 Mbps  in2.904 usecs
>   2:   2  B 86077 times -->5.536 Mbps  in2.890 usecs
>   
>  91: 196.611 KB  1465 times -->9.222 Gbps  in  170.561 usecs
>  92: 262.141 KB  1465 times -->9.352 Gbps  in  224.245 usecs
>  93: 262.144 KB  1114 times -->9.246 Gbps  in  226.826 usecs
>  94: 262.147 KB  1102 times -->9.243 Gbps  in  226.883 usecs
>  95: 393.213 KB  1101 times -->9.413 Gbps  in  334.177 usecs   1
> failures
>  96: 393.216 KB   748 times -->9.418 Gbps  in  334.005 usecs
>  10472 failures
>  97: 393.219 KB   748 times -->9.413 Gbps  in  334.201 usecs   1
> failures
>  98: 524.285 KB   748 times -->9.498 Gbps  in  441.601 usecs   1
> failures
>
> 120:   6.291 MB48 times -->9.744 Gbps  in5.166 msecs   672
> failures
> 121:   6.291 MB48 times -->9.736 Gbps  in5.170 msecs   1
> failures
> 122:   8.389 MB48 times -->9.744 Gbps  in6.887 msecs   1
> failures
> 123:   8.389 MB36 times -->9.750 Gbps  in6.883 msecs   504
> failures
> 124:   8.389 MB36 times -->9.739 Gbps  in6.891 msecs   1
> failures
>
> Completed withmax bandwidth9.737 Gbps2.908 usecs
> latency
>
>
> Work: davetur...@ksu.edu (785) 532-7791
 2219 Engineering Hall, Manhattan KS  66506
Home:drdavetur...@gmail.com
  cell: (785) 770-5929
___
devel mailing list
devel@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/devel

Re: [OMPI devel] Seeing message failures in OpenMPI 4.0.1 on UCX

2019-05-15 Thread Yossi Itigin via devel
Hi,

(resending this edited email due to 150kB limit on email message size)

The issue below is caused by btl_uct – it mistakenly calls 
ucm_set_external_event() without checking opal_mem_hooks_support_level().
This leads UCX to believe that memory hooks would be provided by OMPI, but in 
fact they are not, so pinned physical pages become out-of-sync with process 
virtual address.

  *   btl_uct wrong call: 
https://github.com/open-mpi/ompi/blob/master/opal/mca/btl/uct/btl_uct_component.c#L132
  *   Correct way: 
https://github.com/open-mpi/ompi/blob/master/opal/mca/common/ucx/common_ucx.c#L104

Since btl_uct component currently does have a maintainer, my best suggestion is 
to either disable it in OMPI configure (as described in 
https://github.com/open-mpi/ompi/issues/6640#issuecomment-490465625), or during 
runtime : “mpirun -mca btl ^uct …”

UCX reference issue: https://github.com/openucx/ucx/issues/3581

--Yossi

From: Dave Turner mailto:drdavetur...@gmail.com>>
Sent: Monday, April 22, 2019 10:26 PM
To: Yossi Itigin mailto:yos...@mellanox.com>>
Cc: Pavel Shamis mailto:pasharesea...@gmail.com>>; 
devel@lists.open-mpi.org<mailto:devel@lists.open-mpi.org>; Sergey Oblomov 
mailto:serg...@mellanox.com>>
Subject: Re: [OMPI devel] Seeing message failures in OpenMPI 4.0.1 on UCX

Yossi,

I reran the base NetPIPE test then added --mca 
opal_common_ucx_opal_mem_hooks 1 as
you suggested but got the same failures and no warning messages.  Let me know if
there is anything else you'd like me to try.

  Dave Turner

Elf22 module purge
Elf22 /homes/daveturner/libs/openmpi-4.0.1-ucx/bin/mpirun -np 2 --hostfile 
hf.elf NPmpi-4.0.1-ucx -o np.elf.mpi-4.0.1-ucx-ib --printhostnames
Saving output to np.elf.mpi-4.0.1-ucx-ib

Proc 0 is on host elf22

Proc 1 is on host elf23

  Clock resolution ~   1.000 nsecs  Clock accuracy ~  33.000 nsecs

Start testing with 7 trials for each message size
  1:   1  B 24999 times -->2.754 Mbps  in2.904 usecs
  2:   2  B 86077 times -->5.536 Mbps  in2.890 usecs
  <… cut …>
 91: 196.611 KB  1465 times -->9.222 Gbps  in  170.561 usecs
 92: 262.141 KB  1465 times -->9.352 Gbps  in  224.245 usecs
 93: 262.144 KB  1114 times -->9.246 Gbps  in  226.826 usecs
 94: 262.147 KB  1102 times -->9.243 Gbps  in  226.883 usecs
 95: 393.213 KB  1101 times -->9.413 Gbps  in  334.177 usecs   1 
failures
 96: 393.216 KB   748 times -->9.418 Gbps  in  334.005 usecs   10472 
failures
 97: 393.219 KB   748 times -->9.413 Gbps  in  334.201 usecs   1 
failures
 98: 524.285 KB   748 times -->9.498 Gbps  in  441.601 usecs   1 
failures
   <… cut …>
120:   6.291 MB48 times -->9.744 Gbps  in5.166 msecs   672 
failures
121:   6.291 MB48 times -->9.736 Gbps  in5.170 msecs   1 
failures
122:   8.389 MB48 times -->9.744 Gbps  in6.887 msecs   1 
failures
123:   8.389 MB36 times -->9.750 Gbps  in6.883 msecs   504 
failures
124:   8.389 MB36 times -->9.739 Gbps  in6.891 msecs   1 
failures

Completed withmax bandwidth9.737 Gbps2.908 usecs latency


Elf22 /homes/daveturner/libs/openmpi-4.0.1-ucx/bin/mpirun -np 2 --mca 
opal_common_ucx_opal_mem_hooks 1 --hostfile hf.elf NPmpi-4.0.1-ucx -o 
np.elf.mpi-4.0.1-ucx-ib --printhostnames
Saving output to np.elf.mpi-4.0.1-ucx-ib

Proc 0 is on host elf22

Proc 1 is on host elf23

  Clock resolution ~   1.000 nsecs  Clock accuracy ~  34.000 nsecs

Start testing with 7 trials for each message size
  1:   1  B 24999 times -->2.750 Mbps  in2.909 usecs
  2:   2  B 85939 times -->5.527 Mbps  in2.895 usecs
  <… cut …>
 87: 131.072 KB  2142 times -->8.986 Gbps  in  116.693 usecs
 88: 131.075 KB  2142 times -->8.987 Gbps  in  116.683 usecs
 89: 196.605 KB  2142 times -->9.220 Gbps  in  170.584 usecs
 90: 196.608 KB  1465 times -->9.221 Gbps  in  170.577 usecs
 91: 196.611 KB  1465 times -->9.222 Gbps  in  170.550 usecs
 92: 262.141 KB  1465 times -->9.352 Gbps  in  224.250 usecs
 93: 262.144 KB  1114 times -->9.246 Gbps  in  226.805 usecs
 94: 262.147 KB  1102 times -->9.244 Gbps  in  226.860 usecs
 95: 393.213 KB  1102 times -->9.413 Gbps  in  334.172 usecs   1 
failures
 96: 393.216 KB   748 times -->9.419 Gbps  in  333.994 usecs   10472 
failures
 97: 393.219 KB   748 times -->9.413 Gbps  in  334.200 usecs   1 
failures
 98: 524.285 KB   748 times -->9.499 Gbps  in  441.562 usecs   1 
failures
 99: 524.288 KB   566 times -->9.504 Gbps  in  441.339 usecs   7924 
failures
100: 524.291 KB   566 times -->9.498 Gbps  in  441.590 usecs   1 
failures
101: 786.429 KB