ilman/private/devel/
> attachments/20180123/6d4537ad/attachment.html>
> -- next part --
> A non-text attachment was scrubbed...
> Name: MPI_on_QDR_dlopen_paramter.pdf
> Type: application/pdf
> Size: 16813 bytes
> Desc: not available
> URL: <https://lists.open-
Gilles,
I'm using NetPIPE which is available at http://netpipe.cs.ksu.edu
My base test is uni-directional with 1 process on a node communicating
with a process on a second node.
make mpi
mpirun -np 2 --hostfile=hf.2p2n NPmpi
cat hf.2p2n
node0 slots=1
node1 slots=1
NetPIPE does not do any MPI_
Thanks Paul,
unfortunately, that did not help :-(
performance are just as bad even with --mca mpi_leave_pinned 1
and surprisingly, when patcher/overwrite is used, performances are not
worse with --mca mpi_leave_pinned 0
Cheers,
Gilles
On Wed, Jan 24, 2018 at 2:28 PM, Paul Hargrove wrote:
>
Ah, this sounds familiar.
I believe that the issue Dave sees is that without patcher/overwrite the
"leave pinned" protocol is OFF by default.
Use of '-mca mpi_leave_pinned 1' may help if my guess is right.
HOWEVER, w/o the memory management hooks provided using patcher/overwrite,
leave pinned can
Dave,
here is what I found
- MPI_THREAD_MULTIPLE is not part of the equation (I just found it is
no more required by IMB by default)
- patcher/overwrite is not built when Open MPI is configure'd with
--disable-dlopen
- when configure'd without --disable-dlopen, performances are way
worst for t
Dave,
i can reproduce the issue with btl/openib and the IMB benchmark, that
is known to MPI_Init_thread(MPI_THREAD_MULTIPLE)
note performance is ok with OSU benchmark that does not require
MPI_THREAD_MULTIPLE
Cheers,
Gilles
On Wed, Jan 24, 2018 at 1:16 PM, Gilles Gouaillardet wrote:
> Dave,
>
Dave,
one more question, are you running the openib/btl ? or other libraries
such as MXM or UCX ?
Cheers,
Gilles
On 1/24/2018 12:55 PM, Dave Turner wrote:
We compiled OpenMPI 2.1.1 using the EasyBuild configuration
for CentOS as below and tested on Mellanox QDR hardware.
./configur
Dave,
At first glance, that looks pretty odd, and I'll have a look at it.
Which benchmark are you using to measure the bandwidth ?
Does your benchmark MPI_Init_thread(MPI_THREAD_MULTIPLE) ?
Have you tried without --enable-mpi-thread-multiple ?
Cheers,
Gilles
On Wed, Jan 24, 2018 at 12:55 PM,