Hi Devendar,
Thank again you for your answer.
I searched a little bit and found that UD stands for "Unreliable Datagram"
while RC is for "Reliable Connected" transport mechanism. I found another
called DC for "Dynamically Connected" which is not supported on our HCA.
Do you know what is
Hi Martin
MXM default transport is UD (MXM_TLS=*ud*,shm,self), which is scalable when
running with large applications. RC(MXM_TLS=*rc,*shm,self) is recommended
for microbenchmarks and very small scale applications,
yes, max seg size setting is too small.
Did you check any message rate
Hi Devendar,
Thank you for your answer.
Setting MXM_TLS=rc,shm,self does improve the speed of MXM (both latency and
bandwidth):
without MXM_TLS
comm lat_min bw_max bw_max
pingpong pingpongsendrecv
(us) (MB/s) (MB/s)
Hi Martin,
Can you check if it is any better with "-x MXM_TLS=rc,shm,self" ?
-Devendar
On Tue, Aug 16, 2016 at 11:28 AM, Audet, Martin wrote:
> Hi Josh,
>
> Thanks for your reply. I did try setting MXM_RDMA_PORTS=mlx4_0:1 for all
> my MPI processes
> and it did
"Audet, Martin" writes:
> Hi Josh,
>
> Thanks for your reply. I did try setting MXM_RDMA_PORTS=mlx4_0:1 for all my
> MPI processes
> and it did improve performance but the performance I obtain isn't completely
> satisfying.
I raised the issue of MXM hurting p2p
Hi Josh,
Thanks for your reply. I did try setting MXM_RDMA_PORTS=mlx4_0:1 for all my MPI
processes
and it did improve performance but the performance I obtain isn't completely
satisfying.
When I use IMB 4.1 pingpong and sendrecv benchmarks between two nodes I get
using
Open MPI 1.10.3:
Hi, Martin
The environment variable:
MXM_RDMA_PORTS=device:port
is what you're looking for. You can specify a device/port pair on your OMPI
command line like:
mpirun -np 2 ... -x MXM_RDMA_PORTS=mlx4_0:1 ...
Best,
Josh
On Fri, Aug 12, 2016 at 5:03 PM, Audet, Martin
Hi OMPI_Users && OMPI_Developers,
Is there an equivalent to the MCA parameter btl_openib_include_if when using
MXM over Infiniband (e.g. either (pml=cm mtl=mxm) or (pml=yalla)) ?
I ask this question because I'm working on a cluster where LXC containers are
used on compute nodes (with SR-IOV I