Hi Martin, Can you check if it is any better with "-x MXM_TLS=rc,shm,self" ?
-Devendar On Tue, Aug 16, 2016 at 11:28 AM, Audet, Martin <martin.au...@cnrc-nrc.gc.ca > wrote: > Hi Josh, > > Thanks for your reply. I did try setting MXM_RDMA_PORTS=mlx4_0:1 for all > my MPI processes > and it did improve performance but the performance I obtain isn't > completely satisfying. > > When I use IMB 4.1 pingpong and sendrecv benchmarks between two nodes I > get using > Open MPI 1.10.3: > > without MXM_RDMA_PORTS > > comm lat_min bw_max bw_max > pingpong pingpong sendrecv > (us) (MB/s) (MB/s) > ------------------------------------------- > openib 1.79 5947.07 11534 > mxm 2.51 5166.96 8079.18 > yalla 2.47 5167.29 8278.15 > > > with MXM_RDMA_PORTS=mlx4_0:1 > > comm lat_min bw_max bw_max > pingpong pingpong sendrecv > (us) (MB/s) (MB/s) > ------------------------------------------- > openib 1.79 5827.93 11552.4 > mxm 2.23 5191.77 8201.76 > yalla 2.18 5200.55 8109.48 > > > openib means: pml=ob1 btl=openib,vader,self > btl_openib_include_if=mlx4_0 > mxm means: pml=cm,ob1 mtl=mxm btl=vader,self > yalla means: pml=yalla,ob1 btl=vader,self > > lspci reports for our FDR Infiniband HCA: > Infiniband Controler: Mellanox Technologies MT27500 Family [ConnectX-3] > > and 16 lines like: > Infiniband Controler: Mellanox Technologies MT27500/MT27520 Family > [ConnectX-3/ConnectX-3 Pro Virtual Function] > > the nodes use two octacore Xeon E5-2650v2 Ivybridge-EP 2.67 GHz sockets > > ofed_info reports that mxm version is 3.4.3cce223-0.32200 > > As you can see the results are not very good. I would expect mxm and yalla > to perform > better than openib both in term of latency and bandwidth (note: sendrecv > bandwidth is > full duplex). I would expect the yalla bandwidth to be around 1.1 us like > shown here > https://www.open-mpi.org/papers/sc-2014/Open-MPI-SC14-BOF.pdf (page 33). > > I also ran mxm_perftest (located in /opt/mellanox/bin) and it reports the > following > latency between two nodes: > > without MXM_RDMA_PORTS 1.92 us > with MXM_RDMA_PORTS=mlx4_0:1 1.65 us > > Again I think we can expect a better latency with our configuration. 1.65 > us is not a > very good result. > > Note however that the 0.27 us (1.92 - 1.65 = 0.27) reduction reduction in > raw mxm > latency correspond to the above Open MPI latencies observed with mxm (2.51 > - 2.23 = 0.28) > and yalla (2.47 - 2.18 = 0.29). > > Another detail: everything is run inside LXC containers. Also SR-IOV is > probably used. > > Does anyone has any idea what's wrong with our cluster ? > > Martin Audet > > > > Hi, Martin > > > > The environment variable: > > > > MXM_RDMA_PORTS=device:port > > > > is what you're looking for. You can specify a device/port pair on your > OMPI > > command line like: > > > > mpirun -np 2 ... -x MXM_RDMA_PORTS=mlx4_0:1 ... > > > > > > Best, > > > > Josh > > > _______________________________________________ > users mailing list > users@lists.open-mpi.org > https://rfd.newmexicoconsortium.org/mailman/listinfo/users > -- -Devendar
_______________________________________________ users mailing list users@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/users