Several users have noticed poor latency with Open MPI when using the
new Mellanox ConnectX HCA hardware. Open MPI was getting about 1.9us
latency with 0 byte ping-pong benchmarks (e.g., NetPIPE or
osu_latency). This has been fixed in OMPI v1.2.4.
Short version:
--------------
Open MPI v1.2.4 (and newer) will get around 1.5us latency with 0 byte
ping-pong benchmarks on Mellanox ConnectX HCAs. Prior versions of
Open MPI can also achieve this low latency by setting the
btl_openib_use_eager_rdma MCA parameter to 1.
Longer version:
---------------
Until OMPI v1.2.4, Open MPI did not include specific configuration
information for ConnectX hardware, which forced Open MPI to choose
the conservative/safe configuration of not using RDMA for short
messages (using send/receive semantics instead). This increases
point-to-point latency in benchmarks.
OMPI v1.2.4 (and newer) includes the relevant configuration
information that enables short message RDMA by default on Mellanox
ConnectX hardware. This significantly improves Open MPI's latency on
popular MPI benchmark applications.
The same performance can be achieved on prior versions of Open MPI by
setting the btl_openib_use_eager_rdma MCA parameter to 1. The main
difference between v1.2.4 and prior versions is that the prior
versions do not set this MCA parameter value by default for ConnectX
hardware (because ConnectX did not exist when prior versions of Open
MPI were released).
This information is also now described on the FAQ:
http://www.open-mpi.org/faq/?category=openfabrics#mellanox-connectx-
poor-latency
--
Jeff Squyres
Cisco Systems