Re: [OMPI devel] tcp btl rendezvous performance question

Gilles Gouaillardet Tue, 19 Jul 2016 01:20:53 -0400 (EDT)

Howard,


did you bump both btl_tcp_rndv_eager_limit and btl_tcp_eager_limit ?

you might also need to bump btl_tcp_sndbuf, btl_tcp_rcvbuf andbtl_tcp_max_send_size to get the max performance out of your 100Gbethernet cards

last but not least, you might also need to bump btl_tcp_links tosaturate your network (that is likely a good thing when running 1 taskper node, but that can lead to decreased performance when runningseveral tasks per node)


Cheers,


Gilles


On 7/19/2016 6:57 AM, Howard Pritchard wrote:

Hi Folks,

I have a cluster with some 100 Gb ethernet cards
installed.  What we are noticing if we force Open MPI 1.10.3
to go through the TCP BTL (rather than yalla)  is that
the performance of osu_bw once the TCP BTL switches
from eager to rendezvous (> 32KB)
falls off a cliff, going from about 1.6 GB/sec to 233 MB/sec
and stays that way out to 4 MB message lengths.

There's nothing wrong with the IP stack (iperf -P4 gives
63 Gb/sec).

So, my questions are

1) is this performance expected for the TCP BTL when in
rendezvous mode?
2) is there some way to get more like the single socket
performance obtained with iperf for large messages (~16 Gb/sec).

We tried adjusting the tcp_btl_rendezvous threshold but that doesn't
appear to actually be adjustable from the mpirun command line.

Thanks for any suggestions,

Howard





_______________________________________________
devel mailing list
de...@open-mpi.org
Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel
Link to this post: 
http://www.open-mpi.org/community/lists/devel/2016/07/19237.php

Re: [OMPI devel] tcp btl rendezvous performance question

Reply via email to