hi Howard,

Was this issue resolved ?  If so, what is the solution ?
Please let me know.
Curious to know , since we are also experimenting with these limits.

Thanks,
- Sreenidhi.


On Tue, Jul 19, 2016 at 10:50 AM, Gilles Gouaillardet <gil...@rist.or.jp>
wrote:

> Howard,
>
>
> did you bump both btl_tcp_rndv_eager_limit and btl_tcp_eager_limit ?
>
> you might also need to bump btl_tcp_sndbuf, btl_tcp_rcvbuf and
> btl_tcp_max_send_size to get the max performance out of your 100Gb ethernet
> cards
>
> last but not least, you might also need to bump btl_tcp_links to saturate
> your network (that is likely a good thing when running 1 task per node, but
> that can lead to decreased performance when running several tasks per node)
>
> Cheers,
>
>
> Gilles
>
> On 7/19/2016 6:57 AM, Howard Pritchard wrote:
>
> Hi Folks,
>
> I have a cluster with some 100 Gb ethernet cards
> installed.  What we are noticing if we force Open MPI 1.10.3
> to go through the TCP BTL (rather than yalla)  is that
> the performance of osu_bw once the TCP BTL switches
> from eager to rendezvous (> 32KB)
> falls off a cliff, going from about 1.6 GB/sec to 233 MB/sec
> and stays that way out to 4 MB message lengths.
>
> There's nothing wrong with the IP stack (iperf -P4 gives
> 63 Gb/sec).
>
> So, my questions are
>
> 1) is this performance expected for the TCP BTL when in
> rendezvous mode?
> 2) is there some way to get more like the single socket
> performance obtained with iperf for large messages (~16 Gb/sec).
>
> We tried adjusting the tcp_btl_rendezvous threshold but that doesn't
> appear to actually be adjustable from the mpirun command line.
>
> Thanks for any suggestions,
>
> Howard
>
>
>
>
>
> _______________________________________________
> devel mailing listde...@open-mpi.org
> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2016/07/19237.php
>
>
>
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2016/07/19240.php
>

Reply via email to