On Jun 5, 2014, at 9:16 PM, Gilles Gouaillardet <gilles.gouaillar...@iferc.org> wrote:
> i work on a 4k+ nodes cluster with a very decent gigabit ethernet > network (reasonable oversubscription + switches > from a reputable vendor you are familiar with ;-) ) > my experience is that IPoIB can be very slow at establishing a > connection, especially if the arp table is not populated > (as far as i understand, this involves the subnet manager and > performance can be very random especially if all nodes issue > arp requests at the same time) > on the other hand, performance is much more stable when using the > subnetted IP network. Got it. >> As a simple solution, there could be an TCP oob MCA param that says >> "regardless of peer IP address, I can connect to them" (i.e., assume IP >> routing will make everything work out ok). > +1 and/or an option to tell oob mca "do not discard the interface simply > because the peer IP is not in the same subnet" Looks like Ralph's simpler solution fit the bill. -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/