Jeff, as pointed by Ralph, i do wish using eth0 for oob messages.
i work on a 4k+ nodes cluster with a very decent gigabit ethernet network (reasonable oversubscription + switches from a reputable vendor you are familiar with ;-) ) my experience is that IPoIB can be very slow at establishing a connection, especially if the arp table is not populated (as far as i understand, this involves the subnet manager and performance can be very random especially if all nodes issue arp requests at the same time) on the other hand, performance is much more stable when using the subnetted IP network. as Ralf also pointed, i can imagine some architects neglect their ethernet network (e.g. highly oversubscribed + low end switches) and in this case ib0 is a best fit for oob messages. > As a simple solution, there could be an TCP oob MCA param that says > "regardless of peer IP address, I can connect to them" (i.e., assume IP > routing will make everything work out ok). +1 and/or an option to tell oob mca "do not discard the interface simply because the peer IP is not in the same subnet" Cheers, Gilles On 2014/06/05 23:01, Ralph Castain wrote: > Because Gilles wants to avoid using IB for TCP messages, and using eth0 also > solves the problem (the messages just route) > > On Jun 5, 2014, at 5:00 AM, Jeff Squyres (jsquyres) <jsquy...@cisco.com> > wrote: > >> Another random thought for Gilles situation: why not oob-TCP-if-include ib0? >> (And not eth0) >>