Jeff,

as pointed by Ralph, i do wish using eth0 for oob messages.

i work on a 4k+ nodes cluster with a very decent gigabit ethernet
network (reasonable oversubscription + switches
from a reputable vendor you are familiar with ;-) )
my experience is that IPoIB can be very slow at establishing a
connection, especially if the arp table is not populated
(as far as i understand, this involves the subnet manager and
performance can be very random especially if all nodes issue
arp requests at the same time)
on the other hand, performance is much more stable when using the
subnetted IP network.

as Ralf also pointed, i can imagine some architects neglect their
ethernet network (e.g. highly oversubscribed + low end switches)
and in this case ib0 is a best fit for oob messages.

> As a simple solution, there could be an TCP oob MCA param that says 
> "regardless of peer IP address, I can connect to them" (i.e., assume IP 
> routing will make everything work out ok).
+1 and/or an option to tell oob mca "do not discard the interface simply
because the peer IP is not in the same subnet"

Cheers,

Gilles

On 2014/06/05 23:01, Ralph Castain wrote:
> Because Gilles wants to avoid using IB for TCP messages, and using eth0 also 
> solves the problem (the messages just route)
>
> On Jun 5, 2014, at 5:00 AM, Jeff Squyres (jsquyres) <jsquy...@cisco.com> 
> wrote:
>
>> Another random thought for Gilles situation: why not oob-TCP-if-include ib0? 
>>  (And not eth0)
>>

Reply via email to