On Fri, Feb 10, 2006 at 11:05:34AM -0500, Hal Rosenstock wrote:
> > Hi, Roland!
> > One issue we have with IPoIB is that IPoIB may cache a remote node path
> > for a long time. Remote LID may get changed e.g. if the SM is changed,
> > and IPoIB might lose connectivity.
I wonder if this is why when I reload the IB drivers on one node
I sometimes have to reload them on other nodes too. Otherwise
ping over IPoIB doesn't work.
If endnodes are not periodically refreshing their caches or are not subscribing to event management to be informed a refresh is in order, then endnodes will fall out of sync and would need to be restarted to establish communication. This is a classic problem that was illustrated in various early router protocols and is why today's protocols rely implement a two-prong approach in many cases - limited cache lifetime and proactive cache event updates.
> The remote LID may get changed for other reasons too without an SM
> change (SM merge of 2 separate subnets). How can this be handled ?
Isn't this just another case of the SM changing for one of the subnets?
A SM merge that involves updating LIDs is a non-trivial event. It requires connections to be effectively restarted as one cannot ascertain whether all packets are flushed from the fabric otherwise - that can cause silent data corruption. For a subsystem such as IPoverIB, a LID update should result in an unsolicited ARP / ND exchange which will cause all remote endnodes to receive the new information.
Mike
_______________________________________________ openib-general mailing list [email protected] http://openib.org/mailman/listinfo/openib-general
To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
