> With respect to ire_t's and ill_t's, how does (if it does) the IPMP
 > refactoring project change the way these can be used?

It relies extensively on the split between ire_ipif and ire_stq/irq_rfq.
As with the current bits, ire_ipif (and the ipif's associated ill) refers
to the IRE's source address, and ire_stq/ire_rfq refer to the queue the
packets will be sent and received on.  So, in the common case of creating
an IRE_CACHE entry to a particular destination on a system using IPMP,
ire_ipif will refer to an ipif on the IPMP ill, and ire_stq will refer to
an IP interface in the group that was chosen at random.  If the IP
interface subsequently fails, those IRE_CACHE entries will be purged and a
new one will be selected the next time a packet is sent to that
destination address.

The IRE broadcast logic also makes use of this split.  In particular, the
existing IRE_MARK_NORECV logic (which prevents receiving duplicate
broadcast packets) is gone.  Instead, a single set of IRE_BROADCAST
entries are shared across the group (in the same way that they are
currently shared across an ill not using IPMP), and the ire_rfq/ire_stq
pointers reference the ill that's been nominated to send and receive
broadcast traffic.  Thus, when a broadcast packet comes in, we simply
check whether the incoming queue matches the ire_rfq of the IRE_BROADCAST
entry; if not, the packet is dropped, eliminating the duplicates.
Further, if there are no functioning interfaces in the group,
ire_stq/ire_rfq will refer to the IPMP stub interface itslef (see below),
which will simply discard the packets.

As an aside, combined with the broadcast cleanup work I putback to Nevada
yesterday, the IPMP-specific broadcast code in IP will be only about 75
lines (in comparison, it's around 1000 lines in Nevada today).

There are some other interesting cases (such as IRE_INTERFACE), but the
point is that the split between ire_ipif and ire_stq/ire_rfq is quite
fundamental to the new IPMP design -- and is also fundamental to other
technologies like VNI.

 > For example, now that the IPMP IP interface is a virtual interface,
 > presumably ill_wq will be NULL for an ill_t that describes an IPMP
 > interface?

No, it will refer to the stub DLPI driver, just like VNI.  (In fact, VNI
and IPMP will both share a common driver called "dlpistub".)  This stub
driver allows the normal IP DLPI sequences (such as DL_ATTACH_REQ) to be
handled without any special code in IP.  The stub driver advertises itself
with a special pseudo-DLPI type (SUNW_DL_IPMP for the IPMP stub device)
which causes IP to enable the IFF_IPMP flag via ip_ll_subnet_defaults(),
which is in turn used to enforce semantics that are IPMP-interface
specific.  The pseudo-DLPI type also enables some other things to be
easily handled -- e.g., IPv6 Interface ID's can be easily generated for
the IPMP interface via the existing ip_m_tbl[].

 > Or is IPMP going to become more like loopback, with NULL pointers for
 > queues and specific function calls in certain places to take care of things?

No, we don't do that.

 > Do you have any detailed notes on what the IPMP refactoring project means
 > for how ire_t's and ill_t's work together before and after?

The new source has quite a number of comments that explain these issues,
but I haven't written a low-level kernel design document on it yet (such a
document is planned, however).  A webrev of my ongoing IPMP changes is at:

   http://cr.opensolaris.org/~meem/clearview-ipmp/

Please note that the above webrev is based on a version of Nevada with no
kernel IPMP support.  The corresponding "IPMP-less" webrev is at:

   http://cr.opensolaris.org/~meem/clearview-noipmp/

Both of the above are linked from the project page at:

   http://opensolaris.org/os/project/clearview/ipmp/

Hope this helps,
-- 
meem
_______________________________________________
networking-discuss mailing list
[email protected]

Reply via email to