> With respect to ire_t's and ill_t's, how does (if it does) the IPMP > refactoring project change the way these can be used?
It relies extensively on the split between ire_ipif and ire_stq/irq_rfq. As with the current bits, ire_ipif (and the ipif's associated ill) refers to the IRE's source address, and ire_stq/ire_rfq refer to the queue the packets will be sent and received on. So, in the common case of creating an IRE_CACHE entry to a particular destination on a system using IPMP, ire_ipif will refer to an ipif on the IPMP ill, and ire_stq will refer to an IP interface in the group that was chosen at random. If the IP interface subsequently fails, those IRE_CACHE entries will be purged and a new one will be selected the next time a packet is sent to that destination address. The IRE broadcast logic also makes use of this split. In particular, the existing IRE_MARK_NORECV logic (which prevents receiving duplicate broadcast packets) is gone. Instead, a single set of IRE_BROADCAST entries are shared across the group (in the same way that they are currently shared across an ill not using IPMP), and the ire_rfq/ire_stq pointers reference the ill that's been nominated to send and receive broadcast traffic. Thus, when a broadcast packet comes in, we simply check whether the incoming queue matches the ire_rfq of the IRE_BROADCAST entry; if not, the packet is dropped, eliminating the duplicates. Further, if there are no functioning interfaces in the group, ire_stq/ire_rfq will refer to the IPMP stub interface itslef (see below), which will simply discard the packets. As an aside, combined with the broadcast cleanup work I putback to Nevada yesterday, the IPMP-specific broadcast code in IP will be only about 75 lines (in comparison, it's around 1000 lines in Nevada today). There are some other interesting cases (such as IRE_INTERFACE), but the point is that the split between ire_ipif and ire_stq/ire_rfq is quite fundamental to the new IPMP design -- and is also fundamental to other technologies like VNI. > For example, now that the IPMP IP interface is a virtual interface, > presumably ill_wq will be NULL for an ill_t that describes an IPMP > interface? No, it will refer to the stub DLPI driver, just like VNI. (In fact, VNI and IPMP will both share a common driver called "dlpistub".) This stub driver allows the normal IP DLPI sequences (such as DL_ATTACH_REQ) to be handled without any special code in IP. The stub driver advertises itself with a special pseudo-DLPI type (SUNW_DL_IPMP for the IPMP stub device) which causes IP to enable the IFF_IPMP flag via ip_ll_subnet_defaults(), which is in turn used to enforce semantics that are IPMP-interface specific. The pseudo-DLPI type also enables some other things to be easily handled -- e.g., IPv6 Interface ID's can be easily generated for the IPMP interface via the existing ip_m_tbl[]. > Or is IPMP going to become more like loopback, with NULL pointers for > queues and specific function calls in certain places to take care of things? No, we don't do that. > Do you have any detailed notes on what the IPMP refactoring project means > for how ire_t's and ill_t's work together before and after? The new source has quite a number of comments that explain these issues, but I haven't written a low-level kernel design document on it yet (such a document is planned, however). A webrev of my ongoing IPMP changes is at: http://cr.opensolaris.org/~meem/clearview-ipmp/ Please note that the above webrev is based on a version of Nevada with no kernel IPMP support. The corresponding "IPMP-less" webrev is at: http://cr.opensolaris.org/~meem/clearview-noipmp/ Both of the above are linked from the project page at: http://opensolaris.org/os/project/clearview/ipmp/ Hope this helps, -- meem _______________________________________________ networking-discuss mailing list [email protected]
