Re: [networking-discuss] mblk extensions

Erik Nordmark Fri, 24 Mar 2006 08:29:41 -0800


[Catching up on email]


Sunay Tripathi wrote:

The first step is to understand why M_CTL is bad. Mostly to do with
overload, performance and code cleanliness/readability. Code paths
which don't care about the content of the M_CTL have to deal with it
causing performance and cleanliness impact and then everyone using
M_CTL for their private stuff causes overload and loss of readability.

I don't have a problem using M_CTL across IPsec waiting for IKE to setupa SA; when that part of IP needs to wait for an application it needs tosave a bunch of state with the packet.

But we also use M_CTL essentially as "argument passing" from TCP and thetop of IP to pass the results of IPsec policy lookup.As I understand it, the only reason we need to attach those arguments tothe message is because of the asynchrony introduced by ip_newroute inthe middle of the stack.

For those unfamiliar to the Solaris TCP/IP stack, ip_newroute* is theset of functions that

 - do a routing table lookup
 - do ARP/ND resolution if this is needed
 - when this is done, creates an IRE_CACHE which contains the
   routing+ARP information for the destination address.

With Surya we have a direction (currently only applied to the IPv4forwarding path) to do the ARP/ND resolution at the bottom of IP(ip_xmit_v*) instead of in the middle. Once we've applied this approachto the IPv* packet origination paths (ip_output), then I think we canreplace that broad use of M_CTL by arguments to the relevant functions.

One way to do this is to define a ip_xmit_attr_t structure containingthings like

 - the results from the IPsec policy lookup
 - the derived result from socket options that affect how ip_output
   works

By "derived" I mean that ip_wput_v6 today has lots of code to determineon which interface (ill_t) to send a packet, since there is a slew ofways this could be specified;

 * The order to determine the outgoing interface is as follows:
 * 1. IPV6_BOUND_PIF is set, use that ill (conn_outgoing_pill)
 * 2. If conn_nofailover_ill is set then use that ill.
 * 3. If an ip6i_t with IP6I_IFINDEX set then use that ill.
 * 4. If q is an ill queue and (link local or multicast destination) then
 *    use that ill.
 * 5. If IPV6_BOUND_IF has been set use that ill.
 * 6. For multicast: if IPV6_MULTICAST_IF has been set use it. Otherwise
 *    look for the best IRE match for the unspecified group to determine
 *    the ill.
 * 7. For unicast: Just do an IRE lookup for the best match.

With an ip_xmit_attr_t we can have a single ill_t (or ifindex), whichthe upper layer would set based on the combination of socket options andancillary data. This has the benefit that the information can be cachedin the upper layer (for TCP as well as UDP) since this rarely changes.

This would localize any use of M_CTL (or any other mechanism to stashinformation with the packet) to the small parts of the code such aswhere IPsec needs to wait for iked to setup a security association.


   Erik

_______________________________________________
networking-discuss mailing list
networking-discuss@opensolaris.org

Re: [networking-discuss] mblk extensions

Reply via email to