[Catching up on email]

Sunay Tripathi wrote:

The first step is to understand why M_CTL is bad. Mostly to do with
overload, performance and code cleanliness/readability. Code paths
which don't care about the content of the M_CTL have to deal with it
causing performance and cleanliness impact and then everyone using
M_CTL for their private stuff causes overload and loss of readability.

I don't have a problem using M_CTL across IPsec waiting for IKE to setup a SA; when that part of IP needs to wait for an application it needs to save a bunch of state with the packet.

But we also use M_CTL essentially as "argument passing" from TCP and the top of IP to pass the results of IPsec policy lookup. As I understand it, the only reason we need to attach those arguments to the message is because of the asynchrony introduced by ip_newroute in the middle of the stack.

For those unfamiliar to the Solaris TCP/IP stack, ip_newroute* is the set of functions that
 - do a routing table lookup
 - do ARP/ND resolution if this is needed
 - when this is done, creates an IRE_CACHE which contains the
   routing+ARP information for the destination address.

With Surya we have a direction (currently only applied to the IPv4 forwarding path) to do the ARP/ND resolution at the bottom of IP (ip_xmit_v*) instead of in the middle. Once we've applied this approach to the IPv* packet origination paths (ip_output), then I think we can replace that broad use of M_CTL by arguments to the relevant functions.

One way to do this is to define a ip_xmit_attr_t structure containing things like
 - the results from the IPsec policy lookup
 - the derived result from socket options that affect how ip_output
   works

By "derived" I mean that ip_wput_v6 today has lots of code to determine on which interface (ill_t) to send a packet, since there is a slew of ways this could be specified;
 * The order to determine the outgoing interface is as follows:
 * 1. IPV6_BOUND_PIF is set, use that ill (conn_outgoing_pill)
 * 2. If conn_nofailover_ill is set then use that ill.
 * 3. If an ip6i_t with IP6I_IFINDEX set then use that ill.
 * 4. If q is an ill queue and (link local or multicast destination) then
 *    use that ill.
 * 5. If IPV6_BOUND_IF has been set use that ill.
 * 6. For multicast: if IPV6_MULTICAST_IF has been set use it. Otherwise
 *    look for the best IRE match for the unspecified group to determine
 *    the ill.
 * 7. For unicast: Just do an IRE lookup for the best match.


With an ip_xmit_attr_t we can have a single ill_t (or ifindex), which the upper layer would set based on the combination of socket options and ancillary data. This has the benefit that the information can be cached in the upper layer (for TCP as well as UDP) since this rarely changes.

This would localize any use of M_CTL (or any other mechanism to stash information with the packet) to the small parts of the code such as where IPsec needs to wait for iked to setup a security association.

   Erik

_______________________________________________
networking-discuss mailing list
networking-discuss@opensolaris.org

Reply via email to