[Catching up on email]
Sunay Tripathi wrote:
The first step is to understand why M_CTL is bad. Mostly to do with
overload, performance and code cleanliness/readability. Code paths
which don't care about the content of the M_CTL have to deal with it
causing performance and cleanliness impact and then everyone using
M_CTL for their private stuff causes overload and loss of readability.
I don't have a problem using M_CTL across IPsec waiting for IKE to setup
a SA; when that part of IP needs to wait for an application it needs to
save a bunch of state with the packet.
But we also use M_CTL essentially as "argument passing" from TCP and the
top of IP to pass the results of IPsec policy lookup.
As I understand it, the only reason we need to attach those arguments to
the message is because of the asynchrony introduced by ip_newroute in
the middle of the stack.
For those unfamiliar to the Solaris TCP/IP stack, ip_newroute* is the
set of functions that
- do a routing table lookup
- do ARP/ND resolution if this is needed
- when this is done, creates an IRE_CACHE which contains the
routing+ARP information for the destination address.
With Surya we have a direction (currently only applied to the IPv4
forwarding path) to do the ARP/ND resolution at the bottom of IP
(ip_xmit_v*) instead of in the middle. Once we've applied this approach
to the IPv* packet origination paths (ip_output), then I think we can
replace that broad use of M_CTL by arguments to the relevant functions.
One way to do this is to define a ip_xmit_attr_t structure containing
things like
- the results from the IPsec policy lookup
- the derived result from socket options that affect how ip_output
works
By "derived" I mean that ip_wput_v6 today has lots of code to determine
on which interface (ill_t) to send a packet, since there is a slew of
ways this could be specified;
* The order to determine the outgoing interface is as follows:
* 1. IPV6_BOUND_PIF is set, use that ill (conn_outgoing_pill)
* 2. If conn_nofailover_ill is set then use that ill.
* 3. If an ip6i_t with IP6I_IFINDEX set then use that ill.
* 4. If q is an ill queue and (link local or multicast destination) then
* use that ill.
* 5. If IPV6_BOUND_IF has been set use that ill.
* 6. For multicast: if IPV6_MULTICAST_IF has been set use it. Otherwise
* look for the best IRE match for the unspecified group to determine
* the ill.
* 7. For unicast: Just do an IRE lookup for the best match.
With an ip_xmit_attr_t we can have a single ill_t (or ifindex), which
the upper layer would set based on the combination of socket options and
ancillary data. This has the benefit that the information can be cached
in the upper layer (for TCP as well as UDP) since this rarely changes.
This would localize any use of M_CTL (or any other mechanism to stash
information with the packet) to the small parts of the code such as
where IPsec needs to wait for iked to setup a security association.
Erik
_______________________________________________
networking-discuss mailing list
networking-discuss@opensolaris.org