[ There's a performance issue buried in here and I'd very much value
  input on this matter from the performance heads.  Also, sorry for
  the length of this email. ]

Folks,

As part of the Clearview IPMP work, I'm looking at converting the DHCP
client to strictly use sockets (rather than use DLPI as it does today).
By and large, this conversion has gone well -- minimal changes have been
required to the stack[1], and the DHCP client has shed about 600 lines of
code -- not to mention that it no longer needs its own private UDP and IP
checksum algorithms.

However, there is one nasty issue I've run into: as per RFC2131, by
default, the DHCP server *IP unicasts* its DHCPOFFER and DHCPACK packets
to the client via the IP address that it's *offering* to the client.
Since the client's stack has not yet been configured with that address, it
(rightfully) discards these packets and the lease negotiation fails.

Jim Carlson and I spent a while discussing possible ways to handle this,
none of which I'm in love with:

  1. Have the DHCP client use the "BROADCAST bit" (as per RFC2131) to tell
     the DHCP server to broadcast the DHCPOFFER and DHCPACK.  This is
     certainly the most architecturally pleasing option, and works fine
     in the testing I've done, but the RFC allows an alarming amount of
     wiggle-room in the use of this bit.  For instance, it states
     (emphasis mine):

        The BROADCAST bit will provide a _hint_ to the DHCP server and
        BOOTP relay agent to broadcast any messages to the client on the
        client's subnet.
        
     ... and indeed, articles such as:

        
http://blogs.technet.com/teamdhcp/archive/2006/11/08/use-of-broadcast-b-flag-in-dhcp.aspx

     ... make it clear that there are products on the market that do not
     respect the BROADCAST bit, including Cisco products.  (Though I take
     some comfort in the fact that it's enabled by default in Vista.)

  2. Leave the old DLPI-based code in the DHCP client, and only use the
     sockets-based approach when obtaining leases on IPMP IP interfaces
     (which can't use DLPI).  This would limit the impact of the use of
     the BROADCAST bit to IPMP usage cases, but means that we have two
     distinct codepaths to maintain and test in the DHCP client, rather
     than simplifying the code -- and all to handle a corner case.

  3. (Please sit down for this one.)  Have ip_input() itself recognize
     that it's received one of the special IP unicast DHCP packets, and
     have it rewrite the packet to have the IP broadcast address in the
     slow path.  To do this in a manner that doesn't massively interfere
     with forwarding performance (e.g., since the packet isn't for us, we
     need to decide whether to forward it or rewrite to be for us), it
     seems like we'd need an SIOCSLIFDHCPINIT ioctl that the client would
     use to enable this behavior when necessary on an interface[2] (which
     would e.g. set ill->ill_dhcp_init), and then we'd have to add one
     additional check to the fastpath stanza in ip_input() -- e.g.:
     
                if (!is_system_labeled() &&
                    !ipst->ips_ip_cgtp_filter && ipp_action_count == 0 &&
                    opt_len == 0 && ipha->ipha_protocol != IPPROTO_RSVP &&
                    !ll_multicast && !CLASSD(dst) && !ill->ill_dhcp_init) {
                                                     ^^^^^^^^^^^^^^^^^^^

     Unfortunately, even this single check will impact general networking
     performance especially for small packets, which I know is something
     we're already working hard to improve -- so this seems bad too.

So it seems "interoperability, performance, maintainability" -- you can
only pick two :-(

Thoughts?  Ideas?  (And no, adding DLPI access for IPMP is an
architectural syntax error and thus not an option.)

Thanks!

[1] The only notable modifications to IP were to complete the IP_BOUND_IF
    codepath for unicast and have it add IRE_BROADCAST entries for 0.0.0.0
    and 255.255.255.255 when SIOCSLIFFLAGS brings 0.0.0.0 IFF_UP.

[2] DHCP can be configured on logical interfaces too, so it's insufficient
    to check whether ill->ill_ipif->ipif_lcl_addr is INADDR_ANY.
     
-- 
meem

Reply via email to