On 01/28/2015 09:50 AM, Kevin Benton wrote:
Hi,

Approximately a year and a half ago, the default DHCP lease time in Neutron was increased from 120 seconds to 86400 seconds.[1] This was done with the goal of reducing DHCP traffic with very little discussion (based on what I can see in the review and bug report). While it it does indeed reduce DHCP traffic, I don't think any bug reports were filed showing that a 120 second lease time resulted in too much traffic or that a jump all of the way to 86400 seconds was required instead of a value in the same order of magnitude.

I guess that would be a good case for FORCERENEW DHCP extension [1] though after digging thru dnsmasq code a bit, I doubt it supports the extension (though e.g. systemd dhcp client/server from networkd module do). Le sigh.

[1]: https://tools.ietf.org/html/rfc3203


Why does this matter?

Neutron ports can be updated with a new IP address from the same subnet or another subnet on the same network. The port update will result in anti-spoofing iptables rule changes that immediately stop the old IP address from working on the host. This means the host is unreachable for 0-12 hours based on the current default lease time without manual intervention[2] (assuming half-lease length DHCP renewal attempts).

Why is this on the mailing list?

In an attempt to make the VMs usable in a much shorter timeframe following a Neutron port address change, I submitted a patch to reduce the default DHCP lease time to 8 minutes.[3] However, this was upsetting to several people,[4] so it was suggested I bring this discussion to the mailing list. The following are the high-level concerns followed by my responses:

  * 8 minutes is arbitrary
      o Yes, but it's no more arbitrary than 1440 minutes. I picked it
        as an interval because it is still 4 times larger than the
        last short value, but it still allows VMs to regain
        connectivity in <5 minutes in the event their IP is changed.
        If someone has a good suggestion for another interval based on
        known dnsmasq QPS limits or some other quantitative reason,
        please chime in here.
  * other datacenters use long lease times
      o This is true, but it's not really a valid comparison. In most
        regular datacenters, updating a static DHCP lease has no
        effect on the data plane so it doesn't matter that the client
        doesn't react for hours/days (even with DHCP snooping
        enabled). However, in Neutron's case, the security groups are
        immediately updated so all traffic using the old address is
        blocked.
  * dhcp traffic is scary because it's broadcast
      o ARP traffic is also broadcast and many clients will expire
        entries every 5-10 minutes and re-ARP. L2population may be
        used to prevent ARP propagation, so the comparison between
        DHCP and ARP isn't always relevant here.


Please reply back with your opinions/anecdotes/data related to short DHCP lease times.

Cheers

1. https://github.com/openstack/neutron/commit/d9832282cf656b162c51afdefb830dacab72defe 2. Manual intervention could be an instance reboot, a dhcp client invocation via the console, or a delayed invocation right before the update. (all significantly more difficult to script than a simple update of a port's IP via the API).
3. https://review.openstack.org/#/c/150595/
4. http://i.imgur.com/xtvatkP.jpg

--
Kevin Benton


__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Reply via email to