[Openstack] Dhcp lease errors in vlan mode

2012-05-14 Thread Vishvananda Ishaya
TL;DR

To fix issues with failed dhcp leases in vlan mode, upgrade to dnsmasq 2.6.1[1]

THE LONG VERSION

There is an issue with the way nova uses dnsmasq in VLAN mode. It starts up a 
single copy of dnsmasq for each vlan on the network host (or on every host in 
multi_host mode). The problem is in the way that dnsmasq binds to an ip address 
and port[2]. Both copies can respond to broadcast packet, but unicast packets 
can only be answered by one of the copies.

In nova this means that guests from only one project will get responses to 
their unicast dhcp renew requests.  Unicast projects from guests in other 
projects get ignored. What happens next is different depending on the guest os. 
 Linux generally will send a broadcast packet out after the unicast fails, and 
so the only effect is a small (tens of ms) hiccup while interface is 
reconfigured.  It can be much worse than that, however. I have seen cases where 
Windows just gives up and ends up with a non-configured interface.

This bug was first noticed by some users of openstack who rolled their own fix. 
Basically, on linux, if you set the SO_BINDTODEVICE socket option, it will 
allow different daemons to share the port and respond to unicast packets, as 
long as they listen on different interfaces. I managed to communicate with 
Simon Kelley, the maintainer of dnsmasq and he has integrated a fix[3] for the 
issue in the current version[1] of dnsmaq.

I don't know how may users out there are using vlan mode, but you should be 
able to deal with this issue by upgrading dnsmasq. It would be great if the 
various distributionss could upgrade as well, or at least try to patch in the 
fix[3]. If upgrading dnsmasq is out of the question, a possible workaround is 
to minimize lease renewals with something like the following combination of 
config options.

# release leases immediately on terminate
force_dhcp_release=true
# one week lease time
dhcp_lease_time=604800
# two week disassociate timeout
fixed_ip_disassociate_timeout=1209600

Vish

[1] http://www.thekelleys.org.uk/dnsmasq/dnsmasq-2.61.tar.gz

[2] http://lists.thekelleys.org.uk/pipermail/dnsmasq-discuss/2011q3/005233.html

[3] 
http://thekelleys.org.uk/gitweb/?p=dnsmasq.git;a=commitdiff;h=9380ba70d67db6b69f817d8e318de5ba1e990b12___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


Re: [Openstack] Dhcp lease errors in vlan mode

2012-05-14 Thread Lorin Hochstein

On May 14, 2012, at 1:46 PM, Vishvananda Ishaya wrote:

 TL;DR
 
 To fix issues with failed dhcp leases in vlan mode, upgrade to dnsmasq 
 2.6.1[1]
 

I attempted to document this issue in the docs: 
https://review.openstack.org/7403

(As an aside, we're using VLAN mode at Nimbis).


Take care,

Lorin
--
Lorin Hochstein
Lead Architect - Cloud Services
Nimbis Services, Inc.
www.nimbisservices.com




___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


Re: [Openstack] Dhcp lease errors in vlan mode

2012-05-14 Thread Brian Haley
On 05/14/2012 01:46 PM, Vishvananda Ishaya wrote:
 TL;DR
 
 To fix issues with failed dhcp leases in vlan mode, upgrade to dnsmasq 
 2.6.1[1]

+1 to upgrading (being the one that was bitten by the problem last year).

-Brian

___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


Re: [Openstack] Dhcp lease errors in vlan mode

2012-05-14 Thread Vishvananda Ishaya
Thanks lorin!

Vish

On May 14, 2012, at 12:59 PM, Lorin Hochstein wrote:

 
 On May 14, 2012, at 1:46 PM, Vishvananda Ishaya wrote:
 
 TL;DR
 
 To fix issues with failed dhcp leases in vlan mode, upgrade to dnsmasq 
 2.6.1[1]
 
 
 I attempted to document this issue in the docs: 
 https://review.openstack.org/7403
 
 (As an aside, we're using VLAN mode at Nimbis).
 
 
 Take care,
 
 Lorin
 --
 Lorin Hochstein
 Lead Architect - Cloud Services
 Nimbis Services, Inc.
 www.nimbisservices.com
 
 
 
 

___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp