Ah. I think this is a pretty big issue, especially in VPC, people could miss getting an address if the router's service goes out. It may be the root cause of CLOUDSTACK-2110. We are looking into it. On May 1, 2013 9:10 AM, "Dennis Lawler" <dlaw...@gmail.com> wrote:
> It does reconfigure the available leases for new IP allocations. It just > doesn't expire the leases it has already handed out. > > If you replace the "service dnsmasq restart" in edithosts.sh with "kill -s > 1" on the router VM, you'll start seeing these log messages when a VM is > destroyed and re-allocated: > > dnsmasq-dhcp[pid]: not using configured address 192.168.1.100 because it is > leased to aa:bb:cc:11:22:33 > dnsmasq-dhcp[pid]: DHCPDISCOVER(eth0) aa:bb:cc:22:33:44 no address > available > > > > > On Tue, Apr 30, 2013 at 10:10 PM, Marcus Sorensen <shadow...@gmail.com > >wrote: > > > that's strange, because the dnsmasq man page explicitly calls out the > > SIGHUP as a way to reconfigure DHCP hosts entries from a --dhcp-hostsfile > > parameter. Or are these not the same thing? > > > > > > On Tue, Apr 30, 2013 at 5:52 PM, Chiradeep Vittal < > > chiradeep.vit...@citrix.com> wrote: > > > > > > > > > > > On 4/30/13 3:26 PM, "Dennis Lawler" <dlaw...@gmail.com> wrote: > > > > > > >Every time a new VM is started up, there is a 2 second outage in DNS > > > >services that can cause problems in guest VMs that use the router VM > for > > > >DNS. > > > > > > > > > > > > > > > >For Cloudstack configurations using both DHCP and DNS services on the > > > >router > > > >VM (both implemented with dnsmasq), there is currently a 2 second DNS > > > >service outage every time a new VM is instantiated > > > > > > > > > > > > > > > >The source of this outage is in edithosts.sh, which uses "service > > dnsmasq > > > >restart" to pick up the freshly added DNS and DHCP entries. > > > > > > > >Restarting the dnsmasq service triggers a sleep for 2 seconds after > > > >killing > > > >dnsmasq before starting it back up again. > > > > > > > > > > > > > > > >An obvious solution would be to replace "service dnsmasq restart" with > > > >"kill > > > >-s 1 $pid" (SIGHUP) so that dnsmasq reads the new DHCP entries without > > > >restarting, as in dnsmasq_edithosts.sh (external dhcp). > > > > > > > > > > > >Unfortunately, this solution is flawed because dnsmasq SIGHUP handling > > > >does > > > >not expire in-memory DHCP leases in dnsmasq and all leases are > infinite > > by > > > >default. > > > > > > Aha! That's why SIGHUP didn't work consistently. This has been bugging > me > > > for a long time. > > > > > > >Thus, this will only work if the guest VM performs a DHCP release on > > > >shutdown, which cannot always be guaranteed. > > > > > > > > > > > > > > > >A few possible solutions off the top of my head: > > > > > > > >1. Separate DNS and DHCP services. While DHCP services still > > > >experience an outage during VM, DNS will not necessarily be impacted > if > > > >implemented correctly. > > > > > > > >2. Use SIGHUP with dnsmasq and implement a removeDhcpEntry > > interface > > > >for network appliances to force a DHCP release whenever a NIC / IP is > > > >deallocated. This can use dhcp_release to simulate a DHCP release on > > the > > > >router VM. > > > >Catch: dhcp_release is not available for Debian 6.0. The System VM > > needs > > > >to > > > >be updated to at least Debian 7.0, or the dnsmasq-tools .deb from 7.0 > > > >would > > > >need to be included in the System VM image. > > > > > > There is going to be a new system vm based on 7.0 for the upcoming > > > release. This should work with earlier releases as well. > > > https://cwiki.apache.org/confluence/x/UlHVAQ > > > > > > > > > > >3. Change DHCP to have a shorter lease, track de-allocation of > IPs > > > >separately from VM destruction. > > > >Catch: This may cause occasional IP pool exhaustion depending on > > > >allocation > > > >of the guest IP range and the rate of VM destruction / instantiation > in > > > >the > > > >network. > > > > > > > > > > > > > > > >Thoughts? > > > > > > > > > > > > >