I vaguely recall Vish mentioning a bug in dnsmasq that had a somewhat similar problem. (it had to do with lease renewal problems on ip aliases or something like that).
This issue was particularly pronounced with windows VMs, apparently. -nld On Thu, Jun 14, 2012 at 6:02 PM, Christian Parpart <[email protected]> wrote: > Hey, > > thanks for your reply. Unfortunately there was no process restart in > nova-network nor in dnsmasq, > both processes seem to have been up for about 2 and 3 days. > > However, why is the default dhcp_lease_time value equal 120s? Not having > this one overridden > causes the clients to actually re-acquire a new DHCP lease every 42 seconds > (at least on my nodes), > which is completely ridiculous. > OTOH, I took a look at the sources (linux_net.py) and found out, why the > max_lease_time is > set to 2048, because that is the size of my network. > So why is the max lease time the size of my network? > I've written a tiny patch to allow overriding this value in nova.conf, and > will submit it to launchpad > soon - and hope it'll be accepted and then also applied to essex, since this > is a very straight forward > few-liner helpful thing. > > Nevertheless, that does not clarify on why now I had 2 (well, 3 actually) > instances getting > no DHCP replies/offers after some hours/days anymore. > > The one host that caused issues today (a few hours ago), I fixed it by hard > rebooting the instance, > however, just about 40 minutes later, it again forgot its IP, so one might > say, that it > maybe did not get any reply from the dhcp server (dnsmasq) almost right > after it got > a lease on instance boot. > > So long, > Christian. > > On Thu, Jun 14, 2012 at 10:55 PM, Nathanael Burton > <[email protected]> wrote: >> >> Has nova-network been restarted? There was an issue where nova-network was >> signalling dnsmasq which would cause dnsmasq to stop responding to requests >> yet appear to be running fine. >> >> You can see if killing dnsmasq, restarting nova-network, and rebooting an >> instance allows it to get a dhcp address again ... >> >> Nate >> >> On Jun 14, 2012 4:46 PM, "Christian Parpart" <[email protected]> wrote: >>> >>> Hey all, >>> >>> I feel really sad with saying this, now, that we have quite a few >>> instances in producgtion >>> since about 5 days at least, I now have encountered the second instance >>> loosing its >>> IP address due to "No DHCPOFFER" (as of syslog in the instance). >>> >>> I checked the logs in the central nova-network and gateway node and found >>> dnsmasq still to reply on requests from all the other instances and it >>> even >>> got the request from the instance in question and even sent an OFFER, as >>> of what >>> I can tell by now (i'm investigating / posting logs asap), but while it >>> seemed >>> that the dnsmasq sends an offer, the instances says it didn't receive one >>> - wtf? >>> >>> Please tell me what I can do to actually *fix* this issue, since this is >>> by far very fatal. >>> >>> One chance I'd see (as a workaround) is, to let created instanced >>> retrieve >>> its IP via dhcp, but then reconfigure /etc/network/instances to continue >>> with >>> static networking setup. However, I'd just like the dhcp thingy to get >>> fixed. >>> >>> I'm very open to any kind of helping comments, :) >>> >>> So long, >>> Christian. >>> >>> >>> _______________________________________________ >>> Mailing list: https://launchpad.net/~openstack >>> Post to : [email protected] >>> Unsubscribe : https://launchpad.net/~openstack >>> More help : https://help.launchpad.net/ListHelp >>> > > > _______________________________________________ > Mailing list: https://launchpad.net/~openstack > Post to : [email protected] > Unsubscribe : https://launchpad.net/~openstack > More help : https://help.launchpad.net/ListHelp > _______________________________________________ Mailing list: https://launchpad.net/~openstack Post to : [email protected] Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp

