I've been able to track down what I believe is the root problem. If ovsdb-server (run by the openvswitch-switch service) restarts, the neutron-openvswitch-agent loses its connection and needs to be manually restarted in order to reconnect.
Causes of this bug I've seen have included ovsdb-server segfaulting, being kill -9ed, and being gracefully restarted with "service openvswitch-switch restart". The errors recorded in /var/log/upstart/neutron-openvswitch-agent.log vary depending on why ovsdb-server went away: 2014-03-23 20:10:01.883 20375 ERROR neutron.agent.linux.ovsdb_monitor [req-a776b981-b86b-4437-ab65-0c6be6070094 None] Error received from ovsdb monitor: ovsdb-client: unix:/var/run/openvswitch/db.sock: receive failed (End of file) 2014-03-24 01:40:17.617 20375 ERROR neutron.agent.linux.ovsdb_monitor [req-a776b981-b86b-4437-ab65-0c6be6070094 None] Error received from ovsdb monitor: 2014-03-24T01:40:17Z|00001|fatal_signal|WARN|terminating with signal 15 (Terminated) 2014-03-24 04:08:59.718 8455 ERROR neutron.agent.linux.ovsdb_monitor [req-d2c2cbd5-a77a-4455-84ac-0a8ec69b41e8 None] Error received from ovsdb monitor: ovsdb-client: unix:/var/run/openvswitch/db.sock: receive failed (End of file) 2014-03-24 22:44:22.174 8455 ERROR neutron.agent.linux.ovsdb_monitor [req-d2c2cbd5-a77a-4455-84ac-0a8ec69b41e8 None] Error received from ovsdb monitor: ovsdb-client: unix:/var/run/openvswitch/db.sock: receive failed (End of file) 2014-03-24 22:44:52.220 8455 ERROR neutron.agent.linux.ovsdb_monitor [req-d2c2cbd5-a77a-4455-84ac-0a8ec69b41e8 None] Error received from ovsdb monitor: ovsdb-client: failed to connect to "unix:/var/run/openvswitch/db.sock" (Connection refused) 2014-03-24 22:45:22.266 8455 ERROR neutron.agent.linux.ovsdb_monitor [req-d2c2cbd5-a77a-4455-84ac-0a8ec69b41e8 None] Error received from ovsdb monitor: ovsdb-client: failed to connect to "unix:/var/run/openvswitch/db.sock" (Connection refused) 2014-03-24 22:45:52.310 8455 ERROR neutron.agent.linux.ovsdb_monitor [req-d2c2cbd5-a77a-4455-84ac-0a8ec69b41e8 None] Error received from ovsdb monitor: ovsdb-client: failed to connect to "unix:/var/run/openvswitch/db.sock" (Connection refused) 2014-03-24 22:46:22.355 8455 ERROR neutron.agent.linux.ovsdb_monitor [req-d2c2cbd5-a77a-4455-84ac-0a8ec69b41e8 None] Error received from ovsdb monitor: ovsdb-client: failed to connect to "unix:/var/run/openvswitch/db.sock" (Connection refused) 2014-03-24 22:49:27.179 8455 ERROR neutron.agent.linux.ovsdb_monitor [req-d2c2cbd5-a77a-4455-84ac-0a8ec69b41e8 None] Error received from ovsdb monitor: 2014-03-24T22:49:27Z|00001|fatal_signal|WARN|terminating with signal 15 (Terminated) 2014-03-24 22:55:45.441 16033 ERROR neutron.agent.linux.ovsdb_monitor [req-5fe682ce-138e-46d6-aa7e-f0d43ab576ee None] Error received from ovsdb monitor: ovsdb-client: unix:/var/run/openvswitch/db.sock: receive failed (End of file) In all cases, the result is the same: until neutron-openvswitch-agent is restarted, no traffic is passed onto the tapXXXXX interface inside the dhcp-XXXXX netns ** Also affects: neutron Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1290486 Title: dhcp agent not serving responses Status in OpenStack Neutron (virtual network service): New Status in tripleo - openstack on openstack: In Progress Bug description: The DHCP requests were not being responded to after they were seen on the undercloud network interface. The neutron services were restarted in an attempt to ensure they had the newest configuration and knew they were supposed to respond to the requests. Rather than using the heat stack create (called in devtest_overcloud.sh) to test, it was simple to use the following to directly boot a baremetal node. nova boot --flavor $(nova flavor-list | grep "|[[:space:]]*baremetal[[:space:]]*|" | awk '{print $2}) \ --image $(nova image-list | grep "|[[:space:]]*overcloud-control[[:space:]]*|" | awk '{print $2}') \ bm-test1 Whilst the baremetal node was attempting to pxe boot a restart of the neutron services was performed. This allowed the baremetal node to boot. It has been observed that a neutron restart was needed for each subsequent reboot of the baremetal nodes to succeed. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1290486/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp