Forgive my short reply but do you have a lot of VMs on these machines by any chance? We've seen issues with libvirt taking a long time with a large number of VMs as it sets up all the network filters.
Sent from my iPhone > On Feb 22, 2017, at 10:33 AM, Edmund Rhudy (BLOOMBERG/ 120 PARK) > <[email protected]> wrote: > > I recently witnessed a strange issue with libvirt when upgrading one of our > clusters from Kilo to Liberty. I'm not really looking for a specific > diagnosis here because of the large number of confounding factors and the > relative ease of remediating it, but I'm interested to hear if anyone else > has witnessed this particular problem. > > Background is we had a number of Kilo-based clusters, all running Ubuntu > 14.04.4 with OpenStack installed from the Ubuntu cloud archive. The upgrade > process to Liberty involved upgrading the OpenStack components and their > dependencies (including libvirt), then afterward upgrading all remaining > packages via dist-upgrade (and staging a kernel upgrade from 3.13 to 4.4, to > take effect on the next reboot). 7 clusters had all been upgraded > successfully using this strategy. > > One cluster, however, decided to get a bit weird. After the upgrade, 4 > hypervisors showed that nova-compute was refusing to come up properly and was > showing as enabled/down in nova service-list. Upon further investigation, > nova-compute was starting up but was getting jammed on loading nwfilters. > When I ran "virsh nwfilter-list", the command stalled indefinitely. Killing > nova-compute and restarting libvirt-bin service allowed the command to work > again, but it did not list any of the nova-instance-instance-* nwfilters. > Once nova-compute was started, it tried to start loading the > instance-specific filters and libvirt would wedge. I spent a while tinkering > with the affected systems but could not find any way of correcting the issue > other than rebooting the hypervisor, after which everything was fine. > > Has anyone ever seen anything like this? libvirt was upgraded from 1.2.12 to > 1.2.16. Hundreds of hypervisors had already received this exact same upgrade > without showing this problem, and I have no idea how I could reproduce it. > I'm interested to hear if anyone else has ever run into this and if they > figured out what the root cause was, though I've already braced myself for > tumbleweeds. > _______________________________________________ > OpenStack-operators mailing list > [email protected] > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
_______________________________________________ OpenStack-operators mailing list [email protected] http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
