** Tags added: libvirt queens-backport-potential ** Changed in: nova Status: New => Confirmed
** Also affects: nova/queens Importance: Undecided Status: New ** Changed in: nova/queens Importance: Undecided => High ** Changed in: nova Importance: Undecided => High ** Changed in: nova/queens Status: New => Confirmed ** Summary changed: - Cannot hard reboot an instance in error state + Cannot hard reboot a libvirt instance in error state (mdev query fails) -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1764460 Title: Cannot hard reboot a libvirt instance in error state (mdev query fails) Status in OpenStack Compute (nova): Confirmed Status in OpenStack Compute (nova) queens series: Confirmed Bug description: Nova version: stable/queens fda768b304e05821f7479f9698c59d18bf3d3516 Hypervisor: Libvirt + KVM If an instance doesn't exist in libvirt (failed live migration, compute container rebuilt, etc) a hard reboot or start is no longer able to recreate it. We see this problem occasionally happen for various reasons and in the past a hard reboot would revive the instance. A recent commit is responsible (libvirt: pass the mdevs when rebooting the guest). _get_all_assigned_mediated_devices() throws an instanceNotFound exception when trying to start such an instance. Adding a instance_exists() check solves the issue. --- driver.py.orig 2018-04-16 16:11:42.865555972 +0000 +++ driver.py 2018-04-16 16:11:55.901773724 +0000 @@ -5966,6 +5966,8 @@ """ allocated_mdevs = {} if instance: + if not self.instance_exists(instance): + return {} guest = self._host.get_guest(instance) guests = [guest] else: Steps to recreate: 1. Stop an instance 2. Delete the instance-XXXXXXX.xml file from /etc/libvirt/qemu/ 3. Start the instance Expected result: instance running Actual result: error: instanceNotFound from nova-compute Logs: 2018-04-16 15:41:09.756 2030272 INFO nova.compute.manager [req-ce2e1036-ab7b-4a98-b343-6ab748326963 32bab887a38f4b6cbcaf83297d4b7812 29e87d21ad14403bb789543e8bc0dab7 - default default] [instance: 0130afdf-f5aa-4ec9-8d0a-71080c70f276] Successfully reverted task state from powering-on on failure for instance. 2018-04-16 15:41:09.790 2030272 ERROR oslo_messaging.rpc.server [req-ce2e1036-ab7b-4a98-b343-6ab748326963 32bab887a38f4b6cbcaf83297d4b7812 29e87d21ad14403bb789543e8bc0dab7 - default default] Exception during message handling: InstanceNotFound: Instance 0130afdf-f5aa-4ec9-8d0a-71080c70f276 could not be found. 2018-04-16 15:41:09.790 2030272 ERROR oslo_messaging.rpc.server Traceback (most recent call last): 2018-04-16 15:41:09.790 2030272 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/oslo_messaging/rpc/server.py", line 163, in _process_incoming 2018-04-16 15:41:09.790 2030272 ERROR oslo_messaging.rpc.server res = self.dispatcher.dispatch(message) 2018-04-16 15:41:09.790 2030272 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 220, in dispatch 2018-04-16 15:41:09.790 2030272 ERROR oslo_messaging.rpc.server return self._do_dispatch(endpoint, method, ctxt, args) 2018-04-16 15:41:09.790 2030272 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 190, in _do_dispatch 2018-04-16 15:41:09.790 2030272 ERROR oslo_messaging.rpc.server result = func(ctxt, **new_args) 2018-04-16 15:41:09.790 2030272 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/nova/exception_wrapper.py", line 76, in wrapped 2018-04-16 15:41:09.790 2030272 ERROR oslo_messaging.rpc.server function_name, call_dict, binary) 2018-04-16 15:41:09.790 2030272 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__ 2018-04-16 15:41:09.790 2030272 ERROR oslo_messaging.rpc.server self.force_reraise() 2018-04-16 15:41:09.790 2030272 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise 2018-04-16 15:41:09.790 2030272 ERROR oslo_messaging.rpc.server six.reraise(self.type_, self.value, self.tb) 2018-04-16 15:41:09.790 2030272 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/nova/exception_wrapper.py", line 67, in wrapped 2018-04-16 15:41:09.790 2030272 ERROR oslo_messaging.rpc.server return f(self, context, *args, **kw) 2018-04-16 15:41:09.790 2030272 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/nova/compute/manager.py", line 186, in decorated_function 2018-04-16 15:41:09.790 2030272 ERROR oslo_messaging.rpc.server "Error: %s", e, instance=instance) 2018-04-16 15:41:09.790 2030272 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__ 2018-04-16 15:41:09.790 2030272 ERROR oslo_messaging.rpc.server self.force_reraise() 2018-04-16 15:41:09.790 2030272 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise 2018-04-16 15:41:09.790 2030272 ERROR oslo_messaging.rpc.server six.reraise(self.type_, self.value, self.tb) 2018-04-16 15:41:09.790 2030272 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/nova/compute/manager.py", line 156, in decorated_function 2018-04-16 15:41:09.790 2030272 ERROR oslo_messaging.rpc.server return function(self, context, *args, **kwargs) 2018-04-16 15:41:09.790 2030272 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/nova/compute/utils.py", line 976, in decorated_function 2018-04-16 15:41:09.790 2030272 ERROR oslo_messaging.rpc.server return function(self, context, *args, **kwargs) 2018-04-16 15:41:09.790 2030272 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/nova/compute/manager.py", line 202, in decorated_function 2018-04-16 15:41:09.790 2030272 ERROR oslo_messaging.rpc.server return function(self, context, *args, **kwargs) 2018-04-16 15:41:09.790 2030272 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/nova/compute/manager.py", line 2665, in start_instance 2018-04-16 15:41:09.790 2030272 ERROR oslo_messaging.rpc.server self._power_on(context, instance) 2018-04-16 15:41:09.790 2030272 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/nova/compute/manager.py", line 2635, in _power_on 2018-04-16 15:41:09.790 2030272 ERROR oslo_messaging.rpc.server block_device_info) 2018-04-16 15:41:09.790 2030272 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 2908, in power_on 2018-04-16 15:41:09.790 2030272 ERROR oslo_messaging.rpc.server self._hard_reboot(context, instance, network_info, block_device_info) 2018-04-16 15:41:09.790 2030272 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 2745, in _hard_reboot 2018-04-16 15:41:09.790 2030272 ERROR oslo_messaging.rpc.server mdevs = self._get_all_assigned_mediated_devices(instance) 2018-04-16 15:41:09.790 2030272 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 5969, in _get_all_assigned_mediated_devices 2018-04-16 15:41:09.790 2030272 ERROR oslo_messaging.rpc.server guest = self._host.get_guest(instance) 2018-04-16 15:41:09.790 2030272 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/nova/virt/libvirt/host.py", line 526, in get_guest 2018-04-16 15:41:09.790 2030272 ERROR oslo_messaging.rpc.server return libvirt_guest.Guest(self._get_domain(instance)) 2018-04-16 15:41:09.790 2030272 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/nova/virt/libvirt/host.py", line 546, in _get_domain 2018-04-16 15:41:09.790 2030272 ERROR oslo_messaging.rpc.server raise exception.InstanceNotFound(instance_id=instance.uuid) 2018-04-16 15:41:09.790 2030272 ERROR oslo_messaging.rpc.server InstanceNotFound: Instance 0130afdf-f5aa-4ec9-8d0a-71080c70f276 could not be found. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1764460/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp