Public bug reported: After an unexpected host reboot, all the guests went away. I added '--start_guests_on_host_boot=true' to /etc/nova/nova.conf and started up nova-compute. It started some instances but then died on:
2012-12-19 11:11:47 CRITICAL nova [-] Domain not found: no domain with matching name 'instance-000000bb' 2012-12-19 11:11:47 TRACE nova Traceback (most recent call last): 2012-12-19 11:11:47 TRACE nova File "/usr/bin/nova-compute", line 49, in <module> 2012-12-19 11:11:47 TRACE nova service.wait() 2012-12-19 11:11:47 TRACE nova File "/usr/lib/python2.7/dist-packages/nova/service.py", line 413, in wait 2012-12-19 11:11:47 TRACE nova _launcher.wait() 2012-12-19 11:11:47 TRACE nova File "/usr/lib/python2.7/dist-packages/nova/service.py", line 131, in wait 2012-12-19 11:11:47 TRACE nova service.wait() 2012-12-19 11:11:47 TRACE nova File "/usr/lib/python2.7/dist-packages/eventlet/greenthread.py", line 166, in wait 2012-12-19 11:11:47 TRACE nova return self._exit_event.wait() 2012-12-19 11:11:47 TRACE nova File "/usr/lib/python2.7/dist-packages/eventlet/event.py", line 116, in wait 2012-12-19 11:11:47 TRACE nova return hubs.get_hub().switch() 2012-12-19 11:11:47 TRACE nova File "/usr/lib/python2.7/dist-packages/eventlet/hubs/hub.py", line 177, in switch 2012-12-19 11:11:47 TRACE nova return self.greenlet.switch() 2012-12-19 11:11:47 TRACE nova File "/usr/lib/python2.7/dist-packages/eventlet/greenthread.py", line 192, in main 2012-12-19 11:11:47 TRACE nova result = function(*args, **kwargs) 2012-12-19 11:11:47 TRACE nova File "/usr/lib/python2.7/dist-packages/nova/service.py", line 101, in run_server 2012-12-19 11:11:47 TRACE nova server.start() 2012-12-19 11:11:47 TRACE nova File "/usr/lib/python2.7/dist-packages/nova/service.py", line 162, in start 2012-12-19 11:11:47 TRACE nova self.manager.init_host() 2012-12-19 11:11:47 TRACE nova File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 269, in init_host 2012-12-19 11:11:47 TRACE nova block_device_info) 2012-12-19 11:11:47 TRACE nova File "/usr/lib/python2.7/dist-packages/nova/exception.py", line 114, in wrapped 2012-12-19 11:11:47 TRACE nova return f(*args, **kw) 2012-12-19 11:11:47 TRACE nova File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/connection.py", line 852, in resume_state_on_host_boot 2012-12-19 11:11:47 TRACE nova block_device_info=block_device_info) 2012-12-19 11:11:47 TRACE nova File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/connection.py", line 790, in _hard_reboot 2012-12-19 11:11:47 TRACE nova virt_dom = self._conn.lookupByName(instance['name']) 2012-12-19 11:11:47 TRACE nova File "/usr/lib/python2.7/dist-packages/libvirt.py", line 2370, in lookupByName 2012-12-19 11:11:47 TRACE nova if ret is None:raise libvirtError('virDomainLookupByName() failed', conn=self) 2012-12-19 11:11:47 TRACE nova libvirtError: Domain not found: no domain with matching name 'instance-000000bb' 2012-12-19 11:11:47 TRACE nova This instance is in an error state: RESERVATION r-n1d0t747 c519923c921a404c96ebc8210a4ec67a juju-canonistack2, juju-canonistack2-10 INSTANCE i-000000bb ami-000000bf server-187 server-187 error None (c519923c921a404c96ebc8210a4ec67a, alce) 0 m1.small 2012-07-02T02:12:56.000Z nova monitoring-disabled instance-store And no longer exists on alce. I couldn't find any reasonable way to kill the instance entirely (ec2-terminate-instances as an admin user had no affect) or trivially remove it from the database. I ended up modifying the nova libvirt driver to skip instances it can't find with the attached patch. (FAOD, I'm attaching the patch mostly to illustrate the problem and our workaround, not necessarily for use as is in the packages or upstream.) This is all with current Ubuntu 12.04 packages (including precise-proposed). ** Affects: nova (Ubuntu) Importance: Undecided Status: New ** Tags: canonistack ** Patch added: "Skip instances which can't be found in hard_reboot" https://bugs.launchpad.net/bugs/1092108/+attachment/3463885/+files/diff.txt ** Tags added: canonistack -- You received this bug notification because you are a member of Ubuntu Server Team, which is subscribed to nova in Ubuntu. https://bugs.launchpad.net/bugs/1092108 Title: resume_state_on_host_boot fails on instances in error state To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/nova/+bug/1092108/+subscriptions -- Ubuntu-server-bugs mailing list Ubuntu-server-bugs@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs