Public bug reported:

After an unexpected host reboot, all the guests went away.  I added
'--start_guests_on_host_boot=true' to /etc/nova/nova.conf and started
up nova-compute.  It started some instances but then died on:

2012-12-19 11:11:47 CRITICAL nova [-] Domain not found: no domain with matching 
name 'instance-000000bb'
2012-12-19 11:11:47 TRACE nova Traceback (most recent call last):
2012-12-19 11:11:47 TRACE nova   File "/usr/bin/nova-compute", line 49, in 
<module>
2012-12-19 11:11:47 TRACE nova     service.wait()
2012-12-19 11:11:47 TRACE nova   File 
"/usr/lib/python2.7/dist-packages/nova/service.py", line 413, in wait
2012-12-19 11:11:47 TRACE nova     _launcher.wait()
2012-12-19 11:11:47 TRACE nova   File 
"/usr/lib/python2.7/dist-packages/nova/service.py", line 131, in wait
2012-12-19 11:11:47 TRACE nova     service.wait()
2012-12-19 11:11:47 TRACE nova   File 
"/usr/lib/python2.7/dist-packages/eventlet/greenthread.py", line 166, in wait
2012-12-19 11:11:47 TRACE nova     return self._exit_event.wait()
2012-12-19 11:11:47 TRACE nova   File 
"/usr/lib/python2.7/dist-packages/eventlet/event.py", line 116, in wait
2012-12-19 11:11:47 TRACE nova     return hubs.get_hub().switch()
2012-12-19 11:11:47 TRACE nova   File 
"/usr/lib/python2.7/dist-packages/eventlet/hubs/hub.py", line 177, in switch
2012-12-19 11:11:47 TRACE nova     return self.greenlet.switch()
2012-12-19 11:11:47 TRACE nova   File 
"/usr/lib/python2.7/dist-packages/eventlet/greenthread.py", line 192, in main
2012-12-19 11:11:47 TRACE nova     result = function(*args, **kwargs)
2012-12-19 11:11:47 TRACE nova   File 
"/usr/lib/python2.7/dist-packages/nova/service.py", line 101, in run_server
2012-12-19 11:11:47 TRACE nova     server.start()
2012-12-19 11:11:47 TRACE nova   File 
"/usr/lib/python2.7/dist-packages/nova/service.py", line 162, in start
2012-12-19 11:11:47 TRACE nova     self.manager.init_host()
2012-12-19 11:11:47 TRACE nova   File 
"/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 269, in 
init_host
2012-12-19 11:11:47 TRACE nova     block_device_info)
2012-12-19 11:11:47 TRACE nova   File 
"/usr/lib/python2.7/dist-packages/nova/exception.py", line 114, in wrapped
2012-12-19 11:11:47 TRACE nova     return f(*args, **kw)
2012-12-19 11:11:47 TRACE nova   File 
"/usr/lib/python2.7/dist-packages/nova/virt/libvirt/connection.py", line 852, 
in resume_state_on_host_boot
2012-12-19 11:11:47 TRACE nova     block_device_info=block_device_info)
2012-12-19 11:11:47 TRACE nova   File 
"/usr/lib/python2.7/dist-packages/nova/virt/libvirt/connection.py", line 790, 
in _hard_reboot
2012-12-19 11:11:47 TRACE nova     virt_dom = 
self._conn.lookupByName(instance['name'])
2012-12-19 11:11:47 TRACE nova   File 
"/usr/lib/python2.7/dist-packages/libvirt.py", line 2370, in lookupByName
2012-12-19 11:11:47 TRACE nova     if ret is None:raise 
libvirtError('virDomainLookupByName() failed', conn=self)
2012-12-19 11:11:47 TRACE nova libvirtError: Domain not found: no domain with 
matching name 'instance-000000bb'
2012-12-19 11:11:47 TRACE nova 

This instance is in an error state:

RESERVATION     r-n1d0t747      c519923c921a404c96ebc8210a4ec67a        
juju-canonistack2, juju-canonistack2-10
INSTANCE        i-000000bb      ami-000000bf    server-187      server-187      
error   None (c519923c921a404c96ebc8210a4ec67a, alce)   0               
m1.small        2012-07-02T02:12:56.000Z        nova                            
monitoring-disabled                                    instance-store

And no longer exists on alce.  I couldn't find any reasonable way to
kill the instance entirely (ec2-terminate-instances as an admin user
had no affect) or trivially remove it from the database.  I ended up
modifying the nova libvirt driver to skip instances it can't find with
the attached patch.

(FAOD, I'm attaching the patch mostly to illustrate the problem and
 our workaround, not necessarily for use as is in the packages or
 upstream.)

This is all with current Ubuntu 12.04 packages (including
precise-proposed).

** Affects: nova (Ubuntu)
     Importance: Undecided
         Status: New


** Tags: canonistack

** Patch added: "Skip instances which can't be found in hard_reboot"
   https://bugs.launchpad.net/bugs/1092108/+attachment/3463885/+files/diff.txt

** Tags added: canonistack

-- 
You received this bug notification because you are a member of Ubuntu
Server Team, which is subscribed to nova in Ubuntu.
https://bugs.launchpad.net/bugs/1092108

Title:
  resume_state_on_host_boot fails on instances in error state

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/nova/+bug/1092108/+subscriptions

-- 
Ubuntu-server-bugs mailing list
Ubuntu-server-bugs@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs

Reply via email to