Hello, I am still working on this issue. I can not apply the disaster recovery as describe here: http://docs.openstack.org/trunk/openstack-compute/admin/content/nova-disaster-recovery-process.html
Thanks to livemoon, I can get the instances back running following his tips. By the way, I have put that in a shell script below, to make it easy to run the procedure. To run, you should type: restore-instance <instance name> After that, we have the instance running and the database status update. The volume attachment still is missing. I can get it to work after a reboot in the host. As I said in a previous email, the attach was reporting and error that device /dev/vdc was already in use (what is not the case.. it must be a bug or something in the code). I changed the device to /dev/vde and it accepts and submit the command, but does not attach the device. Logs are below. Hope someone - including you livemoon :) - still has something else to say about this. Have anyone of you guys tested this before? Does it work for you? Cheers. nova volume-attach c5cf37e2-9e96-45a2-a739-638ac9877128 1f590986-995e-4cf6-bbe7-d8ced6672990 /dev/vde The compute.log: 2012-11-18 20:28:54 AUDIT nova.compute.manager [req-4d0e4f88-be19-4788-9c21-c280a77173fc cb302c58bb4245cebc61e132c79c1111 768bd68a0ac149eb8e300665eb3d3950] [instance: c5cf37e2-9e96-45a2-a739-638ac9877128] Attaching volume 1f590986-995e-4cf6-bbe7-d8ced6672990 to /dev/vde 2012-11-18 20:29:43 AUDIT nova.compute.resource_tracker [-] Free ram (MB): 13061 2012-11-18 20:29:43 AUDIT nova.compute.resource_tracker [-] Free disk (GB): 253 2012-11-18 20:29:43 AUDIT nova.compute.resource_tracker [-] Free VCPUS: -21 2012-11-18 20:29:43 INFO nova.compute.resource_tracker [-] Compute_service record updated for nova-controller 2012-11-18 20:29:54 ERROR nova.openstack.common.rpc.impl_qpid [-] Timed out waiting for RPC response: None 2012-11-18 20:29:54 TRACE nova.openstack.common.rpc.impl_qpid Traceback (most recent call last): 2012-11-18 20:29:54 TRACE nova.openstack.common.rpc.impl_qpid File "/usr/lib/python2.6/site-packages/nova/openstack/common/rpc/impl_qpid.py", line 376, in ensure 2012-11-18 20:29:54 TRACE nova.openstack.common.rpc.impl_qpid return method(*args, **kwargs) 2012-11-18 20:29:54 TRACE nova.openstack.common.rpc.impl_qpid File "/usr/lib/python2.6/site-packages/nova/openstack/common/rpc/impl_qpid.py", line 425, in _consume 2012-11-18 20:29:54 TRACE nova.openstack.common.rpc.impl_qpid nxt_receiver = self.session.next_receiver(timeout=timeout) 2012-11-18 20:29:54 TRACE nova.openstack.common.rpc.impl_qpid File "<string>", line 6, in next_receiver 2012-11-18 20:29:54 TRACE nova.openstack.common.rpc.impl_qpid File "/usr/lib/python2.6/site-packages/qpid/messaging/endpoints.py", line 663, in next_receiver 2012-11-18 20:29:54 TRACE nova.openstack.common.rpc.impl_qpid raise Empty 2012-11-18 20:29:54 TRACE nova.openstack.common.rpc.impl_qpid Empty: None 2012-11-18 20:29:54 TRACE nova.openstack.common.rpc.impl_qpid 2012-11-18 20:29:54 ERROR nova.compute.manager [req-4d0e4f88-be19-4788-9c21-c280a77173fc cb302c58bb4245cebc61e132c79c1111 768bd68a0ac149eb8e300665eb3d3950] [instance: c5cf37e2-9e96-45a2-a739-638ac9877128] Failed to connect to volume 1f590986-995e-4cf6-bbe7-d8ced6672990 while attaching at /dev/vde 2012-11-18 20:29:54 TRACE nova.compute.manager [instance: c5cf37e2-9e96-45a2-a739-638ac9877128] Traceback (most recent call last): 2012-11-18 20:29:54 TRACE nova.compute.manager [instance: c5cf37e2-9e96-45a2-a739-638ac9877128] File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 1956, in _attach_volume 2012-11-18 20:29:54 TRACE nova.compute.manager [instance: c5cf37e2-9e96-45a2-a739-638ac9877128] connector) 2012-11-18 20:29:54 TRACE nova.compute.manager [instance: c5cf37e2-9e96-45a2-a739-638ac9877128] File "/usr/lib/python2.6/site-packages/nova/volume/api.py", line 60, in wrapped 2012-11-18 20:29:54 TRACE nova.compute.manager [instance: c5cf37e2-9e96-45a2-a739-638ac9877128] return func(self, context, target_obj, *args, **kwargs) 2012-11-18 20:29:54 TRACE nova.compute.manager [instance: c5cf37e2-9e96-45a2-a739-638ac9877128] File "/usr/lib/python2.6/site-packages/nova/volume/api.py", line 378, in initialize_connection 2012-11-18 20:29:54 TRACE nova.compute.manager [instance: c5cf37e2-9e96-45a2-a739-638ac9877128] "connector": connector}}) 2012-11-18 20:29:54 TRACE nova.compute.manager [instance: c5cf37e2-9e96-45a2-a739-638ac9877128] File "/usr/lib/python2.6/site-packages/nova/openstack/common/rpc/__init__.py", line 102, in call 2012-11-18 20:29:54 TRACE nova.compute.manager [instance: c5cf37e2-9e96-45a2-a739-638ac9877128] return _get_impl().call(cfg.CONF, context, topic, msg, timeout) 2012-11-18 20:29:54 TRACE nova.compute.manager [instance: c5cf37e2-9e96-45a2-a739-638ac9877128] File "/usr/lib/python2.6/site-packages/nova/openstack/common/rpc/impl_qpid.py", line 561, in call 2012-11-18 20:29:54 TRACE nova.compute.manager [instance: c5cf37e2-9e96-45a2-a739-638ac9877128] rpc_amqp.get_connection_pool(conf, Connection)) 2012-11-18 20:29:54 TRACE nova.compute.manager [instance: c5cf37e2-9e96-45a2-a739-638ac9877128] File "/usr/lib/python2.6/site-packages/nova/openstack/common/rpc/amqp.py", line 368, in call 2012-11-18 20:29:54 TRACE nova.compute.manager [instance: c5cf37e2-9e96-45a2-a739-638ac9877128] rv = list(rv) 2012-11-18 20:29:54 TRACE nova.compute.manager [instance: c5cf37e2-9e96-45a2-a739-638ac9877128] File "/usr/lib/python2.6/site-packages/nova/openstack/common/rpc/amqp.py", line 329, in __iter__ 2012-11-18 20:29:54 TRACE nova.compute.manager [instance: c5cf37e2-9e96-45a2-a739-638ac9877128] self.done() 2012-11-18 20:29:54 TRACE nova.compute.manager [instance: c5cf37e2-9e96-45a2-a739-638ac9877128] File "/usr/lib64/python2.6/contextlib.py", line 23, in __exit__ 2012-11-18 20:29:54 TRACE nova.compute.manager [instance: c5cf37e2-9e96-45a2-a739-638ac9877128] self.gen.next() 2012-11-18 20:29:54 TRACE nova.compute.manager [instance: c5cf37e2-9e96-45a2-a739-638ac9877128] File "/usr/lib/python2.6/site-packages/nova/openstack/common/rpc/amqp.py", line 326, in __iter__ 2012-11-18 20:29:54 TRACE nova.compute.manager [instance: c5cf37e2-9e96-45a2-a739-638ac9877128] self._iterator.next() 2012-11-18 20:29:54 TRACE nova.compute.manager [instance: c5cf37e2-9e96-45a2-a739-638ac9877128] File "/usr/lib/python2.6/site-packages/nova/openstack/common/rpc/impl_qpid.py", line 434, in iterconsume 2012-11-18 20:29:54 TRACE nova.compute.manager [instance: c5cf37e2-9e96-45a2-a739-638ac9877128] yield self.ensure(_error_callback, _consume) 2012-11-18 20:29:54 TRACE nova.compute.manager [instance: c5cf37e2-9e96-45a2-a739-638ac9877128] File "/usr/lib/python2.6/site-packages/nova/openstack/common/rpc/impl_qpid.py", line 380, in ensure 2012-11-18 20:29:54 TRACE nova.compute.manager [instance: c5cf37e2-9e96-45a2-a739-638ac9877128] error_callback(e) 2012-11-18 20:29:54 TRACE nova.compute.manager [instance: c5cf37e2-9e96-45a2-a739-638ac9877128] File "/usr/lib/python2.6/site-packages/nova/openstack/common/rpc/impl_qpid.py", line 419, in _error_callback 2012-11-18 20:29:54 TRACE nova.compute.manager [instance: c5cf37e2-9e96-45a2-a739-638ac9877128] raise rpc_common.Timeout() 2012-11-18 20:29:54 TRACE nova.compute.manager [instance: c5cf37e2-9e96-45a2-a739-638ac9877128] Timeout: Timeout while waiting on RPC response. 2012-11-18 20:29:54 TRACE nova.compute.manager [instance: c5cf37e2-9e96-45a2-a739-638ac9877128] 2012-11-18 20:29:54 ERROR nova.openstack.common.rpc.amqp [-] Exception during message handling 2012-11-18 20:29:54 TRACE nova.openstack.common.rpc.amqp Traceback (most recent call last): 2012-11-18 20:29:54 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/openstack/common/rpc/amqp.py", line 275, in _process_data 2012-11-18 20:29:54 TRACE nova.openstack.common.rpc.amqp rval = self.proxy.dispatch(ctxt, version, method, **args) 2012-11-18 20:29:54 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/openstack/common/rpc/dispatcher.py", line 145, in dispatch 2012-11-18 20:29:54 TRACE nova.openstack.common.rpc.amqp return getattr(proxyobj, method)(ctxt, **kwargs) 2012-11-18 20:29:54 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/exception.py", line 117, in wrapped 2012-11-18 20:29:54 TRACE nova.openstack.common.rpc.amqp temp_level, payload) 2012-11-18 20:29:54 TRACE nova.openstack.common.rpc.amqp File "/usr/lib64/python2.6/contextlib.py", line 23, in __exit__ 2012-11-18 20:29:54 TRACE nova.openstack.common.rpc.amqp self.gen.next() 2012-11-18 20:29:54 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/exception.py", line 92, in wrapped 2012-11-18 20:29:54 TRACE nova.openstack.common.rpc.amqp return f(*args, **kw) 2012-11-18 20:29:54 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 181, in decorated_function 2012-11-18 20:29:54 TRACE nova.openstack.common.rpc.amqp pass 2012-11-18 20:29:54 TRACE nova.openstack.common.rpc.amqp File "/usr/lib64/python2.6/contextlib.py", line 23, in __exit__ 2012-11-18 20:29:54 TRACE nova.openstack.common.rpc.amqp self.gen.next() 2012-11-18 20:29:54 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 167, in decorated_function 2012-11-18 20:29:54 TRACE nova.openstack.common.rpc.amqp return function(self, context, *args, **kwargs) 2012-11-18 20:29:54 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 202, in decorated_function 2012-11-18 20:29:54 TRACE nova.openstack.common.rpc.amqp kwargs['instance']['uuid'], e, sys.exc_info()) 2012-11-18 20:29:54 TRACE nova.openstack.common.rpc.amqp File "/usr/lib64/python2.6/contextlib.py", line 23, in __exit__ 2012-11-18 20:29:54 TRACE nova.openstack.common.rpc.amqp self.gen.next() 2012-11-18 20:29:54 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 196, in decorated_function 2012-11-18 20:29:54 TRACE nova.openstack.common.rpc.amqp return function(self, context, *args, **kwargs) 2012-11-18 20:29:54 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 1945, in attach_volume 2012-11-18 20:29:54 TRACE nova.openstack.common.rpc.amqp context, instance.get('uuid'), mountpoint) 2012-11-18 20:29:54 TRACE nova.openstack.common.rpc.amqp File "/usr/lib64/python2.6/contextlib.py", line 23, in __exit__ 2012-11-18 20:29:54 TRACE nova.openstack.common.rpc.amqp self.gen.next() 2012-11-18 20:29:54 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 1941, in attach_volume 2012-11-18 20:29:54 TRACE nova.openstack.common.rpc.amqp mountpoint, instance) 2012-11-18 20:29:54 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 1963, in _attach_volume 2012-11-18 20:29:54 TRACE nova.openstack.common.rpc.amqp self.volume_api.unreserve_volume(context, volume) 2012-11-18 20:29:54 TRACE nova.openstack.common.rpc.amqp File "/usr/lib64/python2.6/contextlib.py", line 23, in __exit__ 2012-11-18 20:29:54 TRACE nova.openstack.common.rpc.amqp self.gen.next() 2012-11-18 20:29:54 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 1956, in _attach_volume 2012-11-18 20:29:54 TRACE nova.openstack.common.rpc.amqp connector) 2012-11-18 20:29:54 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/volume/api.py", line 60, in wrapped 2012-11-18 20:29:54 TRACE nova.openstack.common.rpc.amqp return func(self, context, target_obj, *args, **kwargs) 2012-11-18 20:29:54 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/volume/api.py", line 378, in initialize_connection 2012-11-18 20:29:54 TRACE nova.openstack.common.rpc.amqp "connector": connector}}) 2012-11-18 20:29:54 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/openstack/common/rpc/__init__.py", line 102, in call 2012-11-18 20:29:54 TRACE nova.openstack.common.rpc.amqp return _get_impl().call(cfg.CONF, context, topic, msg, timeout) 2012-11-18 20:29:54 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/openstack/common/rpc/impl_qpid.py", line 561, in call 2012-11-18 20:29:54 TRACE nova.openstack.common.rpc.amqp rpc_amqp.get_connection_pool(conf, Connection)) 2012-11-18 20:29:54 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/openstack/common/rpc/amqp.py", line 368, in call 2012-11-18 20:29:54 TRACE nova.openstack.common.rpc.amqp rv = list(rv) 2012-11-18 20:29:54 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/openstack/common/rpc/amqp.py", line 329, in __iter__ 2012-11-18 20:29:54 TRACE nova.openstack.common.rpc.amqp self.done() 2012-11-18 20:29:54 TRACE nova.openstack.common.rpc.amqp File "/usr/lib64/python2.6/contextlib.py", line 23, in __exit__ 2012-11-18 20:29:54 TRACE nova.openstack.common.rpc.amqp self.gen.next() 2012-11-18 20:29:54 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/openstack/common/rpc/amqp.py", line 326, in __iter__ 2012-11-18 20:29:54 TRACE nova.openstack.common.rpc.amqp self._iterator.next() 2012-11-18 20:29:54 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/openstack/common/rpc/impl_qpid.py", line 434, in iterconsume 2012-11-18 20:29:54 TRACE nova.openstack.common.rpc.amqp yield self.ensure(_error_callback, _consume) 2012-11-18 20:29:54 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/openstack/common/rpc/impl_qpid.py", line 380, in ensure 2012-11-18 20:29:54 TRACE nova.openstack.common.rpc.amqp error_callback(e) 2012-11-18 20:29:54 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/openstack/common/rpc/impl_qpid.py", line 419, in _error_callback 2012-11-18 20:29:54 TRACE nova.openstack.common.rpc.amqp raise rpc_common.Timeout() 2012-11-18 20:29:54 TRACE nova.openstack.common.rpc.amqp Timeout: Timeout while waiting on RPC response. 2012-11-18 20:29:54 TRACE nova.openstack.common.rpc.amqp ------- xx ------ restore-instance -------- xx ------ #!/bin/sh NOVAINSTDIR=/var/lib/nova/instances VIRXML=libvirt.xml get_instanceid_in_xml() { INSTANCEDIR=$1 if [ -f $INSTANCEDIR/$VIRXML ];then UUID=$(grep "<uuid>" $INSTANCEDIR/$VIRXML | /bin/cut -f2 -d">" | /bin/cut -f1 -d"<" ) echo $UUID fi } search_instanceid_in_xml() { UUID=$1 FOUND=false set $(ls -d $NOVAINSTDIR/instance*) while [ $# -gt 0 -a $FOUND == false ] do UUIDXML=$(get_instanceid_in_xml $1) if [ "$UUID" == "$UUIDXML" ];then FOUND=true else shift fi done if [ $FOUND == true ];then echo $1 else echo "1" fi } get_instanceid_in_db() { INSTANCENAME=$1 UUID=$(/usr/bin/nova list | /bin/grep -w "$INSTANCENAME" | /bin/cut -f2 -d"|") echo $UUID } reset_instance_status_db() { INSTUUID=$1 echo "MySQL Password:" read PW mysql -unova -p$PW nova -e "update instances set power_state='1', vm_state='active', task_state=NULL where uuid='$INSTUUID'" } INSTUID=$(get_instanceid_in_db $1) if [ "$INSTUID" == "" ];then echo "Instance not found" else INSTDIR=$(search_instanceid_in_xml $INSTUID) if [ "$INSTDIR" == "" ];then echo "Instance not found" else INSTIDNAME=$(echo $INSTDIR | /bin/awk -F "/" '{print $NF}') /usr/bin/sudo /usr/bin/virsh undefine $INSTIDNAME --managed-save /usr/bin/sudo /usr/bin/virsh define $INSTDIR/$VIRXML /usr/bin/sudo /usr/bin/virsh start $INSTIDNAME reset_instance_status_db "$INSTUID" fi fi ------- xx ------ restore-instance -------- xx ------ On 10 November 2012 17:25, Ronivon Costa <ronivon.co...@gmail.com> wrote: > Hi, > > Had some improvement with this issue. > > Could boot the instance using virsh, following livemoon advice with small > adaptation. > However the problem still is not fixed. > > The volume table was update: > mysql -unova -p$PW nova -e "update volumes set mountpoint=NULL, > attach_status='detached', instance_uuid=0" > mysql -unova -p$PW nova -e "update volumes set status='available' where > status <> 'error_deleting'" > > > Restarted the instance: > # virsh undefine instance-00000038 > error: Refusing to undefine while domain managed save image exists > > # virsh undefine instance-00000038 --managed-save > Domain instance-00000038 has been undefined > > # virsh define libvirt.xml > Domain instance-00000038 defined from libvirt.xml > > # virsh start instance-00000038 > Domain instance-00000038 started > > Then I update the database with the new instances status: > # mysql -unova -p nova -e "update instances set > power_state='1',vm_state='active',task_state=NULL where uuid = > '7e732b31-2ff8-4cf2-a7ac-f1562070cfb3'" > > I can now connect to the instance. That is a great improvement from my > original problem. > > But there still some serius issues to fix: > > The instance can not be rebooted (hard reboot). It will not start, with > the same errors as before. > Also, we can not attach the volume back to the instance: > # nova volume-attach 7e732b31-2ff8-4cf2-a7ac-f1562070cfb3 > 647db677-aa48-4d1e-b875-80be73469cb5 /dev/vdc > ERROR: The supplied device path (/dev/vdb) is in use. > ... > The error is: DevicePathInUse: The supplied device path (/dev/vdb) is in > use. > > > /dev/vdb is one ephemeral disk. Why nova is trying to use /dev/vdb when I > specified /dev/vdc ? > -- -- Ronivon C. Costa IBM Certified for Tivoli Software ITIL V3 Certified Tlm: (+351) 96 676 4458 Skype: ronivon.costa BLog ((hosted in my own personal cloud infrastructure): http://cloud0.dyndns-web.com/blog/
_______________________________________________ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp