Re: [Openstack] nova-volumes problem after host reboot
On 11/10/2012 03:17 PM, Ronivon Costa wrote: Hi there, I am dealing with this issue for a while, but could not figure out what is going on. After a reboot in the openstack server, I am not able to restart ANY instance that had a nova-volume attached. I tried the DR procedure here without any improvement: http://docs.openstack.org/trunk/openstack-compute/admin/content/nova-disaster-recovery-process.html The error in compute.log is: ERROR nova.compute.manager [req-adacca25-ede8-4c6d-be92-9e8bd8578469 cb302c58bb4245cebc61e132c79c 768bd68a0ac149eb8e300665eb3d3950] [instance: 3cd109e4-addf-4aa8-bf66-b69df6573cea] Cannot reboot instance: iSCSI device not found at /dev/disk/by-path/ip-10.100.200.120:3260-iscsi-iqn.2010-10.org.openstack:volume-20db45cc-c97f-4589-9c9f-ed283b0bc16e-lun-1 This is a very restrictive issue, because I can not simply attach volumes to instances knowing that in a power failure or reboot for maintenance I will have my instances unavailable. Below there is some info about my setup. Any idea? Anything! :) Linux nova-controller 2.6.32-279.11.1.el6.x86_64 #1 SMP Tue Oct 16 15:57:10 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux rpm -qa |grep openstack openstack-nova-api-2012.2-2.el6.noarch If you follow: https://fedoraproject.org/wiki/Getting_started_with_OpenStack_EPEL https://fedoraproject.org/wiki/Test_Day:2012-09-18_OpenStack#Setup_OpenStack_volumes https://fedoraproject.org/wiki/QA:Testcase_Create_Cinder_Volumes You get the note: On RHEL based systems the config files in /etc/tgt/conf.d/ don't currently honor globbing. Only the main /etc/tgt/targets.conf seems does. So to avoid that issue causing tgtd to not start: sed -i '1iinclude /etc/nova/volumes/*' /etc/tgt/targets.conf sed -i '1iinclude /etc/cinder/volumes/*' /etc/tgt/targets.conf then restart tgtd: service tgtd restart Hopefully that will setup everything on boot. thanks, Pádraig. ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] nova-volumes problem after host reboot
Hello, I am still working on this issue. I can not apply the disaster recovery as describe here: http://docs.openstack.org/trunk/openstack-compute/admin/content/nova-disaster-recovery-process.html Thanks to livemoon, I can get the instances back running following his tips. By the way, I have put that in a shell script below, to make it easy to run the procedure. To run, you should type: restore-instance instance name After that, we have the instance running and the database status update. The volume attachment still is missing. I can get it to work after a reboot in the host. As I said in a previous email, the attach was reporting and error that device /dev/vdc was already in use (what is not the case.. it must be a bug or something in the code). I changed the device to /dev/vde and it accepts and submit the command, but does not attach the device. Logs are below. Hope someone - including you livemoon :) - still has something else to say about this. Have anyone of you guys tested this before? Does it work for you? Cheers. nova volume-attach c5cf37e2-9e96-45a2-a739-638ac9877128 1f590986-995e-4cf6-bbe7-d8ced6672990 /dev/vde The compute.log: 2012-11-18 20:28:54 AUDIT nova.compute.manager [req-4d0e4f88-be19-4788-9c21-c280a77173fc cb302c58bb4245cebc61e132c79c 768bd68a0ac149eb8e300665eb3d3950] [instance: c5cf37e2-9e96-45a2-a739-638ac9877128] Attaching volume 1f590986-995e-4cf6-bbe7-d8ced6672990 to /dev/vde 2012-11-18 20:29:43 AUDIT nova.compute.resource_tracker [-] Free ram (MB): 13061 2012-11-18 20:29:43 AUDIT nova.compute.resource_tracker [-] Free disk (GB): 253 2012-11-18 20:29:43 AUDIT nova.compute.resource_tracker [-] Free VCPUS: -21 2012-11-18 20:29:43 INFO nova.compute.resource_tracker [-] Compute_service record updated for nova-controller 2012-11-18 20:29:54 ERROR nova.openstack.common.rpc.impl_qpid [-] Timed out waiting for RPC response: None 2012-11-18 20:29:54 TRACE nova.openstack.common.rpc.impl_qpid Traceback (most recent call last): 2012-11-18 20:29:54 TRACE nova.openstack.common.rpc.impl_qpid File /usr/lib/python2.6/site-packages/nova/openstack/common/rpc/impl_qpid.py, line 376, in ensure 2012-11-18 20:29:54 TRACE nova.openstack.common.rpc.impl_qpid return method(*args, **kwargs) 2012-11-18 20:29:54 TRACE nova.openstack.common.rpc.impl_qpid File /usr/lib/python2.6/site-packages/nova/openstack/common/rpc/impl_qpid.py, line 425, in _consume 2012-11-18 20:29:54 TRACE nova.openstack.common.rpc.impl_qpid nxt_receiver = self.session.next_receiver(timeout=timeout) 2012-11-18 20:29:54 TRACE nova.openstack.common.rpc.impl_qpid File string, line 6, in next_receiver 2012-11-18 20:29:54 TRACE nova.openstack.common.rpc.impl_qpid File /usr/lib/python2.6/site-packages/qpid/messaging/endpoints.py, line 663, in next_receiver 2012-11-18 20:29:54 TRACE nova.openstack.common.rpc.impl_qpid raise Empty 2012-11-18 20:29:54 TRACE nova.openstack.common.rpc.impl_qpid Empty: None 2012-11-18 20:29:54 TRACE nova.openstack.common.rpc.impl_qpid 2012-11-18 20:29:54 ERROR nova.compute.manager [req-4d0e4f88-be19-4788-9c21-c280a77173fc cb302c58bb4245cebc61e132c79c 768bd68a0ac149eb8e300665eb3d3950] [instance: c5cf37e2-9e96-45a2-a739-638ac9877128] Failed to connect to volume 1f590986-995e-4cf6-bbe7-d8ced6672990 while attaching at /dev/vde 2012-11-18 20:29:54 TRACE nova.compute.manager [instance: c5cf37e2-9e96-45a2-a739-638ac9877128] Traceback (most recent call last): 2012-11-18 20:29:54 TRACE nova.compute.manager [instance: c5cf37e2-9e96-45a2-a739-638ac9877128] File /usr/lib/python2.6/site-packages/nova/compute/manager.py, line 1956, in _attach_volume 2012-11-18 20:29:54 TRACE nova.compute.manager [instance: c5cf37e2-9e96-45a2-a739-638ac9877128] connector) 2012-11-18 20:29:54 TRACE nova.compute.manager [instance: c5cf37e2-9e96-45a2-a739-638ac9877128] File /usr/lib/python2.6/site-packages/nova/volume/api.py, line 60, in wrapped 2012-11-18 20:29:54 TRACE nova.compute.manager [instance: c5cf37e2-9e96-45a2-a739-638ac9877128] return func(self, context, target_obj, *args, **kwargs) 2012-11-18 20:29:54 TRACE nova.compute.manager [instance: c5cf37e2-9e96-45a2-a739-638ac9877128] File /usr/lib/python2.6/site-packages/nova/volume/api.py, line 378, in initialize_connection 2012-11-18 20:29:54 TRACE nova.compute.manager [instance: c5cf37e2-9e96-45a2-a739-638ac9877128] connector: connector}}) 2012-11-18 20:29:54 TRACE nova.compute.manager [instance: c5cf37e2-9e96-45a2-a739-638ac9877128] File /usr/lib/python2.6/site-packages/nova/openstack/common/rpc/__init__.py, line 102, in call 2012-11-18 20:29:54 TRACE nova.compute.manager [instance: c5cf37e2-9e96-45a2-a739-638ac9877128] return _get_impl().call(cfg.CONF, context, topic, msg, timeout) 2012-11-18 20:29:54 TRACE nova.compute.manager [instance: c5cf37e2-9e96-45a2-a739-638ac9877128] File /usr/lib/python2.6/site-packages/nova/openstack/common/rpc/impl_qpid.py, line 561, in call 2012-11-18 20:29:54 TRACE nova.compute.manager [instance:
Re: [Openstack] nova-volumes problem after host reboot
Hi Ronivon Costa, Besides updating volumes table, you should update block_device_mapping table at the same time which manages the mapping between volumes and instances. Using the below commands update these two tables and you can reboot your instances and reattach your volume as normally with your nova command line or dashbord. mysql -unova -p$PW nova -e update volumes set status = 'available', attach_status = 'detached', mountpoint = null where id = $ID mysql -unova -p$PW nova -e update block_device_mapping set deleted_at = now(), deleted = 1 where volume_id = $ID and deleted = 0 Good Luck. 2012/11/19 Ronivon Costa ronivon.co...@gmail.com Hello, I am still working on this issue. I can not apply the disaster recovery as describe here: http://docs.openstack.org/trunk/openstack-compute/admin/content/nova-disaster-recovery-process.html Thanks to livemoon, I can get the instances back running following his tips. By the way, I have put that in a shell script below, to make it easy to run the procedure. To run, you should type: restore-instance instance name After that, we have the instance running and the database status update. The volume attachment still is missing. I can get it to work after a reboot in the host. As I said in a previous email, the attach was reporting and error that device /dev/vdc was already in use (what is not the case.. it must be a bug or something in the code). I changed the device to /dev/vde and it accepts and submit the command, but does not attach the device. Logs are below. Hope someone - including you livemoon :) - still has something else to say about this. Have anyone of you guys tested this before? Does it work for you? Cheers. nova volume-attach c5cf37e2-9e96-45a2-a739-638ac9877128 1f590986-995e-4cf6-bbe7-d8ced6672990 /dev/vde The compute.log: 2012-11-18 20:28:54 AUDIT nova.compute.manager [req-4d0e4f88-be19-4788-9c21-c280a77173fc cb302c58bb4245cebc61e132c79c 768bd68a0ac149eb8e300665eb3d3950] [instance: c5cf37e2-9e96-45a2-a739-638ac9877128] Attaching volume 1f590986-995e-4cf6-bbe7-d8ced6672990 to /dev/vde 2012-11-18 20:29:43 AUDIT nova.compute.resource_tracker [-] Free ram (MB): 13061 2012-11-18 20:29:43 AUDIT nova.compute.resource_tracker [-] Free disk (GB): 253 2012-11-18 20:29:43 AUDIT nova.compute.resource_tracker [-] Free VCPUS: -21 2012-11-18 20:29:43 INFO nova.compute.resource_tracker [-] Compute_service record updated for nova-controller 2012-11-18 20:29:54 ERROR nova.openstack.common.rpc.impl_qpid [-] Timed out waiting for RPC response: None 2012-11-18 20:29:54 TRACE nova.openstack.common.rpc.impl_qpid Traceback (most recent call last): 2012-11-18 20:29:54 TRACE nova.openstack.common.rpc.impl_qpid File /usr/lib/python2.6/site-packages/nova/openstack/common/rpc/impl_qpid.py, line 376, in ensure 2012-11-18 20:29:54 TRACE nova.openstack.common.rpc.impl_qpid return method(*args, **kwargs) 2012-11-18 20:29:54 TRACE nova.openstack.common.rpc.impl_qpid File /usr/lib/python2.6/site-packages/nova/openstack/common/rpc/impl_qpid.py, line 425, in _consume 2012-11-18 20:29:54 TRACE nova.openstack.common.rpc.impl_qpid nxt_receiver = self.session.next_receiver(timeout=timeout) 2012-11-18 20:29:54 TRACE nova.openstack.common.rpc.impl_qpid File string, line 6, in next_receiver 2012-11-18 20:29:54 TRACE nova.openstack.common.rpc.impl_qpid File /usr/lib/python2.6/site-packages/qpid/messaging/endpoints.py, line 663, in next_receiver 2012-11-18 20:29:54 TRACE nova.openstack.common.rpc.impl_qpid raise Empty 2012-11-18 20:29:54 TRACE nova.openstack.common.rpc.impl_qpid Empty: None 2012-11-18 20:29:54 TRACE nova.openstack.common.rpc.impl_qpid 2012-11-18 20:29:54 ERROR nova.compute.manager [req-4d0e4f88-be19-4788-9c21-c280a77173fc cb302c58bb4245cebc61e132c79c 768bd68a0ac149eb8e300665eb3d3950] [instance: c5cf37e2-9e96-45a2-a739-638ac9877128] Failed to connect to volume 1f590986-995e-4cf6-bbe7-d8ced6672990 while attaching at /dev/vde 2012-11-18 20:29:54 TRACE nova.compute.manager [instance: c5cf37e2-9e96-45a2-a739-638ac9877128] Traceback (most recent call last): 2012-11-18 20:29:54 TRACE nova.compute.manager [instance: c5cf37e2-9e96-45a2-a739-638ac9877128] File /usr/lib/python2.6/site-packages/nova/compute/manager.py, line 1956, in _attach_volume 2012-11-18 20:29:54 TRACE nova.compute.manager [instance: c5cf37e2-9e96-45a2-a739-638ac9877128] connector) 2012-11-18 20:29:54 TRACE nova.compute.manager [instance: c5cf37e2-9e96-45a2-a739-638ac9877128] File /usr/lib/python2.6/site-packages/nova/volume/api.py, line 60, in wrapped 2012-11-18 20:29:54 TRACE nova.compute.manager [instance: c5cf37e2-9e96-45a2-a739-638ac9877128] return func(self, context, target_obj, *args, **kwargs) 2012-11-18 20:29:54 TRACE nova.compute.manager [instance: c5cf37e2-9e96-45a2-a739-638ac9877128] File /usr/lib/python2.6/site-packages/nova/volume/api.py, line 378, in initialize_connection
[Openstack] nova-volumes problem after host reboot
Hi there, I am dealing with this issue for a while, but could not figure out what is going on. After a reboot in the openstack server, I am not able to restart ANY instance that had a nova-volume attached. I tried the DR procedure here without any improvement: http://docs.openstack.org/trunk/openstack-compute/admin/content/nova-disaster-recovery-process.html The error in compute.log is: ERROR nova.compute.manager [req-adacca25-ede8-4c6d-be92-9e8bd8578469 cb302c58bb4245cebc61e132c79c 768bd68a0ac149eb8e300665eb3d3950] [instance: 3cd109e4-addf-4aa8-bf66-b69df6573cea] Cannot reboot instance: iSCSI device not found at /dev/disk/by-path/ip-10.100.200.120:3260-iscsi-iqn.2010-10.org.openstack:volume-20db45cc-c97f-4589-9c9f-ed283b0bc16e-lun-1 This is a very restrictive issue, because I can not simply attach volumes to instances knowing that in a power failure or reboot for maintenance I will have my instances unavailable. Below there is some info about my setup. Any idea? Anything! :) Linux nova-controller 2.6.32-279.11.1.el6.x86_64 #1 SMP Tue Oct 16 15:57:10 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux rpm -qa |grep openstack openstack-nova-api-2012.2-2.el6.noarch openstack-dashboard-2012.2-3.el6.noarch openstack-utils-2012.2-5.el6.noarch openstack-nova-volume-2012.2-2.el6.noarch openstack-nova-novncproxy-0.4-2.el6.noarch openstack-nova-common-2012.2-2.el6.noarch openstack-nova-console-2012.2-2.el6.noarch openstack-nova-network-2012.2-2.el6.noarch openstack-nova-compute-2012.2-2.el6.noarch openstack-nova-cert-2012.2-2.el6.noarch openstack-nova-2012.2-2.el6.noarch openstack-glance-2012.2-2.el6.noarch python-django-openstack-auth-1.0.2-3.el6.noarch openstack-nova-objectstore-2012.2-2.el6.noarch openstack-nova-scheduler-2012.2-2.el6.noarch openstack-keystone-2012.2-1.el6.noarch -- -- Ronivon C. Costa IBM Certified for Tivoli Software ITIL V3 Certified Tlm: (+351) 96 676 4458 Skype: ronivon.costa BLog ((hosted in my own personal cloud infrastructure): http://cloud0.dyndns-web.com/blog/ ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] nova-volumes problem after host reboot
Hi, Ronivon Costa If you use kvm(libvirt), you can logon the compute node. use virsh list --all list all your no-running vm. For example, there is an instance name instance-0001, you cannot reboot using nova command because it attached block disk. You need do: 1. virsh undefine instance-001 2. goto the instances dir(default /var/lib/nova/instances/instance-0001), run virsh define libvirt.xml 3. then virsh start instance-0001, it can be started now/ 4. then you should update information about instances and volumes table in nova db. I think you already done it. 5. then you can reboot use nova-client or in dashboard. 6. then attach volume to the instance in nova-client or dashboard I hope it can help you. On Saturday, November 10, 2012, Ronivon Costa ronivon.co...@gmail.com wrote: Hi there, I am dealing with this issue for a while, but could not figure out what is going on. After a reboot in the openstack server, I am not able to restart ANY instance that had a nova-volume attached. I tried the DR procedure here without any improvement: http://docs.openstack.org/trunk/openstack-compute/admin/content/nova-disaster-recovery-process.html The error in compute.log is: ERROR nova.compute.manager [req-adacca25-ede8-4c6d-be92-9e8bd8578469 cb302c58bb4245cebc61e132c79c 768bd68a0ac149eb8e300665eb3d3950] [instance: 3cd109e4-addf-4aa8-bf66-b69df6573cea] Cannot reboot instance: iSCSI device not found at /dev/disk/by-path/ip-10.100.200.120:3260-iscsi-iqn.2010-10.org.openstack:volume-20db45cc-c97f-4589-9c9f-ed283b0bc16e-lun-1 This is a very restrictive issue, because I can not simply attach volumes to instances knowing that in a power failure or reboot for maintenance I will have my instances unavailable. Below there is some info about my setup. Any idea? Anything! :) Linux nova-controller 2.6.32-279.11.1.el6.x86_64 #1 SMP Tue Oct 16 15:57:10 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux rpm -qa |grep openstack openstack-nova-api-2012.2-2.el6.noarch openstack-dashboard-2012.2-3.el6.noarch openstack-utils-2012.2-5.el6.noarch openstack-nova-volume-2012.2-2.el6.noarch openstack-nova-novncproxy-0.4-2.el6.noarch openstack-nova-common-2012.2-2.el6.noarch openstack-nova-console-2012.2-2.el6.noarch openstack-nova-network-2012.2-2.el6.noarch openstack-nova-compute-2012.2-2.el6.noarch openstack-nova-cert-2012.2-2.el6.noarch openstack-nova-2012.2-2.el6.noarch openstack-glance-2012.2-2.el6.noarch python-django-openstack-auth-1.0.2-3.el6.noarch openstack-nova-objectstore-2012.2-2.el6.noarch openstack-nova-scheduler-2012.2-2.el6.noarch openstack-keystone-2012.2-1.el6.noarch -- -- Ronivon C. Costa IBM Certified for Tivoli Software ITIL V3 Certified Tlm: (+351) 96 676 4458 Skype: ronivon.costa BLog ((hosted in my own personal cloud infrastructure): http://cloud0.dyndns-web.com/blog/ -- Blog Site: livemoon.org Twitter: mwjpiero 非淡薄无以明志,非宁静无以致远 ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] nova-volumes problem after host reboot
Hi, Had some improvement with this issue. Could boot the instance using virsh, following livemoon advice with small adaptation. However the problem still is not fixed. The volume table was update: mysql -unova -p$PW nova -e update volumes set mountpoint=NULL, attach_status='detached', instance_uuid=0 mysql -unova -p$PW nova -e update volumes set status='available' where status 'error_deleting' Restarted the instance: # virsh undefine instance-0038 error: Refusing to undefine while domain managed save image exists # virsh undefine instance-0038 --managed-save Domain instance-0038 has been undefined # virsh define libvirt.xml Domain instance-0038 defined from libvirt.xml # virsh start instance-0038 Domain instance-0038 started Then I update the database with the new instances status: # mysql -unova -p nova -e update instances set power_state='1',vm_state='active',task_state=NULL where uuid = '7e732b31-2ff8-4cf2-a7ac-f1562070cfb3' I can now connect to the instance. That is a great improvement from my original problem. But there still some serius issues to fix: The instance can not be rebooted (hard reboot). It will not start, with the same errors as before. Also, we can not attach the volume back to the instance: # nova volume-attach 7e732b31-2ff8-4cf2-a7ac-f1562070cfb3 647db677-aa48-4d1e-b875-80be73469cb5 /dev/vdc ERROR: The supplied device path (/dev/vdb) is in use. ... The error is: DevicePathInUse: The supplied device path (/dev/vdb) is in use. /dev/vdb is one ephemeral disk. Why nova is trying to use /dev/vdb when I specified /dev/vdc ? ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp