Re: [ovirt-users] Storage domain issue
Hi Jonas, Your problem seems to be similar with mine ([ovirt-users] Storage domain not in pool issue), after a crash disaster the hosts have lost the link with iSCSI LUNs. For the moment I still have no solution to bring back my 2 SD and the VMs included on these domains. Alain Alain VONDRA Chargé d'exploitation des Systèmes d'Information Direction Administrative et Financière +33 1 44 39 77 76 UNICEF France 3 rue Duguay Trouin 75006 PARIS www.unicef.fr -Message d'origine- De : users-boun...@ovirt.org [mailto:users-boun...@ovirt.org] De la part de Jonas Israelsson Envoyé : mardi 7 avril 2015 17:00 À : users@ovirt.org Objet : Re: [ovirt-users] Storage domain issue No one ? On 23/03/15 16:54, Jonas Israelsson wrote: > Greetings. > > Running oVirt 3.5 with a mix of NFS and FC Storage. > > Engine running on a seperate KVM VM and Node installed with a pre 3.5 > ovirt-node "ovirt-node-iso-3.5.0.ovirt35.20140912.el6 (Edited)" > > I had some problems with my FC-Storage where the LUNS for a while > became unavailable to my Ovirt-host. Everything is now up and running > and those luns again are accessible by the host. The NFS domains goes > back online but the FC does not. > > Thread-22::DEBUG::2015-03-23 > 14:53:02,706::lvm::290::Storage.Misc.excCmd::(cmd) /usr/bin/sudo -n > /sbin/lvm vgs --config ' devices { preferred_names = ["^/dev/mapper/"] > ignore_suspended_devices=1 write_cache_state=0 > disable_after_error_count=3 obtain_device_list_from_udev=0 filter = [ > '\''r|.*|'\'' ] } global { locking_type=1 prioritise_write_locks=1 > wait_for_locks=1 use_lvmetad=0 } backup { retain_min = 50 > retain_days = 0 } ' --noheadings --units b --nosuffix --separator '|' > --ignoreskippedcluster -o > uuid,name,attr,size,free,extent_size,extent_count,free_count,tags,vg_m > da_size,vg_mda_free,lv_count,pv_count,pv_name > 29f9b165-3674-4384-a1d4-7aa87d923d56 (cwd None) > > Thread-24::DEBUG::2015-03-23 > 14:53:02,981::lvm::290::Storage.Misc.excCmd::(cmd) FAILED: = ' > Volume group "29f9b165-3674-4384-a1d4-7aa87d923d56" not found\n > Skipping volume group 29f9b165-3674-4384-a1d4-7aa87d923d56\n'; = > 5 > > Thread-24::WARNING::2015-03-23 > 14:53:02,986::lvm::372::Storage.LVM::(_reloadvgs) lvm vgs failed: 5 [] > [' Volume group "29f9b165-3674-4384-a1d4-7aa87d923d56" not found', ' > Skipping volume group 29f9b165-3674-4384-a1d4-7aa87d923d56'] > > > Running the command above manually does indeed give the same output: > > # /sbin/lvm vgs --config ' devices { preferred_names = > ["^/dev/mapper/"] ignore_suspended_devices=1 write_cache_state=0 > disable_after_error_count=3 obtain_device_list_from_udev=0 filter = [ > '\''r|.*|'\'' ] } global { locking_type=1 prioritise_write_locks=1 > wait_for_locks=1 use_lvmetad=0 } backup { retain_min = 50 > retain_days = 0 } ' --noheadings --units b --nosuffix --separator '|' > --ignoreskippedcluster -o > uuid,name,attr,size,free,extent_size,extent_count,free_count,tags,vg_m > da_size,vg_mda_free,lv_count,pv_count,pv_name > 29f9b165-3674-4384-a1d4-7aa87d923d56 > > Volume group "29f9b165-3674-4384-a1d4-7aa87d923d56" not found > Skipping volume group 29f9b165-3674-4384-a1d4-7aa87d923d56 > > What puzzles me is that those volume does exist. > > lvm vgs > VG #PV #LV #SN Attr VSize VFree > 22cf06d1-faca-4e17-ac78-d38b7fc300b1 1 13 0 wz--n- 999.62g 986.50g > 29f9b165-3674-4384-a1d4-7aa87d923d56 1 8 0 wz--n- 99.62g 95.50g > HostVG 1 4 0 wz--n- 13.77g 52.00m > > > --- Volume group --- > VG Name 29f9b165-3674-4384-a1d4-7aa87d923d56 > System ID > Formatlvm2 > Metadata Areas2 > Metadata Sequence No 20 > VG Access read/write > VG Status resizable > MAX LV0 > Cur LV8 > Open LV 0 > Max PV0 > Cur PV1 > Act PV1 > VG Size 99.62 GiB > PE Size 128.00 MiB > Total PE 797 > Alloc PE / Size 33 / 4.12 GiB > Free PE / Size 764 / 95.50 GiB > VG UUID aAoOcw-d9YB-y9gP-Tp4M-S0UE-Aqpx-y6Z2Uk > > lvm vgs --config ' devices { preferred_names = ["^/dev/mapper/"] > ignore_suspended_devices=1 write_cache_state=0 > disable_after_error_count=3 obtain_device_list_from_udev=0 } global { > locking_type=1 prioritise_write_locks=1 wait_for_locks=1 > use_lvmetad=0 } backup { ret
Re: [ovirt-users] Storage domain issue
No one ? On 23/03/15 16:54, Jonas Israelsson wrote: Greetings. Running oVirt 3.5 with a mix of NFS and FC Storage. Engine running on a seperate KVM VM and Node installed with a pre 3.5 ovirt-node "ovirt-node-iso-3.5.0.ovirt35.20140912.el6 (Edited)" I had some problems with my FC-Storage where the LUNS for a while became unavailable to my Ovirt-host. Everything is now up and running and those luns again are accessible by the host. The NFS domains goes back online but the FC does not. Thread-22::DEBUG::2015-03-23 14:53:02,706::lvm::290::Storage.Misc.excCmd::(cmd) /usr/bin/sudo -n /sbin/lvm vgs --config ' devices { preferred_names = ["^/dev/mapper/"] ignore_suspended_devices=1 write_cache_state=0 disable_after_error_count=3 obtain_device_list_from_udev=0 filter = [ '\''r|.*|'\'' ] } global { locking_type=1 prioritise_write_locks=1 wait_for_locks=1 use_lvmetad=0 } backup { retain_min = 50 retain_days = 0 } ' --noheadings --units b --nosuffix --separator '|' --ignoreskippedcluster -o uuid,name,attr,size,free,extent_size,extent_count,free_count,tags,vg_mda_size,vg_mda_free,lv_count,pv_count,pv_name 29f9b165-3674-4384-a1d4-7aa87d923d56 (cwd None) Thread-24::DEBUG::2015-03-23 14:53:02,981::lvm::290::Storage.Misc.excCmd::(cmd) FAILED: = ' Volume group "29f9b165-3674-4384-a1d4-7aa87d923d56" not found\n Skipping volume group 29f9b165-3674-4384-a1d4-7aa87d923d56\n'; = 5 Thread-24::WARNING::2015-03-23 14:53:02,986::lvm::372::Storage.LVM::(_reloadvgs) lvm vgs failed: 5 [] [' Volume group "29f9b165-3674-4384-a1d4-7aa87d923d56" not found', ' Skipping volume group 29f9b165-3674-4384-a1d4-7aa87d923d56'] Running the command above manually does indeed give the same output: # /sbin/lvm vgs --config ' devices { preferred_names = ["^/dev/mapper/"] ignore_suspended_devices=1 write_cache_state=0 disable_after_error_count=3 obtain_device_list_from_udev=0 filter = [ '\''r|.*|'\'' ] } global { locking_type=1 prioritise_write_locks=1 wait_for_locks=1 use_lvmetad=0 } backup { retain_min = 50 retain_days = 0 } ' --noheadings --units b --nosuffix --separator '|' --ignoreskippedcluster -o uuid,name,attr,size,free,extent_size,extent_count,free_count,tags,vg_mda_size,vg_mda_free,lv_count,pv_count,pv_name 29f9b165-3674-4384-a1d4-7aa87d923d56 Volume group "29f9b165-3674-4384-a1d4-7aa87d923d56" not found Skipping volume group 29f9b165-3674-4384-a1d4-7aa87d923d56 What puzzles me is that those volume does exist. lvm vgs VG #PV #LV #SN Attr VSize VFree 22cf06d1-faca-4e17-ac78-d38b7fc300b1 1 13 0 wz--n- 999.62g 986.50g 29f9b165-3674-4384-a1d4-7aa87d923d56 1 8 0 wz--n- 99.62g 95.50g HostVG 1 4 0 wz--n- 13.77g 52.00m --- Volume group --- VG Name 29f9b165-3674-4384-a1d4-7aa87d923d56 System ID Formatlvm2 Metadata Areas2 Metadata Sequence No 20 VG Access read/write VG Status resizable MAX LV0 Cur LV8 Open LV 0 Max PV0 Cur PV1 Act PV1 VG Size 99.62 GiB PE Size 128.00 MiB Total PE 797 Alloc PE / Size 33 / 4.12 GiB Free PE / Size 764 / 95.50 GiB VG UUID aAoOcw-d9YB-y9gP-Tp4M-S0UE-Aqpx-y6Z2Uk lvm vgs --config ' devices { preferred_names = ["^/dev/mapper/"] ignore_suspended_devices=1 write_cache_state=0 disable_after_error_count=3 obtain_device_list_from_udev=0 } global { locking_type=1 prioritise_write_locks=1 wait_for_locks=1 use_lvmetad=0 } backup { retain_min = 50 retain_days = 0 } ' --noheadings --units b --nosuffix --separator '|' --ignoreskippedcluster -o uuid,name,attr,size,free,extent_size,extent_count,free_count,tags,vg_mda_size,vg_mda_free,lv_count,pv_count,pv_name 29f9b165-3674-4384-a1d4-7aa87d923d56 aAoOcw-d9YB-y9gP-Tp4M-S0UE-Aqpx-y6Z2Uk|29f9b165-3674-4384-a1d4-7aa87d923d56|wz--n-|106971529216|102542344192|134217728|797|764|MDT_LEASETIMESEC=60,MDT_CLASS=Data,MDT_VERSION=3,MDT_SDUUID=29f9b165-3674-4384-a1d4-7aa87d923d56,MDT_PV0=pv:36001405c94d80be2ed0482c91a1841b8&44&uuid:muHcYl-sobG-3LyY-jjfg-3fGf-1cHO-uDk7da&44&pestart:0&44&pecount:797&44&mapoffset:0,MDT_LEASERETRIES=3,MDT_VGUUID=aAoOcw-d9YB-y9gP-Tp4M-S0UE-Aqpx-y6Z2Uk,MDT_IOOPTIMEOUTSEC=10,MDT_LOCKRENEWALINTERVALSEC=5,MDT_PHYBLKSIZE=512,MDT_LOGBLKSIZE=512,MDT_TYPE=FCP,MDT_LOCKPOLICY=,MDT_DESCRIPTION=Master,RHAT_storage_domain,MDT_POOL_SPM_ID=-1,MDT_POOL_DESCRIPTION=Elementary,MDT_POOL_SPM_LVER=-1,MDT_POOL_UUID=8c3c5df9-e8ff-4313-99c9-385b6c7d896b,MDT_MASTER_VERSION=10,MDT_POOL_DOMAINS=22cf06d1-faca-4e17-ac78-d38b7fc300b1:Active&44&c434ab5a-9d21-42eb-ba1b-dbd716ba3ed1:Active&44&96e62d18-652d-401a-b4b5-b54ecefa331c:Active&44&29f9b165-3674-4384-a1d4-7aa87d923d56:Active&44&1a0d3e5a-d2ad-4829-8ebd-ad3ff5463062:Active,MDT__ SH A_CKSUM=7ea9af890755d96563cb7a
Re: [ovirt-users] Storage domain issue
Thnak you for reporting this issue because, I met exactly the same : FC storage domain and sometimes, many of my hosts (15 ) become sometimes unavailable without any apparent action on them. The issue message is : storage domain is unvailable. So it is a desaster when power management is activated because hosts reboot at the same time and all VMs go down without migrating. It happened to me two times, and the second time it was less a pity because I desactivated the power management. It may be a serious issue because host stay reacheable and lun is still okay when doing a lvs command. The workaround in this case is to restart the engine (restarting vdsm gives nothing) and then, all the hosts come up. * el6 engine on a separate KVM * implied el7 and el6 hosts * ovirt 3.5.1 and vdsm 4.16.10-8 * 2 FC datacenter on two remote sites with the same engine and both are impacted Le 23/03/2015 16:54, Jonas Israelsson a écrit : Greetings. Running oVirt 3.5 with a mix of NFS and FC Storage. Engine running on a seperate KVM VM and Node installed with a pre 3.5 ovirt-node "ovirt-node-iso-3.5.0.ovirt35.20140912.el6 (Edited)" I had some problems with my FC-Storage where the LUNS for a while became unavailable to my Ovirt-host. Everything is now up and running and those luns again are accessible by the host. The NFS domains goes back online but the FC does not. Thread-22::DEBUG::2015-03-23 14:53:02,706::lvm::290::Storage.Misc.excCmd::(cmd) /usr/bin/sudo -n /sbin/lvm vgs --config ' devices { preferred_names = ["^/dev/mapper/"] ignore_suspended_devices=1 write_cache_state=0 disable_after_error_count=3 obtain_device_list_from_udev=0 filter = [ '\''r|.*|'\'' ] } global { locking_type=1 prioritise_write_locks=1 wait_for_locks=1 use_lvmetad=0 } backup { retain_min = 50 retain_days = 0 } ' --noheadings --units b --nosuffix --separator '|' --ignoreskippedcluster -o uuid,name,attr,size,free,extent_size,extent_count,free_count,tags,vg_mda_size,vg_mda_free,lv_count,pv_count,pv_name 29f9b165-3674-4384-a1d4-7aa87d923d56 (cwd None) Thread-24::DEBUG::2015-03-23 14:53:02,981::lvm::290::Storage.Misc.excCmd::(cmd) FAILED: = ' Volume group "29f9b165-3674-4384-a1d4-7aa87d923d56" not found\n Skipping volume group 29f9b165-3674-4384-a1d4-7aa87d923d56\n'; = 5 Thread-24::WARNING::2015-03-23 14:53:02,986::lvm::372::Storage.LVM::(_reloadvgs) lvm vgs failed: 5 [] [' Volume group "29f9b165-3674-4384-a1d4-7aa87d923d56" not found', ' Skipping volume group 29f9b165-3674-4384-a1d4-7aa87d923d56'] Running the command above manually does indeed give the same output: # /sbin/lvm vgs --config ' devices { preferred_names = ["^/dev/mapper/"] ignore_suspended_devices=1 write_cache_state=0 disable_after_error_count=3 obtain_device_list_from_udev=0 filter = [ '\''r|.*|'\'' ] } global { locking_type=1 prioritise_write_locks=1 wait_for_locks=1 use_lvmetad=0 } backup { retain_min = 50 retain_days = 0 } ' --noheadings --units b --nosuffix --separator '|' --ignoreskippedcluster -o uuid,name,attr,size,free,extent_size,extent_count,free_count,tags,vg_mda_size,vg_mda_free,lv_count,pv_count,pv_name 29f9b165-3674-4384-a1d4-7aa87d923d56 Volume group "29f9b165-3674-4384-a1d4-7aa87d923d56" not found Skipping volume group 29f9b165-3674-4384-a1d4-7aa87d923d56 What puzzles me is that those volume does exist. lvm vgs VG #PV #LV #SN Attr VSize VFree 22cf06d1-faca-4e17-ac78-d38b7fc300b1 1 13 0 wz--n- 999.62g 986.50g 29f9b165-3674-4384-a1d4-7aa87d923d56 1 8 0 wz--n- 99.62g 95.50g HostVG 1 4 0 wz--n- 13.77g 52.00m --- Volume group --- VG Name 29f9b165-3674-4384-a1d4-7aa87d923d56 System ID Formatlvm2 Metadata Areas2 Metadata Sequence No 20 VG Access read/write VG Status resizable MAX LV0 Cur LV8 Open LV 0 Max PV0 Cur PV1 Act PV1 VG Size 99.62 GiB PE Size 128.00 MiB Total PE 797 Alloc PE / Size 33 / 4.12 GiB Free PE / Size 764 / 95.50 GiB VG UUID aAoOcw-d9YB-y9gP-Tp4M-S0UE-Aqpx-y6Z2Uk lvm vgs --config ' devices { preferred_names = ["^/dev/mapper/"] ignore_suspended_devices=1 write_cache_state=0 disable_after_error_count=3 obtain_device_list_from_udev=0 } global { locking_type=1 prioritise_write_locks=1 wait_for_locks=1 use_lvmetad=0 } backup { retain_min = 50 retain_days = 0 } ' --noheadings --units b --nosuffix --separator '|' --ignoreskippedcluster -o uuid,name,attr,size,free,extent_size,extent_count,free_count,tags,vg_mda_size,vg_mda_free,lv_count,pv_count,pv_name 29f9b165-3674-4384-a1d4-7aa87d923d56 aAoOcw-d9YB-y9gP-Tp4M-S0UE-Aqpx-y6Z2Uk|29f9b165-3674-4384-a1d4-7aa87d923d56|wz--n-|106971529216|102542344192|134217728|797|764|MDT_
[ovirt-users] Storage domain issue
Greetings. Running oVirt 3.5 with a mix of NFS and FC Storage. Engine running on a seperate KVM VM and Node installed with a pre 3.5 ovirt-node "ovirt-node-iso-3.5.0.ovirt35.20140912.el6 (Edited)" I had some problems with my FC-Storage where the LUNS for a while became unavailable to my Ovirt-host. Everything is now up and running and those luns again are accessible by the host. The NFS domains goes back online but the FC does not. Thread-22::DEBUG::2015-03-23 14:53:02,706::lvm::290::Storage.Misc.excCmd::(cmd) /usr/bin/sudo -n /sbin/lvm vgs --config ' devices { preferred_names = ["^/dev/mapper/"] ignore_suspended_devices=1 write_cache_state=0 disable_after_error_count=3 obtain_device_list_from_udev=0 filter = [ '\''r|.*|'\'' ] } global { locking_type=1 prioritise_write_locks=1 wait_for_locks=1 use_lvmetad=0 } backup { retain_min = 50 retain_days = 0 } ' --noheadings --units b --nosuffix --separator '|' --ignoreskippedcluster -o uuid,name,attr,size,free,extent_size,extent_count,free_count,tags,vg_mda_size,vg_mda_free,lv_count,pv_count,pv_name 29f9b165-3674-4384-a1d4-7aa87d923d56 (cwd None) Thread-24::DEBUG::2015-03-23 14:53:02,981::lvm::290::Storage.Misc.excCmd::(cmd) FAILED: = ' Volume group "29f9b165-3674-4384-a1d4-7aa87d923d56" not found\n Skipping volume group 29f9b165-3674-4384-a1d4-7aa87d923d56\n'; = 5 Thread-24::WARNING::2015-03-23 14:53:02,986::lvm::372::Storage.LVM::(_reloadvgs) lvm vgs failed: 5 [] [' Volume group "29f9b165-3674-4384-a1d4-7aa87d923d56" not found', ' Skipping volume group 29f9b165-3674-4384-a1d4-7aa87d923d56'] Running the command above manually does indeed give the same output: # /sbin/lvm vgs --config ' devices { preferred_names = ["^/dev/mapper/"] ignore_suspended_devices=1 write_cache_state=0 disable_after_error_count=3 obtain_device_list_from_udev=0 filter = [ '\''r|.*|'\'' ] } global { locking_type=1 prioritise_write_locks=1 wait_for_locks=1 use_lvmetad=0 } backup { retain_min = 50 retain_days = 0 } ' --noheadings --units b --nosuffix --separator '|' --ignoreskippedcluster -o uuid,name,attr,size,free,extent_size,extent_count,free_count,tags,vg_mda_size,vg_mda_free,lv_count,pv_count,pv_name 29f9b165-3674-4384-a1d4-7aa87d923d56 Volume group "29f9b165-3674-4384-a1d4-7aa87d923d56" not found Skipping volume group 29f9b165-3674-4384-a1d4-7aa87d923d56 What puzzles me is that those volume does exist. lvm vgs VG #PV #LV #SN Attr VSize VFree 22cf06d1-faca-4e17-ac78-d38b7fc300b1 1 13 0 wz--n- 999.62g 986.50g 29f9b165-3674-4384-a1d4-7aa87d923d56 1 8 0 wz--n- 99.62g 95.50g HostVG 1 4 0 wz--n- 13.77g 52.00m --- Volume group --- VG Name 29f9b165-3674-4384-a1d4-7aa87d923d56 System ID Formatlvm2 Metadata Areas2 Metadata Sequence No 20 VG Access read/write VG Status resizable MAX LV0 Cur LV8 Open LV 0 Max PV0 Cur PV1 Act PV1 VG Size 99.62 GiB PE Size 128.00 MiB Total PE 797 Alloc PE / Size 33 / 4.12 GiB Free PE / Size 764 / 95.50 GiB VG UUID aAoOcw-d9YB-y9gP-Tp4M-S0UE-Aqpx-y6Z2Uk lvm vgs --config ' devices { preferred_names = ["^/dev/mapper/"] ignore_suspended_devices=1 write_cache_state=0 disable_after_error_count=3 obtain_device_list_from_udev=0 } global { locking_type=1 prioritise_write_locks=1 wait_for_locks=1 use_lvmetad=0 } backup { retain_min = 50 retain_days = 0 } ' --noheadings --units b --nosuffix --separator '|' --ignoreskippedcluster -o uuid,name,attr,size,free,extent_size,extent_count,free_count,tags,vg_mda_size,vg_mda_free,lv_count,pv_count,pv_name 29f9b165-3674-4384-a1d4-7aa87d923d56 aAoOcw-d9YB-y9gP-Tp4M-S0UE-Aqpx-y6Z2Uk|29f9b165-3674-4384-a1d4-7aa87d923d56|wz--n-|106971529216|102542344192|134217728|797|764|MDT_LEASETIMESEC=60,MDT_CLASS=Data,MDT_VERSION=3,MDT_SDUUID=29f9b165-3674-4384-a1d4-7aa87d923d56,MDT_PV0=pv:36001405c94d80be2ed0482c91a1841b8&44&uuid:muHcYl-sobG-3LyY-jjfg-3fGf-1cHO-uDk7da&44&pestart:0&44&pecount:797&44&mapoffset:0,MDT_LEASERETRIES=3,MDT_VGUUID=aAoOcw-d9YB-y9gP-Tp4M-S0UE-Aqpx-y6Z2Uk,MDT_IOOPTIMEOUTSEC=10,MDT_LOCKRENEWALINTERVALSEC=5,MDT_PHYBLKSIZE=512,MDT_LOGBLKSIZE=512,MDT_TYPE=FCP,MDT_LOCKPOLICY=,MDT_DESCRIPTION=Master,RHAT_storage_domain,MDT_POOL_SPM_ID=-1,MDT_POOL_DESCRIPTION=Elementary,MDT_POOL_SPM_LVER=-1,MDT_POOL_UUID=8c3c5df9-e8ff-4313-99c9-385b6c7d896b,MDT_MASTER_VERSION=10,MDT_POOL_DOMAINS=22cf06d1-faca-4e17-ac78-d38b7fc300b1:Active&44&c434ab5a-9d21-42eb-ba1b-dbd716ba3ed1:Active&44&96e62d18-652d-401a-b4b5-b54ecefa331c:Active&44&29f9b165-3674-4384-a1d4-7aa87d923d56:Active&44&1a0d3e5a-d2ad-4829-8ebd-ad3ff5463062:Active,MDT__SH A_CKSUM=7ea9af890755d96563cb7a736f8e3f46ea986f67,MDT_ROLE=Regular|134217728|67103744|
Re: [ovirt-users] Storage Domain Issue
some more errors: Thread-19::DEBUG::2014-12-08 10:20:02,700::lvm::288::Storage.Misc.excCmd::(cmd) /usr/bin/sudo -n /sbin/lvm vgck --config ' devices { preferred_names = ["^/dev/mapper/"] ignore_suspended_devices=1 write_cache_state=0 disable_after_error_count=3 obtain_device_list_from_udev=0 filter = [ '\''a|/dev/mapper/36005076802810d489062|'\'', '\''r|.*|'\'' ] } global { locking_type=1 prioritise_write_locks=1 wait_for_locks=1 use_lvmetad=0 } backup { retain_min = 50 retain_days = 0 } ' f130d166-546e-4905-8b8f-55a1c1dd2e4f (cwd None) Thread-20::DEBUG::2014-12-08 10:20:02,817::lvm::288::Storage.Misc.excCmd::(cmd) /usr/bin/sudo -n /sbin/lvm vgck --config ' devices { preferred_names = ["^/dev/mapper/"] ignore_suspended_devices=1 write_cache_state=0 disable_after_error_count=3 obtain_device_list_from_udev=0 filter = [ '\''a|/dev/mapper/36005076802810d489062|/dev/mapper/36005076802810d48e0ae|/dev/mapper/36005076802810d48e0de|'\'', '\''r|.*|'\'' ] } global { locking_type=1 prioritise_write_locks=1 wait_for_locks=1 use_lvmetad=0 } backup { retain_min = 50 retain_days = 0 } ' eb912657-8a8c-4173-9d24-92d2b09a773c (cwd None) Thread-20::DEBUG::2014-12-08 10:20:03,388::lvm::288::Storage.Misc.excCmd::(cmd) /usr/bin/sudo -n /sbin/lvm vgs --config ' devices { preferred_names = ["^/dev/mapper/"] ignore_suspended_devices=1 write_cache_state=0 disable_after_error_count=3 obtain_device_list_from_udev=0 filter = [ '\''a|/dev/mapper/36005076802810d489062|/dev/mapper/36005076802810d48e0ae|/dev/mapper/36005076802810d48e0de|'\'', '\''r|.*|'\'' ] } global { locking_type=1 prioritise_write_locks=1 wait_for_locks=1 use_lvmetad=0 } backup { retain_min = 50 retain_days = 0 } ' --noheadings --units b --nosuffix --separator '|' --ignoreskippedcluster -o uuid,name,attr,size,free,extent_size,extent_count,free_count,tags,vg_mda_size,vg_mda_free,lv_count,pv_count,pv_name eb912657-8a8c-4173-9d24-92d2b09a773c (cwd None) Thread-17::ERROR::2014-12-08 10:20:03,469::sdc::137::Storage.StorageDomainCache::(_findDomain) looking for unfetched domain 78d84adf-7274-4efe-a711-fbec31196ece Thread-17::ERROR::2014-12-08 10:20:03,472::sdc::154::Storage.StorageDomainCache::(_findUnfetchedDomain) looking for domain 78d84adf-7274-4efe-a711-fbec31196ece Thread-17::DEBUG::2014-12-08 10:20:03,482::lvm::288::Storage.Misc.excCmd::(cmd) /usr/bin/sudo -n /sbin/lvm vgs --config ' devices { preferred_names = ["^/dev/mapper/"] ignore_suspended_devices=1 write_cache_state=0 disable_after_error_count=3 obtain_device_list_from_udev=0 filter = [ '\''a|/dev/mapper/36005076802810d489062|/dev/mapper/36005076802810d48e0ae|/dev/mapper/36005076802810d48e0de|'\'', '\''r|.*|'\'' ] } global { locking_type=1 prioritise_write_locks=1 wait_for_locks=1 use_lvmetad=0 } backup { retain_min = 50 retain_days = 0 } ' --noheadings --units b --nosuffix --separator '|' --ignoreskippedcluster -o uuid,name,attr,size,free,extent_size,extent_count,free_count,tags,vg_mda_size,vg_mda_free,lv_count,pv_count,pv_name 78d84adf-7274-4efe-a711-fbec31196ece (cwd None) Thread-17::DEBUG::2014-12-08 10:20:03,572::lvm::288::Storage.Misc.excCmd::(cmd) /usr/bin/sudo -n /sbin/lvm vgs --config ' devices { preferred_names = ["^/dev/mapper/"] ignore_suspended_devices=1 write_cache_state=0 disable_after_error_count=3 obtain_device_list_from_udev=0 filter = [ '\''a|/dev/mapper/36005076802810d489062|/dev/mapper/36005076802810d48e0ae|/dev/mapper/36005076802810d48e0de|'\'', '\''r|.*|'\'' ] } global { locking_type=1 prioritise_write_locks=1 wait_for_locks=1 use_lvmetad=0 } backup { retain_min = 50 retain_days = 0 } ' --noheadings --units b --nosuffix --separator '|' --ignoreskippedcluster -o uuid,name,attr,size,free,extent_size,extent_count,free_count,tags,vg_mda_size,vg_mda_free,lv_count,pv_count,pv_name 78d84adf-7274-4efe-a711-fbec31196ece (cwd None) Thread-17::DEBUG::2014-12-08 10:20:03,631::lvm::288::Storage.Misc.excCmd::(cmd) /usr/bin/sudo -n /sbin/lvm vgs --config ' devices { preferred_names = ["^/dev/mapper/"] ignore_suspended_devices=1 write_cache_state=0 disable_after_error_count=3 obtain_device_list_from_udev=0 filter = [ '\''a|/dev/mapper/36005076802810d489062|/dev/mapper/36005076802810d48e0ae|/dev/mapper/36005076802810d48e0de|'\'', '\''r|.*|'\'' ] } global { locking_type=1 prioritise_write_locks=1 wait_for_locks=1 use_lvmetad=0 } backup { retain_min = 50 retain_days = 0 } ' --noheadings --units b --nosuffix --separator '|' --ignoreskippedcluster -o uuid,name,attr,size,free,extent_size,extent_count,free_count,tags,vg_mda_size,vg_mda_free,lv_count,pv_count,pv_name f130d166-546e-4905-8b8f-55a1c1dd2e4f eb912657-8a8c-4173-9d24-92d2b09a773c (cwd None) Thread-14::ERROR::2014-12-08 10:20:05,785::task::866::Storage.TaskManager.Task::(_setError) Task=`ffaf5100-e833-4d29-ac5d-f6f7f8ce2b5d
[ovirt-users] Storage Domain Issue
Dear all, We have updated our hypervisors with yum. This included an update ov vdsm also. We now are with these version: vdsm-4.16.7-1.gitdb83943.el6.x86_64 vdsm-python-4.16.7-1.gitdb83943.el6.noarch vdsm-python-zombiereaper-4.16.7-1.gitdb83943.el6.noarch vdsm-xmlrpc-4.16.7-1.gitdb83943.el6.noarch vdsm-yajsonrpc-4.16.7-1.gitdb83943.el6.noarch vdsm-jsonrpc-4.16.7-1.gitdb83943.el6.noarch vdsm-cli-4.16.7-1.gitdb83943.el6.noarch And ever since these updates we experience BIG troubles with our fibre connections. I've already update the brocade cards to the latest version. This seemed to help, they already came back up and saw the storage domains (before the brocade update, they didn't even see their storage domains). But after a day or so, one of the hypersisors began to freak out again. Coming up and going back down... Below you can find the errors: Thread-821::ERROR::2014-12-08 07:10:33,190::task::866::Storage.TaskManager.Task::(_setError) Task=`27cb9779-a8e9-4080-988d-9772c922710b`::Unexpected error raise se.SpmStatusError() SpmStatusError: Not SPM: () Thread-821::ERROR::2014-12-08 07:10:33,194::dispatcher::76::Storage.Dispatcher::(wrapper) {'status': {'message': 'Not SPM: ()', 'code': 654}} Thread-822::ERROR::2014-12-08 07:11:03,878::task::866::Storage.TaskManager.Task::(_setError) Task=`30177931-68c0-420f-950f-da5b770fe35c`::Unexpected error Thread-822::ERROR::2014-12-08 07:11:03,882::dispatcher::76::Storage.Dispatcher::(wrapper) {'status': {'message': "Unknown pool id, pool not connected: ('1d03dc05-008b-4d14-97ce-b17bd714183d',)", 'code': 309}} Thread-813::ERROR::2014-12-08 07:11:07,634::sdc::137::Storage.StorageDomainCache::(_findDomain) looking for unfetched domain 78d84adf-7274-4efe-a711-fbec31196ece Thread-813::ERROR::2014-12-08 07:11:07,634::sdc::154::Storage.StorageDomainCache::(_findUnfetchedDomain) looking for domain 78d84adf-7274-4efe-a711-fbec31196ece Thread-813::DEBUG::2014-12-08 07:11:07,638::lvm::288::Storage.Misc.excCmd::(cmd) /usr/bin/sudo -n /sbin/lvm vgs --config ' devices { preferred_names = ["^/dev/mapper/"] ignore_suspended_devices=1 write_cache_state=0 disable_after_error_count=3 obtain_device_list_from_udev=0 filter = [ '\''a|/dev/mapper/36005076802810d489062|/dev/mapper/36005076802810d48e0ae|/dev/mapper/36005076802810d48e0de|'\'', '\''r|.*|'\'' ] } global { locking_type=1 prioritise_write_locks=1 wait_for_locks=1 use_lvmetad=0 } backup { retain_min = 50 retain_days = 0 } ' --noheadings --units b --nosuffix --separator '|' --ignoreskippedcluster -o uuid,name,attr,size,free,extent_size,extent_count,free_count,tags,vg_mda_size,vg_mda_free,lv_count,pv_count,pv_name 78d84adf-7274-4efe-a711-fbec31196ece (cwd None) Thread-813::DEBUG::2014-12-08 07:11:07,835::lvm::288::Storage.Misc.excCmd::(cmd) /usr/bin/sudo -n /sbin/lvm vgs --config ' devices { preferred_names = ["^/dev/mapper/"] ignore_suspended_devices=1 write_cache_state=0 disable_after_error_count=3 obtain_device_list_from_udev=0 filter = [ '\''a|/dev/mapper/36005076802810d489062|/dev/mapper/36005076802810d48e0ae|/dev/mapper/36005076802810d48e0de|'\'', '\''r|.*|'\'' ] } global { locking_type=1 prioritise_write_locks=1 wait_for_locks=1 use_lvmetad=0 } backup { retain_min = 50 retain_days = 0 } ' --noheadings --units b --nosuffix --separator '|' --ignoreskippedcluster -o uuid,name,attr,size,free,extent_size,extent_count,free_count,tags,vg_mda_size,vg_mda_free,lv_count,pv_count,pv_name 78d84adf-7274-4efe-a711-fbec31196ece (cwd None) Thread-813::ERROR::2014-12-08 07:11:07,896::spbackends::271::Storage.StoragePoolDiskBackend::(validateMasterDomainVersion) Requested master domain 78d84adf-7274-4efe-a711-fbec31196ece does not have expected version 42 it is version 17 Thread-813::ERROR::2014-12-08 07:11:07,903::task::866::Storage.TaskManager.Task::(_setError) Task=`c434f325-5193-4236-a04d-2fee9ac095bc`::Unexpected error Thread-813::ERROR::2014-12-08 07:11:07,946::dispatcher::76::Storage.Dispatcher::(wrapper) {'status': {'message': "Wrong Master domain or its version: 'SD=78d84adf-7274-4efe-a711-fbec31196ece, pool=1d03dc05-008b-4d14-97ce-b17bd714183d'", 'code': 324}} Thread-823::ERROR::2014-12-08 07:11:43,993::task::866::Storage.TaskManager.Task::(_setError) Task=`9abbccd9-88a7-4632-b350-f9af1f65bebd`::Unexpected error Thread-823::ERROR::2014-12-08 07:11:43,998::dispatcher::76::Storage.Dispatcher::(wrapper) {'status': {'message': "Unknown pool id, pool not connected: ('1d03dc05-008b-4d14-97ce-b17bd714183d',)", 'code': 309}} Thread-823::ERROR::2014-12-08 07:11:44,003::task::866::Storage.TaskManager.Task::(_setError) Task=`7ef1ac39-e7c2-4538-b30b-ab2fcefac01d`::Unexpected error raise se.SpmStatusError() SpmStatusError: Not SPM: () Thread-823::ERROR::2014-12-08 07:11:44,007::dispatcher::76::Storage.Dispatcher::(wrapper) {'status': {'message': 'Not SPM: ()', 'code': 654}} Thread-823::ERROR::2014-12-08 07:11:44,133::task::866::Storage.TaskManager.Task
Re: [ovirt-users] Storage domain issue
Never mind. Installed a new updated brocade driver on the hypervisor, restarted the engine after after 10 minutes, the engine restored everything itself :-) 2014-12-04 9:04 GMT+01:00 Koen Vanoppen : > Dear All, > > After we updated our hypervisors to : > vdsm-4.16.7-1.gitdb83943.el6.x86_64 > vdsm-python-4.16.7-1.gitdb83943.el6.noarch > vdsm-python-zombiereaper-4.16.7-1.gitdb83943.el6.noarch > vdsm-xmlrpc-4.16.7-1.gitdb83943.el6.noarch > vdsm-yajsonrpc-4.16.7-1.gitdb83943.el6.noarch > vdsm-jsonrpc-4.16.7-1.gitdb83943.el6.noarch > vdsm-cli-4.16.7-1.gitdb83943.el6.noarch > > We don't have access to our storage domain anymore. He jsut doesn't see > the disks anymore... > > I added the vdsm log and engine log. I'm clueless... BEfore he did see the > storage domains, now after the system update he doesn't anymore... > > Thanks in advance, > > Kind regards, > Koen > ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[ovirt-users] Storage domain issue
Dear All, After we updated our hypervisors to : vdsm-4.16.7-1.gitdb83943.el6.x86_64 vdsm-python-4.16.7-1.gitdb83943.el6.noarch vdsm-python-zombiereaper-4.16.7-1.gitdb83943.el6.noarch vdsm-xmlrpc-4.16.7-1.gitdb83943.el6.noarch vdsm-yajsonrpc-4.16.7-1.gitdb83943.el6.noarch vdsm-jsonrpc-4.16.7-1.gitdb83943.el6.noarch vdsm-cli-4.16.7-1.gitdb83943.el6.noarch We don't have access to our storage domain anymore. He jsut doesn't see the disks anymore... I added the vdsm log and engine log. I'm clueless... BEfore he did see the storage domains, now after the system update he doesn't anymore... Thanks in advance, Kind regards, Koen [root@ovirtmgmt01prod ~]# tail -f /var/log/ovirt-engine/engine.log 2014-12-04 08:58:50,192 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetAllTasksStatusesVDSCommand] (DefaultQuartzScheduler_Worker-96) Command HSMGetAllTasksStatusesVDSCommand(HostName = mercury2, HostId = dfddc678-f8ee-45eb-897a-885c83de870e) execution failed. Exception: IRSNonOperationalException: IRSGenericException: IRSErrorException: IRSNonOperationalException: Not SPM: () 2014-12-04 08:59:20,247 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.ConnectStoragePoolVDSCommand] (DefaultQuartzScheduler_Worker-96) Command ConnectStoragePoolVDSCommand(HostName = mercury2, HostId = dfddc678-f8ee-45eb-897a-885c83de870e, vdsId = dfddc678-f8ee-45eb-897a-885c83de870e, storagePoolId = 1d03dc05-008b-4d14-97ce-b17bd714183d, masterVersion = 8) execution failed. Exception: VDSNetworkException: java.util.concurrent.TimeoutException 2014-12-04 08:59:20,248 ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (DefaultQuartzScheduler_Worker-96) IrsBroker::Failed::GetStoragePoolInfoVDS due to: IRSNonOperationalException: IRSGenericException: IRSErrorException: IRSNonOperationalException: Could not connect host to Data Center(Storage issue) 2014-12-04 08:59:30,418 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetAllTasksStatusesVDSCommand] (DefaultQuartzScheduler_Worker-45) Command HSMGetAllTasksStatusesVDSCommand(HostName = mercury2, HostId = dfddc678-f8ee-45eb-897a-885c83de870e) execution failed. Exception: IRSNonOperationalException: IRSGenericException: IRSErrorException: IRSNonOperationalException: Not SPM: () 2014-12-04 08:59:47,985 WARN [org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxyData] (org.ovirt.thread.pool-8-thread-45) Domain eb912657-8a8c-4173-9d24-92d2b09a773c:StoragePoolDMZ03 was reported by all hosts in status UP as problematic. Not moving the domain to NonOperational because it is being reconstructed now. 2014-12-04 08:59:50,188 WARN [org.ovirt.engine.core.bll.scheduling.policyunits.EvenGuestDistributionBalancePolicyUnit] (DefaultQuartzScheduler_Worker-68) There is no host with less than 5 running guests 2014-12-04 08:59:50,189 WARN [org.ovirt.engine.core.bll.scheduling.PolicyUnitImpl] (DefaultQuartzScheduler_Worker-68) All hosts are over-utilized, cant balance the cluster SandyBridgeCluster 2014-12-04 09:00:00,469 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.ConnectStoragePoolVDSCommand] (DefaultQuartzScheduler_Worker-45) Command ConnectStoragePoolVDSCommand(HostName = mercury2, HostId = dfddc678-f8ee-45eb-897a-885c83de870e, vdsId = dfddc678-f8ee-45eb-897a-885c83de870e, storagePoolId = 1d03dc05-008b-4d14-97ce-b17bd714183d, masterVersion = 8) execution failed. Exception: VDSNetworkException: java.util.concurrent.TimeoutException 2014-12-04 09:00:00,470 ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (DefaultQuartzScheduler_Worker-45) IrsBroker::Failed::GetStoragePoolInfoVDS due to: IRSNonOperationalException: IRSGenericException: IRSErrorException: IRSNonOperationalException: Could not connect host to Data Center(Storage issue) 2014-12-04 09:00:10,597 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetAllTasksStatusesVDSCommand] (DefaultQuartzScheduler_Worker-81) Command HSMGetAllTasksStatusesVDSCommand(HostName = mercury2, HostId = dfddc678-f8ee-45eb-897a-885c83de870e) execution failed. Exception: IRSNonOperationalException: IRSGenericException: IRSErrorException: IRSNonOperationalException: Not SPM: () 2014-12-04 09:00:40,718 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.ConnectStoragePoolVDSCommand] (DefaultQuartzScheduler_Worker-81) Command ConnectStoragePoolVDSCommand(HostName = mercury2, HostId = dfddc678-f8ee-45eb-897a-885c83de870e, vdsId = dfddc678-f8ee-45eb-897a-885c83de870e, storagePoolId = 1d03dc05-008b-4d14-97ce-b17bd714183d, masterVersion = 8) execution failed. Exception: VDSNetworkException: java.util.concurrent.TimeoutException 2014-12-04 09:00:40,719 ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (DefaultQuartzScheduler_Worker-81) IrsBroker::Failed::GetStoragePoolInfoVDS due to: IRSNonOperationalException: IRSGenericException: IRSErrorException: IRSNonOperationalException: Could not connect host to Data Center(Storage issue) 2014-12-04 09:00:50,439 WARN [org.ovirt.engine.core.bll.scheduling.policyunits.EvenGuestDistribut