[ovirt-users] Hosted engine FCP SAN can not activate data domain
-04-20 07:01:26,974::lvm::288::Storage.Misc.excCmd::(cmd) /usr/bin/taskset --cpu-list 0-39 /usr/bin/sudo -n /usr/sbin/lvm lvchange --config ' devices { preferred_names = ["^/dev/mapper/"] ignore_suspended_d evices=1 write_cache_state=0 disable_after_error_count=3 filter = [ '\''a|/dev/mapper/360050768018182b6c99e|[unknown]|'\'', '\''r|.*|'\'' ] } global { locking_type=1 prioritise_write_locks=1 wait_for_locks=1 use_lvmetad= 0 } backup { retain_min = 50 retain_days = 0 } ' --refresh bd616961-6da7-4eb0-939e-330b0a3fea6e/metadata bd616961-6da7-4eb0-939e-330b0a3fea6e/leases bd616961-6da7-4eb0-939e-330b0a3fea6e/ids bd616961-6da7-4eb0-939e-330b0a3fea6e/inbox b d616961-6da7-4eb0-939e-330b0a3fea6e/outbox bd616961-6da7-4eb0-939e-330b0a3fea6e/master (cwd None) Reactor thread::INFO::2017-04-20 07:01:27,069::protocoldetector::72::ProtocolDetector.AcceptorImpl::(handle_accept) Accepting connection from ::1:44692 jsonrpc.Executor/5::DEBUG::2017-04-20 07:01:27,070::lvm::288::Storage.Misc.excCmd::(cmd) SUCCESS: = " WARNING: Not using lvmetad because config setting use_lvmetad=0.\n WARNING: To avoid corruption, rescan devices to make changes visible (pvscan --cache).\n Couldn't find device with uuid jDB9VW-bNqY-UIKc-XxXp-xnyK-ZTlt-7Cpa1U.\n"; = 0 jsonrpc.Executor/5::DEBUG::2017-04-20 07:01:27,070::sp::662::Storage.StoragePool::(_stopWatchingDomainsState) Stop watching domains state jsonrpc.Executor/5::DEBUG::2017-04-20 07:01:27,070::resourceManager::628::Storage.ResourceManager::(releaseResource) Trying to release resource 'Storage.58493e81-01dc-01d8-0390-0032' jsonrpc.Executor/5::DEBUG::2017-04-20 07:01:27,071::resourceManager::647::Storage.ResourceManager::(releaseResource) Released resource 'Storage.58493e81-01dc-01d8-0390-0032' (0 active users) jsonrpc.Executor/5::DEBUG::2017-04-20 07:01:27,071::resourceManager::653::Storage.ResourceManager::(releaseResource) Resource 'Storage.58493e81-01dc-01d8-0390-0032' is free, finding out if anyone is waiting for it. jsonrpc.Executor/5::DEBUG::2017-04-20 07:01:27,071::resourceManager::661::Storage.ResourceManager::(releaseResource) No one is waiting for resource 'Storage.58493e81-01dc-01d8-0390-0032', Clearing records. jsonrpc.Executor/5::DEBUG::2017-04-20 07:01:27,071::resourceManager::628::Storage.ResourceManager::(releaseResource) Trying to release resource 'Storage.HsmDomainMonitorLock' jsonrpc.Executor/5::DEBUG::2017-04-20 07:01:27,071::resourceManager::647::Storage.ResourceManager::(releaseResource) Released resource 'Storage.HsmDomainMonitorLock' (0 active users) jsonrpc.Executor/5::DEBUG::2017-04-20 07:01:27,071::resourceManager::653::Storage.ResourceManager::(releaseResource) Resource 'Storage.HsmDomainMonitorLock' is free, finding out if anyone is waiting for it. jsonrpc.Executor/5::DEBUG::2017-04-20 07:01:27,071::resourceManager::661::Storage.ResourceManager::(releaseResource) No one is waiting for resource 'Storage.HsmDomainMonitorLock', Clearing records. jsonrpc.Executor/5::ERROR::2017-04-20 07:01:27,072::task::868::Storage.TaskManager.Task::(_setError) Task=`15122a21-4fb7-45bf-9a9a-4b97f27bc1e1`::Unexpected error Traceback (most recent call last): File "/usr/share/vdsm/storage/task.py", line 875, in _run return fn(*args, **kargs) File "/usr/lib/python2.7/site-packages/vdsm/logUtils.py", line 50, in wrapper res = f(*args, **kwargs) File "/usr/share/vdsm/storage/hsm.py", line 988, in connectStoragePool spUUID, hostID, msdUUID, masterVersion, domainsMap) File "/usr/share/vdsm/storage/hsm.py", line 1053, in _connectStoragePool res = pool.connect(hostID, msdUUID, masterVersion) File "/usr/share/vdsm/storage/sp.py", line 646, in connect self.__rebuild(msdUUID=msdUUID, masterVersion=masterVersion) File "/usr/share/vdsm/storage/sp.py", line 1219, in __rebuild self.setMasterDomain(msdUUID, masterVersion) File "/usr/share/vdsm/storage/sp.py", line 1427, in setMasterDomain domain = sdCache.produce(msdUUID) File "/usr/share/vdsm/storage/sdc.py", line 101, in produce domain.getRealDomain() File "/usr/share/vdsm/storage/sdc.py", line 53, in getRealDomain return self._cache._realProduce(self._sdUUID) File "/usr/share/vdsm/storage/sdc.py", line 125, in _realProduce domain = self._findDomain(sdUUID) File "/usr/share/vdsm/storage/sdc.py", line 144, in _findDomain dom = findMethod(sdUUID) File "/usr/share/vdsm/storage/blockSD.py", line 1441, in findDomain return BlockStorageDomain(BlockStorageDomain.findDomainPath(sdUUID)) File "/usr/share/vdsm/storage/blockSD.py", line 814, in __init__ lvm.checkVGBlockSizes(sdUUID, (self.logBlkSize, self.phyBlkSize)) File &q
Re: [ovirt-users] Hosted engine FCP SAN can not activate data domain
Hi, LUN is not in pvs output, but I found it in lsblk output without any partions on it apparently. $ sudo pvs PVVG Fmt Attr PSize PFree /dev/mapper/360050768018182b6c990 data lvm2 a-- 200.00g 180.00g /dev/mapper/360050768018182b6c998 9f10e00f-ae39-46a0-86da-8b157c6de7bc lvm2 a-- 499.62g 484.50g /dev/sda2 system lvm2 a-- 278.78g 208.41g $ sudo lvs LV VG Attr LSizePool Origin Data% Meta% Move Log Cpy%Sync Convert 34a9328f-87fe-4190-96e9-a3580b0734fc 9f10e00f-ae39-46a0-86da-8b157c6de7bc -wi-a-1.00g 506ff043-1058-448c-bbab-5c864adb2bfc 9f10e00f-ae39-46a0-86da-8b157c6de7bc -wi-a- 10.00g 65449c88-bc28-4275--5fc75b692cbc 9f10e00f-ae39-46a0-86da-8b157c6de7bc -wi-a- 128.00m e2ee95ce-8105-4a20-8e1f-9f6dfa16bf59 9f10e00f-ae39-46a0-86da-8b157c6de7bc -wi-ao 128.00m ids 9f10e00f-ae39-46a0-86da-8b157c6de7bc -wi-ao 128.00m inbox 9f10e00f-ae39-46a0-86da-8b157c6de7bc -wi-a- 128.00m leases 9f10e00f-ae39-46a0-86da-8b157c6de7bc -wi-a-2.00g master 9f10e00f-ae39-46a0-86da-8b157c6de7bc -wi-a-1.00g metadata 9f10e00f-ae39-46a0-86da-8b157c6de7bc -wi-a- 512.00m outbox 9f10e00f-ae39-46a0-86da-8b157c6de7bc -wi-a- 128.00m data data -wi-ao 20.00g home system -wi-ao 1000.00m prod system -wi-ao4.88g root system -wi-ao7.81g swap system -wi-ao4.00g swap7system -wi-ao 20.00g tmp system -wi-ao4.88g var system -wi-ao 27.81g $ sudo lsblk sdq 65:00 500G 0 disk └─360050768018182b6c9d7 253:33 0 500G 0 mpath Data domain was made with one 500 GB LUN and extended with 500 GB more. On Tue, Apr 25, 2017 at 2:17 PM, Fred Rolland wrote: > Hi, > > Do you see the LUN in the host ? > Can you share pvs and lvs output ? > > Thanks, > > Fred > > On Mon, Apr 24, 2017 at 1:05 PM, Jens Oechsler wrote: >> >> Hello >> I have a problem with oVirt Hosted Engine Setup version: >> 4.0.5.5-1.el7.centos. >> Setup is using FCP SAN for data and engine. >> Cluster has worked fine for a while. It has two hosts with VMs running. >> I extended storage with an additional LUN recently. This LUN seems to >> be gone from data domain and one VM is paused which I assume has data >> on that device. >> >> Got these errors in events: >> >> Apr 24, 2017 10:26:05 AM >> Failed to activate Storage Domain SD (Data Center DC) by >> admin@internal-authz >> Apr 10, 2017 3:38:08 PM >> Status of host cl01 was set to Up. >> Apr 10, 2017 3:38:03 PM >> Host cl01 does not enforce SELinux. Current status: DISABLED >> Apr 10, 2017 3:37:58 PM >> Host cl01 is initializing. Message: Recovering from crash or Initializing >> Apr 10, 2017 3:37:58 PM >> VDSM cl01 command failed: Recovering from crash or Initializing >> Apr 10, 2017 3:37:46 PM >> Failed to Reconstruct Master Domain for Data Center DC. >> Apr 10, 2017 3:37:46 PM >> Host cl01 is not responding. Host cannot be fenced automatically >> because power management for the host is disabled. >> Apr 10, 2017 3:37:46 PM >> VDSM cl01 command failed: Broken pipe >> Apr 10, 2017 3:37:46 PM >> VDSM cl01 command failed: Broken pipe >> Apr 10, 2017 3:32:45 PM >> Invalid status on Data Center DC. Setting Data Center status to Non >> Responsive (On host cl01, Error: General Exception). >> Apr 10, 2017 3:32:45 PM >> VDSM cl01 command failed: [Errno 19] Could not find dm device named >> `[unknown]` >> Apr 7, 2017 1:28:04 PM >> VM HostedEngine is down with error. Exit message: resource busy: >> Failed to acquire lock: error -243. >> Apr 7, 2017 1:28:02 PM >> Storage Pool Manager runs on Host cl01 (Address: cl01). >> Apr 7, 2017 1:27:59 PM >> Invalid status on Data Center DC. Setting status to Non Responsive. >> Apr 7, 2017 1:27:53 PM >> Host cl02 does not enforce SELinux. Current status: DISABLED >> Apr 7, 2017 1:27:52 PM >> Host cl01 does not enforce SELinux. Current status: DISABLED >> Apr 7, 2017 1:27:49 PM >> Affinity Rules Enforcement Manager started. >> Apr 7, 2017 1:27:34 PM >> ETL Service Started >> Apr 7, 2017 1:26:01 PM >> ETL Service Stopped >> Apr 3, 2017 1:22:54 PM >> Sh
Re: [ovirt-users] Hosted engine FCP SAN can not activate data domain
Greetings, Is there any way to get the oVirt Data Center described below active again? On Tue, Apr 25, 2017 at 4:11 PM, Jens Oechsler wrote: > Hi, > > LUN is not in pvs output, but I found it in lsblk output without any > partions on it apparently. > > $ sudo pvs > PVVG > Fmt Attr PSize PFree > /dev/mapper/360050768018182b6c990 data > lvm2 a-- 200.00g 180.00g > /dev/mapper/360050768018182b6c998 > 9f10e00f-ae39-46a0-86da-8b157c6de7bc lvm2 a-- 499.62g 484.50g > /dev/sda2 system > lvm2 a-- 278.78g 208.41g > > $ sudo lvs > LV VG > Attr LSizePool Origin Data% Meta% Move Log Cpy%Sync > Convert > 34a9328f-87fe-4190-96e9-a3580b0734fc > 9f10e00f-ae39-46a0-86da-8b157c6de7bc -wi-a-1.00g > 506ff043-1058-448c-bbab-5c864adb2bfc > 9f10e00f-ae39-46a0-86da-8b157c6de7bc -wi-a- 10.00g > 65449c88-bc28-4275--5fc75b692cbc > 9f10e00f-ae39-46a0-86da-8b157c6de7bc -wi-a- 128.00m > e2ee95ce-8105-4a20-8e1f-9f6dfa16bf59 > 9f10e00f-ae39-46a0-86da-8b157c6de7bc -wi-ao 128.00m > ids > 9f10e00f-ae39-46a0-86da-8b157c6de7bc -wi-ao 128.00m > inbox > 9f10e00f-ae39-46a0-86da-8b157c6de7bc -wi-a- 128.00m > leases > 9f10e00f-ae39-46a0-86da-8b157c6de7bc -wi-a-2.00g > master > 9f10e00f-ae39-46a0-86da-8b157c6de7bc -wi-a-1.00g > metadata > 9f10e00f-ae39-46a0-86da-8b157c6de7bc -wi-a- 512.00m > outbox > 9f10e00f-ae39-46a0-86da-8b157c6de7bc -wi-a- 128.00m > data data > -wi-ao 20.00g > home system > -wi-ao 1000.00m > prod system > -wi-ao4.88g > root system > -wi-ao7.81g > swap system > -wi-ao4.00g > swap7system > -wi-ao 20.00g > tmp system > -wi-ao4.88g > var system > -wi-ao 27.81g > > $ sudo lsblk > > sdq > 65:00 500G 0 disk > └─360050768018182b6c9d7 >253:33 0 500G 0 mpath > > Data domain was made with one 500 GB LUN and extended with 500 GB more. > > On Tue, Apr 25, 2017 at 2:17 PM, Fred Rolland wrote: >> Hi, >> >> Do you see the LUN in the host ? >> Can you share pvs and lvs output ? >> >> Thanks, >> >> Fred >> >> On Mon, Apr 24, 2017 at 1:05 PM, Jens Oechsler wrote: >>> >>> Hello >>> I have a problem with oVirt Hosted Engine Setup version: >>> 4.0.5.5-1.el7.centos. >>> Setup is using FCP SAN for data and engine. >>> Cluster has worked fine for a while. It has two hosts with VMs running. >>> I extended storage with an additional LUN recently. This LUN seems to >>> be gone from data domain and one VM is paused which I assume has data >>> on that device. >>> >>> Got these errors in events: >>> >>> Apr 24, 2017 10:26:05 AM >>> Failed to activate Storage Domain SD (Data Center DC) by >>> admin@internal-authz >>> Apr 10, 2017 3:38:08 PM >>> Status of host cl01 was set to Up. >>> Apr 10, 2017 3:38:03 PM >>> Host cl01 does not enforce SELinux. Current status: DISABLED >>> Apr 10, 2017 3:37:58 PM >>> Host cl01 is initializing. Message: Recovering from crash or Initializing >>> Apr 10, 2017 3:37:58 PM >>> VDSM cl01 command failed: Recovering from crash or Initializing >>> Apr 10, 2017 3:37:46 PM >>> Failed to Reconstruct Master Domain for Data Center DC. >>> Apr 10, 2017 3:37:46 PM >>> Host cl01 is not responding. Host cannot be fenced automatically >>> because power management for the host is disabled. >>> Apr 10, 2017 3:37:46 PM >>> VDSM cl01 command failed: Broken pipe >>> Apr 10, 2017 3:37:46 PM >>> VDSM cl01 command failed: Broken pipe >>> Apr 10, 2017 3:32:45 PM >>> Invalid status on Data Center DC. Setting Data Center status to Non >>> Responsive (On host cl01, Error: General Exception). >>> Apr 10, 2017 3:32:45 PM >>> VDSM cl01 command failed: [Errno 19] Could not find dm device named >>> `[unknown]` >>> Apr 7, 2017 1:28:04 PM >>> VM HostedEngine is down with error. Exit message: resource busy: >>> Failed to acquire loc
Re: [ovirt-users] Hosted engine FCP SAN can not activate data domain
Hi, Thanks for reply. Have tried to gather logs from hosts here on google drive: https://drive.google.com/open?id=0B7R4U330JfWpbkNhb2pxZWhmUUk On Sun, Apr 30, 2017 at 10:50 AM, Fred Rolland wrote: > Hi, > > Can you provide the vdsm and engine logs ? > > Thanks, > Fred > > On Wed, Apr 26, 2017 at 5:30 PM, Jens Oechsler wrote: >> >> Greetings, >> >> Is there any way to get the oVirt Data Center described below active >> again? >> >> On Tue, Apr 25, 2017 at 4:11 PM, Jens Oechsler wrote: >> > Hi, >> > >> > LUN is not in pvs output, but I found it in lsblk output without any >> > partions on it apparently. >> > >> > $ sudo pvs >> > PVVG >> > Fmt Attr PSize PFree >> > /dev/mapper/360050768018182b6c990 data >> > lvm2 a-- 200.00g 180.00g >> > /dev/mapper/360050768018182b6c998 >> > 9f10e00f-ae39-46a0-86da-8b157c6de7bc lvm2 a-- 499.62g 484.50g >> > /dev/sda2 system >> > lvm2 a-- 278.78g 208.41g >> > >> > $ sudo lvs >> > LV VG >> > Attr LSizePool Origin Data% Meta% Move Log Cpy%Sync >> > Convert >> > 34a9328f-87fe-4190-96e9-a3580b0734fc >> > 9f10e00f-ae39-46a0-86da-8b157c6de7bc -wi-a-1.00g >> > 506ff043-1058-448c-bbab-5c864adb2bfc >> > 9f10e00f-ae39-46a0-86da-8b157c6de7bc -wi-a- 10.00g >> > 65449c88-bc28-4275--5fc75b692cbc >> > 9f10e00f-ae39-46a0-86da-8b157c6de7bc -wi-a- 128.00m >> > e2ee95ce-8105-4a20-8e1f-9f6dfa16bf59 >> > 9f10e00f-ae39-46a0-86da-8b157c6de7bc -wi-ao 128.00m >> > ids >> > 9f10e00f-ae39-46a0-86da-8b157c6de7bc -wi-ao 128.00m >> > inbox >> > 9f10e00f-ae39-46a0-86da-8b157c6de7bc -wi-a- 128.00m >> > leases >> > 9f10e00f-ae39-46a0-86da-8b157c6de7bc -wi-a-2.00g >> > master >> > 9f10e00f-ae39-46a0-86da-8b157c6de7bc -wi-a-1.00g >> > metadata >> > 9f10e00f-ae39-46a0-86da-8b157c6de7bc -wi-a- 512.00m >> > outbox >> > 9f10e00f-ae39-46a0-86da-8b157c6de7bc -wi-a- 128.00m >> > data data >> > -wi-ao 20.00g >> > home system >> > -wi-ao 1000.00m >> > prod system >> > -wi-ao4.88g >> > root system >> > -wi-ao7.81g >> > swap system >> > -wi-ao4.00g >> > swap7system >> > -wi-ao 20.00g >> > tmp system >> > -wi-ao4.88g >> > var system >> > -wi-ao 27.81g >> > >> > $ sudo lsblk >> > >> > sdq >> > 65:00 500G 0 disk >> > └─360050768018182b6c9d7 >> >253:33 0 500G 0 mpath >> > >> > Data domain was made with one 500 GB LUN and extended with 500 GB more. >> > >> > On Tue, Apr 25, 2017 at 2:17 PM, Fred Rolland >> > wrote: >> >> Hi, >> >> >> >> Do you see the LUN in the host ? >> >> Can you share pvs and lvs output ? >> >> >> >> Thanks, >> >> >> >> Fred >> >> >> >> On Mon, Apr 24, 2017 at 1:05 PM, Jens Oechsler wrote: >> >>> >> >>> Hello >> >>> I have a problem with oVirt Hosted Engine Setup version: >> >>> 4.0.5.5-1.el7.centos. >> >>> Setup is using FCP SAN for data and engine. >> >>> Cluster has worked fine for a while. It has two hosts with VMs >> >>> running. >> >>> I extended storage with an additional LUN recently. This LUN seems to >> >>> be gone from data domain and one VM is paused which I assume has data >> >>> on that device. >> >>> >> >>> Got these errors in events: >> >>> >> >>> Apr 24, 2017 10:26:05 AM >> >>> Failed to activate Storage Domain SD (Data Center DC) by >> >>> admin@internal-authz >> >>> Apr 10, 2017 3:38:08 PM >> >>> Status of host cl01 was set to Up