Hi all, I'm testing the new oVirt version 4.2.6, with the ovirt-engine installed as a service in a physical host.
I tried to recover an engine-backup from a previous ovirt deployment, where the ovirt-engine was self-hosted inside a VM , and it seems that all work properly, because the cluster, hosts, and network configuration was correctly redeployed. The problem was when in the new cluster, I put all the nodes on Maintenance to upgrade it, and when they go back to the UP state, the datacenter was on a Non Responsive Status, because the Master Storage Domain is Inactive, and if I try to reactivate it, I obtain this error message: Failed Activating Storage Domain gpfs_kvm on Data Center BSC-CNS I don't have any hosts in the cluster as a SPM. Due to this situacion, I've some questions: It's possible to force one host to be an SPM? I failed when I put all the nodes on maintenance to upgrade, beacause it's better to maintain one host active as an SPM to protect the cluster, and the storage? Maybe the restored config from a diferent architecture (hosted-engine vs engine installed on host), can affect the Storage? (Previously the Master Storage Domain was on NFS, but I delete it and changed the role of Master to one Posix domain added) At log level on one compute, I can see this events: 2018-09-19 11:40:11,878+0200 INFO (jsonrpc/5) [vdsm.api] FINISH connectStoragePool error=Cannot find master domain: u'spUUID=32168096-b763-11e8-a7aa-000af7b8b6ba, msdUUID=cb99c414-0d93-41b9-9396-d8a607652b49' from=::ffff:10.2.1.101,58538, task_id=eedeba4d-99bc-4e18-8f24-f501b73e0da6 (api:50) 2018-09-19 11:40:11,879+0200 ERROR (jsonrpc/5) [storage.TaskManager.Task] (Task='eedeba4d-99bc-4e18-8f24-f501b73e0da6') Unexpected error (task:875) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 882, in _run return fn(*args, **kargs) File "<string>", line 2, in connectStoragePool File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 48, in method ret = func(*args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/storage/hsm.py", line 1035, in connectStoragePool spUUID, hostID, msdUUID, masterVersion, domainsMap) File "/usr/lib/python2.7/site-packages/vdsm/storage/hsm.py", line 1097, in _connectStoragePool res = pool.connect(hostID, msdUUID, masterVersion) File "/usr/lib/python2.7/site-packages/vdsm/storage/sp.py", line 700, in connect self.__rebuild(msdUUID=msdUUID, masterVersion=masterVersion) File "/usr/lib/python2.7/site-packages/vdsm/storage/sp.py", line 1274, in __rebuild self.setMasterDomain(msdUUID, masterVersion) File "/usr/lib/python2.7/site-packages/vdsm/storage/sp.py", line 1495, in setMasterDomain raise se.StoragePoolMasterNotFound(self.spUUID, msdUUID) StoragePoolMasterNotFound: Cannot find master domain: u'spUUID=32168096-b763-11e8-a7aa-000af7b8b6ba, msdUUID=cb99c414-0d93-41b9-9396-d8a607652b49' 2018-09-19 11:40:11,879+0200 INFO (jsonrpc/5) [storage.TaskManager.Task] (Task='eedeba4d-99bc-4e18-8f24-f501b73e0da6') aborting: Task is aborted: "Cannot find master domain: u'spUUID=32168096-b763-11e8-a7aa-000af7b8b6ba, msdUUID=cb99c414-0d93-41b9-9396-d8a607652b49'" - code 304 (task:1181) 2018-09-19 11:40:11,879+0200 ERROR (jsonrpc/5) [storage.Dispatcher] FINISH connectStoragePool error=Cannot find master domain: u'spUUID=32168096-b763-11e8-a7aa-000af7b8b6ba, msdUUID=cb99c414-0d93-41b9-9396-d8a607652b49' (dispatcher:82) 2018-09-19 11:40:11,880+0200 INFO (jsonrpc/5) [jsonrpc.JsonRpcServer] RPC call StoragePool.connect failed (error 304) in 0.31 seconds (__init__:573) 2018-09-19 11:40:11,907+0200 INFO (jsonrpc/1) [vdsm.api] START getSpmStatus(spUUID=u'32168096-b763-11e8-a7aa-000af7b8b6ba', options=None) from=::ffff:10.2.1.101,58538, task_id=4133fa8c-88f8-44c5-a8b8-18ac6c67771e (api:46) 2018-09-19 11:40:11,907+0200 INFO (jsonrpc/1) [vdsm.api] FINISH getSpmStatus error=Unknown pool id, pool not connected: (u'32168096-b763-11e8-a7aa-000af7b8b6ba',) from=::ffff:10.2.1.101,58538, task_id=4133fa8c-88f8-44c5-a8b8-18ac6c67771e (api:50) 2018-09-19 11:40:11,908+0200 ERROR (jsonrpc/1) [storage.TaskManager.Task] (Task='4133fa8c-88f8-44c5-a8b8-18ac6c67771e') Unexpected error (task:875) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 882, in _run return fn(*args, **kargs) File "<string>", line 2, in getSpmStatus File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 48, in method ret = func(*args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/storage/hsm.py", line 634, in getSpmStatus pool = self.getPool(spUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/hsm.py", line 350, in getPool raise se.StoragePoolUnknown(spUUID) StoragePoolUnknown: Unknown pool id, pool not connected: (u'32168096-b763-11e8-a7aa-000af7b8b6ba',) 2018-09-19 11:40:11,908+0200 INFO (jsonrpc/1) [storage.TaskManager.Task] (Task='4133fa8c-88f8-44c5-a8b8-18ac6c67771e') aborting: Task is aborted: "Unknown pool id, pool not connected: (u'32168096-b763-11e8-a7aa-000af7b8b6ba',)" - code 309 (task:1181) 2018-09-19 11:40:11,908+0200 ERROR (jsonrpc/1) [storage.Dispatcher] FINISH getSpmStatus error=Unknown pool id, pool not connected: (u'32168096-b763-11e8-a7aa-000af7b8b6ba',) (dispatcher:82) 2018-09-19 11:40:11,908+0200 INFO (jsonrpc/1) [jsonrpc.JsonRpcServer] RPC call StoragePool.getSpmStatus failed (error 309) in 0.01 seconds (__init__:573) I tried to base my situation on other cases that other people reported before, but nothing fits with exactly my logged errors. Could you suggest me anything? The infrastructure is now on testing, and I'm able to redeploy it if its possible. Thanks in advance. _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/MSGSOEEOI4WIYZFMZ5HZOX6OEXNXNS5W/