On čtvrtek 3. září 2020 22:49:17 CEST Gillingham, Eric J (US 393D) via Users 
wrote:
> I recently removed a host from my cluster to upgrade it to 4.4, after I
> removed the host from the datacenter VMs started to pause on the second
> system they all migrated to. Investigating via the engine showed the
> storage domain was showing as "unknown", when I try to activate it via the
> engine it cycles to locked then to unknown again.
 
> /var/log/sanlock.log contains a repeating:
> add_lockspace
> e1270474-108c-4cae-83d6-51698cffebbf:1:/dev/e1270474-108c-4cae-83d6-51698cf
> febbf/ids:0 conflicts with name of list1 s1
> e1270474-108c-4cae-83d6-51698cffebbf:3:/dev/e1270474-108c-4cae-83d6-51698cf
> febbf/ids:0

how do you remove the fist host, did you put it into maintenance first? I 
wonder, how this situation (two lockspaces with conflicting names) can occur.

You can try to re-initialize the lockspace directly using sanlock command (see 
man sanlock), but it would be good to understand the situation first.


> 
> vdsm.log contains these (maybe related) snippets:
> ---
> 2020-09-03 20:19:53,483+0000 INFO  (jsonrpc/6) [vdsm.api] FINISH
> getAllTasksStatuses error=Secured object is not in safe state
> from=::ffff:137.79.52.43,36326, flow_id=18031a91,
> task_id=8e92f059-743a-48c8-aa9d-e7c4c836337b (api:52)
 2020-09-03
> 20:19:53,483+0000 ERROR (jsonrpc/6) [storage.TaskManager.Task]
> (Task='8e92f059-743a-48c8-aa9d-e7c4c836337b') Unexpected error (task:875)
> Traceback (most recent call last):
>   File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 882, in
> _run
 return fn(*args, **kargs)
>   File "<string>", line 2, in getAllTasksStatuses
>   File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 50, in
> method
 ret = func(*args, **kwargs)
>   File "/usr/lib/python2.7/site-packages/vdsm/storage/hsm.py", line 2201, in
> getAllTasksStatuses
 allTasksStatus = self._pool.getAllTasksStatuses()
>   File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py", line
> 77, in wrapper
 raise SecureError("Secured object is not in safe state")
> SecureError: Secured object is not in safe state
> 2020-09-03 20:19:53,483+0000 INFO  (jsonrpc/6) [storage.TaskManager.Task]
> (Task='8e92f059-743a-48c8-aa9d-e7c4c836337b') aborting: Task is aborted:
> u'Secured object is not in safe state' - code 100 (task:1181)
 2020-09-03
> 20:19:53,483+0000 ERROR (jsonrpc/6) [storage.Dispatcher] FINISH
> getAllTasksStatuses error=Secured object is not in safe state
> (dispatcher:87) Traceback (most recent call last):
>   File "/usr/lib/python2.7/site-packages/vdsm/storage/dispatcher.py", line
> 74, in wrapper
 result = ctask.prepare(func, *args, **kwargs)
>   File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 108, in
> wrapper
 return m(self, *a, **kw)
>   File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 1189,
> in prepare
 raise self.error
> SecureError: Secured object is not in safe state
> ---
> 2020-09-03 20:44:23,252+0000 INFO  (tasks/2)
> [storage.ThreadPool.WorkerThread] START task
> 76415a77-9d29-4b72-ade1-53207cfc503b (cmd=<bound method Task.commit of
> <vdsm.storage.task.Task instance at 0x7fe99c27dea8>>, args=None) (thre
> adPool:208)
> 2020-09-03 20:44:23,266+0000 INFO  (tasks/2) [storage.SANLock] Acquiring
> host id for domain e1270474-108c-4cae-83d6-51698cffebbf (id=1, wait=True)
> (clusterlock:313)
 2020-09-03 20:44:23,267+0000 ERROR (tasks/2)
> [storage.TaskManager.Task] (Task='76415a77-9d29-4b72-ade1-53207cfc503b')
> Unexpected error (task:875) Traceback (most recent call last):
>   File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 882, in
> _run
 return fn(*args, **kargs)
>   File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 336, in
> run
 return self.cmd(*self.argslist, **self.argsdict)
>   File "/usr/lib/python2.7/site-packages/vdsm/storage/sp.py", line 317, in
> startSpm
 self.masterDomain.acquireHostId(self.id)
>   File "/usr/lib/python2.7/site-packages/vdsm/storage/sd.py", line 957, in
> acquireHostId
 self._manifest.acquireHostId(hostId, wait)
>   File "/usr/lib/python2.7/site-packages/vdsm/storage/sd.py", line 501, in
> acquireHostId
 self._domainLock.acquireHostId(hostId, wait)
>   File "/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py", line
> 344, in acquireHostId
 raise se.AcquireHostIdFailure(self._sdUUID, e)
> AcquireHostIdFailure: Cannot acquire host id:
> ('e1270474-108c-4cae-83d6-51698cffebbf', SanlockException(22, 'Sanlock
> lockspace add failure', 'Invalid argument'))
 ---
> 
> Another symptom is in the hosts view of the engine SPM bounces between
> "Normal" and "Contending". When it's Normal if I select Management ->
> Select as SPM I get "Error while executing action: Cannot force select SPM.
> Unknown Data Center status."
 
> I've tried rebooting the one remaining host in the cluster no to avail,
> hosted-engine --reinitialize-lockspace also seems to not solve the issue.
 
> 
> I'm kind of stumped as to what else to try, would appreciate any guidance on
> how to resolve this.
 
> Thank You
> 
> _______________________________________________
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/privacy-policy.html
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/ List Archives:
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/FMJZV2OEKHPTS
> TROSPLCQ3WJUIPB6CKL/

Attachment: signature.asc
Description: This is a digitally signed message part.

_______________________________________________
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/XSMRE7ZUNP5E5YEMRKPN5GSVKF5LYU4F/

Reply via email to