[ovirt-users] Re: Error when trying to change master storage domain

2021-08-01 Thread Shani Leviim
Hi Matthew,

You might need to sync back the master version and domain between the
engine and vdsm.
To verify those parameters on vdsm, run this command on the SPM host:
vdsm-client StoragePool getInfo
storagepoolID="f72ec125-69a1-4c1b-a5e1-313fcb70b6ff"

The result should be something like:
"info": {
"domains": "1234:Active,5678:Active,91011:Active",
"isoprefix": "",
"lver": 6,

*"master_uuid": "123","master_ver": 14,*
"name": "No Description",
"pool_status": "connected",
"spm_id": 1,
"type": "NFS",
"version": "5"
}


Then, compare the master version value with the engine:
engine=> select * from storage_pool where id =
'f72ec125-69a1-4c1b-a5e1-313fcb70b6ff';

And the master domain:
engine=> select * from storage_domains where
storage_pool_id='f72ec125-69a1-4c1b-a5e1-313fcb70b6ff'  and
storage_domain_type='0';

(0 means master, for reference, see
https://github.com/oVirt/ovirt-engine/blob/a65cf0eae8858ab2278c3f537dc427e3ff20eba7/backend/manager/modules/common/src/main/java/org/ovirt/engine/core/common/businessentities/StorageDomainType.java
)

Then we can get the bigger picture (and update the engine data to match the
vdsm)


*Regards,*

*Shani Leviim*


On Thu, Jul 29, 2021 at 8:40 PM Matthew Benstead  wrote:

> Thanks Shani - yes we plan to upgrade to 4.4 in the future, but we're on
> 4.3 right now due to only running CentOS 7 at the moment.
>
> I was able to clear the job from the SPM:
>
> [root@daccs01 ~]# vdsm-client Host getAllTasksStatuses
> {
> "5fa9edf0-56c3-40e4-9327-47bf7764d28d": {
> "message": "1 jobs completed successfully",
> "code": 0,
> "taskID": "5fa9edf0-56c3-40e4-9327-47bf7764d28d",
> "taskResult": "success",
> "taskState": "finished"
> }
> }
> [root@daccs01 ~]# vdsm-client Task clear
> taskID=5fa9edf0-56c3-40e4-9327-47bf7764d28d
> true
> [root@daccs01 ~]# vdsm-client Host getAllTasksStatuses
> {}
>
> And confirm there were no async_tasks:
>
> engine=# select * from async_tasks;
>  task_id | action_type | status | result | step_id | command_id |
> started_at | storage_pool_id | task_type | vdsm_task_id | root_command_id |
> user_id
>
> -+-+++-+++-+---+--+-+-
> (0 rows)
>
>
> However, when putting the vm-storage-ssd domain into maintenance mode, it
> failed again:
>
>
>
>
>
>
>
>
>
> Here are some the logs entries - anything else I can look at?
>
>
> 2021-07-29 10:30:37,848-07 ERROR
> [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
> (EE-ManagedThreadFactory-engineScheduled-Thread-25) [35c5b47] EVENT_ID:
> VDS_BROKER_COMMAND_FAILURE(10,802),
>  VDSM compute7.pcic.uvic.ca command ConnectStoragePoolVDS failed: Wrong
> Master domain or its version: u'SD=a5a83df1-47e2-4927-9add-079199ca7ef8,
> pool=f72ec125-69a1-4c1b-a5e1-313fcb70b6ff'
> 2021-07-29 10:30:37,848-07 INFO
> [org.ovirt.engine.core.vdsbroker.vdsbroker.ConnectStoragePoolVDSCommand]
> (EE-ManagedThreadFactory-engineScheduled-Thread-25) [35c5b47] Command
> 'org.ovirt.engine.core.vdsbroker.vd
> sbroker.ConnectStoragePoolVDSCommand' return value 'StatusOnlyReturn
> [status=Status [code=324, message=Wrong Master domain or its version:
> u'SD=a5a83df1-47e2-4927-9add-079199ca7ef8, pool=f72ec125-69a1-4c1b-a5e1-
> 313fcb70b6ff']]'
> ...
> 2021-07-29 10:30:37,848-07 INFO
> [org.ovirt.engine.core.vdsbroker.vdsbroker.ConnectStoragePoolVDSCommand]
> (EE-ManagedThreadFactory-engineScheduled-Thread-25) [35c5b47] HostName =
> compute7.pcic.uvic.ca
> 2021-07-29 10:30:37,849-07 ERROR
> [org.ovirt.engine.core.vdsbroker.vdsbroker.ConnectStoragePoolVDSCommand]
> (EE-ManagedThreadFactory-engineScheduled-Thread-25) [35c5b47] Command
> 'ConnectStoragePoolVDSCommand(HostN
> ame = compute7.pcic.uvic.ca,
> ConnectStoragePoolVDSCommandParameters:{hostId='51769733-0cf6-4270-8288-ec96474b7609',
> vdsId='51769733-0cf6-4270-8288-ec96474b7609',
> storagePoolId='f72ec125-69a1-4c1b-a5e1-313fcb70b6
> ff', masterVersion='288'})' execution failed: IRSGenericException:
> IRSErrorException: IRSNoMasterDomainException: Wrong Master domain or its
> version: u'SD=a5a83df1-47e2-4927-9add-079199ca7ef8, pool=f72ec125-69a1
> -4c1b-a5e1-313fcb70b6ff'
> ...
> 2021-07-29 10:30:37,849-07 ERROR
> [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand]
> (EE-ManagedThreadFactory-engineScheduled-Thread-25) [35c5b47]
> IrsBroker::Failed::DeactivateStorageDomainVDS: IRSGener
> icException: IRSErrorException: IRSNoMasterDomainException: Wrong Master
> domain or its version: u'SD=a5a83df1-47e2-4927-9add-079199ca7ef8,
> pool=f72ec125-69a1-4c1b-a5e1-313fcb70b6ff'
> 2021-07-29 10:30:37,855-07 INFO
> [org.ovirt.engine.core.vdsbroker.irsbroker.DeactivateStorageDomainVDSCommand]
> (EE-ManagedThreadFactory-engineScheduled-Thread-25) [35c5b47] FINISH,
> DeactivateStorageDomainVDSComm
> and, return: , log id: 

[ovirt-users] Re: Error when trying to change master storage domain

2021-07-30 Thread Matthew Benstead
Thanks Shani - yes we plan to upgrade to 4.4 in the future, but we're on
4.3 right now due to only running CentOS 7 at the moment.

I was able to clear the job from the SPM:

[root@daccs01 ~]# vdsm-client Host getAllTasksStatuses
{
    "5fa9edf0-56c3-40e4-9327-47bf7764d28d": {
    "message": "1 jobs completed successfully",
    "code": 0,
    "taskID": "5fa9edf0-56c3-40e4-9327-47bf7764d28d",
    "taskResult": "success",
    "taskState": "finished"
    }
}
[root@daccs01 ~]# vdsm-client Task clear
taskID=5fa9edf0-56c3-40e4-9327-47bf7764d28d
true
[root@daccs01 ~]# vdsm-client Host getAllTasksStatuses
{}

And confirm there were no async_tasks:

engine=# select * from async_tasks;
 task_id | action_type | status | result | step_id | command_id |
started_at | storage_pool_id | task_type | vdsm_task_id |
root_command_id | user_id
-+-+++-+++-+---+--+-+-
(0 rows)


However, when putting the vm-storage-ssd domain into maintenance mode,
it failed again:









Here are some the logs entries - anything else I can look at?


2021-07-29 10:30:37,848-07 ERROR
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(EE-ManagedThreadFactory-engineScheduled-Thread-25) [35c5b47] EVENT_ID:
VDS_BROKER_COMMAND_FAILURE(10,802),
 VDSM compute7.pcic.uvic.ca command ConnectStoragePoolVDS failed: Wrong
Master domain or its version: u'SD=a5a83df1-47e2-4927-9add-079199ca7ef8,
pool=f72ec125-69a1-4c1b-a5e1-313fcb70b6ff'
2021-07-29 10:30:37,848-07 INFO 
[org.ovirt.engine.core.vdsbroker.vdsbroker.ConnectStoragePoolVDSCommand]
(EE-ManagedThreadFactory-engineScheduled-Thread-25) [35c5b47] Command
'org.ovirt.engine.core.vdsbroker.vd
sbroker.ConnectStoragePoolVDSCommand' return value 'StatusOnlyReturn
[status=Status [code=324, message=Wrong Master domain or its version:
u'SD=a5a83df1-47e2-4927-9add-079199ca7ef8, pool=f72ec125-69a1-4c1b-a5e1-
313fcb70b6ff']]'
...
2021-07-29 10:30:37,848-07 INFO 
[org.ovirt.engine.core.vdsbroker.vdsbroker.ConnectStoragePoolVDSCommand]
(EE-ManagedThreadFactory-engineScheduled-Thread-25) [35c5b47] HostName =
compute7.pcic.uvic.ca
2021-07-29 10:30:37,849-07 ERROR
[org.ovirt.engine.core.vdsbroker.vdsbroker.ConnectStoragePoolVDSCommand]
(EE-ManagedThreadFactory-engineScheduled-Thread-25) [35c5b47] Command
'ConnectStoragePoolVDSCommand(HostN
ame = compute7.pcic.uvic.ca,
ConnectStoragePoolVDSCommandParameters:{hostId='51769733-0cf6-4270-8288-ec96474b7609',
vdsId='51769733-0cf6-4270-8288-ec96474b7609',
storagePoolId='f72ec125-69a1-4c1b-a5e1-313fcb70b6
ff', masterVersion='288'})' execution failed: IRSGenericException:
IRSErrorException: IRSNoMasterDomainException: Wrong Master domain or
its version: u'SD=a5a83df1-47e2-4927-9add-079199ca7ef8, pool=f72ec125-69a1
-4c1b-a5e1-313fcb70b6ff'
...
2021-07-29 10:30:37,849-07 ERROR
[org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand]
(EE-ManagedThreadFactory-engineScheduled-Thread-25) [35c5b47]
IrsBroker::Failed::DeactivateStorageDomainVDS: IRSGener
icException: IRSErrorException: IRSNoMasterDomainException: Wrong Master
domain or its version: u'SD=a5a83df1-47e2-4927-9add-079199ca7ef8,
pool=f72ec125-69a1-4c1b-a5e1-313fcb70b6ff'
2021-07-29 10:30:37,855-07 INFO 
[org.ovirt.engine.core.vdsbroker.irsbroker.DeactivateStorageDomainVDSCommand]
(EE-ManagedThreadFactory-engineScheduled-Thread-25) [35c5b47] FINISH,
DeactivateStorageDomainVDSComm
and, return: , log id: 1c215ca4
2021-07-29 10:30:37,855-07 ERROR
[org.ovirt.engine.core.bll.storage.domain.DeactivateStorageDomainCommand]
(EE-ManagedThreadFactory-engineScheduled-Thread-25) [35c5b47]
DeactivateStorageDomainVDS failed 'a5a83df
1-47e2-4927-9add-079199ca7ef8':
org.ovirt.engine.core.common.errors.EngineException: EngineException:
org.ovirt.engine.core.vdsbroker.irsbroker.IRSNoMasterDomainException:
IRSGenericException: IRSErrorException:
 IRSNoMasterDomainException: Wrong Master domain or its version:
u'SD=a5a83df1-47e2-4927-9add-079199ca7ef8,
pool=f72ec125-69a1-4c1b-a5e1-313fcb70b6ff' (Failed with error
StoragePoolWrongMaster and code 324)
    at
org.ovirt.engine.core.bll.VdsHandler.handleVdsResult(VdsHandler.java:118)
[bll.jar:]
    at
org.ovirt.engine.core.bll.VDSBrokerFrontendImpl.runVdsCommand(VDSBrokerFrontendImpl.java:33)
[bll.jar:]
    at
org.ovirt.engine.core.bll.CommandBase.runVdsCommand(CommandBase.java:2112)
[bll.jar:]
    at
org.ovirt.engine.core.bll.storage.domain.DeactivateStorageDomainCommand.dectivateStorageDomain(DeactivateStorageDomainCommand.java:340)
[bll.jar:]
...
    at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
[rt.jar:1.8.0_292]
    at java.lang.Thread.run(Thread.java:748) [rt.jar:1.8.0_292]
    at
org.glassfish.enterprise.concurrent.ManagedThreadFactoryImpl$ManagedThread.run(ManagedThreadFactoryImpl.java:250)
[javax.enterprise.concurrent-1.0.jar:]
Caused by:

[ovirt-users] Re: Error when trying to change master storage domain

2021-07-29 Thread Shani Leviim
Hi Matthew,
Actually, your description is related to 2 features available for ovirt
4.4.5 
1. The ability to switch the master storage domain while domains are up and
running [1]
2. Clearing the finished tasks from REST API [2] and UI [3].

We recommend you upgrade your engine to enjoy those features.

In the meanwhile, as you've described, moving the Master role from one
storage to the other is available using putting the domain into maintenance.
In order to clear the finished tasks from SPM:
   vdsm-client Host getAllTasksStatuses

It should be something like that:
{
"1dc4d885-577a-4b6a-b01f-e682602a907c": {
"code": 0,
"message": "1 jobs completed successfully",
"taskID": "1dc4d885-577a-4b6a-b01f-e682602a907c",
"taskResult": "success",
"taskState": "finished"
}
}

Then clear that tasks:
   vdsm-client Task clear taskID=12345
Once it gets cleared, the reconstruction can be finished.

To verify there are no more finished async tasks, you can run this SQL
query on the engine:
engine=# select * from async_tasks WHERE storage_pool_id = '123';

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1910022
[2] https://bugzilla.redhat.com/show_bug.cgi?id=1627997
[3] https://bugzilla.redhat.com/show_bug.cgi?id=1910302


*Regards,*

*Shani Leviim*


On Thu, Jul 29, 2021 at 8:33 AM Matthew Benstead  wrote:

> Hello,
>
> I'm trying to decommission the old master storage domain in ovirt, and
> replace it with a new one. All of the VMs have been migrated off of the old
> master, and everything has been running on the new storage domain for a
> couple months. But when I try to put the old domain into maintenance mode I
> get an error.
>
> Old Master: vm-storage-ssd
> New Domain: vm-storage-ssd2
>
> The error is:
>
> Failed to Reconstruct Master Domain for Data Center EDC2
>
> As well as:
>
> Sync Error on Master Domain between Host daccs01 and oVirt Engine. Domain:
> vm-storage-ssd is marked as Master in oVirt Engine database but not on the
> Storage side. Please consult with Support on how to fix this issue.
>
> 2021-07-28 11:41:34,870-07 WARN
> [org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxy]
> (EE-ManagedThreadFactory-engine-Thread-23) [] Master domain version is not
> in sync between DB and VDSM. Domain vm-storage-ssd
>  marked as master, but the version in DB: 283 and in VDSM: 280
>
> And:
>
> Not stopping SPM on vds daccs01, pool id
> f72ec125-69a1-4c1b-a5e1-313fcb70b6ff as there are uncleared tasks Task
> '5fa9edf0-56c3-40e4-9327-47bf7764d28d', status 'finished'
>
>
> After a couple minutes all the domains are marked as active again and
> things continue, but vm-storage-ssd is still listed as the master domain.
> Any thoughts?
>
> This is on 4.3.10.4-1.el7   on CentOS 7.
>
> engine=# SELECT storage_name, storage_pool_id, storage, status FROM
> storage_pool_with_storage_domain ORDER BY storage_name;
>  storage_name  |   storage_pool_id
> |storage | status
>
> ---+--++
>  compute1-iscsi-ssd| f72ec125-69a1-4c1b-a5e1-313fcb70b6ff |
> yvUESE-yWUv-VIWL-qX90-aAq7-gK0I-EqppRL |  1
>  compute7-iscsi-ssd| f72ec125-69a1-4c1b-a5e1-313fcb70b6ff |
> 8ekHdv-u0RJ-B0FO-LUUK-wDWs-iaxb-sh3W3J |  1
>  export-domain-storage | f72ec125-69a1-4c1b-a5e1-313fcb70b6ff |
> d3932528-6844-481a-bfed-542872ace9e5   |  1
>  iso-storage   | f72ec125-69a1-4c1b-a5e1-313fcb70b6ff |
> f800b7a6-6a0c-4560-8476-2f294412d87d   |  1
>  vm-storage-7200rpm| f72ec125-69a1-4c1b-a5e1-313fcb70b6ff |
> a0bff472-1348-4302-a5c7-f1177efa45a9   |  1
>  vm-storage-ssd| f72ec125-69a1-4c1b-a5e1-313fcb70b6ff |
> 95acd9a4-a6fb-4208-80dd-1c53d6aacad0   |  1
>  vm-storage-ssd2   | f72ec125-69a1-4c1b-a5e1-313fcb70b6ff |
> 829d0600-c3f7-4dae-a749-d7f05c6a6ca4   |  1
> (7 rows)
>
> Thanks,
>  -Matthew
> --
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/privacy-policy.html
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/OXOXW6B2NWXOUGZV3OKO4OMDXVDJSQLZ/
>
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/FMXDDVYXOL2GD6GEHBKHM77BWZMJ7BHX/