Hi,

will answer myself... but if you have comments or have better solution please levae comment

ovirt-engine-setup log logs SELECT statements to test global maintenance state. In my case

engine=# SELECT vm_guid, run_on_vds FROM vms WHERE vm_name = 'HostedEngine'; vm_guid | run_on_vds
--------------------------------------+--------------------------------------
96a6b6a7-75a9-472a-9d4f-1502b415470a | e24f0dcc-51f3-4d1a-acf5-2833a9dc584a
(1 row)

and

engine=# SELECT vds_id, ha_global_maintenance FROM vds_statistics WHERE vds_id = 'e24f0dcc-51f3-4d1a-acf5-2833a9dc584a';
                vds_id                | ha_global_maintenance
--------------------------------------+-----------------------
 e24f0dcc-51f3-4d1a-acf5-2833a9dc584a | f
(1 row)

because I believe global maintenance is really enabled I have updated ha_global_maintenance state with

engine=# UPDATE vds_statistics SET ha_global_maintenance = true WHERE vds_id = 'e24f0dcc-51f3-4d1a-acf5-2833a9dc584a';
UPDATE 1

after that I run

engine-setup --offline

and choose Renew certificates? (Yes, No) [No]: Yes

after that all hosts becomes up and vms were recovered (except that vms on failed and restarted host)

Cheers,

Jiri


On 5/2/22 11:16, Jiří Sléžka wrote:
Hello,

I am stuck in this situation...

It looks like engine certificate (engine.cer) expired few days ago

[root@ovirt ~]# openssl x509 -in /etc/pki/ovirt-engine/certs/engine.cer -noout -dates
notBefore=Mar 23 21:34:19 2021 GMT
notAfter=Apr 26 21:34:19 2022 GMT

CA and other certs are still valid

Yesterday I had one host outage and HE restarted on other host. But it cannot communicate with all hosts due to certificate expiration

lnav /var/log/ovirt-engine/engine.log

...
2022-05-02 11:02:29,127+02 ERROR [org.ovirt.engine.core.vdsbroker.monitoring.HostMonitoring] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-43) [] Unable to RefreshCapabilities: VDSNetworkException: VDSGenericException: VDSNetworkException: Received fatal alert: certificate_expired
...

There are vms still running on hosts.

Is there way how to (manualy?) renew engine cert and recover from this situation?

I have tried run engine-setup (and select renew certificate during install)

[root@ovirt ~]# engine-setup --offline

but it fails with

[ ERROR ] It seems that you are running your engine inside of the hosted-engine VM and are not in "Global Maintenance" mode.          In that case you should put the system into the "Global Maintenance" mode before running engine-setup, or the hosted-engine HA agent might kill the machine, which might corrupt your data.

[ ERROR ] Failed to execute stage 'Setup validation': Hosted Engine setup detected, but Global Maintenance is not set.

But global maintenance is enabled on host...

[root@ovirt06 ~]# hosted-engine --vm-status

!! Cluster is in GLOBAL MAINTENANCE mode !!

--== Host ovirt05.net.slu.cz (id: 1) status ==--

Host ID                            : 1
Host timestamp                     : 38627
Score                              : 3400
Engine status                      : {"vm": "down_unexpected", "health": "bad", "detail": "Down", "reason": "bad vm status"}
Hostname                           : ovirt05.net.slu.cz
Local maintenance                  : False
stopped                            : False
crc32                              : b719664d
conf_on_shared_storage             : True
local_conf_timestamp               : 38627
Status up-to-date                  : True
Extra metadata (valid at timestamp):
     metadata_parse_version=1
     metadata_feature_version=1
     timestamp=38627 (Mon May  2 10:55:43 2022)
     host-id=1
     score=3400
     vm_conf_refresh_time=38627 (Mon May  2 10:55:43 2022)
     conf_on_shared_storage=True
     maintenance=False
     state=EngineDown
     stopped=False

--== Host ovirt06.net.slu.cz (id: 2) status ==--

Host ID                            : 2
Host timestamp                     : 8858161
Score                              : 3400
Engine status                      : {"vm": "up", "health": "good", "detail": "Up"}
Hostname                           : ovirt06.net.slu.cz
Local maintenance                  : False
stopped                            : False
crc32                              : 414a980b
conf_on_shared_storage             : True
local_conf_timestamp               : 8858161
Status up-to-date                  : True
Extra metadata (valid at timestamp):
     metadata_parse_version=1
     metadata_feature_version=1
     timestamp=8858161 (Mon May  2 10:55:48 2022)
     host-id=2
     score=3400
     vm_conf_refresh_time=8858161 (Mon May  2 10:55:48 2022)
     conf_on_shared_storage=True
     maintenance=False
     state=GlobalMaintenance
     stopped=False

!! Cluster is in GLOBAL MAINTENANCE mode !!

relevant lines from ovirt-engine-setup log are

...
2022-05-02 11:08:02,194+0200 DEBUG otopi.ovirt_engine_setup.engine_common.database database.execute:239 Creating own connection 2022-05-02 11:08:02,233+0200 DEBUG otopi.ovirt_engine_setup.engine_common.database database.execute:284 Result: [{'vm_guid': '96a6b6a7-75a9-472a-9d4f-1502b415470a', 'run_on_vds': 'e24f0dcc-51f3-4d1a-acf5-2833a9dc584a'}] 2022-05-02 11:08:02,234+0200 DEBUG otopi.ovirt_engine_setup.engine_common.database database.execute:234 Database: 'None', Statement: '
                         SELECT vds_id, ha_global_maintenance
                         FROM vds_statistics
                         WHERE vds_id = %(VdsId)s;
                    ', args: {'VdsId': 'e24f0dcc-51f3-4d1a-acf5-2833a9dc584a'} 2022-05-02 11:08:02,234+0200 DEBUG otopi.ovirt_engine_setup.engine_common.database database.execute:239 Creating own connection 2022-05-02 11:08:02,250+0200 DEBUG otopi.ovirt_engine_setup.engine_common.database database.execute:284 Result: [{'vds_id': 'e24f0dcc-51f3-4d1a-acf5-2833a9dc584a', 'ha_global_maintenance': False}] 2022-05-02 11:08:02,250+0200 ERROR otopi.plugins.ovirt_engine_common.ovirt_engine.system.he he._validate:114 It seems that you are running your engine inside of the hosted-engine VM and are not in "Global Maintenance" mode. In that case you should put the system into the "Global Maintenance" mode before running engine-setup, or the hosted-engine HA agent might kill the machine, which might corrupt your data.
...

Thanks in advance for any advice,

Jiri

_______________________________________________
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/KJOWRWXM2EAZEFJ7PLBXZ3JCLCQCFMTI/

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature

_______________________________________________
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/D5P3NQ2JYDTOWFP4JYPZRVPDSS7VMGY3/

Reply via email to