[ovirt-users] Re: Unable to remove host from ovirt engine
That host does have gluster bricks that are all healthy. I have noticed under Compute -> Hosts -> Edit -> Hosted Engine there is the option to Undeploy, maybe that is what I need? Any ideas what actions that takes? > On Apr 30, 2022, at 15:26, Strahil Nikolov wrote: > > You need to first replace the host on gluster level (it looks it has/had a > brick) before ovirt allows you to remove the host. > > Can you check if the gluster volumes have a brick from that host ? > > Best Regards, > Strahil Nikolov > > On Sat, Apr 30, 2022 at 15:24, Joseph Gelinas > wrote: > Using GlusterFS for storage. > > > On Apr 30, 2022, at 06:55, Strahil Nikolov via Users > > wrote: > > > > *storage > > > > On Sat, Apr 30, 2022 at 13:50, Strahil Nikolov via Users > > wrote: > > ___ > > Users mailing list -- users@ovirt.org > > To unsubscribe send an email to users-le...@ovirt.org > > Privacy Statement: https://www.ovirt.org/privacy-policy.html > > oVirt Code of Conduct: > > https://www.ovirt.org/community/about/community-guidelines/ > > List Archives: > > https://lists.ovirt.org/archives/list/users@ovirt.org/message/MWDNE4IK5F7ZH4LCY3I5EJWXDD54ILFH/ > > > > ___ > > Users mailing list -- users@ovirt.org > > To unsubscribe send an email to users-le...@ovirt.org > > Privacy Statement: https://www.ovirt.org/privacy-policy.html > > oVirt Code of Conduct: > > https://www.ovirt.org/community/about/community-guidelines/ > > List Archives: > > https://lists.ovirt.org/archives/list/users@ovirt.org/message/EQ4ZBRWZCO2TJRJJDRZZLV4BXET2JQMF/ > > ___ > Users mailing list -- users@ovirt.org > To unsubscribe send an email to users-le...@ovirt.org > Privacy Statement: https://www.ovirt.org/privacy-policy.html > oVirt Code of Conduct: > https://www.ovirt.org/community/about/community-guidelines/ > List Archives: > https://lists.ovirt.org/archives/list/users@ovirt.org/message/3OJOWCRYAQD4KN6IIEYYKBCUE5R5U5JH/ > ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/UV6AY5QUN7RNZ6BUWDHCUAAKJENETVAC/
[ovirt-users] Re: Unable to remove host from ovirt engine
Using GlusterFS for storage. > On Apr 30, 2022, at 06:55, Strahil Nikolov via Users wrote: > > *storage > > On Sat, Apr 30, 2022 at 13:50, Strahil Nikolov via Users > wrote: > ___ > Users mailing list -- users@ovirt.org > To unsubscribe send an email to users-le...@ovirt.org > Privacy Statement: https://www.ovirt.org/privacy-policy.html > oVirt Code of Conduct: > https://www.ovirt.org/community/about/community-guidelines/ > List Archives: > https://lists.ovirt.org/archives/list/users@ovirt.org/message/MWDNE4IK5F7ZH4LCY3I5EJWXDD54ILFH/ > > ___ > Users mailing list -- users@ovirt.org > To unsubscribe send an email to users-le...@ovirt.org > Privacy Statement: https://www.ovirt.org/privacy-policy.html > oVirt Code of Conduct: > https://www.ovirt.org/community/about/community-guidelines/ > List Archives: > https://lists.ovirt.org/archives/list/users@ovirt.org/message/EQ4ZBRWZCO2TJRJJDRZZLV4BXET2JQMF/ ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/3OJOWCRYAQD4KN6IIEYYKBCUE5R5U5JH/
[ovirt-users] Re: Unable to remove host from ovirt engine
Yes, that is how I am getting the message saying it cannot confirm host has been rebooted because it isn't in a valid state. > On Apr 29, 2022, at 13:59, Strahil Nikolov wrote: > > See my last email... Use the 3 dots menu > > > > > On Fri, Apr 29, 2022 at 14:09, Joseph Gelinas > wrote: > ___ > Users mailing list -- users@ovirt.org > To unsubscribe send an email to users-le...@ovirt.org > Privacy Statement: https://www.ovirt.org/privacy-policy.html > oVirt Code of Conduct: > https://www.ovirt.org/community/about/community-guidelines/ > List Archives: > https://lists.ovirt.org/archives/list/users@ovirt.org/message/TOD3ZJAYKVMNY23WDUJ2XMTTOQFKDR4F/ > > ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/GR3GEFBYC4S3NIANSS5OYUIJFBTLXZXH/
[ovirt-users] Re: Unable to remove host from ovirt engine
Unfortunately can't do that either. > On Apr 28, 2022, at 15:26, Strahil Nikolov via Users wrote: > > Have you tried setting it to maintenance ? > > Best Regards, > Strahil Nikolov > > On Thu, Apr 28, 2022 at 21:46, Joseph Gelinas > wrote: > ___ > Users mailing list -- users@ovirt.org > To unsubscribe send an email to users-le...@ovirt.org > Privacy Statement: https://www.ovirt.org/privacy-policy.html > oVirt Code of Conduct: > https://www.ovirt.org/community/about/community-guidelines/ > List Archives: > https://lists.ovirt.org/archives/list/users@ovirt.org/message/Z4U4N42S72IPKD5SY4OZG2TAPPSSOMC5/ > > ___ > Users mailing list -- users@ovirt.org > To unsubscribe send an email to users-le...@ovirt.org > Privacy Statement: https://www.ovirt.org/privacy-policy.html > oVirt Code of Conduct: > https://www.ovirt.org/community/about/community-guidelines/ > List Archives: > https://lists.ovirt.org/archives/list/users@ovirt.org/message/QLFNU45TDDFGLNYMDQFSHPGZTV44N5QJ/ ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/DAKAIFBRJG3EWFIFFCZLT6VGDKZCLOFO/
[ovirt-users] Re: Unable to remove host from ovirt engine
That did remove the ovirt-1 host from `hosted-engine --vm-status` on ovirt-3, however it still appears in the web interface as an Unassigned host after restarting ovirt-engine. If I rerun `hosted-engine --clean-metadata --host-id=1` on ovirt-2 or ovirt-3 I get a message about an unclean metadata block, but perhaps that is expected given it doesn't exist in the vm-status output anymore? [root@ovirt-2 ~]# hosted-engine --clean-metadata --host-id=1 INFO:ovirt_hosted_engine_ha.agent.agent.Agent:ovirt-hosted-engine-ha agent 2.4.5 started INFO:ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine:Certificate common name not found, using hostname to identify host INFO:ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine:Initializing ha-broker connection INFO:ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine:Broker initialized, all submonitors started ERROR:ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine:Malformed metadata for host 1: received 0 of 512 expected bytes ERROR:ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine:Cannot clean unclean metadata block. Consider --force-clean. INFO:ovirt_hosted_engine_ha.agent.agent.Agent:Agent shutting down > On Apr 25, 2022, at 16:39, Strahil Nikolov via Users wrote: > > I think you can use 'hosted-engine --clean-metadata --host-id=1' > > In my case I had to use --force-cleanup, but I wouldn't recommend using it. > > Best Regards, > Strahil Nikolov > > On Mon, Apr 25, 2022 at 18:08, Joseph Gelinas > wrote: > Recently our host and ovirt engine certificates expired and with some ideas > from Strahil we were able to get 2 of the 3 ovirt hosts updated with usable > certificates and move all of our VMs to those two nodes. > > https://lists.ovirt.org/archives/list/users@ovirt.org/thread/QCFPKQ3OKPOUV266MFJUMVTNG2OHLJVW/ > > Not having any luck with the last host we figured we'd just try to remove it > from ovirt engine and re-add it. While it seems `hosted-engine --vm-status` > on one node no longer shows the removed host, the other good host and the web > interface still show ovirt-1 in the mix. What is the best way to remove a > NonRespnsive host from ovirt and re-add it? > > > [root@ovirt-1 ~]# hosted-engine --vm-status > The hosted engine configuration has not been retrieved from shared storage. > Please ensure that ovirt-ha-agent is running and the storage server is > reachable. > > > > [root@ovirt-2 ~]# hosted-engine --vm-status > > > !! Cluster is in GLOBAL MAINTENANCE mode !! > > > > --== Host ovirt-3.x.com (id: 2) status ==-- > > Host ID: 2 > Host timestamp: 12515451 > Score : 3274 > Engine status : {"vm": "down", "health": "bad", > "detail": "unknown", "reason": "vm not running on this host"} > Hostname : ovirt-3.x.com > Local maintenance : False > stopped: False > crc32 : 9cf92792 > conf_on_shared_storage: True > local_conf_timestamp : 12515451 > Status up-to-date : True > Extra metadata (valid at timestamp): > metadata_parse_version=1 > metadata_feature_version=1 > timestamp=12515451 (Mon Apr 25 14:08:51 2022) > host-id=2 > score=3274 > vm_conf_refresh_time=12515451 (Mon Apr 25 14:08:51 2022) > conf_on_shared_storage=True > maintenance=False > state=GlobalMaintenance > stopped=False > > > --== Host ovirt-2.x.com (id: 3) status ==-- > > Host ID: 3 > Host timestamp: 12513269 > Score : 3400 > Engine status : {"vm": "up", "health": "good", "detail": > "Up"} > Hostname : ovirt-2.x.com > Local maintenance : False > stopped: False > crc32 : 4a89d706 > conf_on_shared_storage: True > local_conf_timestamp : 12513269 > Status up-to-date : True > Extra metadata (valid at timestamp): > metadata_parse_version=1 > metadata_feature_version=1 > timestamp=12513269 (Mon Apr 25 14:09:00 2022) > host-id=3 > score=3400 > vm_conf_refresh_time=12513269 (Mon Apr 25 14:09:00 2022) > conf_on_shared_storage=True > maintenance=False > state=GlobalMaintenance > stopped=False > > > !! Cluster is in GLOBAL MAINTENANCE
[ovirt-users] Unable to remove host from ovirt engine
Recently our host and ovirt engine certificates expired and with some ideas from Strahil we were able to get 2 of the 3 ovirt hosts updated with usable certificates and move all of our VMs to those two nodes. https://lists.ovirt.org/archives/list/users@ovirt.org/thread/QCFPKQ3OKPOUV266MFJUMVTNG2OHLJVW/ Not having any luck with the last host we figured we'd just try to remove it from ovirt engine and re-add it. While it seems `hosted-engine --vm-status` on one node no longer shows the removed host, the other good host and the web interface still show ovirt-1 in the mix. What is the best way to remove a NonRespnsive host from ovirt and re-add it? [root@ovirt-1 ~]# hosted-engine --vm-status The hosted engine configuration has not been retrieved from shared storage. Please ensure that ovirt-ha-agent is running and the storage server is reachable. [root@ovirt-2 ~]# hosted-engine --vm-status !! Cluster is in GLOBAL MAINTENANCE mode !! --== Host ovirt-3.x.com (id: 2) status ==-- Host ID: 2 Host timestamp : 12515451 Score : 3274 Engine status : {"vm": "down", "health": "bad", "detail": "unknown", "reason": "vm not running on this host"} Hostname : ovirt-3.x.com Local maintenance : False stopped: False crc32 : 9cf92792 conf_on_shared_storage : True local_conf_timestamp : 12515451 Status up-to-date : True Extra metadata (valid at timestamp): metadata_parse_version=1 metadata_feature_version=1 timestamp=12515451 (Mon Apr 25 14:08:51 2022) host-id=2 score=3274 vm_conf_refresh_time=12515451 (Mon Apr 25 14:08:51 2022) conf_on_shared_storage=True maintenance=False state=GlobalMaintenance stopped=False --== Host ovirt-2.x.com (id: 3) status ==-- Host ID: 3 Host timestamp : 12513269 Score : 3400 Engine status : {"vm": "up", "health": "good", "detail": "Up"} Hostname : ovirt-2.x.com Local maintenance : False stopped: False crc32 : 4a89d706 conf_on_shared_storage : True local_conf_timestamp : 12513269 Status up-to-date : True Extra metadata (valid at timestamp): metadata_parse_version=1 metadata_feature_version=1 timestamp=12513269 (Mon Apr 25 14:09:00 2022) host-id=3 score=3400 vm_conf_refresh_time=12513269 (Mon Apr 25 14:09:00 2022) conf_on_shared_storage=True maintenance=False state=GlobalMaintenance stopped=False !! Cluster is in GLOBAL MAINTENANCE mode !! [root@ovirt-3 ~]# hosted-engine --vm-status !! Cluster is in GLOBAL MAINTENANCE mode !! --== Host ovirt-1.x.com (id: 1) status ==-- Host ID: 1 Host timestamp : 6750990 Score : 0 Engine status : unknown stale-data Hostname : ovirt-1.x.com Local maintenance : False stopped: True crc32 : 5290657b conf_on_shared_storage : True local_conf_timestamp : 6750950 Status up-to-date : False Extra metadata (valid at timestamp): metadata_parse_version=1 metadata_feature_version=1 timestamp=6750990 (Thu Feb 17 22:17:53 2022) host-id=1 score=0 vm_conf_refresh_time=6750950 (Thu Feb 17 22:17:12 2022) conf_on_shared_storage=True maintenance=False state=AgentStopped stopped=True --== Host ovirt-3.x.com (id: 2) status ==-- Host ID: 2 Host timestamp : 12515501 Score : 3279 Engine status : {"vm": "down", "health": "bad", "detail": "unknown", "reason": "vm not running on this host"} Hostname : ovirt-3.x.com Local maintenance : False stopped: False crc32 : 0845cd93 conf_on_shared_storage : True local_conf_timestamp : 12515501 Status up-to-date : True Extra metadata (valid at timestamp): metadata_parse_version=1 metadata_feature_version=1 timestamp=12515501 (Mon Apr 25 14:09:42 2022) host-id=2 score=3279 vm_conf_refresh_time=12515501 (Mon Apr 25 14:09:42 2022) conf_on_shared_storage=True maintenance=False state=GlobalMaintenance stopped=False --== Host ovirt-2
[ovirt-users] Re: Certificate expiration
No. I don't have any of the options under Installation. > On Feb 20, 2022, at 07:52, Strahil Nikolov via Users wrote: > > Do you have the option to use 'Install' -> enroll certificate (or whatever is > the entry in UI ) ? > > Best Regards, > Strahil Nikolov > > On Sun, Feb 20, 2022 at 8:05, Joseph Gelinas > wrote: > Both I guess. The host certificates expired on the 15th the console expires > on the 23. Right now since the engine sees the hosts as unassigned I don't > get the option to set hosts to maintenance mode and if I try to set Enable > Global Maintenance I get the message: "Cannot edit VM Cluster. Operation can > be performed only when Hoist status is Up." > > > > On Feb 19, 2022, at 14:55, Strahil Nikolov wrote: > > > > Is your issue with the host certificates or the engine ? > > > > You can try to set a node in maintenance (or at least try that) and then > > try to reenroll the certificate from the UI. > > > > Best Regards, > > Strahil Nikolov > > > > On Sat, Feb 19, 2022 at 9:48, Joseph Gelinas > > wrote: > > I believe I ran `hosted-engine --deploy` on ovirt-1 to see if there was an > > option to reenroll that way, but when it prompted and asked if it was > > really what I wanted to do I ctrl-D or said no and it ran something > > anyways, so I ctrl-C out of it and maybe that is what messed up vdsm on > > that node. Not sure about ovirt-3, is there a way to fix that? > > > > > On Feb 18, 2022, at 17:21, Joseph Gelinas wrote: > > > > > > Unfortunately ovirt-ha-broker & ovirt-ha-agent are just in continual > > > restart loops on ovirt-1 & ovirt-3 (ovirt-engine is currently on ovirt-3). > > > > > > The output for broker.log: > > > > > > MainThread::ERROR::2022-02-18 > > > 22:08:58,101::broker::72::ovirt_hosted_engine_ha.broker.broker.Broker::(run) > > > Trying to restart the broker > > > MainThread::INFO::2022-02-18 > > > 22:08:58,453::broker::47::ovirt_hosted_engine_ha.broker.broker.Broker::(run) > > > ovirt-hosted-engine-ha broker 2.4.5 started > > > MainThread::INFO::2022-02-18 > > > 22:09:00,456::monitor::45::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) > > > Searching for submonitors in > > > /usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/broker/submonitors > > > MainThread::INFO::2022-02-18 > > > 22:09:00,456::monitor::62::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) > > > Loaded submonitor mem-free > > > MainThread::INFO::2022-02-18 > > > 22:09:00,457::monitor::62::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) > > > Loaded submonitor engine-health > > > MainThread::INFO::2022-02-18 > > > 22:09:00,459::monitor::62::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) > > > Loaded submonitor cpu-load-no-engine > > > MainThread::INFO::2022-02-18 > > > 22:09:00,459::monitor::62::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) > > > Loaded submonitor mgmt-bridge > > > MainThread::INFO::2022-02-18 > > > 22:09:00,459::monitor::62::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) > > > Loaded submonitor network > > > MainThread::INFO::2022-02-18 > > > 22:09:00,460::monitor::62::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) > > > Loaded submonitor storage-domain > > > MainThread::INFO::2022-02-18 > > > 22:09:00,460::monitor::62::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) > > > Loaded submonitor cpu-load > > > MainThread::INFO::2022-02-18 > > > 22:09:00,460::monitor::63::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) > > > Finished loading submonitors > > > MainThread::WARNING::2022-02-18 > > > 22:10:00,788::storage_broker::100::ovirt_hosted_engine_ha.broker.storage_broker.StorageBroker::(__init__) > > > Can't connect vdsm storage: Couldn't connect to VDSM within 60 seconds > > > MainThread::ERROR::2022-02-18 > > > 22:10:00,788::broker::69::ovirt_hosted_engine_ha.broker.broker.Broker::(run) > > > Failed initializing the broker: Couldn't connect to VDSM within 60 > > > seconds > > > MainThread::ERROR::2022-02-18 > > > 22:10:00,789::broker::71::ovirt_hosted_engine_ha.broker.broker.Broker::(run) > > > Traceback (most recent call last): > &g
[ovirt-users] Re: Certificate expiration
Is there a way to do so without the web frontend? As I don't have option to migrate it. > On Feb 20, 2022, at 07:56, Strahil Nikolov via Users wrote: > > Did you manage to move the engine VM to the only node that's in global > maintenance ? > > Best Regards, > Strahil Nikolov > > On Sun, Feb 20, 2022 at 8:05, Joseph Gelinas > wrote: > Both I guess. The host certificates expired on the 15th the console expires > on the 23. Right now since the engine sees the hosts as unassigned I don't > get the option to set hosts to maintenance mode and if I try to set Enable > Global Maintenance I get the message: "Cannot edit VM Cluster. Operation can > be performed only when Hoist status is Up." > > > > On Feb 19, 2022, at 14:55, Strahil Nikolov wrote: > > > > Is your issue with the host certificates or the engine ? > > > > You can try to set a node in maintenance (or at least try that) and then > > try to reenroll the certificate from the UI. > > > > Best Regards, > > Strahil Nikolov > > > > On Sat, Feb 19, 2022 at 9:48, Joseph Gelinas > > wrote: > > I believe I ran `hosted-engine --deploy` on ovirt-1 to see if there was an > > option to reenroll that way, but when it prompted and asked if it was > > really what I wanted to do I ctrl-D or said no and it ran something > > anyways, so I ctrl-C out of it and maybe that is what messed up vdsm on > > that node. Not sure about ovirt-3, is there a way to fix that? > > > > > On Feb 18, 2022, at 17:21, Joseph Gelinas wrote: > > > > > > Unfortunately ovirt-ha-broker & ovirt-ha-agent are just in continual > > > restart loops on ovirt-1 & ovirt-3 (ovirt-engine is currently on ovirt-3). > > > > > > The output for broker.log: > > > > > > MainThread::ERROR::2022-02-18 > > > 22:08:58,101::broker::72::ovirt_hosted_engine_ha.broker.broker.Broker::(run) > > > Trying to restart the broker > > > MainThread::INFO::2022-02-18 > > > 22:08:58,453::broker::47::ovirt_hosted_engine_ha.broker.broker.Broker::(run) > > > ovirt-hosted-engine-ha broker 2.4.5 started > > > MainThread::INFO::2022-02-18 > > > 22:09:00,456::monitor::45::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) > > > Searching for submonitors in > > > /usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/broker/submonitors > > > MainThread::INFO::2022-02-18 > > > 22:09:00,456::monitor::62::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) > > > Loaded submonitor mem-free > > > MainThread::INFO::2022-02-18 > > > 22:09:00,457::monitor::62::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) > > > Loaded submonitor engine-health > > > MainThread::INFO::2022-02-18 > > > 22:09:00,459::monitor::62::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) > > > Loaded submonitor cpu-load-no-engine > > > MainThread::INFO::2022-02-18 > > > 22:09:00,459::monitor::62::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) > > > Loaded submonitor mgmt-bridge > > > MainThread::INFO::2022-02-18 > > > 22:09:00,459::monitor::62::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) > > > Loaded submonitor network > > > MainThread::INFO::2022-02-18 > > > 22:09:00,460::monitor::62::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) > > > Loaded submonitor storage-domain > > > MainThread::INFO::2022-02-18 > > > 22:09:00,460::monitor::62::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) > > > Loaded submonitor cpu-load > > > MainThread::INFO::2022-02-18 > > > 22:09:00,460::monitor::63::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) > > > Finished loading submonitors > > > MainThread::WARNING::2022-02-18 > > > 22:10:00,788::storage_broker::100::ovirt_hosted_engine_ha.broker.storage_broker.StorageBroker::(__init__) > > > Can't connect vdsm storage: Couldn't connect to VDSM within 60 seconds > > > MainThread::ERROR::2022-02-18 > > > 22:10:00,788::broker::69::ovirt_hosted_engine_ha.broker.broker.Broker::(run) > > > Failed initializing the broker: Couldn't connect to VDSM within 60 > > > seconds > > > MainThread::ERROR::2022-02-18 > > > 22:10:00,789::broker::71::ovirt_hosted_engine_ha.broker.broker.Broker::(run) > > > Traceback (most recent call la
[ovirt-users] Re: Certificate expiration
Right, I don't have those options, because the hosts are listed as unassigned. I can't migrate the engine. I can't put anything into maintenance so the installation menu becomes available. > On Feb 20, 2022, at 07:52, Strahil Nikolov wrote: > > Do you have the option to use 'Install' -> enroll certificate (or whatever is > the entry in UI ) ? > > Best Regards, > Strahil Nikolov > > On Sun, Feb 20, 2022 at 8:05, Joseph Gelinas > wrote: > Both I guess. The host certificates expired on the 15th the console expires > on the 23. Right now since the engine sees the hosts as unassigned I don't > get the option to set hosts to maintenance mode and if I try to set Enable > Global Maintenance I get the message: "Cannot edit VM Cluster. Operation can > be performed only when Hoist status is Up." > > > > On Feb 19, 2022, at 14:55, Strahil Nikolov wrote: > > > > Is your issue with the host certificates or the engine ? > > > > You can try to set a node in maintenance (or at least try that) and then > > try to reenroll the certificate from the UI. > > > > Best Regards, > > Strahil Nikolov > > > > On Sat, Feb 19, 2022 at 9:48, Joseph Gelinas > > wrote: > > I believe I ran `hosted-engine --deploy` on ovirt-1 to see if there was an > > option to reenroll that way, but when it prompted and asked if it was > > really what I wanted to do I ctrl-D or said no and it ran something > > anyways, so I ctrl-C out of it and maybe that is what messed up vdsm on > > that node. Not sure about ovirt-3, is there a way to fix that? > > > > > On Feb 18, 2022, at 17:21, Joseph Gelinas wrote: > > > > > > Unfortunately ovirt-ha-broker & ovirt-ha-agent are just in continual > > > restart loops on ovirt-1 & ovirt-3 (ovirt-engine is currently on ovirt-3). > > > > > > The output for broker.log: > > > > > > MainThread::ERROR::2022-02-18 > > > 22:08:58,101::broker::72::ovirt_hosted_engine_ha.broker.broker.Broker::(run) > > > Trying to restart the broker > > > MainThread::INFO::2022-02-18 > > > 22:08:58,453::broker::47::ovirt_hosted_engine_ha.broker.broker.Broker::(run) > > > ovirt-hosted-engine-ha broker 2.4.5 started > > > MainThread::INFO::2022-02-18 > > > 22:09:00,456::monitor::45::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) > > > Searching for submonitors in > > > /usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/broker/submonitors > > > MainThread::INFO::2022-02-18 > > > 22:09:00,456::monitor::62::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) > > > Loaded submonitor mem-free > > > MainThread::INFO::2022-02-18 > > > 22:09:00,457::monitor::62::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) > > > Loaded submonitor engine-health > > > MainThread::INFO::2022-02-18 > > > 22:09:00,459::monitor::62::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) > > > Loaded submonitor cpu-load-no-engine > > > MainThread::INFO::2022-02-18 > > > 22:09:00,459::monitor::62::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) > > > Loaded submonitor mgmt-bridge > > > MainThread::INFO::2022-02-18 > > > 22:09:00,459::monitor::62::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) > > > Loaded submonitor network > > > MainThread::INFO::2022-02-18 > > > 22:09:00,460::monitor::62::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) > > > Loaded submonitor storage-domain > > > MainThread::INFO::2022-02-18 > > > 22:09:00,460::monitor::62::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) > > > Loaded submonitor cpu-load > > > MainThread::INFO::2022-02-18 > > > 22:09:00,460::monitor::63::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) > > > Finished loading submonitors > > > MainThread::WARNING::2022-02-18 > > > 22:10:00,788::storage_broker::100::ovirt_hosted_engine_ha.broker.storage_broker.StorageBroker::(__init__) > > > Can't connect vdsm storage: Couldn't connect to VDSM within 60 seconds > > > MainThread::ERROR::2022-02-18 > > > 22:10:00,788::broker::69::ovirt_hosted_engine_ha.broker.broker.Broker::(run) > > > Failed initializing the broker: Couldn't connect to VDSM within 60 > > > seconds > > > MainThread::ERROR::2022-02-18 > > > 22:10:00,789::
[ovirt-users] Re: Certificate expiration
Both I guess. The host certificates expired on the 15th the console expires on the 23. Right now since the engine sees the hosts as unassigned I don't get the option to set hosts to maintenance mode and if I try to set Enable Global Maintenance I get the message: "Cannot edit VM Cluster. Operation can be performed only when Hoist status is Up." > On Feb 19, 2022, at 14:55, Strahil Nikolov wrote: > > Is your issue with the host certificates or the engine ? > > You can try to set a node in maintenance (or at least try that) and then try > to reenroll the certificate from the UI. > > Best Regards, > Strahil Nikolov > > On Sat, Feb 19, 2022 at 9:48, Joseph Gelinas > wrote: > I believe I ran `hosted-engine --deploy` on ovirt-1 to see if there was an > option to reenroll that way, but when it prompted and asked if it was really > what I wanted to do I ctrl-D or said no and it ran something anyways, so I > ctrl-C out of it and maybe that is what messed up vdsm on that node. Not sure > about ovirt-3, is there a way to fix that? > > > On Feb 18, 2022, at 17:21, Joseph Gelinas wrote: > > > > Unfortunately ovirt-ha-broker & ovirt-ha-agent are just in continual > > restart loops on ovirt-1 & ovirt-3 (ovirt-engine is currently on ovirt-3). > > > > The output for broker.log: > > > > MainThread::ERROR::2022-02-18 > > 22:08:58,101::broker::72::ovirt_hosted_engine_ha.broker.broker.Broker::(run) > > Trying to restart the broker > > MainThread::INFO::2022-02-18 > > 22:08:58,453::broker::47::ovirt_hosted_engine_ha.broker.broker.Broker::(run) > > ovirt-hosted-engine-ha broker 2.4.5 started > > MainThread::INFO::2022-02-18 > > 22:09:00,456::monitor::45::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) > > Searching for submonitors in > > /usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/broker/submonitors > > MainThread::INFO::2022-02-18 > > 22:09:00,456::monitor::62::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) > > Loaded submonitor mem-free > > MainThread::INFO::2022-02-18 > > 22:09:00,457::monitor::62::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) > > Loaded submonitor engine-health > > MainThread::INFO::2022-02-18 > > 22:09:00,459::monitor::62::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) > > Loaded submonitor cpu-load-no-engine > > MainThread::INFO::2022-02-18 > > 22:09:00,459::monitor::62::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) > > Loaded submonitor mgmt-bridge > > MainThread::INFO::2022-02-18 > > 22:09:00,459::monitor::62::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) > > Loaded submonitor network > > MainThread::INFO::2022-02-18 > > 22:09:00,460::monitor::62::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) > > Loaded submonitor storage-domain > > MainThread::INFO::2022-02-18 > > 22:09:00,460::monitor::62::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) > > Loaded submonitor cpu-load > > MainThread::INFO::2022-02-18 > > 22:09:00,460::monitor::63::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) > > Finished loading submonitors > > MainThread::WARNING::2022-02-18 > > 22:10:00,788::storage_broker::100::ovirt_hosted_engine_ha.broker.storage_broker.StorageBroker::(__init__) > > Can't connect vdsm storage: Couldn't connect to VDSM within 60 seconds > > MainThread::ERROR::2022-02-18 > > 22:10:00,788::broker::69::ovirt_hosted_engine_ha.broker.broker.Broker::(run) > > Failed initializing the broker: Couldn't connect to VDSM within 60 seconds > > MainThread::ERROR::2022-02-18 > > 22:10:00,789::broker::71::ovirt_hosted_engine_ha.broker.broker.Broker::(run) > > Traceback (most recent call last): > > File > > "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/broker/broker.py", > > line 64, in run > >self._storage_broker_instance = self._get_storage_broker() > > File > > "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/broker/broker.py", > > line 143, in _get_storage_broker > >return storage_broker.StorageBroker() > > File > > "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py", > > line 97, in __init__ > >self._backend.connect() > > File > > "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/lib/storage_backends.py", > > line 370, in connect > >connec
[ovirt-users] Re: Certificate expiration
I believe I ran `hosted-engine --deploy` on ovirt-1 to see if there was an option to reenroll that way, but when it prompted and asked if it was really what I wanted to do I ctrl-D or said no and it ran something anyways, so I ctrl-C out of it and maybe that is what messed up vdsm on that node. Not sure about ovirt-3, is there a way to fix that? > On Feb 18, 2022, at 17:21, Joseph Gelinas wrote: > > Unfortunately ovirt-ha-broker & ovirt-ha-agent are just in continual restart > loops on ovirt-1 & ovirt-3 (ovirt-engine is currently on ovirt-3). > > The output for broker.log: > > MainThread::ERROR::2022-02-18 > 22:08:58,101::broker::72::ovirt_hosted_engine_ha.broker.broker.Broker::(run) > Trying to restart the broker > MainThread::INFO::2022-02-18 > 22:08:58,453::broker::47::ovirt_hosted_engine_ha.broker.broker.Broker::(run) > ovirt-hosted-engine-ha broker 2.4.5 started > MainThread::INFO::2022-02-18 > 22:09:00,456::monitor::45::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) > Searching for submonitors in > /usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/broker/submonitors > MainThread::INFO::2022-02-18 > 22:09:00,456::monitor::62::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) > Loaded submonitor mem-free > MainThread::INFO::2022-02-18 > 22:09:00,457::monitor::62::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) > Loaded submonitor engine-health > MainThread::INFO::2022-02-18 > 22:09:00,459::monitor::62::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) > Loaded submonitor cpu-load-no-engine > MainThread::INFO::2022-02-18 > 22:09:00,459::monitor::62::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) > Loaded submonitor mgmt-bridge > MainThread::INFO::2022-02-18 > 22:09:00,459::monitor::62::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) > Loaded submonitor network > MainThread::INFO::2022-02-18 > 22:09:00,460::monitor::62::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) > Loaded submonitor storage-domain > MainThread::INFO::2022-02-18 > 22:09:00,460::monitor::62::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) > Loaded submonitor cpu-load > MainThread::INFO::2022-02-18 > 22:09:00,460::monitor::63::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) > Finished loading submonitors > MainThread::WARNING::2022-02-18 > 22:10:00,788::storage_broker::100::ovirt_hosted_engine_ha.broker.storage_broker.StorageBroker::(__init__) > Can't connect vdsm storage: Couldn't connect to VDSM within 60 seconds > MainThread::ERROR::2022-02-18 > 22:10:00,788::broker::69::ovirt_hosted_engine_ha.broker.broker.Broker::(run) > Failed initializing the broker: Couldn't connect to VDSM within 60 seconds > MainThread::ERROR::2022-02-18 > 22:10:00,789::broker::71::ovirt_hosted_engine_ha.broker.broker.Broker::(run) > Traceback (most recent call last): > File > "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/broker/broker.py", > line 64, in run >self._storage_broker_instance = self._get_storage_broker() > File > "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/broker/broker.py", > line 143, in _get_storage_broker >return storage_broker.StorageBroker() > File > "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py", > line 97, in __init__ >self._backend.connect() > File > "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/lib/storage_backends.py", > line 370, in connect >connection = util.connect_vdsm_json_rpc(logger=self._logger) > File "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/lib/util.py", > line 472, in connect_vdsm_json_rpc >__vdsm_json_rpc_connect(logger, timeout) > File "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/lib/util.py", > line 415, in __vdsm_json_rpc_connect >timeout=VDSM_MAX_RETRY * VDSM_DELAY > RuntimeError: Couldn't connect to VDSM within 60 seconds > > > vdsm.log: > > 2022-02-18 22:14:43,939+ INFO (vmrecovery) [vds] recovery: waiting for > storage pool to go up (clientIF:726) > 2022-02-18 22:14:44,071+ INFO (Reactor thread) > [ProtocolDetector.AcceptorImpl] Accepted connection from ::1:48832 > (protocoldetector:61) > 2022-02-18 22:14:44,074+ ERROR (Reactor thread) > [ProtocolDetector.SSLHandshakeDispatcher] ssl handshake: SSLError, address: > ::1 (sslutils:269) > 2022-02-18 22:14:44,442+ INFO (Reactor thread) > [ProtocolDetector.AcceptorImpl] Accepted connection from ::1:48836 > (pro
[ovirt-users] Re: Certificate expiration
AcceptorImpl] Accepted connection from ::1:48840 (protocoldetector:61) 2022-02-18 22:14:45,449+ ERROR (Reactor thread) [ProtocolDetector.SSLHandshakeDispatcher] ssl handshake: SSLError, address: ::1 (sslutils:269) 2022-02-18 22:14:46,082+ INFO (Reactor thread) [ProtocolDetector.AcceptorImpl] Accepted connection from ::1:48842 (protocoldetector:61) 2022-02-18 22:14:46,084+ ERROR (Reactor thread) [ProtocolDetector.SSLHandshakeDispatcher] ssl handshake: SSLError, address: ::1 (sslutils:269) 2022-02-18 22:14:46,452+ INFO (Reactor thread) [ProtocolDetector.AcceptorImpl] Accepted connection from ::1:48844 (protocoldetector:61) 2022-02-18 22:14:46,455+ ERROR (Reactor thread) [ProtocolDetector.SSLHandshakeDispatcher] ssl handshake: SSLError, address: ::1 (sslutils:269) 2022-02-18 22:14:47,087+ INFO (Reactor thread) [ProtocolDetector.AcceptorImpl] Accepted connection from ::1:48846 (protocoldetector:61) 2022-02-18 22:14:47,089+ ERROR (Reactor thread) [ProtocolDetector.SSLHandshakeDispatcher] ssl handshake: SSLError, address: ::1 (sslutils:269) 2022-02-18 22:14:47,457+ INFO (Reactor thread) [ProtocolDetector.AcceptorImpl] Accepted connection from ::1:48848 (protocoldetector:61) 2022-02-18 22:14:47,459+ ERROR (Reactor thread) [ProtocolDetector.SSLHandshakeDispatcher] ssl handshake: SSLError, address: ::1 (sslutils:269) 2022-02-18 22:14:48,092+ INFO (Reactor thread) [ProtocolDetector.AcceptorImpl] Accepted connection from ::1:48850 (protocoldetector:61) 2022-02-18 22:14:48,094+ ERROR (Reactor thread) [ProtocolDetector.SSLHandshakeDispatcher] ssl handshake: SSLError, address: ::1 (sslutils:269) 2022-02-18 22:14:48,461+ INFO (Reactor thread) [ProtocolDetector.AcceptorImpl] Accepted connection from ::1:48852 (protocoldetector:61) 2022-02-18 22:14:48,464+ ERROR (Reactor thread) [ProtocolDetector.SSLHandshakeDispatcher] ssl handshake: SSLError, address: ::1 (sslutils:269) 2022-02-18 22:14:48,941+ INFO (vmrecovery) [vdsm.api] START getConnectedStoragePoolsList(options=None) from=internal, task_id=75ef5d5f-c56b-4595-95c8-3dc64caa3a83 (api:48) 2022-02-18 22:14:48,942+ INFO (vmrecovery) [vdsm.api] FINISH getConnectedStoragePoolsList return={'poollist': []} from=internal, task_id=75ef5d5f-c56b-4595-95c8-3dc64caa3a83 (api:54) > On Feb 18, 2022, at 16:35, Strahil Nikolov via Users wrote: > > ovirt-2 is 'state=GlobalMaintenance' , but the other 2 nodes is uknown. > Try to start ovirt-ha-broker & ovirt-ha-agent > > Also, you may try to move the hosted-engine to ovirt-2 and try again > > > Best Regards, > Strahil Nikolov > > On Fri, Feb 18, 2022 at 21:48, Joseph Gelinas > wrote: > I may be in maintenance mode, I did try to set it in the beginning of this, > but engine-setup doesn't see it. At this point my nodes say they can't > connect to the HA daemon, or have stale data. > > [root@ovirt-1 ~]# hosted-engine --set-maintenance --mode=global > Cannot connect to the HA daemon, please check the logs. > > [root@ovirt-3 ~]# hosted-engine --set-maintenance --mode=global > Cannot connect to the HA daemon, please check the logs. > > [root@ovirt-2 ~]# hosted-engine --set-maintenance --mode=global > [root@ovirt-2 ~]# hosted-engine --vm-status > > > !! Cluster is in GLOBAL MAINTENANCE mode !! > > > > --== Host ovirt-1.xx.com (id: 1) status ==-- > > Host ID: 1 > Host timestamp: 6750990 > Score : 0 > Engine status : unknown stale-data > Hostname : ovirt-1.xx.com > Local maintenance : False > stopped: True > crc32 : 5290657b > conf_on_shared_storage: True > local_conf_timestamp : 6750950 > Status up-to-date : False > Extra metadata (valid at timestamp): > metadata_parse_version=1 > metadata_feature_version=1 > timestamp=6750990 (Thu Feb 17 22:17:53 2022) > host-id=1 > score=0 > vm_conf_refresh_time=6750950 (Thu Feb 17 22:17:12 2022) > conf_on_shared_storage=True > maintenance=False > state=AgentStopped > stopped=True > > > --== Host ovirt-3.xx.com (id: 2) status ==-- > > Host ID: 2 > Host timestamp: 6731526 > Score : 0 > Engine status : unknown stale-data > Hostname : ovirt-3.xx.com > Local maintenance : False > stopped: True > crc32 : 12c6b5c9 > conf_on_shared_storage: True > loca
[ovirt-users] Re: Certificate expiration
usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/lib/util.py", line 472, in connect_vdsm_json_rpc __vdsm_json_rpc_connect(logger, timeout) File "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/lib/util.py", line 415, in __vdsm_json_rpc_connect timeout=VDSM_MAX_RETRY * VDSM_DELAY RuntimeError: Couldn't connect to VDSM within 60 seconds Ovirt-2's ovirt-hosted-engine-ha/agent.log has entries detecting global maintenance though `systemctl status ovirt-ha-agent` has python exception errors from yesterday. MainThread::INFO::2022-02-18 19:39:10,452::state_decorators::51::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(check) Global maintenance detected MainThread::INFO::2022-02-18 19:39:10,524::hosted_engine::517::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_monitoring_loop) Current state GlobalMaintenance (score: 3400) Feb 17 18:49:12 ovirt-2.us1.vricon.com python3[1324125]: detected unhandled Python exception in '/usr/lib/python3.6/site-packages/ovirt_hosted_engine_setup/vdsm_helper.py' > On Feb 18, 2022, at 14:20, Strahil Nikolov wrote: > > To set the engine into maintenance mode you can ssh to any Hypervisor and run: > 'hosted-engine --set-maintenance --mode=global' > wait 1 minute and run 'hosted-engine --vm-status' to validate. > > Best Regards, > Strahil Nikolov > > On Fri, Feb 18, 2022 at 19:03, Joseph Gelinas > wrote: > Hi, > > The certificates on our oVirt stack recently expired, while all the VMs are > still up, I can't put the cluster into global maintenance via ovirt-engine, > or do anything via ovirt-engine for that matter. Just get event logs about > cert validity. > > VDSM ovirt-1.x.com command Get Host Capabilities failed: PKIX path > validation failed: java.security.cert.CertPathValidatorException: validity > check failed > VDSM ovirt-2.x.com command Get Host Capabilities failed: PKIX path > validation failed: java.security.cert.CertPathValidatorException: validity > check failed > VDSM ovirt-3.x.com command Get Host Capabilities failed: PKIX path > validation failed: java.security.cert.CertPathValidatorException: validity > check failed > > Under Compute -> Hosts, all are status Unassigned. Default data center is > status Non Responsive. > > I have tried a couple of solutions to regenerate the certificates without > much luck and have copied the originals back in place. > > https://access.redhat.com/documentation/en-us/red_hat_virtualization/4.3/html/upgrade_guide/replacing_sha-1_certificates_with_sha-256_certificates_4-1_local_db#Replacing_All_Signed_Certificates_with_SHA-256_4-1_local_db > > https://access.redhat.com/solutions/2409751 > > > I have seen things saying running engine-setup will generate new certs, > however engine doesn't think the cluster is in global maintenance so won't > run that, I believe I can get around the check with `engine-setup > --otopi-environment=OVESETUP_CONFIG/continueSetupOnHEVM=bool:True` but is > that the right thing to do? Will it deploy the certs on to the hosts as well > so things communicate properly? Looks like one is supposed to put a node into > maintenance and reenroll it after doing the engine-setup, but will it even be > able to put the nodes into maintenance given I can't do anything with them > now? > > Appreciate any ideas. > > > ___ > Users mailing list -- users@ovirt.org > To unsubscribe send an email to users-le...@ovirt.org > Privacy Statement: https://www.ovirt.org/privacy-policy.html > oVirt Code of Conduct: > https://www.ovirt.org/community/about/community-guidelines/ > List Archives: > https://lists.ovirt.org/archives/list/users@ovirt.org/message/QCFPKQ3OKPOUV266MFJUMVTNG2OHLJVW/ ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/XOQBFYM5W7SCJISJHQ7PZZ3VZWKY6GEZ/
[ovirt-users] Certificate expiration
Hi, The certificates on our oVirt stack recently expired, while all the VMs are still up, I can't put the cluster into global maintenance via ovirt-engine, or do anything via ovirt-engine for that matter. Just get event logs about cert validity. VDSM ovirt-1.x.com command Get Host Capabilities failed: PKIX path validation failed: java.security.cert.CertPathValidatorException: validity check failed VDSM ovirt-2.x.com command Get Host Capabilities failed: PKIX path validation failed: java.security.cert.CertPathValidatorException: validity check failed VDSM ovirt-3.x.com command Get Host Capabilities failed: PKIX path validation failed: java.security.cert.CertPathValidatorException: validity check failed Under Compute -> Hosts, all are status Unassigned. Default data center is status Non Responsive. I have tried a couple of solutions to regenerate the certificates without much luck and have copied the originals back in place. https://access.redhat.com/documentation/en-us/red_hat_virtualization/4.3/html/upgrade_guide/replacing_sha-1_certificates_with_sha-256_certificates_4-1_local_db#Replacing_All_Signed_Certificates_with_SHA-256_4-1_local_db https://access.redhat.com/solutions/2409751 I have seen things saying running engine-setup will generate new certs, however engine doesn't think the cluster is in global maintenance so won't run that, I believe I can get around the check with `engine-setup --otopi-environment=OVESETUP_CONFIG/continueSetupOnHEVM=bool:True` but is that the right thing to do? Will it deploy the certs on to the hosts as well so things communicate properly? Looks like one is supposed to put a node into maintenance and reenroll it after doing the engine-setup, but will it even be able to put the nodes into maintenance given I can't do anything with them now? Appreciate any ideas. ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/QCFPKQ3OKPOUV266MFJUMVTNG2OHLJVW/
[ovirt-users] Re: Hosted Engine stuck in bios
> On Jan 22, 2021, at 10:11, Arik Hadas wrote: > > On Thu, Jan 21, 2021 at 9:27 PM Joseph Gelinas wrote: > I found `engine-setup > --otopi-environment=OVESETUP_CONFIG/continueSetupOnHEVM=bool:True` from [1] > and now have the ovirt-engine web interface reachable again. But do have one > more question; when I try to change the Custom Chipset/Firmware Type to Q35 > Chipset with BIOS, I get the error; HostedEngine: There was an attempt to > change the Hosted Engine VM values that are locked. > > How do I make the removal of the loader/nvram lines permanent? > > Can you please check the output of: > select custom_bios_type from vm_static where origin=6; > > If it returns 0 then you can change the custom bios type to Q35 + BIOS with: > update vm_static set custom_bios_type = 2, db_generation = db_generation + 1 > where origin = 6; > > If it returns 2 as it is supposed to, you can change any field of the hosted > engine VM (e.g., "comment") via the UI to trigger an update of the OVF_STORE. That did indeed return 0. Thanks for your help Arik. > > [1] > https://lists.ovirt.org/archives/list/users@ovirt.org/thread/2AC57LTHFKJBU6OYZPYSCMTBF6NE3QO2/ > > > On Jan 21, 2021, at 10:15, Joseph Gelinas wrote: > > > > Removing those two lines got the hosted engine vm booting again, so that is > > a great help. Thank you. > > > > Now I just need the web interface of ovirt-engine to work again. I feel > > like I might have run things out of order and forgot to do `engine-setup` > > as part of the update of hosted engine. Though when I try to do that now it > > bails out claiming the cluster isn't in global maintenance yet it is. > > > > [ INFO ] Stage: Setup validation > > [ ERROR ] It seems that you are running your engine inside of the > > hosted-engine VM and are not in "Global Maintenance" mode. > > In that case you should put the system into the "Global > > Maintenance" mode before running engine-setup, or the hosted-engine HA > > agent might kill the machine, which might corrupt your data. > > > > [ ERROR ] Failed to execute stage 'Setup validation': Hosted Engine setup > > detected, but Global Maintenance is not set. > > > > > > I see engine.log says it can't contact the database but I certainly see > > Postgres processes running. > > > > /var/log/ovirt-engine/engine.log > > > > 2021-01-21 14:47:31,502Z ERROR > > [org.ovirt.engine.core.services.HealthStatus] (default task-15) [] Failed > > to run Health Status. > > 2021-01-21 14:47:31,502Z ERROR > > [org.ovirt.engine.core.services.HealthStatus] (default task-14) [] Unable > > to contact Database!: java.lang.InterruptedException > > > > > > > > > >> On Jan 21, 2021, at 03:19, Arik Hadas wrote: > >> > >> > >> > >> On Thu, Jan 21, 2021 at 8:57 AM Joseph Gelinas wrote: > >> Hi, > >> > >> I recently did some updates of ovirt from 4.4.1 or 4.4.3 to 4.4.4, also > >> setting the default datacenter from 4.4 to 4.5 and making the default bios > >> q35+eufi. Unfortunately quite a few things. Now however hosted engine > >> doesn't boot up anymore and `hosted-engine --console` just shows the > >> below bios/firmware output: > >> > >> RHEL > >> > >> RHEL-8.1.0 PC (Q35 + ICH9, 2009)2.00 GHz > >> > >> 0.0.0 16384 MB RAM > >> > >> > >> > >> > >> Select Language This is the option > >> > >> one adjusts to > >> change > >>> Device Managerthe language for > >>> the > >>> Boot Manager current system > >>> > >>> Boot Maintenance Manager > >>> > >> > >> Continue > >> > >> Reset > >> > >> > >> > >> > >> > >> > >> > >> > >> ^v=Move Highlight =Select Entry > >>
[ovirt-users] Re: Hosted Engine stuck in bios
I found `engine-setup --otopi-environment=OVESETUP_CONFIG/continueSetupOnHEVM=bool:True` from [1] and now have the ovirt-engine web interface reachable again. But do have one more question; when I try to change the Custom Chipset/Firmware Type to Q35 Chipset with BIOS, I get the error; HostedEngine: There was an attempt to change the Hosted Engine VM values that are locked. How do I make the removal of the loader/nvram lines permanent? [1] https://lists.ovirt.org/archives/list/users@ovirt.org/thread/2AC57LTHFKJBU6OYZPYSCMTBF6NE3QO2/ > On Jan 21, 2021, at 10:15, Joseph Gelinas wrote: > > Removing those two lines got the hosted engine vm booting again, so that is a > great help. Thank you. > > Now I just need the web interface of ovirt-engine to work again. I feel like > I might have run things out of order and forgot to do `engine-setup` as part > of the update of hosted engine. Though when I try to do that now it bails out > claiming the cluster isn't in global maintenance yet it is. > > [ INFO ] Stage: Setup validation > [ ERROR ] It seems that you are running your engine inside of the > hosted-engine VM and are not in "Global Maintenance" mode. > In that case you should put the system into the "Global Maintenance" > mode before running engine-setup, or the hosted-engine HA agent might kill > the machine, which might corrupt your data. > > [ ERROR ] Failed to execute stage 'Setup validation': Hosted Engine setup > detected, but Global Maintenance is not set. > > > I see engine.log says it can't contact the database but I certainly see > Postgres processes running. > > /var/log/ovirt-engine/engine.log > > 2021-01-21 14:47:31,502Z ERROR [org.ovirt.engine.core.services.HealthStatus] > (default task-15) [] Failed to run Health Status. > 2021-01-21 14:47:31,502Z ERROR [org.ovirt.engine.core.services.HealthStatus] > (default task-14) [] Unable to contact Database!: > java.lang.InterruptedException > > > > >> On Jan 21, 2021, at 03:19, Arik Hadas wrote: >> >> >> >> On Thu, Jan 21, 2021 at 8:57 AM Joseph Gelinas wrote: >> Hi, >> >> I recently did some updates of ovirt from 4.4.1 or 4.4.3 to 4.4.4, also >> setting the default datacenter from 4.4 to 4.5 and making the default bios >> q35+eufi. Unfortunately quite a few things. Now however hosted engine >> doesn't boot up anymore and `hosted-engine --console` just shows the below >> bios/firmware output: >> >> RHEL >> >> RHEL-8.1.0 PC (Q35 + ICH9, 2009)2.00 GHz >> >> 0.0.0 16384 MB RAM >> >> >> >> >> Select Language This is the option >> >> one adjusts to >> change >>> Device Managerthe language for the >>> >>> Boot Manager current system >>> >>> Boot Maintenance Manager >>> >> >> Continue >> >> Reset >> >> >> >> >> >> >> >> >> ^v=Move Highlight =Select Entry >> >> >> >> When in this state `hosted-engine --vm-status` says it is up but failed >> liveliness check >> >> hosted-engine --vm-status | grep -i engine\ status >> Engine status : {"vm": "down", "health": "bad", >> "detail": "unknown", "reason": "vm not running on this host"} >> Engine status : {"vm": "up", "health": "bad", "detail": >> "Up", "reason": "failed liveliness check"} >> Engine status : {"vm": "down", "health": "bad", >> "detail": "Down", "reason": "bad vm status"} >> >> I assume I am running into https://access.redhat.com/solutions/5341561 (RHV: >> Hosted-Engine VM fails to start after changing the cluster to Q35/UEFI) >> however how to fix that isn't really described. I have tried starting hosted
[ovirt-users] Re: Hosted Engine stuck in bios
Removing those two lines got the hosted engine vm booting again, so that is a great help. Thank you. Now I just need the web interface of ovirt-engine to work again. I feel like I might have run things out of order and forgot to do `engine-setup` as part of the update of hosted engine. Though when I try to do that now it bails out claiming the cluster isn't in global maintenance yet it is. [ INFO ] Stage: Setup validation [ ERROR ] It seems that you are running your engine inside of the hosted-engine VM and are not in "Global Maintenance" mode. In that case you should put the system into the "Global Maintenance" mode before running engine-setup, or the hosted-engine HA agent might kill the machine, which might corrupt your data. [ ERROR ] Failed to execute stage 'Setup validation': Hosted Engine setup detected, but Global Maintenance is not set. I see engine.log says it can't contact the database but I certainly see Postgres processes running. /var/log/ovirt-engine/engine.log 2021-01-21 14:47:31,502Z ERROR [org.ovirt.engine.core.services.HealthStatus] (default task-15) [] Failed to run Health Status. 2021-01-21 14:47:31,502Z ERROR [org.ovirt.engine.core.services.HealthStatus] (default task-14) [] Unable to contact Database!: java.lang.InterruptedException > On Jan 21, 2021, at 03:19, Arik Hadas wrote: > > > > On Thu, Jan 21, 2021 at 8:57 AM Joseph Gelinas wrote: > Hi, > > I recently did some updates of ovirt from 4.4.1 or 4.4.3 to 4.4.4, also > setting the default datacenter from 4.4 to 4.5 and making the default bios > q35+eufi. Unfortunately quite a few things. Now however hosted engine doesn't > boot up anymore and `hosted-engine --console` just shows the below > bios/firmware output: > > RHEL > > RHEL-8.1.0 PC (Q35 + ICH9, 2009)2.00 GHz > > 0.0.0 16384 MB RAM > > > > >Select Language This is the option > > one adjusts to > change > > Device Managerthe language for the > > > Boot Manager current system > > > Boot Maintenance Manager > > >Continue > >Reset > > > > > > > > > ^v=Move Highlight =Select Entry > > > > When in this state `hosted-engine --vm-status` says it is up but failed > liveliness check > > hosted-engine --vm-status | grep -i engine\ status > Engine status : {"vm": "down", "health": "bad", > "detail": "unknown", "reason": "vm not running on this host"} > Engine status : {"vm": "up", "health": "bad", "detail": > "Up", "reason": "failed liveliness check"} > Engine status : {"vm": "down", "health": "bad", > "detail": "Down", "reason": "bad vm status"} > > I assume I am running into https://access.redhat.com/solutions/5341561 (RHV: > Hosted-Engine VM fails to start after changing the cluster to Q35/UEFI) > however how to fix that isn't really described. I have tried starting hosted > engine paused (`hosted-engine --vm-start-paused`) and editing the config > (`virsh -c qemu:///system?authfile=/etc/ovirt-hosted-engine/virsh_auth.conf > edit HostedEngine`) to have pc-i440fx instead and removing a bunch of pcie > lines etc until it will accept the config and then resuming hosted engine > (`virsh -c qemu:///system?authfile=/etc/ovirt-hosted-engine/virsh_auth.conf > resume HostedEngine`) but haven't come up with something that is able to > start. > > Anyone know how to resolve this? Am I even chasing the right path? > > Let's start with the negative - this should have been prevented by [1]. > Can it be that the custom bios type that the hosted engine VM was set with > was manually dropped in this environment? > > The positive is that the VM starts. This means that from the chipset > perspective, the configuration is valid. > So I wouldn't try to change it to i440fx, but only to switch the firmware to &
[ovirt-users] Hosted Engine stuck in bios
Hi, I recently did some updates of ovirt from 4.4.1 or 4.4.3 to 4.4.4, also setting the default datacenter from 4.4 to 4.5 and making the default bios q35+eufi. Unfortunately quite a few things. Now however hosted engine doesn't boot up anymore and `hosted-engine --console` just shows the below bios/firmware output: RHEL RHEL-8.1.0 PC (Q35 + ICH9, 2009)2.00 GHz 0.0.0 16384 MB RAM Select Language This is the option one adjusts to change > Device Managerthe language for the > Boot Manager current system > Boot Maintenance Manager Continue Reset ^v=Move Highlight =Select Entry When in this state `hosted-engine --vm-status` says it is up but failed liveliness check hosted-engine --vm-status | grep -i engine\ status Engine status : {"vm": "down", "health": "bad", "detail": "unknown", "reason": "vm not running on this host"} Engine status : {"vm": "up", "health": "bad", "detail": "Up", "reason": "failed liveliness check"} Engine status : {"vm": "down", "health": "bad", "detail": "Down", "reason": "bad vm status"} I assume I am running into https://access.redhat.com/solutions/5341561 (RHV: Hosted-Engine VM fails to start after changing the cluster to Q35/UEFI) however how to fix that isn't really described. I have tried starting hosted engine paused (`hosted-engine --vm-start-paused`) and editing the config (`virsh -c qemu:///system?authfile=/etc/ovirt-hosted-engine/virsh_auth.conf edit HostedEngine`) to have pc-i440fx instead and removing a bunch of pcie lines etc until it will accept the config and then resuming hosted engine (`virsh -c qemu:///system?authfile=/etc/ovirt-hosted-engine/virsh_auth.conf resume HostedEngine`) but haven't come up with something that is able to start. Anyone know how to resolve this? Am I even chasing the right path? /var/log/libvirt/qemu/HostedEngine.log 2021-01-20 15:31:56.500+: starting up libvirt version: 6.6.0, package: 7.1.el8 (CBS , 2020-12-10-14:05:40, ), qemu version: 5.1.0qemu-kvm-5.1.0-14.el8.1, kernel: 4.18.0-240.1.1.el8_3.x86_64, hostname: ovirt-3 LC_ALL=C \ PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin \ HOME=/var/lib/libvirt/qemu/domain-25-HostedEngine \ XDG_DATA_HOME=/var/lib/libvirt/qemu/domain-25-HostedEngine/.local/share \ XDG_CACHE_HOME=/var/lib/libvirt/qemu/domain-25-HostedEngine/.cache \ XDG_CONFIG_HOME=/var/lib/libvirt/qemu/domain-25-HostedEngine/.config \ QEMU_AUDIO_DRV=spice \ /usr/libexec/qemu-kvm \ -name guest=HostedEngine,debug-threads=on \ -S \ -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-25-HostedEngine/master-key.aes \ -blockdev '{"driver":"file","filename":"/usr/share/OVMF/OVMF_CODE.secboot.fd","node-name":"libvirt-pflash0-storage","auto-read-only":true,"discard":"unmap"}' \ -blockdev '{"node-name":"libvirt-pflash0-format","read-only":true,"driver":"raw","file":"libvirt-pflash0-storage"}' \ -blockdev '{"driver":"file","filename":"/var/lib/libvirt/qemu/nvram/81816cd3-5816-4185-b553-b5a636156fbd.fd","node-name":"libvirt-pflash1-storage","auto-read-only":true,"discard":"unmap"}' \ -blockdev '{"node-name":"libvirt-pflash1-format","read-only":false,"driver":"raw","file":"libvirt-pflash1-storage"}' \ -machine pc-q35-rhel8.1.0,accel=kvm,usb=off,dump-guest-core=off,pflash0=libvirt-pflash0-forma