[ovirt-users] Re: Unable to remove host from ovirt engine

2022-05-02 Thread Joseph Gelinas
That host does have gluster bricks that are all healthy. I have noticed under 
Compute -> Hosts -> Edit -> Hosted Engine there is the option to Undeploy, 
maybe that is what I need? Any ideas what actions that takes?


> On Apr 30, 2022, at 15:26, Strahil Nikolov  wrote:
> 
> You need to first replace the host on gluster level (it looks it has/had a 
> brick) before ovirt allows you to remove the host.
> 
> Can you check if the gluster volumes have a brick from that host ?
> 
> Best Regards,
> Strahil Nikolov
> 
> On Sat, Apr 30, 2022 at 15:24, Joseph Gelinas
>  wrote:
> Using GlusterFS for storage.
> 
> > On Apr 30, 2022, at 06:55, Strahil Nikolov via Users  
> > wrote:
> > 
> > *storage 
> > 
> > On Sat, Apr 30, 2022 at 13:50, Strahil Nikolov via Users
> >  wrote:
> > ___
> > Users mailing list -- users@ovirt.org
> > To unsubscribe send an email to users-le...@ovirt.org
> > Privacy Statement: https://www.ovirt.org/privacy-policy.html
> > oVirt Code of Conduct: 
> > https://www.ovirt.org/community/about/community-guidelines/
> > List Archives:
> > https://lists.ovirt.org/archives/list/users@ovirt.org/message/MWDNE4IK5F7ZH4LCY3I5EJWXDD54ILFH/
> > 
> > ___
> > Users mailing list -- users@ovirt.org
> > To unsubscribe send an email to users-le...@ovirt.org
> > Privacy Statement: https://www.ovirt.org/privacy-policy.html
> > oVirt Code of Conduct: 
> > https://www.ovirt.org/community/about/community-guidelines/
> > List Archives: 
> > https://lists.ovirt.org/archives/list/users@ovirt.org/message/EQ4ZBRWZCO2TJRJJDRZZLV4BXET2JQMF/
> 
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/privacy-policy.html
> oVirt Code of Conduct: 
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/3OJOWCRYAQD4KN6IIEYYKBCUE5R5U5JH/
> 
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/UV6AY5QUN7RNZ6BUWDHCUAAKJENETVAC/


[ovirt-users] Re: Unable to remove host from ovirt engine

2022-04-30 Thread Joseph Gelinas
Using GlusterFS for storage.

> On Apr 30, 2022, at 06:55, Strahil Nikolov via Users  wrote:
> 
> *storage 
> 
> On Sat, Apr 30, 2022 at 13:50, Strahil Nikolov via Users
>  wrote:
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/privacy-policy.html
> oVirt Code of Conduct: 
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/MWDNE4IK5F7ZH4LCY3I5EJWXDD54ILFH/
> 
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/privacy-policy.html
> oVirt Code of Conduct: 
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives: 
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/EQ4ZBRWZCO2TJRJJDRZZLV4BXET2JQMF/
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/3OJOWCRYAQD4KN6IIEYYKBCUE5R5U5JH/


[ovirt-users] Re: Unable to remove host from ovirt engine

2022-04-29 Thread Joseph Gelinas
Yes, that is how I am getting the message saying it cannot confirm host has 
been rebooted because it isn't in a valid state.


> On Apr 29, 2022, at 13:59, Strahil Nikolov  wrote:
> 
> See my last email... Use the 3 dots menu
> 
> 
> 
> 
> On Fri, Apr 29, 2022 at 14:09, Joseph Gelinas
>  wrote:
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/privacy-policy.html
> oVirt Code of Conduct: 
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/TOD3ZJAYKVMNY23WDUJ2XMTTOQFKDR4F/
> 
> 
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/GR3GEFBYC4S3NIANSS5OYUIJFBTLXZXH/


[ovirt-users] Re: Unable to remove host from ovirt engine

2022-04-28 Thread Joseph Gelinas
Unfortunately can't do that either.

> On Apr 28, 2022, at 15:26, Strahil Nikolov via Users  wrote:
> 
> Have you tried setting it to maintenance ?
> 
> Best Regards,
> Strahil Nikolov
> 
> On Thu, Apr 28, 2022 at 21:46, Joseph Gelinas
>  wrote:
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/privacy-policy.html
> oVirt Code of Conduct: 
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/Z4U4N42S72IPKD5SY4OZG2TAPPSSOMC5/
> 
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/privacy-policy.html
> oVirt Code of Conduct: 
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives: 
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/QLFNU45TDDFGLNYMDQFSHPGZTV44N5QJ/
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/DAKAIFBRJG3EWFIFFCZLT6VGDKZCLOFO/


[ovirt-users] Re: Unable to remove host from ovirt engine

2022-04-26 Thread Joseph Gelinas
That did remove the ovirt-1 host from `hosted-engine --vm-status` on ovirt-3, 
however it still appears in the web interface as an Unassigned host after 
restarting ovirt-engine.

If I rerun `hosted-engine --clean-metadata --host-id=1` on ovirt-2 or ovirt-3 I 
get a message about an unclean metadata block, but perhaps that is expected 
given it doesn't exist in the vm-status output anymore?

[root@ovirt-2 ~]# hosted-engine --clean-metadata --host-id=1
INFO:ovirt_hosted_engine_ha.agent.agent.Agent:ovirt-hosted-engine-ha agent 
2.4.5 started
INFO:ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine:Certificate common 
name not found, using hostname to identify host
INFO:ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine:Initializing 
ha-broker connection
INFO:ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine:Broker 
initialized, all submonitors started
ERROR:ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine:Malformed 
metadata for host 1: received 0 of 512 expected bytes
ERROR:ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine:Cannot clean 
unclean metadata block. Consider --force-clean.
INFO:ovirt_hosted_engine_ha.agent.agent.Agent:Agent shutting down




> On Apr 25, 2022, at 16:39, Strahil Nikolov via Users  wrote:
> 
> I think you can use 'hosted-engine --clean-metadata --host-id=1'
> 
> In my case I had to use --force-cleanup, but I wouldn't recommend using it.
> 
> Best Regards,
> Strahil Nikolov
> 
> On Mon, Apr 25, 2022 at 18:08, Joseph Gelinas
>  wrote:
> Recently our host and ovirt engine certificates expired and with some ideas 
> from Strahil we were able to get 2 of the 3 ovirt hosts updated with usable 
> certificates and move all of our VMs to those two nodes.
> 
> https://lists.ovirt.org/archives/list/users@ovirt.org/thread/QCFPKQ3OKPOUV266MFJUMVTNG2OHLJVW/
> 
> Not having any luck with the last host we figured we'd just try to remove it 
> from ovirt engine and re-add it. While it seems `hosted-engine --vm-status` 
> on one node no longer shows the removed host, the other good host and the web 
> interface still show ovirt-1 in the mix. What is the best way to remove a 
> NonRespnsive host from ovirt and re-add it?
> 
> 
> [root@ovirt-1 ~]# hosted-engine --vm-status
> The hosted engine configuration has not been retrieved from shared storage. 
> Please ensure that ovirt-ha-agent is running and the storage server is 
> reachable.
> 
> 
> 
> [root@ovirt-2 ~]# hosted-engine --vm-status
> 
> 
> !! Cluster is in GLOBAL MAINTENANCE mode !!
> 
> 
> 
> --== Host ovirt-3.x.com (id: 2) status ==--
> 
> Host ID: 2
> Host timestamp: 12515451
> Score  : 3274
> Engine status  : {"vm": "down", "health": "bad", 
> "detail": "unknown", "reason": "vm not running on this host"}
> Hostname  : ovirt-3.x.com
> Local maintenance  : False
> stopped: False
> crc32  : 9cf92792
> conf_on_shared_storage: True
> local_conf_timestamp  : 12515451
> Status up-to-date  : True
> Extra metadata (valid at timestamp):
> metadata_parse_version=1
> metadata_feature_version=1
> timestamp=12515451 (Mon Apr 25 14:08:51 2022)
> host-id=2
> score=3274
> vm_conf_refresh_time=12515451 (Mon Apr 25 14:08:51 2022)
> conf_on_shared_storage=True
> maintenance=False
> state=GlobalMaintenance
> stopped=False
> 
> 
> --== Host ovirt-2.x.com (id: 3) status ==--
> 
> Host ID: 3
> Host timestamp: 12513269
> Score  : 3400
> Engine status  : {"vm": "up", "health": "good", "detail": 
> "Up"}
> Hostname  : ovirt-2.x.com
> Local maintenance  : False
> stopped: False
> crc32  : 4a89d706
> conf_on_shared_storage: True
> local_conf_timestamp  : 12513269
> Status up-to-date  : True
> Extra metadata (valid at timestamp):
> metadata_parse_version=1
> metadata_feature_version=1
> timestamp=12513269 (Mon Apr 25 14:09:00 2022)
> host-id=3
> score=3400
> vm_conf_refresh_time=12513269 (Mon Apr 25 14:09:00 2022)
> conf_on_shared_storage=True
> maintenance=False
> state=GlobalMaintenance
> stopped=False
> 
> 
> !! Cluster is in GLOBAL MAINTENANCE 

[ovirt-users] Unable to remove host from ovirt engine

2022-04-25 Thread Joseph Gelinas
Recently our host and ovirt engine certificates expired and with some ideas 
from Strahil we were able to get 2 of the 3 ovirt hosts updated with usable 
certificates and move all of our VMs to those two nodes.

https://lists.ovirt.org/archives/list/users@ovirt.org/thread/QCFPKQ3OKPOUV266MFJUMVTNG2OHLJVW/

Not having any luck with the last host we figured we'd just try to remove it 
from ovirt engine and re-add it. While it seems `hosted-engine --vm-status` on 
one node no longer shows the removed host, the other good host and the web 
interface still show ovirt-1 in the mix. What is the best way to remove a 
NonRespnsive host from ovirt and re-add it?


[root@ovirt-1 ~]# hosted-engine --vm-status
The hosted engine configuration has not been retrieved from shared storage. 
Please ensure that ovirt-ha-agent is running and the storage server is 
reachable.



[root@ovirt-2 ~]# hosted-engine --vm-status


!! Cluster is in GLOBAL MAINTENANCE mode !!



--== Host ovirt-3.x.com (id: 2) status ==--

Host ID: 2
Host timestamp : 12515451
Score  : 3274
Engine status  : {"vm": "down", "health": "bad", "detail": 
"unknown", "reason": "vm not running on this host"}
Hostname   : ovirt-3.x.com
Local maintenance  : False
stopped: False
crc32  : 9cf92792
conf_on_shared_storage : True
local_conf_timestamp   : 12515451
Status up-to-date  : True
Extra metadata (valid at timestamp):
metadata_parse_version=1
metadata_feature_version=1
timestamp=12515451 (Mon Apr 25 14:08:51 2022)
host-id=2
score=3274
vm_conf_refresh_time=12515451 (Mon Apr 25 14:08:51 2022)
conf_on_shared_storage=True
maintenance=False
state=GlobalMaintenance
stopped=False


--== Host ovirt-2.x.com (id: 3) status ==--

Host ID: 3
Host timestamp : 12513269
Score  : 3400
Engine status  : {"vm": "up", "health": "good", "detail": 
"Up"}
Hostname   : ovirt-2.x.com
Local maintenance  : False
stopped: False
crc32  : 4a89d706
conf_on_shared_storage : True
local_conf_timestamp   : 12513269
Status up-to-date  : True
Extra metadata (valid at timestamp):
metadata_parse_version=1
metadata_feature_version=1
timestamp=12513269 (Mon Apr 25 14:09:00 2022)
host-id=3
score=3400
vm_conf_refresh_time=12513269 (Mon Apr 25 14:09:00 2022)
conf_on_shared_storage=True
maintenance=False
state=GlobalMaintenance
stopped=False


!! Cluster is in GLOBAL MAINTENANCE mode !!





[root@ovirt-3 ~]# hosted-engine --vm-status


!! Cluster is in GLOBAL MAINTENANCE mode !!



--== Host ovirt-1.x.com (id: 1) status ==--

Host ID: 1
Host timestamp : 6750990
Score  : 0
Engine status  : unknown stale-data
Hostname   : ovirt-1.x.com
Local maintenance  : False
stopped: True
crc32  : 5290657b
conf_on_shared_storage : True
local_conf_timestamp   : 6750950
Status up-to-date  : False
Extra metadata (valid at timestamp):
metadata_parse_version=1
metadata_feature_version=1
timestamp=6750990 (Thu Feb 17 22:17:53 2022)
host-id=1
score=0
vm_conf_refresh_time=6750950 (Thu Feb 17 22:17:12 2022)
conf_on_shared_storage=True
maintenance=False
state=AgentStopped
stopped=True


--== Host ovirt-3.x.com (id: 2) status ==--

Host ID: 2
Host timestamp : 12515501
Score  : 3279
Engine status  : {"vm": "down", "health": "bad", "detail": 
"unknown", "reason": "vm not running on this host"}
Hostname   : ovirt-3.x.com
Local maintenance  : False
stopped: False
crc32  : 0845cd93
conf_on_shared_storage : True
local_conf_timestamp   : 12515501
Status up-to-date  : True
Extra metadata (valid at timestamp):
metadata_parse_version=1
metadata_feature_version=1
timestamp=12515501 (Mon Apr 25 14:09:42 2022)
host-id=2
score=3279
vm_conf_refresh_time=12515501 (Mon Apr 25 14:09:42 2022)
conf_on_shared_storage=True
maintenance=False
state=GlobalMaintenance
stopped=False


--== Host ovirt-2

[ovirt-users] Re: Certificate expiration

2022-02-20 Thread Joseph Gelinas
No. I don't have any of the options under Installation.

> On Feb 20, 2022, at 07:52, Strahil Nikolov via Users  wrote:
> 
> Do you have the option to use 'Install' -> enroll certificate (or whatever is 
> the entry in UI ) ?
> 
> Best Regards,
> Strahil Nikolov
> 
> On Sun, Feb 20, 2022 at 8:05, Joseph Gelinas
>  wrote:
> Both I guess. The host certificates expired on the 15th the console expires 
> on the 23. Right now since the engine sees the hosts as unassigned I don't 
> get the option to set hosts to maintenance mode and if I try to set Enable 
> Global Maintenance I get the message: "Cannot edit VM Cluster. Operation can 
> be performed only when Hoist status is Up."
> 
> 
> > On Feb 19, 2022, at 14:55, Strahil Nikolov  wrote:
> > 
> > Is your issue with the host certificates or the engine ?
> > 
> > You can try to set a node in maintenance (or at least try that) and then 
> > try to reenroll the certificate from the UI.
> > 
> > Best Regards,
> > Strahil Nikolov
> > 
> > On Sat, Feb 19, 2022 at 9:48, Joseph Gelinas
> >  wrote:
> > I believe I ran `hosted-engine --deploy` on ovirt-1 to see if there was an 
> > option to reenroll that way, but when it prompted and asked if it was 
> > really what I wanted to do I ctrl-D or said no and it ran something 
> > anyways, so I ctrl-C out of it and maybe that is what messed up vdsm on 
> > that node. Not sure about ovirt-3, is there a way to fix that?
> > 
> > > On Feb 18, 2022, at 17:21, Joseph Gelinas  wrote:
> > > 
> > > Unfortunately ovirt-ha-broker & ovirt-ha-agent are just in continual 
> > > restart loops on ovirt-1 & ovirt-3 (ovirt-engine is currently on ovirt-3).
> > > 
> > > The output for broker.log:
> > > 
> > > MainThread::ERROR::2022-02-18 
> > > 22:08:58,101::broker::72::ovirt_hosted_engine_ha.broker.broker.Broker::(run)
> > >  Trying to restart the broker
> > > MainThread::INFO::2022-02-18 
> > > 22:08:58,453::broker::47::ovirt_hosted_engine_ha.broker.broker.Broker::(run)
> > >  ovirt-hosted-engine-ha broker 2.4.5 started
> > > MainThread::INFO::2022-02-18 
> > > 22:09:00,456::monitor::45::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
> > >  Searching for submonitors in 
> > > /usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/broker/submonitors
> > > MainThread::INFO::2022-02-18 
> > > 22:09:00,456::monitor::62::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
> > >  Loaded submonitor mem-free
> > > MainThread::INFO::2022-02-18 
> > > 22:09:00,457::monitor::62::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
> > >  Loaded submonitor engine-health
> > > MainThread::INFO::2022-02-18 
> > > 22:09:00,459::monitor::62::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
> > >  Loaded submonitor cpu-load-no-engine
> > > MainThread::INFO::2022-02-18 
> > > 22:09:00,459::monitor::62::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
> > >  Loaded submonitor mgmt-bridge
> > > MainThread::INFO::2022-02-18 
> > > 22:09:00,459::monitor::62::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
> > >  Loaded submonitor network
> > > MainThread::INFO::2022-02-18 
> > > 22:09:00,460::monitor::62::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
> > >  Loaded submonitor storage-domain
> > > MainThread::INFO::2022-02-18 
> > > 22:09:00,460::monitor::62::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
> > >  Loaded submonitor cpu-load
> > > MainThread::INFO::2022-02-18 
> > > 22:09:00,460::monitor::63::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
> > >  Finished loading submonitors
> > > MainThread::WARNING::2022-02-18 
> > > 22:10:00,788::storage_broker::100::ovirt_hosted_engine_ha.broker.storage_broker.StorageBroker::(__init__)
> > >  Can't connect vdsm storage: Couldn't  connect to VDSM within 60 seconds 
> > > MainThread::ERROR::2022-02-18 
> > > 22:10:00,788::broker::69::ovirt_hosted_engine_ha.broker.broker.Broker::(run)
> > >  Failed initializing the broker: Couldn't  connect to VDSM within 60 
> > > seconds
> > > MainThread::ERROR::2022-02-18 
> > > 22:10:00,789::broker::71::ovirt_hosted_engine_ha.broker.broker.Broker::(run)
> > >  Traceback (most recent call last):
> &g

[ovirt-users] Re: Certificate expiration

2022-02-20 Thread Joseph Gelinas
Is there a way to do so without the web frontend? As I don't have option to 
migrate it.

> On Feb 20, 2022, at 07:56, Strahil Nikolov via Users  wrote:
> 
> Did you manage to move the engine VM to the only node that's in global 
> maintenance ?
> 
> Best Regards,
> Strahil Nikolov
> 
> On Sun, Feb 20, 2022 at 8:05, Joseph Gelinas
>  wrote:
> Both I guess. The host certificates expired on the 15th the console expires 
> on the 23. Right now since the engine sees the hosts as unassigned I don't 
> get the option to set hosts to maintenance mode and if I try to set Enable 
> Global Maintenance I get the message: "Cannot edit VM Cluster. Operation can 
> be performed only when Hoist status is Up."
> 
> 
> > On Feb 19, 2022, at 14:55, Strahil Nikolov  wrote:
> > 
> > Is your issue with the host certificates or the engine ?
> > 
> > You can try to set a node in maintenance (or at least try that) and then 
> > try to reenroll the certificate from the UI.
> > 
> > Best Regards,
> > Strahil Nikolov
> > 
> > On Sat, Feb 19, 2022 at 9:48, Joseph Gelinas
> >  wrote:
> > I believe I ran `hosted-engine --deploy` on ovirt-1 to see if there was an 
> > option to reenroll that way, but when it prompted and asked if it was 
> > really what I wanted to do I ctrl-D or said no and it ran something 
> > anyways, so I ctrl-C out of it and maybe that is what messed up vdsm on 
> > that node. Not sure about ovirt-3, is there a way to fix that?
> > 
> > > On Feb 18, 2022, at 17:21, Joseph Gelinas  wrote:
> > > 
> > > Unfortunately ovirt-ha-broker & ovirt-ha-agent are just in continual 
> > > restart loops on ovirt-1 & ovirt-3 (ovirt-engine is currently on ovirt-3).
> > > 
> > > The output for broker.log:
> > > 
> > > MainThread::ERROR::2022-02-18 
> > > 22:08:58,101::broker::72::ovirt_hosted_engine_ha.broker.broker.Broker::(run)
> > >  Trying to restart the broker
> > > MainThread::INFO::2022-02-18 
> > > 22:08:58,453::broker::47::ovirt_hosted_engine_ha.broker.broker.Broker::(run)
> > >  ovirt-hosted-engine-ha broker 2.4.5 started
> > > MainThread::INFO::2022-02-18 
> > > 22:09:00,456::monitor::45::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
> > >  Searching for submonitors in 
> > > /usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/broker/submonitors
> > > MainThread::INFO::2022-02-18 
> > > 22:09:00,456::monitor::62::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
> > >  Loaded submonitor mem-free
> > > MainThread::INFO::2022-02-18 
> > > 22:09:00,457::monitor::62::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
> > >  Loaded submonitor engine-health
> > > MainThread::INFO::2022-02-18 
> > > 22:09:00,459::monitor::62::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
> > >  Loaded submonitor cpu-load-no-engine
> > > MainThread::INFO::2022-02-18 
> > > 22:09:00,459::monitor::62::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
> > >  Loaded submonitor mgmt-bridge
> > > MainThread::INFO::2022-02-18 
> > > 22:09:00,459::monitor::62::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
> > >  Loaded submonitor network
> > > MainThread::INFO::2022-02-18 
> > > 22:09:00,460::monitor::62::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
> > >  Loaded submonitor storage-domain
> > > MainThread::INFO::2022-02-18 
> > > 22:09:00,460::monitor::62::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
> > >  Loaded submonitor cpu-load
> > > MainThread::INFO::2022-02-18 
> > > 22:09:00,460::monitor::63::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
> > >  Finished loading submonitors
> > > MainThread::WARNING::2022-02-18 
> > > 22:10:00,788::storage_broker::100::ovirt_hosted_engine_ha.broker.storage_broker.StorageBroker::(__init__)
> > >  Can't connect vdsm storage: Couldn't  connect to VDSM within 60 seconds 
> > > MainThread::ERROR::2022-02-18 
> > > 22:10:00,788::broker::69::ovirt_hosted_engine_ha.broker.broker.Broker::(run)
> > >  Failed initializing the broker: Couldn't  connect to VDSM within 60 
> > > seconds
> > > MainThread::ERROR::2022-02-18 
> > > 22:10:00,789::broker::71::ovirt_hosted_engine_ha.broker.broker.Broker::(run)
> > >  Traceback (most recent call la

[ovirt-users] Re: Certificate expiration

2022-02-20 Thread Joseph Gelinas
Right, I don't have those options, because the hosts are listed as unassigned. 
I can't migrate the engine. I can't put anything into maintenance so the 
installation menu becomes available.
 

> On Feb 20, 2022, at 07:52, Strahil Nikolov  wrote:
> 
> Do you have the option to use 'Install' -> enroll certificate (or whatever is 
> the entry in UI ) ?
> 
> Best Regards,
> Strahil Nikolov
> 
> On Sun, Feb 20, 2022 at 8:05, Joseph Gelinas
>  wrote:
> Both I guess. The host certificates expired on the 15th the console expires 
> on the 23. Right now since the engine sees the hosts as unassigned I don't 
> get the option to set hosts to maintenance mode and if I try to set Enable 
> Global Maintenance I get the message: "Cannot edit VM Cluster. Operation can 
> be performed only when Hoist status is Up."
> 
> 
> > On Feb 19, 2022, at 14:55, Strahil Nikolov  wrote:
> > 
> > Is your issue with the host certificates or the engine ?
> > 
> > You can try to set a node in maintenance (or at least try that) and then 
> > try to reenroll the certificate from the UI.
> > 
> > Best Regards,
> > Strahil Nikolov
> > 
> > On Sat, Feb 19, 2022 at 9:48, Joseph Gelinas
> >  wrote:
> > I believe I ran `hosted-engine --deploy` on ovirt-1 to see if there was an 
> > option to reenroll that way, but when it prompted and asked if it was 
> > really what I wanted to do I ctrl-D or said no and it ran something 
> > anyways, so I ctrl-C out of it and maybe that is what messed up vdsm on 
> > that node. Not sure about ovirt-3, is there a way to fix that?
> > 
> > > On Feb 18, 2022, at 17:21, Joseph Gelinas  wrote:
> > > 
> > > Unfortunately ovirt-ha-broker & ovirt-ha-agent are just in continual 
> > > restart loops on ovirt-1 & ovirt-3 (ovirt-engine is currently on ovirt-3).
> > > 
> > > The output for broker.log:
> > > 
> > > MainThread::ERROR::2022-02-18 
> > > 22:08:58,101::broker::72::ovirt_hosted_engine_ha.broker.broker.Broker::(run)
> > >  Trying to restart the broker
> > > MainThread::INFO::2022-02-18 
> > > 22:08:58,453::broker::47::ovirt_hosted_engine_ha.broker.broker.Broker::(run)
> > >  ovirt-hosted-engine-ha broker 2.4.5 started
> > > MainThread::INFO::2022-02-18 
> > > 22:09:00,456::monitor::45::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
> > >  Searching for submonitors in 
> > > /usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/broker/submonitors
> > > MainThread::INFO::2022-02-18 
> > > 22:09:00,456::monitor::62::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
> > >  Loaded submonitor mem-free
> > > MainThread::INFO::2022-02-18 
> > > 22:09:00,457::monitor::62::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
> > >  Loaded submonitor engine-health
> > > MainThread::INFO::2022-02-18 
> > > 22:09:00,459::monitor::62::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
> > >  Loaded submonitor cpu-load-no-engine
> > > MainThread::INFO::2022-02-18 
> > > 22:09:00,459::monitor::62::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
> > >  Loaded submonitor mgmt-bridge
> > > MainThread::INFO::2022-02-18 
> > > 22:09:00,459::monitor::62::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
> > >  Loaded submonitor network
> > > MainThread::INFO::2022-02-18 
> > > 22:09:00,460::monitor::62::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
> > >  Loaded submonitor storage-domain
> > > MainThread::INFO::2022-02-18 
> > > 22:09:00,460::monitor::62::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
> > >  Loaded submonitor cpu-load
> > > MainThread::INFO::2022-02-18 
> > > 22:09:00,460::monitor::63::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
> > >  Finished loading submonitors
> > > MainThread::WARNING::2022-02-18 
> > > 22:10:00,788::storage_broker::100::ovirt_hosted_engine_ha.broker.storage_broker.StorageBroker::(__init__)
> > >  Can't connect vdsm storage: Couldn't  connect to VDSM within 60 seconds 
> > > MainThread::ERROR::2022-02-18 
> > > 22:10:00,788::broker::69::ovirt_hosted_engine_ha.broker.broker.Broker::(run)
> > >  Failed initializing the broker: Couldn't  connect to VDSM within 60 
> > > seconds
> > > MainThread::ERROR::2022-02-18 
> > > 22:10:00,789::

[ovirt-users] Re: Certificate expiration

2022-02-19 Thread Joseph Gelinas
Both I guess. The host certificates expired on the 15th the console expires on 
the 23. Right now since the engine sees the hosts as unassigned I don't get the 
option to set hosts to maintenance mode and if I try to set Enable Global 
Maintenance I get the message: "Cannot edit VM Cluster. Operation can be 
performed only when Hoist status is Up."


> On Feb 19, 2022, at 14:55, Strahil Nikolov  wrote:
> 
> Is your issue with the host certificates or the engine ?
> 
> You can try to set a node in maintenance (or at least try that) and then try 
> to reenroll the certificate from the UI.
> 
> Best Regards,
> Strahil Nikolov
> 
> On Sat, Feb 19, 2022 at 9:48, Joseph Gelinas
>  wrote:
> I believe I ran `hosted-engine --deploy` on ovirt-1 to see if there was an 
> option to reenroll that way, but when it prompted and asked if it was really 
> what I wanted to do I ctrl-D or said no and it ran something anyways, so I 
> ctrl-C out of it and maybe that is what messed up vdsm on that node. Not sure 
> about ovirt-3, is there a way to fix that?
> 
> > On Feb 18, 2022, at 17:21, Joseph Gelinas  wrote:
> > 
> > Unfortunately ovirt-ha-broker & ovirt-ha-agent are just in continual 
> > restart loops on ovirt-1 & ovirt-3 (ovirt-engine is currently on ovirt-3).
> > 
> > The output for broker.log:
> > 
> > MainThread::ERROR::2022-02-18 
> > 22:08:58,101::broker::72::ovirt_hosted_engine_ha.broker.broker.Broker::(run)
> >  Trying to restart the broker
> > MainThread::INFO::2022-02-18 
> > 22:08:58,453::broker::47::ovirt_hosted_engine_ha.broker.broker.Broker::(run)
> >  ovirt-hosted-engine-ha broker 2.4.5 started
> > MainThread::INFO::2022-02-18 
> > 22:09:00,456::monitor::45::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
> >  Searching for submonitors in 
> > /usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/broker/submonitors
> > MainThread::INFO::2022-02-18 
> > 22:09:00,456::monitor::62::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
> >  Loaded submonitor mem-free
> > MainThread::INFO::2022-02-18 
> > 22:09:00,457::monitor::62::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
> >  Loaded submonitor engine-health
> > MainThread::INFO::2022-02-18 
> > 22:09:00,459::monitor::62::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
> >  Loaded submonitor cpu-load-no-engine
> > MainThread::INFO::2022-02-18 
> > 22:09:00,459::monitor::62::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
> >  Loaded submonitor mgmt-bridge
> > MainThread::INFO::2022-02-18 
> > 22:09:00,459::monitor::62::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
> >  Loaded submonitor network
> > MainThread::INFO::2022-02-18 
> > 22:09:00,460::monitor::62::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
> >  Loaded submonitor storage-domain
> > MainThread::INFO::2022-02-18 
> > 22:09:00,460::monitor::62::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
> >  Loaded submonitor cpu-load
> > MainThread::INFO::2022-02-18 
> > 22:09:00,460::monitor::63::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
> >  Finished loading submonitors
> > MainThread::WARNING::2022-02-18 
> > 22:10:00,788::storage_broker::100::ovirt_hosted_engine_ha.broker.storage_broker.StorageBroker::(__init__)
> >  Can't connect vdsm storage: Couldn't  connect to VDSM within 60 seconds 
> > MainThread::ERROR::2022-02-18 
> > 22:10:00,788::broker::69::ovirt_hosted_engine_ha.broker.broker.Broker::(run)
> >  Failed initializing the broker: Couldn't  connect to VDSM within 60 seconds
> > MainThread::ERROR::2022-02-18 
> > 22:10:00,789::broker::71::ovirt_hosted_engine_ha.broker.broker.Broker::(run)
> >  Traceback (most recent call last):
> >  File 
> > "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/broker/broker.py", 
> > line 64, in run
> >self._storage_broker_instance = self._get_storage_broker()
> >  File 
> > "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/broker/broker.py", 
> > line 143, in _get_storage_broker
> >return storage_broker.StorageBroker()
> >  File 
> > "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py",
> >  line 97, in __init__
> >self._backend.connect()
> >  File 
> > "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/lib/storage_backends.py",
> >  line 370, in connect
> >connec

[ovirt-users] Re: Certificate expiration

2022-02-18 Thread Joseph Gelinas
I believe I ran `hosted-engine --deploy` on ovirt-1 to see if there was an 
option to reenroll that way, but when it prompted and asked if it was really 
what I wanted to do I ctrl-D or said no and it ran something anyways, so I 
ctrl-C out of it and maybe that is what messed up vdsm on that node. Not sure 
about ovirt-3, is there a way to fix that?

> On Feb 18, 2022, at 17:21, Joseph Gelinas  wrote:
> 
> Unfortunately ovirt-ha-broker & ovirt-ha-agent are just in continual restart 
> loops on ovirt-1 & ovirt-3 (ovirt-engine is currently on ovirt-3).
> 
> The output for broker.log:
> 
> MainThread::ERROR::2022-02-18 
> 22:08:58,101::broker::72::ovirt_hosted_engine_ha.broker.broker.Broker::(run) 
> Trying to restart the broker
> MainThread::INFO::2022-02-18 
> 22:08:58,453::broker::47::ovirt_hosted_engine_ha.broker.broker.Broker::(run) 
> ovirt-hosted-engine-ha broker 2.4.5 started
> MainThread::INFO::2022-02-18 
> 22:09:00,456::monitor::45::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
>  Searching for submonitors in 
> /usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/broker/submonitors
> MainThread::INFO::2022-02-18 
> 22:09:00,456::monitor::62::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
>  Loaded submonitor mem-free
> MainThread::INFO::2022-02-18 
> 22:09:00,457::monitor::62::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
>  Loaded submonitor engine-health
> MainThread::INFO::2022-02-18 
> 22:09:00,459::monitor::62::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
>  Loaded submonitor cpu-load-no-engine
> MainThread::INFO::2022-02-18 
> 22:09:00,459::monitor::62::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
>  Loaded submonitor mgmt-bridge
> MainThread::INFO::2022-02-18 
> 22:09:00,459::monitor::62::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
>  Loaded submonitor network
> MainThread::INFO::2022-02-18 
> 22:09:00,460::monitor::62::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
>  Loaded submonitor storage-domain
> MainThread::INFO::2022-02-18 
> 22:09:00,460::monitor::62::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
>  Loaded submonitor cpu-load
> MainThread::INFO::2022-02-18 
> 22:09:00,460::monitor::63::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
>  Finished loading submonitors
> MainThread::WARNING::2022-02-18 
> 22:10:00,788::storage_broker::100::ovirt_hosted_engine_ha.broker.storage_broker.StorageBroker::(__init__)
>  Can't connect vdsm storage: Couldn't  connect to VDSM within 60 seconds 
> MainThread::ERROR::2022-02-18 
> 22:10:00,788::broker::69::ovirt_hosted_engine_ha.broker.broker.Broker::(run) 
> Failed initializing the broker: Couldn't  connect to VDSM within 60 seconds
> MainThread::ERROR::2022-02-18 
> 22:10:00,789::broker::71::ovirt_hosted_engine_ha.broker.broker.Broker::(run) 
> Traceback (most recent call last):
>  File 
> "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/broker/broker.py", 
> line 64, in run
>self._storage_broker_instance = self._get_storage_broker()
>  File 
> "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/broker/broker.py", 
> line 143, in _get_storage_broker
>return storage_broker.StorageBroker()
>  File 
> "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py",
>  line 97, in __init__
>self._backend.connect()
>  File 
> "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/lib/storage_backends.py",
>  line 370, in connect
>connection = util.connect_vdsm_json_rpc(logger=self._logger)
>  File "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/lib/util.py", 
> line 472, in connect_vdsm_json_rpc
>__vdsm_json_rpc_connect(logger, timeout)
>  File "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/lib/util.py", 
> line 415, in __vdsm_json_rpc_connect
>timeout=VDSM_MAX_RETRY * VDSM_DELAY
> RuntimeError: Couldn't  connect to VDSM within 60 seconds
> 
> 
> vdsm.log:
> 
> 2022-02-18 22:14:43,939+ INFO  (vmrecovery) [vds] recovery: waiting for 
> storage pool to go up (clientIF:726)
> 2022-02-18 22:14:44,071+ INFO  (Reactor thread) 
> [ProtocolDetector.AcceptorImpl] Accepted connection from ::1:48832 
> (protocoldetector:61)
> 2022-02-18 22:14:44,074+ ERROR (Reactor thread) 
> [ProtocolDetector.SSLHandshakeDispatcher] ssl handshake: SSLError, address: 
> ::1 (sslutils:269)
> 2022-02-18 22:14:44,442+ INFO  (Reactor thread) 
> [ProtocolDetector.AcceptorImpl] Accepted connection from ::1:48836 
> (pro

[ovirt-users] Re: Certificate expiration

2022-02-18 Thread Joseph Gelinas
AcceptorImpl] Accepted connection from ::1:48840 
(protocoldetector:61)
2022-02-18 22:14:45,449+ ERROR (Reactor thread) 
[ProtocolDetector.SSLHandshakeDispatcher] ssl handshake: SSLError, address: ::1 
(sslutils:269)
2022-02-18 22:14:46,082+ INFO  (Reactor thread) 
[ProtocolDetector.AcceptorImpl] Accepted connection from ::1:48842 
(protocoldetector:61)
2022-02-18 22:14:46,084+ ERROR (Reactor thread) 
[ProtocolDetector.SSLHandshakeDispatcher] ssl handshake: SSLError, address: ::1 
(sslutils:269)
2022-02-18 22:14:46,452+ INFO  (Reactor thread) 
[ProtocolDetector.AcceptorImpl] Accepted connection from ::1:48844 
(protocoldetector:61)
2022-02-18 22:14:46,455+ ERROR (Reactor thread) 
[ProtocolDetector.SSLHandshakeDispatcher] ssl handshake: SSLError, address: ::1 
(sslutils:269)
2022-02-18 22:14:47,087+ INFO  (Reactor thread) 
[ProtocolDetector.AcceptorImpl] Accepted connection from ::1:48846 
(protocoldetector:61)
2022-02-18 22:14:47,089+ ERROR (Reactor thread) 
[ProtocolDetector.SSLHandshakeDispatcher] ssl handshake: SSLError, address: ::1 
(sslutils:269)
2022-02-18 22:14:47,457+ INFO  (Reactor thread) 
[ProtocolDetector.AcceptorImpl] Accepted connection from ::1:48848 
(protocoldetector:61)
2022-02-18 22:14:47,459+ ERROR (Reactor thread) 
[ProtocolDetector.SSLHandshakeDispatcher] ssl handshake: SSLError, address: ::1 
(sslutils:269)
2022-02-18 22:14:48,092+ INFO  (Reactor thread) 
[ProtocolDetector.AcceptorImpl] Accepted connection from ::1:48850 
(protocoldetector:61)
2022-02-18 22:14:48,094+ ERROR (Reactor thread) 
[ProtocolDetector.SSLHandshakeDispatcher] ssl handshake: SSLError, address: ::1 
(sslutils:269)
2022-02-18 22:14:48,461+ INFO  (Reactor thread) 
[ProtocolDetector.AcceptorImpl] Accepted connection from ::1:48852 
(protocoldetector:61)
2022-02-18 22:14:48,464+ ERROR (Reactor thread) 
[ProtocolDetector.SSLHandshakeDispatcher] ssl handshake: SSLError, address: ::1 
(sslutils:269)
2022-02-18 22:14:48,941+ INFO  (vmrecovery) [vdsm.api] START 
getConnectedStoragePoolsList(options=None) from=internal, 
task_id=75ef5d5f-c56b-4595-95c8-3dc64caa3a83 (api:48)
2022-02-18 22:14:48,942+ INFO  (vmrecovery) [vdsm.api] FINISH 
getConnectedStoragePoolsList return={'poollist': []} from=internal, 
task_id=75ef5d5f-c56b-4595-95c8-3dc64caa3a83 (api:54)



> On Feb 18, 2022, at 16:35, Strahil Nikolov via Users  wrote:
> 
> ovirt-2 is 'state=GlobalMaintenance' , but the other 2 nodes is uknown.
> Try to start ovirt-ha-broker & ovirt-ha-agent
> 
> Also, you may try to move the hosted-engine to ovirt-2 and try again
> 
> 
> Best Regards,
> Strahil Nikolov
> 
> On Fri, Feb 18, 2022 at 21:48, Joseph Gelinas
>  wrote:
> I may be in maintenance mode, I did try to set it in the beginning of this, 
> but engine-setup doesn't see it. At this point my nodes say they can't 
> connect to the HA daemon, or have stale data.
> 
> [root@ovirt-1 ~]# hosted-engine --set-maintenance --mode=global
> Cannot connect to the HA daemon, please check the logs.
> 
> [root@ovirt-3 ~]# hosted-engine --set-maintenance --mode=global
> Cannot connect to the HA daemon, please check the logs.
> 
> [root@ovirt-2 ~]# hosted-engine --set-maintenance --mode=global
> [root@ovirt-2 ~]# hosted-engine --vm-status
> 
> 
> !! Cluster is in GLOBAL MAINTENANCE mode !!
> 
> 
> 
> --== Host ovirt-1.xx.com (id: 1) status ==--
> 
> Host ID: 1
> Host timestamp: 6750990
> Score  : 0
> Engine status  : unknown stale-data
> Hostname  : ovirt-1.xx.com
> Local maintenance  : False
> stopped: True
> crc32  : 5290657b
> conf_on_shared_storage: True
> local_conf_timestamp  : 6750950
> Status up-to-date  : False
> Extra metadata (valid at timestamp):
> metadata_parse_version=1
> metadata_feature_version=1
> timestamp=6750990 (Thu Feb 17 22:17:53 2022)
> host-id=1
> score=0
> vm_conf_refresh_time=6750950 (Thu Feb 17 22:17:12 2022)
> conf_on_shared_storage=True
> maintenance=False
> state=AgentStopped
> stopped=True
> 
> 
> --== Host ovirt-3.xx.com (id: 2) status ==--
> 
> Host ID: 2
> Host timestamp: 6731526
> Score  : 0
> Engine status  : unknown stale-data
> Hostname  : ovirt-3.xx.com
> Local maintenance  : False
> stopped: True
> crc32  : 12c6b5c9
> conf_on_shared_storage: True
> loca

[ovirt-users] Re: Certificate expiration

2022-02-18 Thread Joseph Gelinas
usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/lib/util.py", 
line 472, in connect_vdsm_json_rpc
__vdsm_json_rpc_connect(logger, timeout)
  File "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/lib/util.py", 
line 415, in __vdsm_json_rpc_connect
timeout=VDSM_MAX_RETRY * VDSM_DELAY
RuntimeError: Couldn't  connect to VDSM within 60 seconds


Ovirt-2's ovirt-hosted-engine-ha/agent.log has entries detecting global 
maintenance though `systemctl status ovirt-ha-agent` has python exception 
errors from yesterday.

MainThread::INFO::2022-02-18 
19:39:10,452::state_decorators::51::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(check)
 Global maintenance detected
MainThread::INFO::2022-02-18 
19:39:10,524::hosted_engine::517::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_monitoring_loop)
 Current state GlobalMaintenance (score: 3400)


Feb 17 18:49:12 ovirt-2.us1.vricon.com python3[1324125]: detected unhandled 
Python exception in 
'/usr/lib/python3.6/site-packages/ovirt_hosted_engine_setup/vdsm_helper.py'



> On Feb 18, 2022, at 14:20, Strahil Nikolov  wrote:
> 
> To set the engine into maintenance mode you can ssh to any Hypervisor and run:
> 'hosted-engine --set-maintenance --mode=global'
> wait 1 minute and run 'hosted-engine --vm-status' to validate.
> 
> Best Regards,
> Strahil Nikolov
> 
> On Fri, Feb 18, 2022 at 19:03, Joseph Gelinas
>  wrote:
> Hi,
> 
> The certificates on our oVirt stack recently expired, while all the VMs are 
> still up, I can't put the cluster into global maintenance via ovirt-engine, 
> or do anything via ovirt-engine for that matter. Just get event logs about 
> cert validity.
> 
> VDSM ovirt-1.x.com command Get Host Capabilities failed: PKIX path 
> validation failed: java.security.cert.CertPathValidatorException: validity 
> check failed
> VDSM ovirt-2.x.com command Get Host Capabilities failed: PKIX path 
> validation failed: java.security.cert.CertPathValidatorException: validity 
> check failed
> VDSM ovirt-3.x.com command Get Host Capabilities failed: PKIX path 
> validation failed: java.security.cert.CertPathValidatorException: validity 
> check failed
> 
> Under Compute -> Hosts, all are status Unassigned. Default data center is 
> status Non Responsive.
> 
> I have tried a couple of solutions to regenerate the certificates without 
> much luck and have copied the originals back in place.
> 
> https://access.redhat.com/documentation/en-us/red_hat_virtualization/4.3/html/upgrade_guide/replacing_sha-1_certificates_with_sha-256_certificates_4-1_local_db#Replacing_All_Signed_Certificates_with_SHA-256_4-1_local_db
> 
> https://access.redhat.com/solutions/2409751
> 
> 
> I have seen things saying running engine-setup will generate new certs, 
> however engine doesn't think the cluster is in global maintenance so won't 
> run that, I believe I can get around the check with `engine-setup 
> --otopi-environment=OVESETUP_CONFIG/continueSetupOnHEVM=bool:True` but is 
> that the right thing to do? Will it deploy the certs on to the hosts as well 
> so things communicate properly? Looks like one is supposed to put a node into 
> maintenance and reenroll it after doing the engine-setup, but will it even be 
> able to put the nodes into maintenance given I can't do anything with them 
> now?
> 
> Appreciate any ideas.
> 
> 
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/privacy-policy.html
> oVirt Code of Conduct: 
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives: 
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/QCFPKQ3OKPOUV266MFJUMVTNG2OHLJVW/
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/XOQBFYM5W7SCJISJHQ7PZZ3VZWKY6GEZ/


[ovirt-users] Certificate expiration

2022-02-18 Thread Joseph Gelinas
Hi,

The certificates on our oVirt stack recently expired, while all the VMs are 
still up, I can't put the cluster into global maintenance via ovirt-engine, or 
do anything via ovirt-engine for that matter. Just get event logs about cert 
validity.

VDSM ovirt-1.x.com command Get Host Capabilities failed: PKIX path 
validation failed: java.security.cert.CertPathValidatorException: validity 
check failed
VDSM ovirt-2.x.com command Get Host Capabilities failed: PKIX path 
validation failed: java.security.cert.CertPathValidatorException: validity 
check failed
VDSM ovirt-3.x.com command Get Host Capabilities failed: PKIX path 
validation failed: java.security.cert.CertPathValidatorException: validity 
check failed

Under Compute -> Hosts, all are status Unassigned. Default data center is 
status Non Responsive.

I have tried a couple of solutions to regenerate the certificates without much 
luck and have copied the originals back in place.

https://access.redhat.com/documentation/en-us/red_hat_virtualization/4.3/html/upgrade_guide/replacing_sha-1_certificates_with_sha-256_certificates_4-1_local_db#Replacing_All_Signed_Certificates_with_SHA-256_4-1_local_db

https://access.redhat.com/solutions/2409751


I have seen things saying running engine-setup will generate new certs, however 
engine doesn't think the cluster is in global maintenance so won't run that, I 
believe I can get around the check with `engine-setup 
--otopi-environment=OVESETUP_CONFIG/continueSetupOnHEVM=bool:True` but is that 
the right thing to do? Will it deploy the certs on to the hosts as well so 
things communicate properly? Looks like one is supposed to put a node into 
maintenance and reenroll it after doing the engine-setup, but will it even be 
able to put the nodes into maintenance given I can't do anything with them now?

Appreciate any ideas.


___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/QCFPKQ3OKPOUV266MFJUMVTNG2OHLJVW/


[ovirt-users] Re: Hosted Engine stuck in bios

2021-01-22 Thread Joseph Gelinas

> On Jan 22, 2021, at 10:11, Arik Hadas  wrote:
> 
> On Thu, Jan 21, 2021 at 9:27 PM Joseph Gelinas  wrote:
> I found `engine-setup 
> --otopi-environment=OVESETUP_CONFIG/continueSetupOnHEVM=bool:True` from [1] 
> and now have the ovirt-engine web interface reachable again. But do have one 
> more question; when I try to change the Custom Chipset/Firmware Type to Q35 
> Chipset with BIOS, I get the error; HostedEngine: There was an attempt to 
> change the Hosted Engine VM values that are locked.
> 
> How do I make the removal of the loader/nvram lines permanent?
> 
> Can you please check the output of:
> select custom_bios_type from vm_static where origin=6;
> 
> If it returns 0 then you can change the custom bios type to Q35 + BIOS with:
> update vm_static set custom_bios_type = 2,  db_generation = db_generation + 1 
> where origin = 6;
> 
> If it returns 2 as it is supposed to, you can change any field of the hosted 
> engine VM (e.g., "comment") via the UI to trigger an update of the OVF_STORE.

That did indeed return 0. Thanks for your help Arik.


> 
> [1] 
> https://lists.ovirt.org/archives/list/users@ovirt.org/thread/2AC57LTHFKJBU6OYZPYSCMTBF6NE3QO2/
> 
> > On Jan 21, 2021, at 10:15, Joseph Gelinas  wrote:
> > 
> > Removing those two lines got the hosted engine vm booting again, so that is 
> > a great help. Thank you.
> > 
> > Now I just need the web interface of ovirt-engine to work again. I feel 
> > like I might have run things out of order and forgot to do `engine-setup` 
> > as part of the update of hosted engine. Though when I try to do that now it 
> > bails out claiming the cluster isn't in global maintenance yet it is.
> > 
> > [ INFO  ] Stage: Setup validation
> > [ ERROR ] It seems that you are running your engine inside of the 
> > hosted-engine VM and are not in "Global Maintenance" mode.
> > In that case you should put the system into the "Global 
> > Maintenance" mode before running engine-setup, or the hosted-engine HA 
> > agent might kill the machine, which might corrupt your data.
> > 
> > [ ERROR ] Failed to execute stage 'Setup validation': Hosted Engine setup 
> > detected, but Global Maintenance is not set.
> > 
> > 
> > I see engine.log says it can't contact the database but I certainly see 
> > Postgres processes running.
> > 
> > /var/log/ovirt-engine/engine.log
> > 
> > 2021-01-21 14:47:31,502Z ERROR 
> > [org.ovirt.engine.core.services.HealthStatus] (default task-15) [] Failed 
> > to run Health Status.
> > 2021-01-21 14:47:31,502Z ERROR 
> > [org.ovirt.engine.core.services.HealthStatus] (default task-14) [] Unable 
> > to contact Database!: java.lang.InterruptedException
> > 
> > 
> > 
> > 
> >> On Jan 21, 2021, at 03:19, Arik Hadas  wrote:
> >> 
> >> 
> >> 
> >> On Thu, Jan 21, 2021 at 8:57 AM Joseph Gelinas  wrote:
> >> Hi,
> >> 
> >> I recently did some updates of ovirt from 4.4.1 or 4.4.3 to 4.4.4, also 
> >> setting the default datacenter from 4.4 to 4.5 and making the default bios 
> >> q35+eufi. Unfortunately quite a few things. Now however hosted engine 
> >> doesn't boot up anymore and `hosted-engine --console`  just shows the 
> >> below bios/firmware output:
> >> 
> >> RHEL   
> >> 
> >> RHEL-8.1.0 PC (Q35 + ICH9, 2009)2.00 GHz   
> >> 
> >> 0.0.0   16384 MB RAM   
> >> 
> >> 
> >> 
> >> 
> >>   Select Language This is the option 
> >> 
> >> one adjusts to 
> >> change  
> >>> Device Managerthe language for 
> >>> the   
> >>> Boot Manager  current system  
> >>>
> >>> Boot Maintenance Manager  
> >>>
> >> 
> >>   Continue 
> >> 
> >>   Reset
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >>  ^v=Move Highlight   =Select Entry  
> >> 

[ovirt-users] Re: Hosted Engine stuck in bios

2021-01-21 Thread Joseph Gelinas
I found `engine-setup 
--otopi-environment=OVESETUP_CONFIG/continueSetupOnHEVM=bool:True` from [1] and 
now have the ovirt-engine web interface reachable again. But do have one more 
question; when I try to change the Custom Chipset/Firmware Type to Q35 Chipset 
with BIOS, I get the error; HostedEngine: There was an attempt to change the 
Hosted Engine VM values that are locked.

How do I make the removal of the loader/nvram lines permanent?

[1] 
https://lists.ovirt.org/archives/list/users@ovirt.org/thread/2AC57LTHFKJBU6OYZPYSCMTBF6NE3QO2/

> On Jan 21, 2021, at 10:15, Joseph Gelinas  wrote:
> 
> Removing those two lines got the hosted engine vm booting again, so that is a 
> great help. Thank you.
> 
> Now I just need the web interface of ovirt-engine to work again. I feel like 
> I might have run things out of order and forgot to do `engine-setup` as part 
> of the update of hosted engine. Though when I try to do that now it bails out 
> claiming the cluster isn't in global maintenance yet it is.
> 
> [ INFO  ] Stage: Setup validation
> [ ERROR ] It seems that you are running your engine inside of the 
> hosted-engine VM and are not in "Global Maintenance" mode.
> In that case you should put the system into the "Global Maintenance" 
> mode before running engine-setup, or the hosted-engine HA agent might kill 
> the machine, which might corrupt your data.
> 
> [ ERROR ] Failed to execute stage 'Setup validation': Hosted Engine setup 
> detected, but Global Maintenance is not set.
> 
> 
> I see engine.log says it can't contact the database but I certainly see 
> Postgres processes running.
> 
> /var/log/ovirt-engine/engine.log
> 
> 2021-01-21 14:47:31,502Z ERROR [org.ovirt.engine.core.services.HealthStatus] 
> (default task-15) [] Failed to run Health Status.
> 2021-01-21 14:47:31,502Z ERROR [org.ovirt.engine.core.services.HealthStatus] 
> (default task-14) [] Unable to contact Database!: 
> java.lang.InterruptedException
> 
> 
> 
> 
>> On Jan 21, 2021, at 03:19, Arik Hadas  wrote:
>> 
>> 
>> 
>> On Thu, Jan 21, 2021 at 8:57 AM Joseph Gelinas  wrote:
>> Hi,
>> 
>> I recently did some updates of ovirt from 4.4.1 or 4.4.3 to 4.4.4, also 
>> setting the default datacenter from 4.4 to 4.5 and making the default bios 
>> q35+eufi. Unfortunately quite a few things. Now however hosted engine 
>> doesn't boot up anymore and `hosted-engine --console`  just shows the below 
>> bios/firmware output:
>> 
>> RHEL 
>>   
>> RHEL-8.1.0 PC (Q35 + ICH9, 2009)2.00 GHz 
>>   
>> 0.0.0   16384 MB RAM 
>>   
>> 
>> 
>> 
>>   Select Language This is the option   
>>   
>> one adjusts to 
>> change  
>>> Device Managerthe language for the  
>>>  
>>> Boot Manager  current system
>>>  
>>> Boot Maintenance Manager
>>>  
>> 
>>   Continue   
>>   
>>   Reset  
>>   
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>>  ^v=Move Highlight   =Select Entry
>>   
>> 
>> 
>> When in this state `hosted-engine --vm-status` says it is up but failed 
>> liveliness check
>> 
>> hosted-engine --vm-status | grep -i engine\ status
>> Engine status  : {"vm": "down", "health": "bad", 
>> "detail": "unknown", "reason": "vm not running on this host"}
>> Engine status  : {"vm": "up", "health": "bad", "detail": 
>> "Up", "reason": "failed liveliness check"}
>> Engine status  : {"vm": "down", "health": "bad", 
>> "detail": "Down", "reason": "bad vm status"}
>> 
>> I assume I am running into https://access.redhat.com/solutions/5341561 (RHV: 
>> Hosted-Engine VM fails to start after changing the cluster to Q35/UEFI) 
>> however how to fix that isn't really described. I have tried starting hosted 

[ovirt-users] Re: Hosted Engine stuck in bios

2021-01-21 Thread Joseph Gelinas
Removing those two lines got the hosted engine vm booting again, so that is a 
great help. Thank you.

Now I just need the web interface of ovirt-engine to work again. I feel like I 
might have run things out of order and forgot to do `engine-setup` as part of 
the update of hosted engine. Though when I try to do that now it bails out 
claiming the cluster isn't in global maintenance yet it is.

[ INFO  ] Stage: Setup validation
[ ERROR ] It seems that you are running your engine inside of the hosted-engine 
VM and are not in "Global Maintenance" mode.
 In that case you should put the system into the "Global Maintenance" 
mode before running engine-setup, or the hosted-engine HA agent might kill the 
machine, which might corrupt your data.
 
[ ERROR ] Failed to execute stage 'Setup validation': Hosted Engine setup 
detected, but Global Maintenance is not set.


I see engine.log says it can't contact the database but I certainly see 
Postgres processes running.

/var/log/ovirt-engine/engine.log

2021-01-21 14:47:31,502Z ERROR [org.ovirt.engine.core.services.HealthStatus] 
(default task-15) [] Failed to run Health Status.
2021-01-21 14:47:31,502Z ERROR [org.ovirt.engine.core.services.HealthStatus] 
(default task-14) [] Unable to contact Database!: java.lang.InterruptedException




> On Jan 21, 2021, at 03:19, Arik Hadas  wrote:
> 
> 
> 
> On Thu, Jan 21, 2021 at 8:57 AM Joseph Gelinas  wrote:
> Hi,
> 
> I recently did some updates of ovirt from 4.4.1 or 4.4.3 to 4.4.4, also 
> setting the default datacenter from 4.4 to 4.5 and making the default bios 
> q35+eufi. Unfortunately quite a few things. Now however hosted engine doesn't 
> boot up anymore and `hosted-engine --console`  just shows the below 
> bios/firmware output:
> 
>  RHEL 
>   
>  RHEL-8.1.0 PC (Q35 + ICH9, 2009)2.00 GHz 
>   
>  0.0.0   16384 MB RAM 
>   
> 
> 
> 
>Select Language This is the option   
>   
>  one adjusts to 
> change  
>  > Device Managerthe language for the 
>   
>  > Boot Manager  current system   
>   
>  > Boot Maintenance Manager   
>   
> 
>Continue   
>   
>Reset  
>   
> 
> 
> 
> 
> 
> 
> 
>   ^v=Move Highlight   =Select Entry
>   
> 
> 
> When in this state `hosted-engine --vm-status` says it is up but failed 
> liveliness check
> 
> hosted-engine --vm-status | grep -i engine\ status
> Engine status  : {"vm": "down", "health": "bad", 
> "detail": "unknown", "reason": "vm not running on this host"}
> Engine status  : {"vm": "up", "health": "bad", "detail": 
> "Up", "reason": "failed liveliness check"}
> Engine status  : {"vm": "down", "health": "bad", 
> "detail": "Down", "reason": "bad vm status"}
> 
> I assume I am running into https://access.redhat.com/solutions/5341561 (RHV: 
> Hosted-Engine VM fails to start after changing the cluster to Q35/UEFI) 
> however how to fix that isn't really described. I have tried starting hosted 
> engine paused (`hosted-engine --vm-start-paused`) and editing the config 
> (`virsh -c qemu:///system?authfile=/etc/ovirt-hosted-engine/virsh_auth.conf 
> edit HostedEngine`) to have pc-i440fx instead and removing a bunch of pcie 
> lines etc until it will accept the config and then resuming hosted engine 
> (`virsh -c qemu:///system?authfile=/etc/ovirt-hosted-engine/virsh_auth.conf 
> resume HostedEngine`) but haven't come up with something that is able to 
> start.
> 
> Anyone know how to resolve this? Am I even chasing the right path?
> 
> Let's start with the negative - this should have been prevented by [1].
> Can it be that the custom bios type that the hosted engine VM was set with 
> was manually dropped in this environment?
> 
> The positive is that the VM starts. This means that from the chipset 
> perspective, the configuration is valid.
> So I wouldn't try to change it to i440fx, but only to switch the firmware to 
&

[ovirt-users] Hosted Engine stuck in bios

2021-01-20 Thread Joseph Gelinas
Hi,

I recently did some updates of ovirt from 4.4.1 or 4.4.3 to 4.4.4, also setting 
the default datacenter from 4.4 to 4.5 and making the default bios q35+eufi. 
Unfortunately quite a few things. Now however hosted engine doesn't boot up 
anymore and `hosted-engine --console`  just shows the below bios/firmware 
output:

 RHEL   
 RHEL-8.1.0 PC (Q35 + ICH9, 2009)2.00 GHz   
 0.0.0   16384 MB RAM   



   Select Language This is the option 
 one adjusts to change  
 > Device Managerthe language for the   
 > Boot Manager  current system 
 > Boot Maintenance Manager 

   Continue 
   Reset







  ^v=Move Highlight   =Select Entry  


When in this state `hosted-engine --vm-status` says it is up but failed 
liveliness check

hosted-engine --vm-status | grep -i engine\ status
Engine status  : {"vm": "down", "health": "bad", "detail": 
"unknown", "reason": "vm not running on this host"}
Engine status  : {"vm": "up", "health": "bad", "detail": 
"Up", "reason": "failed liveliness check"}
Engine status  : {"vm": "down", "health": "bad", "detail": 
"Down", "reason": "bad vm status"}

I assume I am running into https://access.redhat.com/solutions/5341561 (RHV: 
Hosted-Engine VM fails to start after changing the cluster to Q35/UEFI) however 
how to fix that isn't really described. I have tried starting hosted engine 
paused (`hosted-engine --vm-start-paused`) and editing the config (`virsh -c 
qemu:///system?authfile=/etc/ovirt-hosted-engine/virsh_auth.conf edit 
HostedEngine`) to have pc-i440fx instead and removing a bunch of pcie lines etc 
until it will accept the config and then resuming hosted engine (`virsh -c 
qemu:///system?authfile=/etc/ovirt-hosted-engine/virsh_auth.conf resume 
HostedEngine`) but haven't come up with something that is able to start.

Anyone know how to resolve this? Am I even chasing the right path?


/var/log/libvirt/qemu/HostedEngine.log 

2021-01-20 15:31:56.500+: starting up libvirt version: 6.6.0, package: 
7.1.el8 (CBS , 2020-12-10-14:05:40, ), qemu version: 
5.1.0qemu-kvm-5.1.0-14.el8.1, kernel: 4.18.0-240.1.1.el8_3.x86_64, hostname: 
ovirt-3
LC_ALL=C \
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin \
HOME=/var/lib/libvirt/qemu/domain-25-HostedEngine \
XDG_DATA_HOME=/var/lib/libvirt/qemu/domain-25-HostedEngine/.local/share \
XDG_CACHE_HOME=/var/lib/libvirt/qemu/domain-25-HostedEngine/.cache \
XDG_CONFIG_HOME=/var/lib/libvirt/qemu/domain-25-HostedEngine/.config \
QEMU_AUDIO_DRV=spice \
/usr/libexec/qemu-kvm \
-name guest=HostedEngine,debug-threads=on \
-S \
-object 
secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-25-HostedEngine/master-key.aes
 \
-blockdev 
'{"driver":"file","filename":"/usr/share/OVMF/OVMF_CODE.secboot.fd","node-name":"libvirt-pflash0-storage","auto-read-only":true,"discard":"unmap"}'
 \
-blockdev 
'{"node-name":"libvirt-pflash0-format","read-only":true,"driver":"raw","file":"libvirt-pflash0-storage"}'
 \
-blockdev 
'{"driver":"file","filename":"/var/lib/libvirt/qemu/nvram/81816cd3-5816-4185-b553-b5a636156fbd.fd","node-name":"libvirt-pflash1-storage","auto-read-only":true,"discard":"unmap"}'
 \
-blockdev 
'{"node-name":"libvirt-pflash1-format","read-only":false,"driver":"raw","file":"libvirt-pflash1-storage"}'
 \
-machine 
pc-q35-rhel8.1.0,accel=kvm,usb=off,dump-guest-core=off,pflash0=libvirt-pflash0-forma