On Wed, Mar 10, 2021 at 10:16 PM penguin pages <jeremey.w...@gmail.com> wrote: > > well.. figured the package remove was means to get rid of "upgrade pending" > which would then allow me to get engine failover to start working.... but... > ya.. don't do that.
If you refer to "Use --allowerasing without fully understanding what's going to be erased", then I definitely agree - don't do that. > > How to destroy engine: > 1) yum update --allowerasing What did it remove? If this includes vdsm, it will definitely prevent starting the engine vm. > 2) reboot > 3) no more engine starting. > https://access.redhat.com/documentation/en-us/red_hat_virtualization/4.1/html/self-hosted_engine_guide/troubleshooting > > Validated services look ok > [root@thor ~]# systemctl status ovirt-ha-proxy > Unit ovirt-ha-proxy.service could not be found. > [root@thor ~]# systemctl status ovirt-ha-agent > ● ovirt-ha-agent.service - oVirt Hosted Engine High Availability Monitoring > Agent > Loaded: loaded (/usr/lib/systemd/system/ovirt-ha-agent.service; enabled; > vendor preset: disabled) > Active: active (running) since Wed 2021-03-10 14:55:17 EST; 14min ago > Main PID: 6390 (ovirt-ha-agent) > Tasks: 2 (limit: 1080501) > Memory: 25.8M > CGroup: /system.slice/ovirt-ha-agent.service > └─6390 /usr/libexec/platform-python > /usr/share/ovirt-hosted-engine-ha/ovirt-ha-agent > > Mar 10 14:55:17 thor.penguinpages.local systemd[1]: Started oVirt Hosted > Engine High Availability Monitoring Agent. > [root@thor ~]# systemctl status -l ovirt-ha-agent > ● ovirt-ha-agent.service - oVirt Hosted Engine High Availability Monitoring > Agent > Loaded: loaded (/usr/lib/systemd/system/ovirt-ha-agent.service; enabled; > vendor preset: disabled) > Active: active (running) since Wed 2021-03-10 14:55:17 EST; 16min ago > Main PID: 6390 (ovirt-ha-agent) > Tasks: 2 (limit: 1080501) > Memory: 25.6M > CGroup: /system.slice/ovirt-ha-agent.service > └─6390 /usr/libexec/platform-python > /usr/share/ovirt-hosted-engine-ha/ovirt-ha-agent > > Mar 10 14:55:17 thor.penguinpages.local systemd[1]: Started oVirt Hosted > Engine High Availability Monitoring Agent. > [root@thor ~]#journalctl -u ovirt-ha-agent > > -- Logs begin at Wed 2021-03-10 14:47:34 EST, end at Wed 2021-03-10 15:12:12 > EST. -- > Mar 10 14:48:35 thor.penguinpages.local systemd[1]: Started oVirt Hosted > Engine High Availability Monitoring Agent. > Mar 10 14:48:37 thor.penguinpages.local ovirt-ha-agent[3463]: ovirt-ha-agent > ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Failed to start > necessary monitors > Mar 10 14:48:37 thor.penguinpages.local ovirt-ha-agent[3463]: ovirt-ha-agent > ovirt_hosted_engine_ha.agent.agent.Agent ERROR Traceback (most recent call > last): > File > "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py", > line 85, in start_monitor I think this is while trying to connect to ovirt-ha-broker, you might want to check the status of that one. > response = > self._proxy.start_monitor(type, options) > File > "/usr/lib64/python3.6/xmlrpc/client.py", line 1112, in __call__ > return > self.__send(self.__name, args) > File > "/usr/lib64/python3.6/xmlrpc/client.py", line 1452, in __request > > verbose=self.__verbose > File > "/usr/lib64/python3.6/xmlrpc/client.py", line 1154, in request > return > self.single_request(host, handler, request_body, verbose) > File > "/usr/lib64/python3.6/xmlrpc/client.py", line 1166, in single_request > http_conn = > self.send_request(host, handler, request_body, verbose) > File > "/usr/lib64/python3.6/xmlrpc/client.py", line 1279, in send_request > > self.send_content(connection, request_body) > File > "/usr/lib64/python3.6/xmlrpc/client.py", line 1309, in send_content > > connection.endheaders(request_body) > File > "/usr/lib64/python3.6/http/client.py", line 1264, in endheaders > > self._send_output(message_body, encode_chunked=encode_chunked) > File > "/usr/lib64/python3.6/http/client.py", line 1040, in _send_output > > self.send(msg) > File > "/usr/lib64/python3.6/http/client.py", line 978, in send > > self.connect() > File > "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/lib/unixrpc.py", > line 74, in connect > > self.sock.connect(base64.b16decode(self.host)) > > FileNotFoundError: [Errno 2] No such file or directory > [root@thor ~]# tail /var/log/messages > > error rotating in /var/log/messages but I think this is just some form of > "engine is fubar. > "/usr/lib64/python3.6/smtplib.py", line 336, in connect#012 self.sock = > self._get_socket(host, port, self.timeout)#012 File > "/usr/lib64/python3.6/smtplib.py", line 307, in _get_socket#012 > self.source_address)#012 File "/usr/lib64/python3.6/socket.py", line 724, in > create_connection#012 raise err#012 File > "/usr/lib64/python3.6/socket.py", line 713, in create_connection#012 > sock.connect(sa)#012ConnectionRefusedError: [Errno 111] Connection refused > Mar 10 15:08:59 thor journal[1454]: ovirt-ha-broker > ovirt_hosted_engine_ha.broker.notifications.Notifications ERROR [Errno 111] > Connection refused#012Traceback (most recent call last):#012 File > "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/broker/notifications.py", > line 29, in send_email#012 timeout=float(cfg["smtp-timeout"]))#012 File > "/usr/lib64/python3.6/smtplib.py", line 251, in __init__#012 (code, msg) = > self.connect(host, port)#012 File "/usr/lib64/python3.6/smtplib.py", line > 336, in connect#012 self.sock = self._get_socket(host, port, > self.timeout)#012 File "/usr/lib64/python3.6/smtplib.py", line 307, in > _get_socket#012 self.source_address)#012 File > "/usr/lib64/python3.6/socket.py", line 724, in create_connection#012 raise > err#012 File "/usr/lib64/python3.6/socket.py", line 713, in > create_connection#012 sock.connect(sa)#012ConnectionRefusedError: [Errno > 111] Connection refused > Mar 10 15:08:59 thor journal[1454]: ovirt-ha-broker > ovirt_hosted_engine_ha.broker.notifications.Notifications ERROR [Errno 111] > Connection refused#012Traceback (most recent call last):#012 File > "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/broker/notifications.py", > line 29, in send_email#012 timeout=float(cfg["smtp-timeout"]))#012 File > "/usr/lib64/python3.6/smtplib.py", line 251, in __init__#012 (code, msg) = > self.connect(host, port)#012 File "/usr/lib64/python3.6/smtplib.py", line > 336, in connect#012 self.sock = self._get_socket(host, port, > self.timeout)#012 File "/usr/lib64/python3.6/smtplib.py", line 307, in > _get_socket#012 self.source_address)#012 File "/usr/ I think this is while it's trying to email a notification (about the failure?). Can be ignored, in itself - probably your sendmail is down. > > > I guess I get to re-deploy.. again. Good luck and best regards, -- Didi _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/66LKYLILAVJBQG2PXPKZPAEQBHTNTHBE/