Hello,
It appears that my Manager / hosted-engine isn't working, and I'm unable to get 
it to start.

I have a 3-node HCI cluster, but right now, Gluster is only running on 1 host 
(so no replication).
I was hoping to upgrade / replace the storage on my 2nd host today, but aborted 
that maintenance when I found that I couldn't even get into the Manager.

The storage is mounted, but here's what I see:

> [root@cha2-storage dwhite]# hosted-engine --vm-statusThe hosted engine 
> configuration has not been retrieved from shared storage. Please ensure that 
> ovirt-ha-agent is running and the storage server is reachable.
> 

> [root@cha2-storage dwhite]# systemctl status ovirt-ha-agent● 
> ovirt-ha-agent.service - oVirt Hosted Engine High Availability Monitoring 
> Agent
>    Loaded: loaded (/usr/lib/systemd/system/ovirt-ha-agent.service; enabled; 
> vendor preset: disabled)
>    Active: active (running) since Fri 2021-08-13 11:10:51 EDT; 2h 44min ago
> Main PID: 3591872 (ovirt-ha-agent)
>     Tasks: 1 (limit: 409676)
>    Memory: 21.5M
>    CGroup: /system.slice/ovirt-ha-agent.service
>            └─3591872 /usr/libexec/platform-python 
> /usr/share/ovirt-hosted-engine-ha/ovirt-ha-agent
> 

> Aug 13 11:10:51 cha2-storage.mgt.barredowlweb.com systemd[1]: Started oVirt 
> Hosted Engine High Availability Monitoring Agent.

Any time I try to do anything like connect the engine storage, disconnect the 
engine storage, or connect to the console, it just sits there, and doesn't do 
anything, and I eventually have to ctl-c out of it.
Maybe I have to be patient? When I ctl-c, I get a trackback error:

> [root@cha2-storage dwhite]# hosted-engine --console^CTraceback (most recent 
> call last):
>   File "/usr/lib64/python3.6/runpy.py", line 193, in _run_module_as_main
> 

>     "__main__", mod_spec)
>   File "/usr/lib64/python3.6/runpy.py", line 85, in _run_code
>     exec(code, run_globals)
>   File 
> "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_setup/vdsm_helper.py", 
> line 214, in <module>
> [root@cha2-storage dwhite]#     args.command(args)
>   File 
> "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_setup/vdsm_helper.py", 
> line 42, in func
>     f(*args, **kwargs)
>   File 
> "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_setup/vdsm_helper.py", 
> line 91, in checkVmStatus
>     cli = ohautil.connect_vdsm_json_rpc()
>   File "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/lib/util.py", 
> line 472, in connect_vdsm_json_rpc
>     __vdsm_json_rpc_connect(logger, timeout)
>   File "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/lib/util.py", 
> line 395, in __vdsm_json_rpc_connect
>     timeout=timeout)
>   File "/usr/lib/python3.6/site-packages/vdsm/client.py", line 154, in connect
>     outgoing_heartbeat=outgoing_heartbeat, nr_retries=nr_retries)
>   File "/usr/lib/python3.6/site-packages/yajsonrpc/stompclient.py", line 426, 
> in SimpleClient
>     nr_retries, reconnect_interval)
>   File "/usr/lib/python3.6/site-packages/yajsonrpc/stompclient.py", line 448, 
> in StandAloneRpcClient
>     client = StompClient(utils.create_connected_socket(host, port, sslctx),
>   File "/usr/lib/python3.6/site-packages/vdsm/utils.py", line 379, in 
> create_connected_socket
>     sock.connect((host, port))
>   File "/usr/lib64/python3.6/ssl.py", line 1068, in connect
>     self._real_connect(addr, False)
>   File "/usr/lib64/python3.6/ssl.py", line 1059, in _real_connect
>     self.do_handshake()
>   File "/usr/lib64/python3.6/ssl.py", line 1036, in do_handshake
>     self._sslobj.do_handshake()
>   File "/usr/lib64/python3.6/ssl.py", line 648, in do_handshake
>     self._sslobj.do_handshake()

This is what I see in /var/log/ovirt-hosted-engine-ha/broker.log:

> MainThread::WARNING::2021-08-11 
> 10:24:41,596::storage_broker::100::ovirt_hosted_engine_ha.broker.storage_broker.StorageBroker::(__init__)
>  Can't connect vdsm storage: Connection to storage server failed
> MainThread::ERROR::2021-08-11 
> 10:24:41,596::broker::69::ovirt_hosted_engine_ha.broker.broker.Broker::(run) 
> Failed initializing the broker: Connection to storage server failed
> MainThread::ERROR::2021-08-11 
> 10:24:41,598::broker::71::ovirt_hosted_engine_ha.broker.broker.Broker::(run) 
> Traceback (most recent call last):
>   File 
> "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/broker/broker.py", 
> line 64, in run
>     self._storage_broker_instance = self._get_storage_broker()
>   File 
> "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/broker/broker.py", 
> line 143, in _get_storage_broker
>     return storage_broker.StorageBroker()
>   File 
> "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py",
>  line 97, in __init__
>     self._backend.connect()
>   File 
> "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/lib/storage_backends.py",
>  line 375, in connect
>     sserver.connect_storage_server()
>   File 
> "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/lib/storage_server.py",
>  line 451, in connect_storage_server
>     'Connection to storage server failed'
> RuntimeError: Connection to storage server failed
> 

> MainThread::ERROR::2021-08-11 
> 10:24:41,599::broker::72::ovirt_hosted_engine_ha.broker.broker.Broker::(run) 
> Trying to restart the broker
> MainThread::INFO::2021-08-11 
> 10:24:42,439::broker::47::ovirt_hosted_engine_ha.broker.broker.Broker::(run) 
> ovirt-hosted-engine-ha broker 2.4.7 started
> MainThread::INFO::2021-08-11 
> 10:24:44,442::monitor::45::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
>  Searching for submonitors in 
> /usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/broker/submonitors
> MainThread::INFO::2021-08-11 
> 10:24:44,443::monitor::62::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
>  Loaded submonitor cpu-load
> MainThread::INFO::2021-08-11 
> 10:24:44,449::monitor::62::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
>  Loaded submonitor cpu-load-no-engine
> MainThread::INFO::2021-08-11 
> 10:24:44,450::monitor::62::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
>  Loaded submonitor engine-health
> MainThread::INFO::2021-08-11 
> 10:24:44,451::monitor::62::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
>  Loaded submonitor mem-free
> MainThread::INFO::2021-08-11 
> 10:24:44,451::monitor::62::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
>  Loaded submonitor mgmt-bridge
> MainThread::INFO::2021-08-11 
> 10:24:44,452::monitor::62::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
>  Loaded submonitor network
> MainThread::INFO::2021-08-11 
> 10:24:44,452::monitor::62::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
>  Loaded submonitor storage-domain
> MainThread::INFO::2021-08-11 
> 10:24:44,452::monitor::63::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors)
>  Finished loading submonitors

And I see this in /var/log/vdsm/vdsm.log:

> 2021-08-13 14:08:10,844-0400 ERROR (Reactor thread) 
> [ProtocolDetector.AcceptorImpl] Unhandled exception in acceptor 
> (protocoldetector:76)
> Traceback (most recent call last):
>   File "/usr/lib64/python3.6/asyncore.py", line 108, in readwrite
>   File "/usr/lib64/python3.6/asyncore.py", line 417, in handle_read_event
>   File "/usr/lib/python3.6/site-packages/yajsonrpc/betterAsyncore.py", line 
> 57, in handle_accept
>   File "/usr/lib/python3.6/site-packages/yajsonrpc/betterAsyncore.py", line 
> 173, in _delegate_call
>   File "/usr/lib/python3.6/site-packages/vdsm/protocoldetector.py", line 53, 
> in handle_accept
>   File "/usr/lib64/python3.6/asyncore.py", line 348, in accept
>   File "/usr/lib64/python3.6/socket.py", line 205, in accept
> OSError: [Errno 24] Too many open files

Can anyone help?

Sent with ProtonMail Secure Email.

Attachment: publickey - dmwhite823@protonmail.com - 0x320CD582.asc
Description: application/pgp-keys

Attachment: signature.asc
Description: OpenPGP digital signature

_______________________________________________
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/YVD2IDBACJX4CILGSCW77WZEWAB2TLNX/

Reply via email to