A shot in the dark, haven't checked the log files properly.
For these hosts in the disconnected state, if you check them in the DB
cloud.host table (type="Routing" btw), which mgmt_server_id are they
reporting?
Then check cloud.mshost table and see whether the management server with
that id is in there and marked as UP etc.
HTH
On 2024-07-03 06:57, Janis Viklis | Files.fm wrote:
(sorry, some bad formatting in previous email)
Could anyone have any ideas why this error occurs and how to debug it?
(248 is a host id)
Monitor ComputeCapacityListener says there is an error in the connect
process for 248 due to null
Janis
On 2024-07-01 21:44, Janis Viklis | Files.fm wrote:
Hi,
looking for help after 2 weeks: What could be the reason that
suddenly after restarting the 4.13.1 Management server, all 4 XEN
(xcp-ng 8.1) hosts of one Intel cluster disconnects and goes into
"Alert state" with an error:
Monitor ComputeCapacityListener says there is an error in the connect
process for 248 due to null
I can't find the reason for 2 weeks. The other AMD Xenserver 6.5
cluster is working just fine.
Everything seems ok: network is working, I restarted: toolstack, both
system vms (SSVM, consolev), one of the hosts, then removed and added
back.
Previously there were 3 management servers via Haproxy and Galera
Mariadb, I left only one. (tried upgrade to 3.14.1, didn't help). I
can manage hosts via Xencenter. There ar 5 storage pools and 3
secondary.
Thanks, hoping on some clues or directions, Janis.
Below is LOG output: