kiranchavala commented on issue #7543:
URL: https://github.com/apache/cloudstack/issues/7543#issuecomment-3782700146
I have used Sushy Tools Emulator to test HOST HA feature and found the
integration is not working.
The host HA remains in regarded state and doesn't move to Fenced state
Steps to reproduce the Issue
1. Have a cloudstack environment 4.22 on running on Ubuntu 24.04.3
2. Have a KVM Cluster 3 hosts
3. Testing Flow
CloudStack Management Server
|
| Redfish (HTTP)
v
+-------------------------+
| Sushy Emulator |
| (Redfish API) |
+-------------------------+
|
| Virtual power state
v
KVM Host (libvirt/cloudstack-agent stopped / started)
4. Run the Sushy Emulator on the mgmt server or another standalone vm
Steps to install Sushy tools emulator
```
apt install -y python3-full python3-venv git jq
python3 -m venv /opt/sushy
source /opt/sushy/bin/activate
pip install sushy-tools
sushy-emulator --help
mkdir -p /etc/sushy
vi /etc/sushy/sushy.conf
SUSHY_LISTEN_IP = "0.0.0.0"
SUSHY_LISTEN_PORT = "80"
SUSHY_SSL = False
SUSHY_AUTH_FILE = "/etc/sushy/auth.conf"
vi /etc/sushy/auth.conf
root:password
chmod 600 /etc/sushy/auth.conf
source /opt/sushy/bin/activate
sushy-emulator --config /etc/sushy/sushy.conf
sushy-emulator --config /etc/sushy/sushy.conf -i 0.0.0.0 -p 80 --fake
curl -u root:password http://192.168.55.167/redfish/v1/Systems
curl -u root:password
http://192.168.55.167/redfish/v1/Systems/27946b59-9e44-4fa7-8e91-f3527a1ef094
curl -u root:password \
http://192.168.55.167/redfish/v1/Systems/27946b59-9e44-4fa7-8e91-f3527a1ef094 \
| jq '.Actions."#ComputerSystem.Reset"."[email protected]"'
% Total % Received % Xferd Average Speed Time Time Time
Current
Dload Upload Total Spent Left Speed
100 2821 100 2821 0 0 787k 0 --:--:-- --:--:-- --:--:--
918k
[
"On",
"ForceOff",
"GracefulShutdown",
"GracefulRestart",
"ForceRestart",
"Nmi",
"ForceOn"
]
Check the current powerstate
root@acs420:~# curl -u root:password
http://192.168.55.167/redfish/v1/Systems/27946b59-9e44-4fa7-8e91-f3527a1ef094 |
jq .PowerState
% Total % Received % Xferd Average Speed Time Time Time
Current
Dload Upload Total Spent Left Speed
100 2822 100 2822 0 0 1068k 0 --:--:-- --:--:-- --:--:--
1377k
"Off"
To ForceOn
curl -u root:password -X POST -H "Content-Type: application/json" -d
'{"ResetType":"ForceOn"}'
http://192.168.55.167/redfish/v1/Systems/27946b59-9e44-4fa7-8e91-f3527a1ef094/Actions/ComputerSystem.Reset
Check the powerstate if the change got applied
root@acs420:~# curl -u root:password
http://192.168.55.167/redfish/v1/Systems/27946b59-9e44-4fa7-8e91-f3527a1ef094 |
jq .PowerState
% Total % Received % Xferd Average Speed Time Time Time
Current
Dload Upload Total Spent Left Speed
100 2822 100 2822 0 0 1068k 0 --:--:-- --:--:-- --:--:--
1377k
"On"
To Forceshutdown
curl -u root:password -X POST -H "Content-Type: application/json" -d
'{"ResetType":"ForceOff"}'
http://192.168.55.167/redfish/v1/Systems/27946b59-9e44-4fa7-8e91-f3527a1ef094/Actions/ComputerSystem.Reset
```
5. On KVM Host 1, Perform the Following actions of
Enable HA
<img width="1603" height="510" alt="Image"
src="https://github.com/user-attachments/assets/0c5ea492-92f0-4103-836f-21abdfeb4da4"
/>
Configure HA
<img width="1611" height="561" alt="Image"
src="https://github.com/user-attachments/assets/1499ee14-cfef-4615-b8d9-d5f3f2710c52"
/>
Configure OOBM
<img width="1614" height="727" alt="Image"
src="https://github.com/user-attachments/assets/6181d27b-57e0-4ca4-8c3f-9e464899813c"
/>
6. Check the Power state on host , it should be On
<img width="1638" height="226" alt="Image"
src="https://github.com/user-attachments/assets/5b5fc7da-c964-4bf1-9f61-eb9ca3868565"
/>
7. Test the OOBM Power actions if they are working fine
<img width="691" height="623" alt="Image"
src="https://github.com/user-attachments/assets/4945dcf5-210b-48fc-bd30-aa3f3a2dc5ab"
/>
Except Power-Cycle ( since its not supported by the emulator ) all the power
actions works
LOGS
```
On
root@acs420:~# cat /var/log/cloudstack/management/management-server.log
|grep -i "logid:d527953b"
2026-01-21 07:57:58,178 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl$5]
(API-Job-Executor-12:[ctx-011c8120, job-2767]) (logid:d527953b) Executing
AsyncJob
{"accountId":2,"cmd":"org.apache.cloudstack.api.command.admin.outofbandmanagement.IssueOutOfBandManagementPowerActionCmd","cmdInfo":"{\"response\":\"json\",\"ctxUserId\":\"2\",\"sessionkey\":\"npHS_wT26VxYZe77mt5y3EbjedA\",\"action\":\"ON\",\"hostid\":\"650ff5c6-4cf3-4d26-99a8-ce8c0b9b153f\",\"httpmethod\":\"POST\",\"ctxStartEventId\":\"5110\",\"ctxDetails\":\"{\\\"interface
com.cloud.host.Host\\\":\\\"650ff5c6-4cf3-4d26-99a8-ce8c0b9b153f\\\"}\",\"ctxAccountId\":\"2\",\"cmdEventType\":\"HOST.OOBM.ACTION\"}","cmdVersion":0,"completeMsid":null,"created":null,"id":2767,"initMsid":206863097656310,"instanceId":6,"instanceType":"Host","lastPolled":null,"lastUpdated":null,"processStatus":0,"removed":null,"result":null,"resultCode":0,"status":"IN_PROGRESS","userId":2,"uuid":"d527953b-589a-46ce-b22c-edad6b113aaf"}
2026-01-21 07:57:58,185 DEBUG [o.a.c.u.r.RedfishClient]
(API-Job-Executor-12:[ctx-011c8120, job-2767, ctx-3ef3b976]) (logid:d527953b)
Retrieved System ID 'System-1' with request 'GET:
http://192.168.55.167/redfish/v1/Systems'
2026-01-21 07:57:58,186 DEBUG [o.a.c.u.r.RedfishClient]
(API-Job-Executor-12:[ctx-011c8120, job-2767, ctx-3ef3b976]) (logid:d527953b)
Sending ComputerSystem.Reset Command 'On' to host '192.168.55.167' with request
'POST
http://192.168.55.167/redfish/v1/Systems/System-1/Actions/ComputerSystem.Reset'
2026-01-21 07:57:58,191 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
(API-Job-Executor-12:[ctx-011c8120, job-2767, ctx-3ef3b976]) (logid:d527953b)
Complete async job-2767, jobStatus: SUCCEEDED, resultCode: 0, result:
org.apache.cloudstack.api.response.OutOfBandManagementResponse/outofbandmanagement/{"hostid":"650ff5c6-4cf3-4d26-99a8-ce8c0b9b153f","powerstate":"Off","enabled":"true","driver":"redfish","address":"192.168.55.167","port":"80","username":"root","password":"p*****","action":"ON","description":"200","status":"true"}
2026-01-21 07:57:58,192 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
(API-Job-Executor-12:[ctx-011c8120, job-2767, ctx-3ef3b976]) (logid:d527953b)
Publish async job-2767 complete on message bus
2026-01-21 07:57:58,192 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
(API-Job-Executor-12:[ctx-011c8120, job-2767, ctx-3ef3b976]) (logid:d527953b)
Wake up jobs related to job-2767
2026-01-21 07:57:58,192 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
(API-Job-Executor-12:[ctx-011c8120, job-2767, ctx-3ef3b976]) (logid:d527953b)
Update db status for job-2767
2026-01-21 07:57:58,192 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
(API-Job-Executor-12:[ctx-011c8120, job-2767, ctx-3ef3b976]) (logid:d527953b)
Wake up jobs joined with job-2767 and disjoin all subjobs created from job- 2767
2026-01-21 07:57:58,195 DEBUG [c.c.a.ApiServer]
(API-Job-Executor-12:[ctx-011c8120, job-2767, ctx-3ef3b976]) (logid:d527953b)
Retrieved cmdEventType from job info: HOST.OOBM.ACTION
2026-01-21 07:57:58,196 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl$5]
(API-Job-Executor-12:[ctx-011c8120, job-2767]) (logid:d527953b) Done executing
org.apache.cloudstack.api.command.admin.outofbandmanagement.IssueOutOfBandManagementPowerActionCmd
for job-2767
2026-01-21 07:57:58,196 INFO [o.a.c.f.j.i.AsyncJobMonitor]
(API-Job-Executor-12:[ctx-011c8120, job-2767]) (logid:d527953b) Remove job-2767
from job monitoring
OFF
root@acs420:~# cat /var/log/cloudstack/management/management-server.log
|grep -i "logid:f1c83fdb"
2026-01-21 08:00:09,567 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl$5]
(API-Job-Executor-13:[ctx-b6f50fb5, job-2768]) (logid:f1c83fdb) Executing
AsyncJob
{"accountId":2,"cmd":"org.apache.cloudstack.api.command.admin.outofbandmanagement.IssueOutOfBandManagementPowerActionCmd","cmdInfo":"{\"response\":\"json\",\"ctxUserId\":\"2\",\"sessionkey\":\"npHS_wT26VxYZe77mt5y3EbjedA\",\"action\":\"OFF\",\"hostid\":\"650ff5c6-4cf3-4d26-99a8-ce8c0b9b153f\",\"httpmethod\":\"POST\",\"ctxStartEventId\":\"5113\",\"ctxDetails\":\"{\\\"interface
com.cloud.host.Host\\\":\\\"650ff5c6-4cf3-4d26-99a8-ce8c0b9b153f\\\"}\",\"ctxAccountId\":\"2\",\"cmdEventType\":\"HOST.OOBM.ACTION\"}","cmdVersion":0,"completeMsid":null,"created":null,"id":2768,"initMsid":206863097656310,"instanceId":6,"instanceType":"Host","lastPolled":null,"lastUpdated":null,"processStatus":0,"removed":null,"result":null,"resultCode":0,"status":"IN_PROGRESS","userId":2,"uuid":"f1c83fdb-d274-4025-b16c-6e609f82ff70"}
2026-01-21 08:00:09,577 DEBUG [o.a.c.u.r.RedfishClient]
(API-Job-Executor-13:[ctx-b6f50fb5, job-2768, ctx-fcf304e5]) (logid:f1c83fdb)
Retrieved System ID 'System-1' with request 'GET:
http://192.168.55.167/redfish/v1/Systems'
2026-01-21 08:00:09,578 DEBUG [o.a.c.u.r.RedfishClient]
(API-Job-Executor-13:[ctx-b6f50fb5, job-2768, ctx-fcf304e5]) (logid:f1c83fdb)
Sending ComputerSystem.Reset Command 'GracefulShutdown' to host
'192.168.55.167' with request 'POST
http://192.168.55.167/redfish/v1/Systems/System-1/Actions/ComputerSystem.Reset'
2026-01-21 08:00:09,583 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
(API-Job-Executor-13:[ctx-b6f50fb5, job-2768, ctx-fcf304e5]) (logid:f1c83fdb)
Complete async job-2768, jobStatus: SUCCEEDED, resultCode: 0, result:
org.apache.cloudstack.api.response.OutOfBandManagementResponse/outofbandmanagement/{"hostid":"650ff5c6-4cf3-4d26-99a8-ce8c0b9b153f","powerstate":"On","enabled":"true","driver":"redfish","address":"192.168.55.167","port":"80","username":"root","password":"p*****","action":"OFF","description":"200","status":"true"}
2026-01-21 08:00:09,583 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
(API-Job-Executor-13:[ctx-b6f50fb5, job-2768, ctx-fcf304e5]) (logid:f1c83fdb)
Publish async job-2768 complete on message bus
2026-01-21 08:00:09,583 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
(API-Job-Executor-13:[ctx-b6f50fb5, job-2768, ctx-fcf304e5]) (logid:f1c83fdb)
Wake up jobs related to job-2768
2026-01-21 08:00:09,583 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
(API-Job-Executor-13:[ctx-b6f50fb5, job-2768, ctx-fcf304e5]) (logid:f1c83fdb)
Update db status for job-2768
2026-01-21 08:00:09,583 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
(API-Job-Executor-13:[ctx-b6f50fb5, job-2768, ctx-fcf304e5]) (logid:f1c83fdb)
Wake up jobs joined with job-2768 and disjoin all subjobs created from job- 2768
2026-01-21 08:00:09,587 DEBUG [c.c.a.ApiServer]
(API-Job-Executor-13:[ctx-b6f50fb5, job-2768, ctx-fcf304e5]) (logid:f1c83fdb)
Retrieved cmdEventType from job info: HOST.OOBM.ACTION
2026-01-21 08:00:09,588 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl$5]
(API-Job-Executor-13:[ctx-b6f50fb5, job-2768]) (logid:f1c83fdb) Done executing
org.apache.cloudstack.api.command.admin.outofbandmanagement.IssueOutOfBandManagementPowerActionCmd
for job-2768
2026-01-21 08:00:09,588 INFO [o.a.c.f.j.i.AsyncJobMonitor]
(API-Job-Executor-13:[ctx-b6f50fb5, job-2768]) (logid:f1c83fdb) Remove job-2768
from job monitoring
Cycle
2026-01-21 08:01:59,844 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl$5]
(API-Job-Executor-14:[ctx-d7ff9438, job-2769]) (logid:c78789d9) Executing
AsyncJob
{"accountId":2,"cmd":"org.apache.cloudstack.api.command.admin.outofbandmanagement.IssueOutOfBandManagementPowerActionCmd","cmdInfo":"{\"response\":\"json\",\"ctxUserId\":\"2\",\"sessionkey\":\"npHS_wT26VxYZe77mt5y3EbjedA\",\"action\":\"CYCLE\",\"hostid\":\"650ff5c6-4cf3-4d26-99a8-ce8c0b9b153f\",\"httpmethod\":\"POST\",\"ctxStartEventId\":\"5118\",\"ctxDetails\":\"{\\\"interface
com.cloud.host.Host\\\":\\\"650ff5c6-4cf3-4d26-99a8-ce8c0b9b153f\\\"}\",\"ctxAccountId\":\"2\",\"cmdEventType\":\"HOST.OOBM.ACTION\"}","cmdVersion":0,"completeMsid":null,"created":null,"id":2769,"initMsid":206863097656310,"instanceId":6,"instanceType":"Host","lastPolled":null,"lastUpdated":null,"processStatus":0,"removed":null,"result":null,"resultCode":0,"status":"IN_PROGRESS","userId":2,"uuid":"c78789d9-19a1-4525-89f9-120d1c7c764a"}
2026-01-21 08:01:59,853 DEBUG [o.a.c.u.r.RedfishClient]
(API-Job-Executor-14:[ctx-d7ff9438, job-2769, ctx-00d47bde]) (logid:c78789d9)
Retrieved System ID 'System-1' with request 'GET:
http://192.168.55.167/redfish/v1/Systems'
2026-01-21 08:01:59,859 ERROR [c.c.a.ApiAsyncJobDispatcher]
(API-Job-Executor-14:[ctx-d7ff9438, job-2769]) (logid:c78789d9) Unexpected
exception while executing
org.apache.cloudstack.api.command.admin.outofbandmanagement.IssueOutOfBandManagementPowerActionCmd
org.apache.cloudstack.utils.redfish.RedfishException: Failed to execute System
power command for host by performing 'POST' request on URL
'http://192.168.55.167/redfish/v1/Systems/System-1/Actions/ComputerSystem.Reset'
and host address '192.168.55.167'. The expected HTTP status code is '2XX' but
it got '400'.
2026-01-21 08:01:59,859 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
(API-Job-Executor-14:[ctx-d7ff9438, job-2769]) (logid:c78789d9) Complete async
job-2769, jobStatus: FAILED, resultCode: 530, result:
org.apache.cloudstack.api.response.ExceptionResponse/null/{"uuidList":[],"errorcode":"530","errortext":"Failed
to execute System power command for host by performing 'POST' request on URL
'http://192.168.55.167/redfish/v1/Systems/System-1/Actions/ComputerSystem.Reset'
and host address '192.168.55.167'. The expected HTTP status code is '2XX' but
it got '400'."}
2026-01-21 08:01:59,860 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
(API-Job-Executor-14:[ctx-d7ff9438, job-2769]) (logid:c78789d9) Publish async
job-2769 complete on message bus
2026-01-21 08:01:59,860 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
(API-Job-Executor-14:[ctx-d7ff9438, job-2769]) (logid:c78789d9) Wake up jobs
related to job-2769
2026-01-21 08:01:59,860 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
(API-Job-Executor-14:[ctx-d7ff9438, job-2769]) (logid:c78789d9) Update db
status for job-2769
2026-01-21 08:01:59,860 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
(API-Job-Executor-14:[ctx-d7ff9438, job-2769]) (logid:c78789d9) Wake up jobs
joined with job-2769 and disjoin all subjobs created from job- 2769
2026-01-21 08:01:59,864 DEBUG [c.c.a.ApiServer]
(API-Job-Executor-14:[ctx-d7ff9438, job-2769]) (logid:c78789d9) Retrieved
cmdEventType from job info: HOST.OOBM.ACTION
2026-01-21 08:01:59,865 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl$5]
(API-Job-Executor-14:[ctx-d7ff9438, job-2769]) (logid:c78789d9) Done executing
org.apache.cloudstack.api.command.admin.outofbandmanagement.IssueOutOfBandManagementPowerActionCmd
for job-2769
2026-01-21 08:01:59,865 INFO [o.a.c.f.j.i.AsyncJobMonitor]
(API-Job-Executor-14:[ctx-d7ff9438, job-2769]) (logid:c78789d9) Remove job-2769
from job monitoring
Reset
root@acs420:~# cat /var/log/cloudstack/management/management-server.log
|grep -i "logid:354c4b59"
2026-01-21 08:04:25,665 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl$5]
(API-Job-Executor-15:[ctx-ec99bb43, job-2770]) (logid:354c4b59) Executing
AsyncJob
{"accountId":2,"cmd":"org.apache.cloudstack.api.command.admin.outofbandmanagement.IssueOutOfBandManagementPowerActionCmd","cmdInfo":"{\"response\":\"json\",\"ctxUserId\":\"2\",\"sessionkey\":\"npHS_wT26VxYZe77mt5y3EbjedA\",\"action\":\"RESET\",\"hostid\":\"650ff5c6-4cf3-4d26-99a8-ce8c0b9b153f\",\"httpmethod\":\"POST\",\"ctxStartEventId\":\"5124\",\"ctxDetails\":\"{\\\"interface
com.cloud.host.Host\\\":\\\"650ff5c6-4cf3-4d26-99a8-ce8c0b9b153f\\\"}\",\"ctxAccountId\":\"2\",\"cmdEventType\":\"HOST.OOBM.ACTION\"}","cmdVersion":0,"completeMsid":null,"created":null,"id":2770,"initMsid":206863097656310,"instanceId":6,"instanceType":"Host","lastPolled":null,"lastUpdated":null,"processStatus":0,"removed":null,"result":null,"resultCode":0,"status":"IN_PROGRESS","userId":2,"uuid":"354c4b59-6446-487f-b507-d93b8b605733"}
2026-01-21 08:04:25,673 DEBUG [o.a.c.u.r.RedfishClient]
(API-Job-Executor-15:[ctx-ec99bb43, job-2770, ctx-5a2162f3]) (logid:354c4b59)
Retrieved System ID 'System-1' with request 'GET:
http://192.168.55.167/redfish/v1/Systems'
2026-01-21 08:04:25,674 DEBUG [o.a.c.u.r.RedfishClient]
(API-Job-Executor-15:[ctx-ec99bb43, job-2770, ctx-5a2162f3]) (logid:354c4b59)
Sending ComputerSystem.Reset Command 'ForceRestart' to host '192.168.55.167'
with request 'POST
http://192.168.55.167/redfish/v1/Systems/System-1/Actions/ComputerSystem.Reset'
2026-01-21 08:04:25,679 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
(API-Job-Executor-15:[ctx-ec99bb43, job-2770, ctx-5a2162f3]) (logid:354c4b59)
Complete async job-2770, jobStatus: SUCCEEDED, resultCode: 0, result:
org.apache.cloudstack.api.response.OutOfBandManagementResponse/outofbandmanagement/{"hostid":"650ff5c6-4cf3-4d26-99a8-ce8c0b9b153f","powerstate":"On","enabled":"true","driver":"redfish","address":"192.168.55.167","port":"80","username":"root","password":"p*****","action":"RESET","description":"200","status":"true"}
2026-01-21 08:04:25,679 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
(API-Job-Executor-15:[ctx-ec99bb43, job-2770, ctx-5a2162f3]) (logid:354c4b59)
Publish async job-2770 complete on message bus
2026-01-21 08:04:25,679 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
(API-Job-Executor-15:[ctx-ec99bb43, job-2770, ctx-5a2162f3]) (logid:354c4b59)
Wake up jobs related to job-2770
2026-01-21 08:04:25,679 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
(API-Job-Executor-15:[ctx-ec99bb43, job-2770, ctx-5a2162f3]) (logid:354c4b59)
Update db status for job-2770
2026-01-21 08:04:25,679 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
(API-Job-Executor-15:[ctx-ec99bb43, job-2770, ctx-5a2162f3]) (logid:354c4b59)
Wake up jobs joined with job-2770 and disjoin all subjobs created from job- 2770
2026-01-21 08:04:25,683 DEBUG [c.c.a.ApiServer]
(API-Job-Executor-15:[ctx-ec99bb43, job-2770, ctx-5a2162f3]) (logid:354c4b59)
Retrieved cmdEventType from job info: HOST.OOBM.ACTION
2026-01-21 08:04:25,684 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl$5]
(API-Job-Executor-15:[ctx-ec99bb43, job-2770]) (logid:354c4b59) Done executing
org.apache.cloudstack.api.command.admin.outofbandmanagement.IssueOutOfBandManagementPowerActionCmd
for job-2770
2026-01-21 08:04:25,684 INFO [o.a.c.f.j.i.AsyncJobMonitor]
(API-Job-Executor-15:[ctx-ec99bb43, job-2770]) (logid:354c4b59) Remove job-2770
from job monitoring
status
2026-01-21 08:05:18,784 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl$5]
(API-Job-Executor-16:[ctx-39f7b528, job-2771]) (logid:820aa1fd) Executing
AsyncJob
{"accountId":2,"cmd":"org.apache.cloudstack.api.command.admin.outofbandmanagement.IssueOutOfBandManagementPowerActionCmd","cmdInfo":"{\"response\":\"json\",\"ctxUserId\":\"2\",\"sessionkey\":\"npHS_wT26VxYZe77mt5y3EbjedA\",\"action\":\"STATUS\",\"hostid\":\"650ff5c6-4cf3-4d26-99a8-ce8c0b9b153f\",\"httpmethod\":\"POST\",\"ctxStartEventId\":\"5127\",\"ctxDetails\":\"{\\\"interface
com.cloud.host.Host\\\":\\\"650ff5c6-4cf3-4d26-99a8-ce8c0b9b153f\\\"}\",\"ctxAccountId\":\"2\",\"cmdEventType\":\"HOST.OOBM.ACTION\"}","cmdVersion":0,"completeMsid":null,"created":null,"id":2771,"initMsid":206863097656310,"instanceId":6,"instanceType":"Host","lastPolled":null,"lastUpdated":null,"processStatus":0,"removed":null,"result":null,"resultCode":0,"status":"IN_PROGRESS","userId":2,"uuid":"820aa1fd-4c2e-46f9-b71e-efef540fc9e5"}
2026-01-21 08:05:18,791 DEBUG [o.a.c.u.r.RedfishClient]
(API-Job-Executor-16:[ctx-39f7b528, job-2771, ctx-78a6c8bb]) (logid:820aa1fd)
Retrieved System ID 'System-1' with request 'GET:
http://192.168.55.167/redfish/v1/Systems'
2026-01-21 08:05:18,792 DEBUG [o.a.c.u.r.RedfishClient]
(API-Job-Executor-16:[ctx-39f7b528, job-2771, ctx-78a6c8bb]) (logid:820aa1fd)
Retrieved System power state 'On' with request 'GET:
http://192.168.55.167/redfish/v1/Systems/System-1'
2026-01-21 08:05:18,798 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
(API-Job-Executor-16:[ctx-39f7b528, job-2771, ctx-78a6c8bb]) (logid:820aa1fd)
Complete async job-2771, jobStatus: SUCCEEDED, resultCode: 0, result:
org.apache.cloudstack.api.response.OutOfBandManagementResponse/outofbandmanagement/{"hostid":"650ff5c6-4cf3-4d26-99a8-ce8c0b9b153f","powerstate":"On","enabled":"true","driver":"redfish","address":"192.168.55.167","port":"80","username":"root","password":"p*****","action":"STATUS","description":"200","status":"true"}
2026-01-21 08:05:18,798 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
(API-Job-Executor-16:[ctx-39f7b528, job-2771, ctx-78a6c8bb]) (logid:820aa1fd)
Publish async job-2771 complete on message bus
2026-01-21 08:05:18,798 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
(API-Job-Executor-16:[ctx-39f7b528, job-2771, ctx-78a6c8bb]) (logid:820aa1fd)
Wake up jobs related to job-2771
2026-01-21 08:05:18,798 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
(API-Job-Executor-16:[ctx-39f7b528, job-2771, ctx-78a6c8bb]) (logid:820aa1fd)
Update db status for job-2771
2026-01-21 08:05:18,799 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
(API-Job-Executor-16:[ctx-39f7b528, job-2771, ctx-78a6c8bb]) (logid:820aa1fd)
Wake up jobs joined with job-2771 and disjoin all subjobs created from job- 2771
2026-01-21 08:05:18,802 DEBUG [c.c.a.ApiServer]
(API-Job-Executor-16:[ctx-39f7b528, job-2771, ctx-78a6c8bb]) (logid:820aa1fd)
Retrieved cmdEventType from job info: HOST.OOBM.ACTION
2026-01-21 08:05:18,803 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl$5]
(API-Job-Executor-16:[ctx-39f7b528, job-2771]) (logid:820aa1fd) Done executing
org.apache.cloudstack.api.command.admin.outofbandmanagement.IssueOutOfBandManagementPowerActionCmd
for job-2771
2026-01-21 08:05:18,803 INFO [o.a.c.f.j.i.AsyncJobMonitor]
(API-Job-Executor-16:[ctx-39f7b528, job-2771]) (logid:820aa1fd) Remove job-2771
from job monitoring
Soft
root@acs420:~# cat /var/log/cloudstack/management/management-server.log
|grep -i "logid:0295ebe0"
2026-01-21 07:54:52,937 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl$5]
(API-Job-Executor-11:[ctx-3725e9c6, job-2766]) (logid:0295ebe0) Executing
AsyncJob
{"accountId":2,"cmd":"org.apache.cloudstack.api.command.admin.outofbandmanagement.IssueOutOfBandManagementPowerActionCmd","cmdInfo":"{\"response\":\"json\",\"ctxUserId\":\"2\",\"sessionkey\":\"npHS_wT26VxYZe77mt5y3EbjedA\",\"action\":\"SOFT\",\"hostid\":\"650ff5c6-4cf3-4d26-99a8-ce8c0b9b153f\",\"httpmethod\":\"POST\",\"ctxStartEventId\":\"5105\",\"ctxDetails\":\"{\\\"interface
com.cloud.host.Host\\\":\\\"650ff5c6-4cf3-4d26-99a8-ce8c0b9b153f\\\"}\",\"ctxAccountId\":\"2\",\"cmdEventType\":\"HOST.OOBM.ACTION\"}","cmdVersion":0,"completeMsid":null,"created":null,"id":2766,"initMsid":206863097656310,"instanceId":6,"instanceType":"Host","lastPolled":null,"lastUpdated":null,"processStatus":0,"removed":null,"result":null,"resultCode":0,"status":"IN_PROGRESS","userId":2,"uuid":"0295ebe0-cc53-4e4d-b514-06d62ebce244"}
2026-01-21 07:54:52,948 DEBUG [o.a.c.u.r.RedfishClient]
(API-Job-Executor-11:[ctx-3725e9c6, job-2766, ctx-ffcf43e3]) (logid:0295ebe0)
Retrieved System ID 'System-1' with request 'GET:
http://192.168.55.167/redfish/v1/Systems'
2026-01-21 07:54:52,950 DEBUG [o.a.c.u.r.RedfishClient]
(API-Job-Executor-11:[ctx-3725e9c6, job-2766, ctx-ffcf43e3]) (logid:0295ebe0)
Sending ComputerSystem.Reset Command 'GracefulShutdown' to host
'192.168.55.167' with request 'POST
http://192.168.55.167/redfish/v1/Systems/System-1/Actions/ComputerSystem.Reset'
2026-01-21 07:54:52,955 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
(API-Job-Executor-11:[ctx-3725e9c6, job-2766, ctx-ffcf43e3]) (logid:0295ebe0)
Complete async job-2766, jobStatus: SUCCEEDED, resultCode: 0, result:
org.apache.cloudstack.api.response.OutOfBandManagementResponse/outofbandmanagement/{"hostid":"650ff5c6-4cf3-4d26-99a8-ce8c0b9b153f","powerstate":"Off","enabled":"true","driver":"redfish","address":"192.168.55.167","port":"80","username":"root","password":"p*****","action":"SOFT","description":"200","status":"true"}
2026-01-21 07:54:52,956 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
(API-Job-Executor-11:[ctx-3725e9c6, job-2766, ctx-ffcf43e3]) (logid:0295ebe0)
Publish async job-2766 complete on message bus
2026-01-21 07:54:52,956 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
(API-Job-Executor-11:[ctx-3725e9c6, job-2766, ctx-ffcf43e3]) (logid:0295ebe0)
Wake up jobs related to job-2766
2026-01-21 07:54:52,956 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
(API-Job-Executor-11:[ctx-3725e9c6, job-2766, ctx-ffcf43e3]) (logid:0295ebe0)
Update db status for job-2766
2026-01-21 07:54:52,956 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
(API-Job-Executor-11:[ctx-3725e9c6, job-2766, ctx-ffcf43e3]) (logid:0295ebe0)
Wake up jobs joined with job-2766 and disjoin all subjobs created from job- 2766
2026-01-21 07:54:52,960 DEBUG [c.c.a.ApiServer]
(API-Job-Executor-11:[ctx-3725e9c6, job-2766, ctx-ffcf43e3]) (logid:0295ebe0)
Retrieved cmdEventType from job info: HOST.OOBM.ACTION
2026-01-21 07:54:52,961 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl$5]
(API-Job-Executor-11:[ctx-3725e9c6, job-2766]) (logid:0295ebe0) Done executing
org.apache.cloudstack.api.command.admin.outofbandmanagement.IssueOutOfBandManagementPowerActionCmd
for job-2766
2026-01-21 07:54:52,961 INFO [o.a.c.f.j.i.AsyncJobMonitor]
(API-Job-Executor-11:[ctx-3725e9c6, job-2766]) (logid:0295ebe0) Remove job-2766
from job monitoring
```
8. Configure the following global settings
<img width="1643" height="854" alt="Image"
src="https://github.com/user-attachments/assets/6ad3ce53-ad01-46a9-bdd0-b3c46638242b"
/>
```
commands.timeout : CheckHealthCommand=5,CheckOnHostCommand=5
kvm.ha.fence.on.storage.heartbeat.failure to true
kvm.ha.fence.on.storage.heartbeat.failure : 1
kvm.ha.activity.check.interval : 10
kvm.ha.activity.check.max.attempts : 1
kvm.ha.activity.check.timeout : 60
kvm.ha.degraded.max.period : 300
```
9. Test the host ha when there are no vm's running on the host.
Stop the cloudstack agent on kvm host 1
```
service libvirtd stop
service cloudstack-agent stop
systemctl stop multipathd
Block the iptables rules
iptables -I OUTPUT -p tcp --dport 8250 -j DROP
iptables -I INPUT -p tcp --dport 8250 -j DROP
```
10. Turn off power state manually
```
curl -u root:password -X POST -H "Content-Type: application/json" -d
'{"ResetType":"ForceOff"}'
http://192.168.55.167/redfish/v1/Systems/27946b59-9e44-4fa7-8e91-f3527a1ef094/Actions/ComputerSystem.Reset
curl -u root:password
http://192.168.55.167/redfish/v1/Systems/27946b59-9e44-4fa7-8e91-f3527a1ef094 |
jq .PowerState
% Total % Received % Xferd Average Speed Time Time Time
Current
Dload Upload Total Spent Left Speed
100 2822 100 2822 0 0 217k 0 --:--:-- --:--:-- --:--:--
229k
"Off"
```
11. Check the managment server logs
Every five minutes it checks the ha state based on kvm.ha.degraded.max.period
```
tail -f /var/log/cloudstack/management/management-server.log | egrep -i
"o.a.c.h.HAManagerImpl]:
2026-01-22 05:44:07,386 DEBUG [o.a.c.h.HAManagerImpl] (pool-3-thread-43:[])
(logid:) HA state pre-transition:: new state=[Available], old
state=[Available], for resource id=[6], status=[true], ha config
state=[Available].
2026-01-22 05:44:11,392 DEBUG [o.a.c.h.HAManagerImpl] (pool-3-thread-45:[])
(logid:) HA state pre-transition:: new state=[Available], old
state=[Available], for resource id=[6], status=[true], ha config
state=[Available].
2026-01-22 05:44:15,395 DEBUG [o.a.c.h.HAManagerImpl] (pool-3-thread-47:[])
(logid:) HA state pre-transition:: new state=[Available], old
state=[Available], for resource id=[6], status=[true], ha config
state=[Available].
2026-01-22 05:44:19,413 DEBUG [o.a.c.h.HAManagerImpl] (pool-3-thread-49:[])
(logid:) HA state post-transition:: new state=[Suspect], old state=[Available],
for resource id=[6], status=[true], ha config state=[Suspect].
2026-01-22 05:44:19,416 DEBUG [o.a.c.h.HAManagerImpl] (pool-3-thread-49:[])
(logid:) Transitioned host HA state from: Available to: Suspect due to
event:HealthCheckFailed for the host Host
{"id":6,"name":"kvmhost1","type":"Routing","uuid":"650ff5c6-4cf3-4d26-99a8-ce8c0b9b153f"}
with id: 6
2026-01-22 05:44:23,368 DEBUG [o.a.c.h.HAManagerImpl]
(BackgroundTaskPollManager-3:[ctx-05db2251]) (logid:ad0ef2d6) HA state
post-transition:: new state=[Checking], old state=[Suspect], for resource
id=[6], status=[true], ha config state=[Checking].
2026-01-22 05:44:23,371 DEBUG [o.a.c.h.HAManagerImpl]
(BackgroundTaskPollManager-3:[ctx-05db2251]) (logid:ad0ef2d6) Transitioned host
HA state from: Suspect to: Checking due to event:PerformActivityCheck for the
host Host
{"id":6,"name":"kvmhost1","type":"Routing","uuid":"650ff5c6-4cf3-4d26-99a8-ce8c0b9b153f"}
with id: 6
2026-01-22 05:44:23,374 DEBUG [o.a.c.h.HAManagerImpl] (pool-4-thread-3:[])
(logid:) HA state post-transition:: new state=[Degraded], old state=[Checking],
for resource id=[6], status=[true], ha config state=[Degraded].
2026-01-22 05:44:23,377 DEBUG [o.a.c.h.HAManagerImpl] (pool-4-thread-3:[])
(logid:) Transitioned host HA state from: Checking to: Degraded due to
event:ActivityCheckFailureUnderThresholdRatio for the host Host
{"id":6,"name":"kvmhost1","type":"Routing","uuid":"650ff5c6-4cf3-4d26-99a8-ce8c0b9b153f"}
with id: 6
2026-01-22 05:44:23,406 DEBUG [o.a.c.h.HAManagerImpl] (pool-3-thread-2:[])
(logid:) HA state pre-transition:: new state=[Degraded], old state=[Degraded],
for resource id=[6], status=[true], ha config state=[Degraded].
2026-01-22 05:44:27,427 DEBUG [o.a.c.h.HAManagerImpl] (pool-3-thread-4:[])
(logid:) HA state pre-transition:: new state=[Degraded], old state=[Degraded],
for resource id=[6], status=[true], ha config state=[Degraded].
2026-01-22 05:49:23,771 DEBUG [o.a.c.h.HAManagerImpl]
(BackgroundTaskPollManager-6:[ctx-7cf7f693]) (logid:28b9fc9c) HA state
post-transition:: new state=[Suspect], old state=[Degraded], for resource
id=[6], status=[true], ha config state=[Suspect].
2026-01-22 05:49:23,773 DEBUG [o.a.c.h.HAManagerImpl]
(BackgroundTaskPollManager-6:[ctx-7cf7f693]) (logid:28b9fc9c) Transitioned host
HA state from: Degraded to: Suspect due to
event:PeriodicRecheckResourceActivity for the host Host
{"id":6,"name":"kvmhost1","type":"Routing","uuid":"650ff5c6-4cf3-4d26-99a8-ce8c0b9b153f"}
with id: 6
2026-01-22 05:49:27,790 DEBUG [o.a.c.h.HAManagerImpl]
(BackgroundTaskPollManager-4:[ctx-3a6054e2]) (logid:116db6f1) HA state
post-transition:: new state=[Checking], old state=[Suspect], for resource
id=[6], status=[true], ha config state=[Checking].
2026-01-22 05:49:27,793 DEBUG [o.a.c.h.HAManagerImpl]
(BackgroundTaskPollManager-4:[ctx-3a6054e2]) (logid:116db6f1) Transitioned host
HA state from: Suspect to: Checking due to event:PerformActivityCheck for the
host Host
{"id":6,"name":"kvmhost1","type":"Routing","uuid":"650ff5c6-4cf3-4d26-99a8-ce8c0b9b153f"}
with id: 6
2026-01-22 05:49:27,796 DEBUG [o.a.c.h.HAManagerImpl] (pool-4-thread-5:[])
(logid:) HA state post-transition:: new state=[Degraded], old state=[Checking],
for resource id=[6], status=[true], ha config state=[Degraded].
2026-01-22 05:49:27,798 DEBUG [o.a.c.h.HAManagerImpl] (pool-4-thread-5:[])
(logid:) Transitioned host HA state from: Checking to: Degraded due to
event:ActivityCheckFailureUnderThresholdRatio for the host Host
{"id":6,"name":"kvmhost1","type":"Routing","uuid":"650ff5c6-4cf3-4d26-99a8-ce8c0b9b153f"}
with id: 6
2026-01-22 05:49:27,876 DEBUG [o.a.c.h.HAManagerImpl] (pool-3-thread-3:[])
(logid:) HA state pre-transition:: new state=[Degraded], old state=[Degraded],
for resource id=[6], status=[true], ha config state=[Degraded].
2026-01-22 05:49:31,895 DEBUG [o.a.c.h.HAManagerImpl] (pool-3-thread-5:[])
(logid:) HA state pre-transition:: new state=[Degraded], old state=[Degraded],
for resource id=[6], status=[true], ha config state=[Degraded].
2026-01-22 05:49:35,899 DEBUG [o.a.c.h.HAManagerImpl] (pool-3-thread-7:[])
(logid:) HA state pre-transition:: new state=[Degraded], old state=[Degraded],
for resource id=[6], status=[true], ha config state=[Degraded].
tail -f /var/log/cloudstack/management/management-server.log |grep -i
"o.a.c.k.h.KVMHostActivityChecker"
2026-01-22 05:46:43,545 DEBUG [o.a.c.k.h.KVMHostActivityChecker]
(pool-3-thread-22:[]) (logid:) Investigating Host
{"id":6,"name":"kvmhost1","type":"Routing","uuid":"650ff5c6-4cf3-4d26-99a8-ce8c0b9b153f"}
via neighbouring Host
{"id":7,"name":"kvmhost2","type":"Routing","uuid":"c30b8e0d-0a48-40fa-bb57-c2de13513ca5"}.
2026-01-22 05:46:43,594 DEBUG [o.a.c.k.h.KVMHostActivityChecker]
(pool-3-thread-22:[]) (logid:) Neighbouring Host
{"id":7,"name":"kvmhost2","type":"Routing","uuid":"c30b8e0d-0a48-40fa-bb57-c2de13513ca5"}
returned status [Down] for the investigated Host
{"id":6,"name":"kvmhost1","type":"Routing","uuid":"650ff5c6-4cf3-4d26-99a8-ce8c0b9b153f"}.
2026-01-22 05:46:43,594 DEBUG [o.a.c.k.h.KVMHostActivityChecker]
(pool-3-thread-22:[]) (logid:) Investigating Host
{"id":6,"name":"kvmhost1","type":"Routing","uuid":"650ff5c6-4cf3-4d26-99a8-ce8c0b9b153f"}
via neighbouring Host
{"id":13,"name":"kvmhost3","type":"Routing","uuid":"3cfc4b1f-1aa9-40e5-a549-484ab4fcc921"}.
2026-01-22 05:46:43,642 DEBUG [o.a.c.k.h.KVMHostActivityChecker]
(pool-3-thread-22:[]) (logid:) Neighbouring Host
{"id":13,"name":"kvmhost3","type":"Routing","uuid":"3cfc4b1f-1aa9-40e5-a549-484ab4fcc921"}
returned status [Down] for the investigated Host
{"id":6,"name":"kvmhost1","type":"Routing","uuid":"650ff5c6-4cf3-4d26-99a8-ce8c0b9b153f"}.
2026-01-22 05:46:43,642 DEBUG [o.a.c.k.h.KVMHostActivityChecker]
(pool-3-thread-22:[]) (logid:) Host
{"id":6,"name":"kvmhost1","type":"Routing","uuid":"650ff5c6-4cf3-4d26-99a8-ce8c0b9b153f"}
has the status [Down].
tail -f /var/log/cloudstack/management/management-server.log |grep -i
"Redfish"
2026-01-22 05:49:21,565 DEBUG [o.a.c.u.r.RedfishClient]
(pool-2-thread-16:[ctx-394c23c2]) (logid:) Retrieved System ID
'27946b59-9e44-4fa7-8e91-f3527a1ef094' with request 'GET:
http://192.168.55.167/redfish/v1/Systems'
2026-01-22 05:49:21,567 DEBUG [o.a.c.u.r.RedfishClient]
(pool-2-thread-16:[ctx-394c23c2]) (logid:) Retrieved System power state 'Off'
with request 'GET:
http://192.168.55.167/redfish/v1/Systems/27946b59-9e44-4fa7-8e91-f3527a1ef094'
2026-01-22 05:49:25,566 DEBUG [o.a.c.u.r.RedfishClient]
(pool-2-thread-17:[ctx-11e469a8]) (logid:) Retrieved System ID
'27946b59-9e44-4fa7-8e91-f3527a1ef094' with request 'GET:
http://192.168.55.167/redfish/v1/Systems'
2026-01-22 05:49:25,568 DEBUG [o.a.c.u.r.RedfishClient]
(pool-2-thread-17:[ctx-11e469a8]) (logid:) Retrieved System power state 'Off'
with request 'GET:
http://192.168.55.167/redfish/v1/Systems/27946b59-9e44-4fa7-8e91-f3527a1ef094'
tail -f /var/log/cloudstack/management/management-server.log |grep -i
"AgentTaskPool"
2026-01-22 05:44:17,263 DEBUG [c.c.a.m.ClusteredAgentManagerImpl]
(AgentTaskPool-4:[ctx-198d5c99]) (logid:a36a496a) Acquired lock on host Host
{"id":6,"name":"kvmhost1","type":"Routing","uuid":"650ff5c6-4cf3-4d26-99a8-ce8c0b9b153f"},
to process agent disconnection
2026-01-22 05:44:17,263 INFO [c.c.a.m.ClusteredAgentManagerImpl]
(AgentTaskPool-4:[ctx-198d5c99]) (logid:a36a496a) Host AgentAttache
{"_id":6,"_name":"kvmhost1","_uuid":"650ff5c6-4cf3-4d26-99a8-ce8c0b9b153f"} is
disconnecting with event ShutdownRequested
2026-01-22 05:44:17,263 DEBUG [c.c.a.m.ClusteredAgentManagerImpl]
(AgentTaskPool-4:[ctx-198d5c99]) (logid:a36a496a) The next status of agent Host
{"id":6,"name":"kvmhost1","type":"Routing","uuid":"650ff5c6-4cf3-4d26-99a8-ce8c0b9b153f"}
is Disconnected, current status is Up
2026-01-22 05:44:17,264 DEBUG [c.c.a.m.ClusteredAgentManagerImpl]
(AgentTaskPool-4:[ctx-198d5c99]) (logid:a36a496a) Deregistering link for
AgentAttache
{"_id":6,"_name":"kvmhost1","_uuid":"650ff5c6-4cf3-4d26-99a8-ce8c0b9b153f"}
with state Disconnected
2026-01-22 05:44:17,264 DEBUG [c.c.a.m.ClusteredAgentManagerImpl]
(AgentTaskPool-4:[ctx-198d5c99]) (logid:a36a496a) Remove Agent : AgentAttache
{"_id":6,"_name":"kvmhost1","_uuid":"650ff5c6-4cf3-4d26-99a8-ce8c0b9b153f"}
2026-01-22 05:44:17,264 DEBUG [c.c.a.m.ClusteredAgentAttache]
(AgentTaskPool-4:[ctx-198d5c99]) (logid:a36a496a) Processing disconnect [id: 6,
uuid: 650ff5c6-4cf3-4d26-99a8-ce8c0b9b153f, name: kvmhost1]
2026-01-22 05:44:17,264 DEBUG [c.c.a.m.ClusteredAgentAttache]
(AgentTaskPool-4:[ctx-198d5c99]) (logid:a36a496a) Disconnecting from
/192.168.55.164, Socket Address: /192.168.55.164:60614
2026-01-22 05:44:17,264 DEBUG [c.c.a.m.ClusteredAgentAttache]
(AgentTaskPool-4:[ctx-198d5c99]) (logid:a36a496a) Seq 27-8555431917120389122:
Sending disconnect to class com.cloud.network.security.SecurityGroupListener
2026-01-22 05:44:17,264 DEBUG [c.c.a.m.ClusteredAgentManagerImpl]
(AgentTaskPool-4:[ctx-198d5c99]) (logid:a36a496a) Sending Disconnect to
listener: com.cloud.hypervisor.xenserver.discoverer.XcpServerDiscoverer
2026-01-22 05:44:17,264 DEBUG [c.c.a.m.ClusteredAgentManagerImpl]
(AgentTaskPool-4:[ctx-198d5c99]) (logid:a36a496a) Sending Disconnect to
listener: com.cloud.hypervisor.hyperv.discoverer.HypervServerDiscoverer
2026-01-22 05:44:17,264 DEBUG [c.c.a.m.ClusteredAgentManagerImpl]
(AgentTaskPool-4:[ctx-198d5c99]) (logid:a36a496a) Sending Disconnect to
listener:
org.apache.cloudstack.hypervisor.external.discoverer.ExternalServerDiscoverer
2026-01-22 05:44:17,264 DEBUG [c.c.a.m.ClusteredAgentManagerImpl]
(AgentTaskPool-4:[ctx-198d5c99]) (logid:a36a496a) Sending Disconnect to
listener: org.apache.cloudstack.network.tungsten.service.TungstenElement
2026-01-22 05:44:17,264 DEBUG [c.c.a.m.ClusteredAgentManagerImpl]
(AgentTaskPool-4:[ctx-198d5c99]) (logid:a36a496a) Sending Disconnect to
listener: org.apache.cloudstack.service.NsxElement
2026-01-22 05:44:17,264 DEBUG [c.c.a.m.ClusteredAgentManagerImpl]
(AgentTaskPool-4:[ctx-198d5c99]) (logid:a36a496a) Sending Disconnect to
listener: org.apache.cloudstack.service.NetrisElement
2026-01-22 05:44:17,264 DEBUG [c.c.a.m.ClusteredAgentManagerImpl]
(AgentTaskPool-4:[ctx-198d5c99]) (logid:a36a496a) Sending Disconnect to
listener: com.cloud.storage.listener.StoragePoolMonitor
2026-01-22 05:44:17,265 DEBUG [c.c.a.m.ClusteredAgentManagerImpl]
(AgentTaskPool-4:[ctx-198d5c99]) (logid:a36a496a) Sending Disconnect to
listener: com.cloud.hypervisor.vmware.manager.VmwareManagerImpl
2026-01-22 05:44:17,265 DEBUG [c.c.a.m.ClusteredAgentManagerImpl]
(AgentTaskPool-4:[ctx-198d5c99]) (logid:a36a496a) Sending Disconnect to
listener: com.cloud.deploy.DeploymentPlanningManagerImpl
2026-01-22 05:44:17,265 DEBUG [c.c.a.m.ClusteredAgentManagerImpl]
(AgentTaskPool-4:[ctx-198d5c99]) (logid:a36a496a) Sending Disconnect to
listener: com.cloud.network.security.SecurityGroupListener
2026-01-22 05:44:17,265 DEBUG [c.c.a.m.ClusteredAgentManagerImpl]
(AgentTaskPool-4:[ctx-198d5c99]) (logid:a36a496a) Sending Disconnect to
listener: com.cloud.vm.ClusteredVirtualMachineManagerImpl
2026-01-22 05:44:17,265 DEBUG [c.c.a.m.ClusteredAgentManagerImpl]
(AgentTaskPool-4:[ctx-198d5c99]) (logid:a36a496a) Sending Disconnect to
listener: org.apache.cloudstack.engine.orchestration.NetworkOrchestrator
2026-01-22 05:44:17,265 DEBUG [c.c.a.m.ClusteredAgentManagerImpl]
(AgentTaskPool-4:[ctx-198d5c99]) (logid:a36a496a) Sending Disconnect to
listener: com.cloud.storage.secondary.SecondaryStorageListener
2026-01-22 05:44:17,265 DEBUG [c.c.a.m.ClusteredAgentManagerImpl]
(AgentTaskPool-4:[ctx-198d5c99]) (logid:a36a496a) Sending Disconnect to
listener: com.cloud.capacity.StorageCapacityListener
2026-01-22 05:44:17,265 DEBUG [c.c.a.m.ClusteredAgentManagerImpl]
(AgentTaskPool-4:[ctx-198d5c99]) (logid:a36a496a) Sending Disconnect to
listener: com.cloud.capacity.ComputeCapacityListener
2026-01-22 05:44:17,265 DEBUG [c.c.a.m.ClusteredAgentManagerImpl]
(AgentTaskPool-4:[ctx-198d5c99]) (logid:a36a496a) Sending Disconnect to
listener: com.cloud.storage.download.DownloadListener
2026-01-22 05:44:17,265 DEBUG [c.c.a.m.ClusteredAgentManagerImpl]
(AgentTaskPool-4:[ctx-198d5c99]) (logid:a36a496a) Sending Disconnect to
listener: com.cloud.consoleproxy.ConsoleProxyListener
2026-01-22 05:44:17,265 DEBUG [c.c.a.m.ClusteredAgentManagerImpl]
(AgentTaskPool-4:[ctx-198d5c99]) (logid:a36a496a) Sending Disconnect to
listener: com.cloud.agent.manager.AgentManagerImpl$BehindOnPingListener
2026-01-22 05:44:17,265 DEBUG [c.c.a.m.ClusteredAgentManagerImpl]
(AgentTaskPool-4:[ctx-198d5c99]) (logid:a36a496a) Sending Disconnect to
listener: com.cloud.agent.manager.AgentManagerImpl$SetHostParamsListener
2026-01-22 05:44:17,265 DEBUG [c.c.a.m.ClusteredAgentManagerImpl]
(AgentTaskPool-4:[ctx-198d5c99]) (logid:a36a496a) Sending Disconnect to
listener: com.cloud.network.SshKeysDistriMonitor
2026-01-22 05:44:17,265 DEBUG [c.c.a.m.ClusteredAgentManagerImpl]
(AgentTaskPool-4:[ctx-198d5c99]) (logid:a36a496a) Sending Disconnect to
listener: com.cloud.network.router.VirtualNetworkApplianceManagerImpl
2026-01-22 05:44:17,265 DEBUG [c.c.a.m.ClusteredAgentManagerImpl]
(AgentTaskPool-4:[ctx-198d5c99]) (logid:a36a496a) Sending Disconnect to
listener: com.cloud.network.NetworkUsageManagerImpl$DirectNetworkStatsListener
2026-01-22 05:44:17,265 DEBUG [c.c.n.NetworkUsageManagerImpl]
(AgentTaskPool-4:[ctx-198d5c99]) (logid:a36a496a) Disconnected called on [id:
6, uuid: 650ff5c6-4cf3-4d26-99a8-ce8c0b9b153f, name: kvmhost1] with status
Disconnected
2026-01-22 05:44:17,265 DEBUG [c.c.a.m.ClusteredAgentManagerImpl]
(AgentTaskPool-4:[ctx-198d5c99]) (logid:a36a496a) Sending Disconnect to
listener: com.cloud.network.SshKeysDistriMonitor
2026-01-22 05:44:17,265 DEBUG [c.c.a.m.ClusteredAgentManagerImpl]
(AgentTaskPool-4:[ctx-198d5c99]) (logid:a36a496a) Sending Disconnect to
listener: com.cloud.network.router.VpcVirtualNetworkApplianceManagerImpl
2026-01-22 05:44:17,265 DEBUG [c.c.a.m.ClusteredAgentManagerImpl]
(AgentTaskPool-4:[ctx-198d5c99]) (logid:a36a496a) Sending Disconnect to
listener: com.cloud.storage.LocalStoragePoolListener
2026-01-22 05:44:17,265 DEBUG [c.c.a.m.ClusteredAgentManagerImpl]
(AgentTaskPool-4:[ctx-198d5c99]) (logid:a36a496a) Sending Disconnect to
listener: com.cloud.storage.upload.UploadListener
2026-01-22 05:44:17,265 DEBUG [c.c.a.m.ClusteredAgentManagerImpl]
(AgentTaskPool-4:[ctx-198d5c99]) (logid:a36a496a) Sending Disconnect to
listener: com.cloud.hypervisor.kvm.discoverer.KvmServerDiscoverer
2026-01-22 05:44:17,265 DEBUG [c.c.a.m.ClusteredAgentManagerImpl]
(AgentTaskPool-4:[ctx-198d5c99]) (logid:a36a496a) Sending Disconnect to
listener: com.cloud.hypervisor.kvm.discoverer.LxcServerDiscoverer
2026-01-22 05:44:17,265 DEBUG [c.c.a.m.ClusteredAgentManagerImpl]
(AgentTaskPool-4:[ctx-198d5c99]) (logid:a36a496a) Sending Disconnect to
listener: com.cloud.hypervisor.discoverer.CustomServerDiscoverer
2026-01-22 05:44:17,265 DEBUG [c.c.a.m.ClusteredAgentManagerImpl]
(AgentTaskPool-4:[ctx-198d5c99]) (logid:a36a496a) [Resource state = Enabled,
Agent event = , Host = ShutdownRequested]
2026-01-22 05:44:17,269 DEBUG [c.c.a.m.ClusteredAgentManagerImpl]
(AgentTaskPool-4:[ctx-198d5c99]) (logid:a36a496a) Notifying other nodes of to
disconnect
```
12. Check the HOST HA state , Power state ,state , Resource State
State: Disconnected
Host HA state : Degraded
Resource state : Enabled
Poer state : Off
The host ha state remains in degraded state and doesn't move to Fenced
state, nor does the power sate
<img width="1649" height="188" alt="Image"
src="https://github.com/user-attachments/assets/83b8c769-f6c4-4a4d-8c97-171073af1c70"
/>
<img width="1639" height="731" alt="Image"
src="https://github.com/user-attachments/assets/4346287d-6d4a-4890-9613-c53879bd673b"
/>
13. Bring back the KVM HOST
```
service libvirtd start
service cloudstack-agent start
systemctl start multipathd
iptables -D OUTPUT -p tcp --dport 8250 -j DROP
iptables -D INPUT -p tcp --dport 8250 -j DROP
curl -u root:password -X POST -H "Content-Type: application/json" -d
'{"ResetType":"ForceOn"}'
http://192.168.55.167/redfish/v1/Systems/27946b59-9e44-4fa7-8e91-f3527a1ef094/Actions/ComputerSystem.Reset
```
logs
```
2026-01-22 06:01:08,816 DEBUG [o.a.c.h.HAManagerImpl] (pool-3-thread-4:[])
(logid:) Transitioned host HA state from: Degraded to: Available due to
event:HealthCheckPassed for the host Host
{"id":6,"name":"kvmhost1","type":"Routing","uuid":"650ff5c6-4cf3-4d26-99a8-ce8c0b9b153f"}
with id: 6
2026-01-22 06:01:12,796 DEBUG [o.a.c.h.HAManagerImpl] (pool-3-thread-6:[])
(logid:) HA state pre-transition:: new state=[Available], old
state=[Available], for resource id=[6], status=[true], ha config
state=[Available].
2026-01-22 06:01:16,804 DEBUG [o.a.c.h.HAManagerImpl] (pool-3-thread-8:[])
(logid:) HA state pre-transition:: new state=[Available], old
state=[Available], for resource id=[6], status=[true], ha config
state=[Available].
2026-01-22 06:01:20,809 DEBUG [o.a.c.h.HAManagerImpl] (pool-3-thread-10:[])
(logid:) HA state pre-transition:: new state=[Available], old
state=[Available], for resource id=[6], status=[true], ha config
state=[Available].
2026-01-22 06:01:24,814 DEBUG [o.a.c.h.HAManagerImpl] (pool-3-thread-12:[])
(logid:) HA state pre-transition:: new state=[Available], old
state=[Available], for resource id=[6], status=[tru
2026-01-22 06:05:14,046 DEBUG [o.a.c.u.r.RedfishClient]
(pool-2-thread-4:[ctx-961227c1]) (logid:) Retrieved System ID
'27946b59-9e44-4fa7-8e91-f3527a1ef094' with request 'GET:
http://192.168.55.167/redfish/v1/Systems'
2026-01-22 06:05:14,048 DEBUG [o.a.c.u.r.RedfishClient]
(pool-2-thread-4:[ctx-961227c1]) (logid:) Retrieved System power state 'On'
with request 'GET:
http://192.168.55.167/redfish/v1/Systems/27946b59-9e44-4fa7-8e91-f3527a1ef094'
2026-01-22 06:05:18,051 DEBUG [o.a.c.u.r.RedfishClient]
(pool-2-thread-5:[ctx-35dbc010]) (logid:) Retrieved System ID
'27946b59-9e44-4fa7-8e91-f3527a1ef094' with request 'GET:
http://192.168.55.167/redfish/v1/Systems'
```
<img width="1643" height="150" alt="Image"
src="https://github.com/user-attachments/assets/9f01f4ac-c4ea-4021-9b47-9ac828e77941"
/>
<img width="1645" height="719" alt="Image"
src="https://github.com/user-attachments/assets/bdcdacc7-1c80-4a7e-aca3-7b6d4a0ea22f"
/>
14. Test the host ha when there is a HA enabled vm running on the kvm host 1.
Launch a HA enabled vm on host1 and perform the same steps ( 9,10) and
check the management server logs
The host ha remains in degraded state and the ha enabled vm continues to
remains on kvm host 1.
The vm ha doesn't start the vm on another kvm host in the cluster
<img width="1659" height="222" alt="Image"
src="https://github.com/user-attachments/assets/6d5c7717-508f-428e-adc7-8ca9854396f8"
/>
<img width="1665" height="207" alt="Image"
src="https://github.com/user-attachments/assets/087bf97a-42b5-4da0-b43f-5ed12407d495"
/>
Ref :
https://cwiki.apache.org/confluence/display/cloudstack/host+ha
https://cwiki.apache.org/confluence/display/CLOUDSTACK/KVM+HA+with+IPMI+Fencing
https://cwiki.apache.org/confluence/display/CLOUDSTACK/Out-of-band+Management+for+CloudStack
https://cwiki.apache.org/confluence/display/CLOUDSTACK/High+Availability+Developer's+Guide
https://www.shapeblue.com/host-ha-for-kvm-hosts-in-cloudstack/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]