kiranchavala commented on issue #7543:
URL: https://github.com/apache/cloudstack/issues/7543#issuecomment-3782700146

   
   I have used Sushy Tools Emulator to test HOST HA feature and found the 
integration is not working.
   
   The host HA remains in regarded state and doesn't move to Fenced state  
   
   Steps to reproduce the Issue
   
   
   1. Have a cloudstack environment 4.22 on running on Ubuntu 24.04.3
   
   2. Have a KVM  Cluster 3 hosts 
   
   3. Testing Flow 
   
   CloudStack Management Server
           |
           |  Redfish (HTTP)
           v
   +-------------------------+
   |  Sushy Emulator         |
   |  (Redfish API)          |
   +-------------------------+
           |
           |  Virtual power state
           v
   KVM Host (libvirt/cloudstack-agent stopped / started)
   
   
   4. Run the Sushy Emulator on the mgmt server or another standalone vm
   
   Steps to install Sushy tools emulator 
   
   
   ```
   apt install -y python3-full python3-venv git jq
   python3 -m venv /opt/sushy
   source /opt/sushy/bin/activate
   pip install sushy-tools
   sushy-emulator --help
   mkdir -p /etc/sushy
   vi /etc/sushy/sushy.conf
   
   SUSHY_LISTEN_IP = "0.0.0.0"
   SUSHY_LISTEN_PORT = "80"
   SUSHY_SSL = False
   SUSHY_AUTH_FILE = "/etc/sushy/auth.conf"
   
   vi /etc/sushy/auth.conf
   root:password
   
   chmod 600 /etc/sushy/auth.conf
   
   source /opt/sushy/bin/activate
   
   
   sushy-emulator --config /etc/sushy/sushy.conf
     
   sushy-emulator --config /etc/sushy/sushy.conf -i 0.0.0.0 -p 80 --fake
   
   
   curl -u root:password http://192.168.55.167/redfish/v1/Systems
   
   curl -u root:password 
http://192.168.55.167/redfish/v1/Systems/27946b59-9e44-4fa7-8e91-f3527a1ef094
   
   
   
   curl -u root:password \
   
http://192.168.55.167/redfish/v1/Systems/27946b59-9e44-4fa7-8e91-f3527a1ef094 \
   | jq '.Actions."#ComputerSystem.Reset"."[email protected]"'
     % Total    % Received % Xferd  Average Speed   Time    Time     Time  
Current
                                    Dload  Upload   Total   Spent    Left  Speed
   100  2821  100  2821    0     0   787k      0 --:--:-- --:--:-- --:--:--  
918k
   [
     "On",
     "ForceOff",
     "GracefulShutdown",
     "GracefulRestart",
     "ForceRestart",
     "Nmi",
     "ForceOn"
   ]
   
   Check the current powerstate 
   
   root@acs420:~# curl -u root:password 
http://192.168.55.167/redfish/v1/Systems/27946b59-9e44-4fa7-8e91-f3527a1ef094 | 
jq .PowerState
     % Total    % Received % Xferd  Average Speed   Time    Time     Time  
Current
                                    Dload  Upload   Total   Spent    Left  Speed
   100  2822  100  2822    0     0  1068k      0 --:--:-- --:--:-- --:--:-- 
1377k
   "Off"
   
   To ForceOn
   
   curl -u root:password   -X POST   -H "Content-Type: application/json"   -d 
'{"ResetType":"ForceOn"}'   
http://192.168.55.167/redfish/v1/Systems/27946b59-9e44-4fa7-8e91-f3527a1ef094/Actions/ComputerSystem.Reset
   
   
   Check the powerstate if the change got applied 
   
   root@acs420:~# curl -u root:password 
http://192.168.55.167/redfish/v1/Systems/27946b59-9e44-4fa7-8e91-f3527a1ef094 | 
jq .PowerState
     % Total    % Received % Xferd  Average Speed   Time    Time     Time  
Current
                                    Dload  Upload   Total   Spent    Left  Speed
   100  2822  100  2822    0     0  1068k      0 --:--:-- --:--:-- --:--:-- 
1377k
   "On"
   
   
   
   To Forceshutdown
   
   
   curl -u root:password   -X POST   -H "Content-Type: application/json"   -d 
'{"ResetType":"ForceOff"}'   
http://192.168.55.167/redfish/v1/Systems/27946b59-9e44-4fa7-8e91-f3527a1ef094/Actions/ComputerSystem.Reset
   
   
   ```
   
   
   5. On KVM Host 1, Perform the Following actions of 
   
   Enable HA 
   
   <img width="1603" height="510" alt="Image" 
src="https://github.com/user-attachments/assets/0c5ea492-92f0-4103-836f-21abdfeb4da4";
 />
   
   Configure HA 
   
   <img width="1611" height="561" alt="Image" 
src="https://github.com/user-attachments/assets/1499ee14-cfef-4615-b8d9-d5f3f2710c52";
 />
   
   Configure OOBM
   
   <img width="1614" height="727" alt="Image" 
src="https://github.com/user-attachments/assets/6181d27b-57e0-4ca4-8c3f-9e464899813c";
 />
   
   
   
   6. Check the Power state on host , it should be On
   
   <img width="1638" height="226" alt="Image" 
src="https://github.com/user-attachments/assets/5b5fc7da-c964-4bf1-9f61-eb9ca3868565";
 />
   
   7.  Test the OOBM Power actions if they are working fine 
   
   <img width="691" height="623" alt="Image" 
src="https://github.com/user-attachments/assets/4945dcf5-210b-48fc-bd30-aa3f3a2dc5ab";
 />
   
   
   
   Except Power-Cycle ( since its not supported by the emulator ) all the power 
actions works
   
   LOGS
   
   ```
   
   On
   
   root@acs420:~# cat  /var/log/cloudstack/management/management-server.log 
|grep -i "logid:d527953b"
   2026-01-21 07:57:58,178 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl$5] 
(API-Job-Executor-12:[ctx-011c8120, job-2767]) (logid:d527953b) Executing 
AsyncJob 
{"accountId":2,"cmd":"org.apache.cloudstack.api.command.admin.outofbandmanagement.IssueOutOfBandManagementPowerActionCmd","cmdInfo":"{\"response\":\"json\",\"ctxUserId\":\"2\",\"sessionkey\":\"npHS_wT26VxYZe77mt5y3EbjedA\",\"action\":\"ON\",\"hostid\":\"650ff5c6-4cf3-4d26-99a8-ce8c0b9b153f\",\"httpmethod\":\"POST\",\"ctxStartEventId\":\"5110\",\"ctxDetails\":\"{\\\"interface
 
com.cloud.host.Host\\\":\\\"650ff5c6-4cf3-4d26-99a8-ce8c0b9b153f\\\"}\",\"ctxAccountId\":\"2\",\"cmdEventType\":\"HOST.OOBM.ACTION\"}","cmdVersion":0,"completeMsid":null,"created":null,"id":2767,"initMsid":206863097656310,"instanceId":6,"instanceType":"Host","lastPolled":null,"lastUpdated":null,"processStatus":0,"removed":null,"result":null,"resultCode":0,"status":"IN_PROGRESS","userId":2,"uuid":"d527953b-589a-46ce-b22c-edad6b113aaf"}
   2026-01-21 07:57:58,185 DEBUG [o.a.c.u.r.RedfishClient] 
(API-Job-Executor-12:[ctx-011c8120, job-2767, ctx-3ef3b976]) (logid:d527953b) 
Retrieved System ID 'System-1' with request 'GET: 
http://192.168.55.167/redfish/v1/Systems'
   2026-01-21 07:57:58,186 DEBUG [o.a.c.u.r.RedfishClient] 
(API-Job-Executor-12:[ctx-011c8120, job-2767, ctx-3ef3b976]) (logid:d527953b) 
Sending ComputerSystem.Reset Command 'On' to host '192.168.55.167' with request 
'POST 
http://192.168.55.167/redfish/v1/Systems/System-1/Actions/ComputerSystem.Reset'
   2026-01-21 07:57:58,191 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] 
(API-Job-Executor-12:[ctx-011c8120, job-2767, ctx-3ef3b976]) (logid:d527953b) 
Complete async job-2767, jobStatus: SUCCEEDED, resultCode: 0, result: 
org.apache.cloudstack.api.response.OutOfBandManagementResponse/outofbandmanagement/{"hostid":"650ff5c6-4cf3-4d26-99a8-ce8c0b9b153f","powerstate":"Off","enabled":"true","driver":"redfish","address":"192.168.55.167","port":"80","username":"root","password":"p*****","action":"ON","description":"200","status":"true"}
   2026-01-21 07:57:58,192 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] 
(API-Job-Executor-12:[ctx-011c8120, job-2767, ctx-3ef3b976]) (logid:d527953b) 
Publish async job-2767 complete on message bus
   2026-01-21 07:57:58,192 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] 
(API-Job-Executor-12:[ctx-011c8120, job-2767, ctx-3ef3b976]) (logid:d527953b) 
Wake up jobs related to job-2767
   2026-01-21 07:57:58,192 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] 
(API-Job-Executor-12:[ctx-011c8120, job-2767, ctx-3ef3b976]) (logid:d527953b) 
Update db status for job-2767
   2026-01-21 07:57:58,192 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] 
(API-Job-Executor-12:[ctx-011c8120, job-2767, ctx-3ef3b976]) (logid:d527953b) 
Wake up jobs joined with job-2767 and disjoin all subjobs created from job- 2767
   2026-01-21 07:57:58,195 DEBUG [c.c.a.ApiServer] 
(API-Job-Executor-12:[ctx-011c8120, job-2767, ctx-3ef3b976]) (logid:d527953b) 
Retrieved cmdEventType from job info: HOST.OOBM.ACTION
   2026-01-21 07:57:58,196 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl$5] 
(API-Job-Executor-12:[ctx-011c8120, job-2767]) (logid:d527953b) Done executing 
org.apache.cloudstack.api.command.admin.outofbandmanagement.IssueOutOfBandManagementPowerActionCmd
 for job-2767
   2026-01-21 07:57:58,196 INFO  [o.a.c.f.j.i.AsyncJobMonitor] 
(API-Job-Executor-12:[ctx-011c8120, job-2767]) (logid:d527953b) Remove job-2767 
from job monitoring
   
   OFF
   
   root@acs420:~# cat  /var/log/cloudstack/management/management-server.log 
|grep -i "logid:f1c83fdb"
   2026-01-21 08:00:09,567 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl$5] 
(API-Job-Executor-13:[ctx-b6f50fb5, job-2768]) (logid:f1c83fdb) Executing 
AsyncJob 
{"accountId":2,"cmd":"org.apache.cloudstack.api.command.admin.outofbandmanagement.IssueOutOfBandManagementPowerActionCmd","cmdInfo":"{\"response\":\"json\",\"ctxUserId\":\"2\",\"sessionkey\":\"npHS_wT26VxYZe77mt5y3EbjedA\",\"action\":\"OFF\",\"hostid\":\"650ff5c6-4cf3-4d26-99a8-ce8c0b9b153f\",\"httpmethod\":\"POST\",\"ctxStartEventId\":\"5113\",\"ctxDetails\":\"{\\\"interface
 
com.cloud.host.Host\\\":\\\"650ff5c6-4cf3-4d26-99a8-ce8c0b9b153f\\\"}\",\"ctxAccountId\":\"2\",\"cmdEventType\":\"HOST.OOBM.ACTION\"}","cmdVersion":0,"completeMsid":null,"created":null,"id":2768,"initMsid":206863097656310,"instanceId":6,"instanceType":"Host","lastPolled":null,"lastUpdated":null,"processStatus":0,"removed":null,"result":null,"resultCode":0,"status":"IN_PROGRESS","userId":2,"uuid":"f1c83fdb-d274-4025-b16c-6e609f82ff70"}
   2026-01-21 08:00:09,577 DEBUG [o.a.c.u.r.RedfishClient] 
(API-Job-Executor-13:[ctx-b6f50fb5, job-2768, ctx-fcf304e5]) (logid:f1c83fdb) 
Retrieved System ID 'System-1' with request 'GET: 
http://192.168.55.167/redfish/v1/Systems'
   2026-01-21 08:00:09,578 DEBUG [o.a.c.u.r.RedfishClient] 
(API-Job-Executor-13:[ctx-b6f50fb5, job-2768, ctx-fcf304e5]) (logid:f1c83fdb) 
Sending ComputerSystem.Reset Command 'GracefulShutdown' to host 
'192.168.55.167' with request 'POST 
http://192.168.55.167/redfish/v1/Systems/System-1/Actions/ComputerSystem.Reset'
   2026-01-21 08:00:09,583 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] 
(API-Job-Executor-13:[ctx-b6f50fb5, job-2768, ctx-fcf304e5]) (logid:f1c83fdb) 
Complete async job-2768, jobStatus: SUCCEEDED, resultCode: 0, result: 
org.apache.cloudstack.api.response.OutOfBandManagementResponse/outofbandmanagement/{"hostid":"650ff5c6-4cf3-4d26-99a8-ce8c0b9b153f","powerstate":"On","enabled":"true","driver":"redfish","address":"192.168.55.167","port":"80","username":"root","password":"p*****","action":"OFF","description":"200","status":"true"}
   2026-01-21 08:00:09,583 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] 
(API-Job-Executor-13:[ctx-b6f50fb5, job-2768, ctx-fcf304e5]) (logid:f1c83fdb) 
Publish async job-2768 complete on message bus
   2026-01-21 08:00:09,583 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] 
(API-Job-Executor-13:[ctx-b6f50fb5, job-2768, ctx-fcf304e5]) (logid:f1c83fdb) 
Wake up jobs related to job-2768
   2026-01-21 08:00:09,583 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] 
(API-Job-Executor-13:[ctx-b6f50fb5, job-2768, ctx-fcf304e5]) (logid:f1c83fdb) 
Update db status for job-2768
   2026-01-21 08:00:09,583 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] 
(API-Job-Executor-13:[ctx-b6f50fb5, job-2768, ctx-fcf304e5]) (logid:f1c83fdb) 
Wake up jobs joined with job-2768 and disjoin all subjobs created from job- 2768
   2026-01-21 08:00:09,587 DEBUG [c.c.a.ApiServer] 
(API-Job-Executor-13:[ctx-b6f50fb5, job-2768, ctx-fcf304e5]) (logid:f1c83fdb) 
Retrieved cmdEventType from job info: HOST.OOBM.ACTION
   2026-01-21 08:00:09,588 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl$5] 
(API-Job-Executor-13:[ctx-b6f50fb5, job-2768]) (logid:f1c83fdb) Done executing 
org.apache.cloudstack.api.command.admin.outofbandmanagement.IssueOutOfBandManagementPowerActionCmd
 for job-2768
   2026-01-21 08:00:09,588 INFO  [o.a.c.f.j.i.AsyncJobMonitor] 
(API-Job-Executor-13:[ctx-b6f50fb5, job-2768]) (logid:f1c83fdb) Remove job-2768 
from job monitoring
   
   
   Cycle 
   
   2026-01-21 08:01:59,844 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl$5] 
(API-Job-Executor-14:[ctx-d7ff9438, job-2769]) (logid:c78789d9) Executing 
AsyncJob 
{"accountId":2,"cmd":"org.apache.cloudstack.api.command.admin.outofbandmanagement.IssueOutOfBandManagementPowerActionCmd","cmdInfo":"{\"response\":\"json\",\"ctxUserId\":\"2\",\"sessionkey\":\"npHS_wT26VxYZe77mt5y3EbjedA\",\"action\":\"CYCLE\",\"hostid\":\"650ff5c6-4cf3-4d26-99a8-ce8c0b9b153f\",\"httpmethod\":\"POST\",\"ctxStartEventId\":\"5118\",\"ctxDetails\":\"{\\\"interface
 
com.cloud.host.Host\\\":\\\"650ff5c6-4cf3-4d26-99a8-ce8c0b9b153f\\\"}\",\"ctxAccountId\":\"2\",\"cmdEventType\":\"HOST.OOBM.ACTION\"}","cmdVersion":0,"completeMsid":null,"created":null,"id":2769,"initMsid":206863097656310,"instanceId":6,"instanceType":"Host","lastPolled":null,"lastUpdated":null,"processStatus":0,"removed":null,"result":null,"resultCode":0,"status":"IN_PROGRESS","userId":2,"uuid":"c78789d9-19a1-4525-89f9-120d1c7c764a"}
   2026-01-21 08:01:59,853 DEBUG [o.a.c.u.r.RedfishClient] 
(API-Job-Executor-14:[ctx-d7ff9438, job-2769, ctx-00d47bde]) (logid:c78789d9) 
Retrieved System ID 'System-1' with request 'GET: 
http://192.168.55.167/redfish/v1/Systems'
   2026-01-21 08:01:59,859 ERROR [c.c.a.ApiAsyncJobDispatcher] 
(API-Job-Executor-14:[ctx-d7ff9438, job-2769]) (logid:c78789d9) Unexpected 
exception while executing 
org.apache.cloudstack.api.command.admin.outofbandmanagement.IssueOutOfBandManagementPowerActionCmd
 org.apache.cloudstack.utils.redfish.RedfishException: Failed to execute System 
power command for host by performing 'POST' request on URL 
'http://192.168.55.167/redfish/v1/Systems/System-1/Actions/ComputerSystem.Reset'
 and host address '192.168.55.167'. The expected HTTP status code is '2XX' but 
it got '400'.
   2026-01-21 08:01:59,859 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] 
(API-Job-Executor-14:[ctx-d7ff9438, job-2769]) (logid:c78789d9) Complete async 
job-2769, jobStatus: FAILED, resultCode: 530, result: 
org.apache.cloudstack.api.response.ExceptionResponse/null/{"uuidList":[],"errorcode":"530","errortext":"Failed
 to execute System power command for host by performing 'POST' request on URL 
'http://192.168.55.167/redfish/v1/Systems/System-1/Actions/ComputerSystem.Reset'
 and host address '192.168.55.167'. The expected HTTP status code is '2XX' but 
it got '400'."}
   2026-01-21 08:01:59,860 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] 
(API-Job-Executor-14:[ctx-d7ff9438, job-2769]) (logid:c78789d9) Publish async 
job-2769 complete on message bus
   2026-01-21 08:01:59,860 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] 
(API-Job-Executor-14:[ctx-d7ff9438, job-2769]) (logid:c78789d9) Wake up jobs 
related to job-2769
   2026-01-21 08:01:59,860 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] 
(API-Job-Executor-14:[ctx-d7ff9438, job-2769]) (logid:c78789d9) Update db 
status for job-2769
   2026-01-21 08:01:59,860 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] 
(API-Job-Executor-14:[ctx-d7ff9438, job-2769]) (logid:c78789d9) Wake up jobs 
joined with job-2769 and disjoin all subjobs created from job- 2769
   2026-01-21 08:01:59,864 DEBUG [c.c.a.ApiServer] 
(API-Job-Executor-14:[ctx-d7ff9438, job-2769]) (logid:c78789d9) Retrieved 
cmdEventType from job info: HOST.OOBM.ACTION
   2026-01-21 08:01:59,865 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl$5] 
(API-Job-Executor-14:[ctx-d7ff9438, job-2769]) (logid:c78789d9) Done executing 
org.apache.cloudstack.api.command.admin.outofbandmanagement.IssueOutOfBandManagementPowerActionCmd
 for job-2769
   2026-01-21 08:01:59,865 INFO  [o.a.c.f.j.i.AsyncJobMonitor] 
(API-Job-Executor-14:[ctx-d7ff9438, job-2769]) (logid:c78789d9) Remove job-2769 
from job monitoring
   
   Reset
   
   root@acs420:~# cat  /var/log/cloudstack/management/management-server.log 
|grep -i "logid:354c4b59"
   2026-01-21 08:04:25,665 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl$5] 
(API-Job-Executor-15:[ctx-ec99bb43, job-2770]) (logid:354c4b59) Executing 
AsyncJob 
{"accountId":2,"cmd":"org.apache.cloudstack.api.command.admin.outofbandmanagement.IssueOutOfBandManagementPowerActionCmd","cmdInfo":"{\"response\":\"json\",\"ctxUserId\":\"2\",\"sessionkey\":\"npHS_wT26VxYZe77mt5y3EbjedA\",\"action\":\"RESET\",\"hostid\":\"650ff5c6-4cf3-4d26-99a8-ce8c0b9b153f\",\"httpmethod\":\"POST\",\"ctxStartEventId\":\"5124\",\"ctxDetails\":\"{\\\"interface
 
com.cloud.host.Host\\\":\\\"650ff5c6-4cf3-4d26-99a8-ce8c0b9b153f\\\"}\",\"ctxAccountId\":\"2\",\"cmdEventType\":\"HOST.OOBM.ACTION\"}","cmdVersion":0,"completeMsid":null,"created":null,"id":2770,"initMsid":206863097656310,"instanceId":6,"instanceType":"Host","lastPolled":null,"lastUpdated":null,"processStatus":0,"removed":null,"result":null,"resultCode":0,"status":"IN_PROGRESS","userId":2,"uuid":"354c4b59-6446-487f-b507-d93b8b605733"}
   2026-01-21 08:04:25,673 DEBUG [o.a.c.u.r.RedfishClient] 
(API-Job-Executor-15:[ctx-ec99bb43, job-2770, ctx-5a2162f3]) (logid:354c4b59) 
Retrieved System ID 'System-1' with request 'GET: 
http://192.168.55.167/redfish/v1/Systems'
   2026-01-21 08:04:25,674 DEBUG [o.a.c.u.r.RedfishClient] 
(API-Job-Executor-15:[ctx-ec99bb43, job-2770, ctx-5a2162f3]) (logid:354c4b59) 
Sending ComputerSystem.Reset Command 'ForceRestart' to host '192.168.55.167' 
with request 'POST 
http://192.168.55.167/redfish/v1/Systems/System-1/Actions/ComputerSystem.Reset'
   2026-01-21 08:04:25,679 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] 
(API-Job-Executor-15:[ctx-ec99bb43, job-2770, ctx-5a2162f3]) (logid:354c4b59) 
Complete async job-2770, jobStatus: SUCCEEDED, resultCode: 0, result: 
org.apache.cloudstack.api.response.OutOfBandManagementResponse/outofbandmanagement/{"hostid":"650ff5c6-4cf3-4d26-99a8-ce8c0b9b153f","powerstate":"On","enabled":"true","driver":"redfish","address":"192.168.55.167","port":"80","username":"root","password":"p*****","action":"RESET","description":"200","status":"true"}
   2026-01-21 08:04:25,679 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] 
(API-Job-Executor-15:[ctx-ec99bb43, job-2770, ctx-5a2162f3]) (logid:354c4b59) 
Publish async job-2770 complete on message bus
   2026-01-21 08:04:25,679 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] 
(API-Job-Executor-15:[ctx-ec99bb43, job-2770, ctx-5a2162f3]) (logid:354c4b59) 
Wake up jobs related to job-2770
   2026-01-21 08:04:25,679 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] 
(API-Job-Executor-15:[ctx-ec99bb43, job-2770, ctx-5a2162f3]) (logid:354c4b59) 
Update db status for job-2770
   2026-01-21 08:04:25,679 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] 
(API-Job-Executor-15:[ctx-ec99bb43, job-2770, ctx-5a2162f3]) (logid:354c4b59) 
Wake up jobs joined with job-2770 and disjoin all subjobs created from job- 2770
   2026-01-21 08:04:25,683 DEBUG [c.c.a.ApiServer] 
(API-Job-Executor-15:[ctx-ec99bb43, job-2770, ctx-5a2162f3]) (logid:354c4b59) 
Retrieved cmdEventType from job info: HOST.OOBM.ACTION
   2026-01-21 08:04:25,684 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl$5] 
(API-Job-Executor-15:[ctx-ec99bb43, job-2770]) (logid:354c4b59) Done executing 
org.apache.cloudstack.api.command.admin.outofbandmanagement.IssueOutOfBandManagementPowerActionCmd
 for job-2770
   2026-01-21 08:04:25,684 INFO  [o.a.c.f.j.i.AsyncJobMonitor] 
(API-Job-Executor-15:[ctx-ec99bb43, job-2770]) (logid:354c4b59) Remove job-2770 
from job monitoring
   
   
   status 
   
   2026-01-21 08:05:18,784 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl$5] 
(API-Job-Executor-16:[ctx-39f7b528, job-2771]) (logid:820aa1fd) Executing 
AsyncJob 
{"accountId":2,"cmd":"org.apache.cloudstack.api.command.admin.outofbandmanagement.IssueOutOfBandManagementPowerActionCmd","cmdInfo":"{\"response\":\"json\",\"ctxUserId\":\"2\",\"sessionkey\":\"npHS_wT26VxYZe77mt5y3EbjedA\",\"action\":\"STATUS\",\"hostid\":\"650ff5c6-4cf3-4d26-99a8-ce8c0b9b153f\",\"httpmethod\":\"POST\",\"ctxStartEventId\":\"5127\",\"ctxDetails\":\"{\\\"interface
 
com.cloud.host.Host\\\":\\\"650ff5c6-4cf3-4d26-99a8-ce8c0b9b153f\\\"}\",\"ctxAccountId\":\"2\",\"cmdEventType\":\"HOST.OOBM.ACTION\"}","cmdVersion":0,"completeMsid":null,"created":null,"id":2771,"initMsid":206863097656310,"instanceId":6,"instanceType":"Host","lastPolled":null,"lastUpdated":null,"processStatus":0,"removed":null,"result":null,"resultCode":0,"status":"IN_PROGRESS","userId":2,"uuid":"820aa1fd-4c2e-46f9-b71e-efef540fc9e5"}
   2026-01-21 08:05:18,791 DEBUG [o.a.c.u.r.RedfishClient] 
(API-Job-Executor-16:[ctx-39f7b528, job-2771, ctx-78a6c8bb]) (logid:820aa1fd) 
Retrieved System ID 'System-1' with request 'GET: 
http://192.168.55.167/redfish/v1/Systems'
   2026-01-21 08:05:18,792 DEBUG [o.a.c.u.r.RedfishClient] 
(API-Job-Executor-16:[ctx-39f7b528, job-2771, ctx-78a6c8bb]) (logid:820aa1fd) 
Retrieved System power state 'On' with request 'GET: 
http://192.168.55.167/redfish/v1/Systems/System-1'
   2026-01-21 08:05:18,798 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] 
(API-Job-Executor-16:[ctx-39f7b528, job-2771, ctx-78a6c8bb]) (logid:820aa1fd) 
Complete async job-2771, jobStatus: SUCCEEDED, resultCode: 0, result: 
org.apache.cloudstack.api.response.OutOfBandManagementResponse/outofbandmanagement/{"hostid":"650ff5c6-4cf3-4d26-99a8-ce8c0b9b153f","powerstate":"On","enabled":"true","driver":"redfish","address":"192.168.55.167","port":"80","username":"root","password":"p*****","action":"STATUS","description":"200","status":"true"}
   2026-01-21 08:05:18,798 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] 
(API-Job-Executor-16:[ctx-39f7b528, job-2771, ctx-78a6c8bb]) (logid:820aa1fd) 
Publish async job-2771 complete on message bus
   2026-01-21 08:05:18,798 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] 
(API-Job-Executor-16:[ctx-39f7b528, job-2771, ctx-78a6c8bb]) (logid:820aa1fd) 
Wake up jobs related to job-2771
   2026-01-21 08:05:18,798 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] 
(API-Job-Executor-16:[ctx-39f7b528, job-2771, ctx-78a6c8bb]) (logid:820aa1fd) 
Update db status for job-2771
   2026-01-21 08:05:18,799 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] 
(API-Job-Executor-16:[ctx-39f7b528, job-2771, ctx-78a6c8bb]) (logid:820aa1fd) 
Wake up jobs joined with job-2771 and disjoin all subjobs created from job- 2771
   2026-01-21 08:05:18,802 DEBUG [c.c.a.ApiServer] 
(API-Job-Executor-16:[ctx-39f7b528, job-2771, ctx-78a6c8bb]) (logid:820aa1fd) 
Retrieved cmdEventType from job info: HOST.OOBM.ACTION
   2026-01-21 08:05:18,803 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl$5] 
(API-Job-Executor-16:[ctx-39f7b528, job-2771]) (logid:820aa1fd) Done executing 
org.apache.cloudstack.api.command.admin.outofbandmanagement.IssueOutOfBandManagementPowerActionCmd
 for job-2771
   2026-01-21 08:05:18,803 INFO  [o.a.c.f.j.i.AsyncJobMonitor] 
(API-Job-Executor-16:[ctx-39f7b528, job-2771]) (logid:820aa1fd) Remove job-2771 
from job monitoring
   
   
   
   Soft
   
   
   root@acs420:~# cat  /var/log/cloudstack/management/management-server.log 
|grep -i "logid:0295ebe0"
   2026-01-21 07:54:52,937 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl$5] 
(API-Job-Executor-11:[ctx-3725e9c6, job-2766]) (logid:0295ebe0) Executing 
AsyncJob 
{"accountId":2,"cmd":"org.apache.cloudstack.api.command.admin.outofbandmanagement.IssueOutOfBandManagementPowerActionCmd","cmdInfo":"{\"response\":\"json\",\"ctxUserId\":\"2\",\"sessionkey\":\"npHS_wT26VxYZe77mt5y3EbjedA\",\"action\":\"SOFT\",\"hostid\":\"650ff5c6-4cf3-4d26-99a8-ce8c0b9b153f\",\"httpmethod\":\"POST\",\"ctxStartEventId\":\"5105\",\"ctxDetails\":\"{\\\"interface
 
com.cloud.host.Host\\\":\\\"650ff5c6-4cf3-4d26-99a8-ce8c0b9b153f\\\"}\",\"ctxAccountId\":\"2\",\"cmdEventType\":\"HOST.OOBM.ACTION\"}","cmdVersion":0,"completeMsid":null,"created":null,"id":2766,"initMsid":206863097656310,"instanceId":6,"instanceType":"Host","lastPolled":null,"lastUpdated":null,"processStatus":0,"removed":null,"result":null,"resultCode":0,"status":"IN_PROGRESS","userId":2,"uuid":"0295ebe0-cc53-4e4d-b514-06d62ebce244"}
   2026-01-21 07:54:52,948 DEBUG [o.a.c.u.r.RedfishClient] 
(API-Job-Executor-11:[ctx-3725e9c6, job-2766, ctx-ffcf43e3]) (logid:0295ebe0) 
Retrieved System ID 'System-1' with request 'GET: 
http://192.168.55.167/redfish/v1/Systems'
   2026-01-21 07:54:52,950 DEBUG [o.a.c.u.r.RedfishClient] 
(API-Job-Executor-11:[ctx-3725e9c6, job-2766, ctx-ffcf43e3]) (logid:0295ebe0) 
Sending ComputerSystem.Reset Command 'GracefulShutdown' to host 
'192.168.55.167' with request 'POST 
http://192.168.55.167/redfish/v1/Systems/System-1/Actions/ComputerSystem.Reset'
   2026-01-21 07:54:52,955 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] 
(API-Job-Executor-11:[ctx-3725e9c6, job-2766, ctx-ffcf43e3]) (logid:0295ebe0) 
Complete async job-2766, jobStatus: SUCCEEDED, resultCode: 0, result: 
org.apache.cloudstack.api.response.OutOfBandManagementResponse/outofbandmanagement/{"hostid":"650ff5c6-4cf3-4d26-99a8-ce8c0b9b153f","powerstate":"Off","enabled":"true","driver":"redfish","address":"192.168.55.167","port":"80","username":"root","password":"p*****","action":"SOFT","description":"200","status":"true"}
   2026-01-21 07:54:52,956 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] 
(API-Job-Executor-11:[ctx-3725e9c6, job-2766, ctx-ffcf43e3]) (logid:0295ebe0) 
Publish async job-2766 complete on message bus
   2026-01-21 07:54:52,956 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] 
(API-Job-Executor-11:[ctx-3725e9c6, job-2766, ctx-ffcf43e3]) (logid:0295ebe0) 
Wake up jobs related to job-2766
   2026-01-21 07:54:52,956 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] 
(API-Job-Executor-11:[ctx-3725e9c6, job-2766, ctx-ffcf43e3]) (logid:0295ebe0) 
Update db status for job-2766
   2026-01-21 07:54:52,956 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] 
(API-Job-Executor-11:[ctx-3725e9c6, job-2766, ctx-ffcf43e3]) (logid:0295ebe0) 
Wake up jobs joined with job-2766 and disjoin all subjobs created from job- 2766
   2026-01-21 07:54:52,960 DEBUG [c.c.a.ApiServer] 
(API-Job-Executor-11:[ctx-3725e9c6, job-2766, ctx-ffcf43e3]) (logid:0295ebe0) 
Retrieved cmdEventType from job info: HOST.OOBM.ACTION
   2026-01-21 07:54:52,961 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl$5] 
(API-Job-Executor-11:[ctx-3725e9c6, job-2766]) (logid:0295ebe0) Done executing 
org.apache.cloudstack.api.command.admin.outofbandmanagement.IssueOutOfBandManagementPowerActionCmd
 for job-2766
   2026-01-21 07:54:52,961 INFO  [o.a.c.f.j.i.AsyncJobMonitor] 
(API-Job-Executor-11:[ctx-3725e9c6, job-2766]) (logid:0295ebe0) Remove job-2766 
from job monitoring
   
   
   ```
   
   
   8. Configure the following global settings
   
   <img width="1643" height="854" alt="Image" 
src="https://github.com/user-attachments/assets/6ad3ce53-ad01-46a9-bdd0-b3c46638242b";
 />
   
   ```
   
   commands.timeout :  CheckHealthCommand=5,CheckOnHostCommand=5
   kvm.ha.fence.on.storage.heartbeat.failure to true 
   kvm.ha.fence.on.storage.heartbeat.failure : 1 
   kvm.ha.activity.check.interval  : 10
   kvm.ha.activity.check.max.attempts : 1
   kvm.ha.activity.check.timeout : 60 
   kvm.ha.degraded.max.period : 300 
   
   
   ```
   
   9. Test the host ha when there are no vm's running on the host.
   
   Stop the cloudstack agent on kvm host 1
   
   ```
   
   service libvirtd stop
   service cloudstack-agent stop 
   systemctl stop multipathd
   
   Block the  iptables rules
   
   iptables -I OUTPUT -p tcp --dport 8250 -j DROP
   iptables -I INPUT  -p tcp --dport 8250 -j DROP
   
   ```
   
   
   10. Turn off power state manually
   
   ```
   curl -u root:password   -X POST   -H "Content-Type: application/json"   -d 
'{"ResetType":"ForceOff"}'   
http://192.168.55.167/redfish/v1/Systems/27946b59-9e44-4fa7-8e91-f3527a1ef094/Actions/ComputerSystem.Reset
   
   curl -u root:password 
http://192.168.55.167/redfish/v1/Systems/27946b59-9e44-4fa7-8e91-f3527a1ef094 | 
jq .PowerState
     % Total    % Received % Xferd  Average Speed   Time    Time     Time  
Current
                                    Dload  Upload   Total   Spent    Left  Speed
   100  2822  100  2822    0     0   217k      0 --:--:-- --:--:-- --:--:--  
229k
   "Off"
   
   ```
   
   11. Check the managment server logs 
   
   Every five minutes it checks the ha state based on kvm.ha.degraded.max.period
   
   
   
   ```
   tail -f /var/log/cloudstack/management/management-server.log  | egrep -i 
"o.a.c.h.HAManagerImpl]:
   
   2026-01-22 05:44:07,386 DEBUG [o.a.c.h.HAManagerImpl] (pool-3-thread-43:[]) 
(logid:) HA state pre-transition:: new state=[Available], old 
state=[Available], for resource id=[6], status=[true], ha config 
state=[Available].
   2026-01-22 05:44:11,392 DEBUG [o.a.c.h.HAManagerImpl] (pool-3-thread-45:[]) 
(logid:) HA state pre-transition:: new state=[Available], old 
state=[Available], for resource id=[6], status=[true], ha config 
state=[Available].
   2026-01-22 05:44:15,395 DEBUG [o.a.c.h.HAManagerImpl] (pool-3-thread-47:[]) 
(logid:) HA state pre-transition:: new state=[Available], old 
state=[Available], for resource id=[6], status=[true], ha config 
state=[Available].
   2026-01-22 05:44:19,413 DEBUG [o.a.c.h.HAManagerImpl] (pool-3-thread-49:[]) 
(logid:) HA state post-transition:: new state=[Suspect], old state=[Available], 
for resource id=[6], status=[true], ha config state=[Suspect].
   2026-01-22 05:44:19,416 DEBUG [o.a.c.h.HAManagerImpl] (pool-3-thread-49:[]) 
(logid:) Transitioned host HA state from: Available to: Suspect due to 
event:HealthCheckFailed for the host Host 
{"id":6,"name":"kvmhost1","type":"Routing","uuid":"650ff5c6-4cf3-4d26-99a8-ce8c0b9b153f"}
 with id: 6
   2026-01-22 05:44:23,368 DEBUG [o.a.c.h.HAManagerImpl] 
(BackgroundTaskPollManager-3:[ctx-05db2251]) (logid:ad0ef2d6) HA state 
post-transition:: new state=[Checking], old state=[Suspect], for resource 
id=[6], status=[true], ha config state=[Checking].
   2026-01-22 05:44:23,371 DEBUG [o.a.c.h.HAManagerImpl] 
(BackgroundTaskPollManager-3:[ctx-05db2251]) (logid:ad0ef2d6) Transitioned host 
HA state from: Suspect to: Checking due to event:PerformActivityCheck for the 
host Host 
{"id":6,"name":"kvmhost1","type":"Routing","uuid":"650ff5c6-4cf3-4d26-99a8-ce8c0b9b153f"}
 with id: 6
   2026-01-22 05:44:23,374 DEBUG [o.a.c.h.HAManagerImpl] (pool-4-thread-3:[]) 
(logid:) HA state post-transition:: new state=[Degraded], old state=[Checking], 
for resource id=[6], status=[true], ha config state=[Degraded].
   2026-01-22 05:44:23,377 DEBUG [o.a.c.h.HAManagerImpl] (pool-4-thread-3:[]) 
(logid:) Transitioned host HA state from: Checking to: Degraded due to 
event:ActivityCheckFailureUnderThresholdRatio for the host Host 
{"id":6,"name":"kvmhost1","type":"Routing","uuid":"650ff5c6-4cf3-4d26-99a8-ce8c0b9b153f"}
 with id: 6
   2026-01-22 05:44:23,406 DEBUG [o.a.c.h.HAManagerImpl] (pool-3-thread-2:[]) 
(logid:) HA state pre-transition:: new state=[Degraded], old state=[Degraded], 
for resource id=[6], status=[true], ha config state=[Degraded].
   2026-01-22 05:44:27,427 DEBUG [o.a.c.h.HAManagerImpl] (pool-3-thread-4:[]) 
(logid:) HA state pre-transition:: new state=[Degraded], old state=[Degraded], 
for resource id=[6], status=[true], ha config state=[Degraded].
   
   
   2026-01-22 05:49:23,771 DEBUG [o.a.c.h.HAManagerImpl] 
(BackgroundTaskPollManager-6:[ctx-7cf7f693]) (logid:28b9fc9c) HA state 
post-transition:: new state=[Suspect], old state=[Degraded], for resource 
id=[6], status=[true], ha config state=[Suspect].
   2026-01-22 05:49:23,773 DEBUG [o.a.c.h.HAManagerImpl] 
(BackgroundTaskPollManager-6:[ctx-7cf7f693]) (logid:28b9fc9c) Transitioned host 
HA state from: Degraded to: Suspect due to 
event:PeriodicRecheckResourceActivity for the host Host 
{"id":6,"name":"kvmhost1","type":"Routing","uuid":"650ff5c6-4cf3-4d26-99a8-ce8c0b9b153f"}
 with id: 6
   2026-01-22 05:49:27,790 DEBUG [o.a.c.h.HAManagerImpl] 
(BackgroundTaskPollManager-4:[ctx-3a6054e2]) (logid:116db6f1) HA state 
post-transition:: new state=[Checking], old state=[Suspect], for resource 
id=[6], status=[true], ha config state=[Checking].
   2026-01-22 05:49:27,793 DEBUG [o.a.c.h.HAManagerImpl] 
(BackgroundTaskPollManager-4:[ctx-3a6054e2]) (logid:116db6f1) Transitioned host 
HA state from: Suspect to: Checking due to event:PerformActivityCheck for the 
host Host 
{"id":6,"name":"kvmhost1","type":"Routing","uuid":"650ff5c6-4cf3-4d26-99a8-ce8c0b9b153f"}
 with id: 6
   2026-01-22 05:49:27,796 DEBUG [o.a.c.h.HAManagerImpl] (pool-4-thread-5:[]) 
(logid:) HA state post-transition:: new state=[Degraded], old state=[Checking], 
for resource id=[6], status=[true], ha config state=[Degraded].
   2026-01-22 05:49:27,798 DEBUG [o.a.c.h.HAManagerImpl] (pool-4-thread-5:[]) 
(logid:) Transitioned host HA state from: Checking to: Degraded due to 
event:ActivityCheckFailureUnderThresholdRatio for the host Host 
{"id":6,"name":"kvmhost1","type":"Routing","uuid":"650ff5c6-4cf3-4d26-99a8-ce8c0b9b153f"}
 with id: 6
   2026-01-22 05:49:27,876 DEBUG [o.a.c.h.HAManagerImpl] (pool-3-thread-3:[]) 
(logid:) HA state pre-transition:: new state=[Degraded], old state=[Degraded], 
for resource id=[6], status=[true], ha config state=[Degraded].
   2026-01-22 05:49:31,895 DEBUG [o.a.c.h.HAManagerImpl] (pool-3-thread-5:[]) 
(logid:) HA state pre-transition:: new state=[Degraded], old state=[Degraded], 
for resource id=[6], status=[true], ha config state=[Degraded].
   2026-01-22 05:49:35,899 DEBUG [o.a.c.h.HAManagerImpl] (pool-3-thread-7:[]) 
(logid:) HA state pre-transition:: new state=[Degraded], old state=[Degraded], 
for resource id=[6], status=[true], ha config state=[Degraded].
   
   
   
   tail -f /var/log/cloudstack/management/management-server.log |grep -i 
"o.a.c.k.h.KVMHostActivityChecker"
   
   
   2026-01-22 05:46:43,545 DEBUG [o.a.c.k.h.KVMHostActivityChecker] 
(pool-3-thread-22:[]) (logid:) Investigating Host 
{"id":6,"name":"kvmhost1","type":"Routing","uuid":"650ff5c6-4cf3-4d26-99a8-ce8c0b9b153f"}
 via neighbouring Host 
{"id":7,"name":"kvmhost2","type":"Routing","uuid":"c30b8e0d-0a48-40fa-bb57-c2de13513ca5"}.
   2026-01-22 05:46:43,594 DEBUG [o.a.c.k.h.KVMHostActivityChecker] 
(pool-3-thread-22:[]) (logid:) Neighbouring Host 
{"id":7,"name":"kvmhost2","type":"Routing","uuid":"c30b8e0d-0a48-40fa-bb57-c2de13513ca5"}
 returned status [Down] for the investigated Host 
{"id":6,"name":"kvmhost1","type":"Routing","uuid":"650ff5c6-4cf3-4d26-99a8-ce8c0b9b153f"}.
   2026-01-22 05:46:43,594 DEBUG [o.a.c.k.h.KVMHostActivityChecker] 
(pool-3-thread-22:[]) (logid:) Investigating Host 
{"id":6,"name":"kvmhost1","type":"Routing","uuid":"650ff5c6-4cf3-4d26-99a8-ce8c0b9b153f"}
 via neighbouring Host 
{"id":13,"name":"kvmhost3","type":"Routing","uuid":"3cfc4b1f-1aa9-40e5-a549-484ab4fcc921"}.
   2026-01-22 05:46:43,642 DEBUG [o.a.c.k.h.KVMHostActivityChecker] 
(pool-3-thread-22:[]) (logid:) Neighbouring Host 
{"id":13,"name":"kvmhost3","type":"Routing","uuid":"3cfc4b1f-1aa9-40e5-a549-484ab4fcc921"}
 returned status [Down] for the investigated Host 
{"id":6,"name":"kvmhost1","type":"Routing","uuid":"650ff5c6-4cf3-4d26-99a8-ce8c0b9b153f"}.
   2026-01-22 05:46:43,642 DEBUG [o.a.c.k.h.KVMHostActivityChecker] 
(pool-3-thread-22:[]) (logid:) Host 
{"id":6,"name":"kvmhost1","type":"Routing","uuid":"650ff5c6-4cf3-4d26-99a8-ce8c0b9b153f"}
 has the status [Down].
   
   
   tail -f /var/log/cloudstack/management/management-server.log |grep -i 
"Redfish"
   
   
   2026-01-22 05:49:21,565 DEBUG [o.a.c.u.r.RedfishClient] 
(pool-2-thread-16:[ctx-394c23c2]) (logid:) Retrieved System ID 
'27946b59-9e44-4fa7-8e91-f3527a1ef094' with request 'GET: 
http://192.168.55.167/redfish/v1/Systems'
   2026-01-22 05:49:21,567 DEBUG [o.a.c.u.r.RedfishClient] 
(pool-2-thread-16:[ctx-394c23c2]) (logid:) Retrieved System power state 'Off' 
with request 'GET: 
http://192.168.55.167/redfish/v1/Systems/27946b59-9e44-4fa7-8e91-f3527a1ef094'
   2026-01-22 05:49:25,566 DEBUG [o.a.c.u.r.RedfishClient] 
(pool-2-thread-17:[ctx-11e469a8]) (logid:) Retrieved System ID 
'27946b59-9e44-4fa7-8e91-f3527a1ef094' with request 'GET: 
http://192.168.55.167/redfish/v1/Systems'
   2026-01-22 05:49:25,568 DEBUG [o.a.c.u.r.RedfishClient] 
(pool-2-thread-17:[ctx-11e469a8]) (logid:) Retrieved System power state 'Off' 
with request 'GET: 
http://192.168.55.167/redfish/v1/Systems/27946b59-9e44-4fa7-8e91-f3527a1ef094'
   
   tail -f /var/log/cloudstack/management/management-server.log |grep -i 
"AgentTaskPool"
   
   2026-01-22 05:44:17,263 DEBUG [c.c.a.m.ClusteredAgentManagerImpl] 
(AgentTaskPool-4:[ctx-198d5c99]) (logid:a36a496a) Acquired lock on host Host 
{"id":6,"name":"kvmhost1","type":"Routing","uuid":"650ff5c6-4cf3-4d26-99a8-ce8c0b9b153f"},
 to process agent disconnection
   2026-01-22 05:44:17,263 INFO  [c.c.a.m.ClusteredAgentManagerImpl] 
(AgentTaskPool-4:[ctx-198d5c99]) (logid:a36a496a) Host AgentAttache 
{"_id":6,"_name":"kvmhost1","_uuid":"650ff5c6-4cf3-4d26-99a8-ce8c0b9b153f"} is 
disconnecting with event ShutdownRequested
   2026-01-22 05:44:17,263 DEBUG [c.c.a.m.ClusteredAgentManagerImpl] 
(AgentTaskPool-4:[ctx-198d5c99]) (logid:a36a496a) The next status of agent Host 
{"id":6,"name":"kvmhost1","type":"Routing","uuid":"650ff5c6-4cf3-4d26-99a8-ce8c0b9b153f"}
 is Disconnected, current status is Up
   2026-01-22 05:44:17,264 DEBUG [c.c.a.m.ClusteredAgentManagerImpl] 
(AgentTaskPool-4:[ctx-198d5c99]) (logid:a36a496a) Deregistering link for 
AgentAttache 
{"_id":6,"_name":"kvmhost1","_uuid":"650ff5c6-4cf3-4d26-99a8-ce8c0b9b153f"} 
with state Disconnected
   2026-01-22 05:44:17,264 DEBUG [c.c.a.m.ClusteredAgentManagerImpl] 
(AgentTaskPool-4:[ctx-198d5c99]) (logid:a36a496a) Remove Agent : AgentAttache 
{"_id":6,"_name":"kvmhost1","_uuid":"650ff5c6-4cf3-4d26-99a8-ce8c0b9b153f"}
   2026-01-22 05:44:17,264 DEBUG [c.c.a.m.ClusteredAgentAttache] 
(AgentTaskPool-4:[ctx-198d5c99]) (logid:a36a496a) Processing disconnect [id: 6, 
uuid: 650ff5c6-4cf3-4d26-99a8-ce8c0b9b153f, name: kvmhost1]
   2026-01-22 05:44:17,264 DEBUG [c.c.a.m.ClusteredAgentAttache] 
(AgentTaskPool-4:[ctx-198d5c99]) (logid:a36a496a) Disconnecting from 
/192.168.55.164, Socket Address: /192.168.55.164:60614
   2026-01-22 05:44:17,264 DEBUG [c.c.a.m.ClusteredAgentAttache] 
(AgentTaskPool-4:[ctx-198d5c99]) (logid:a36a496a) Seq 27-8555431917120389122: 
Sending disconnect to class com.cloud.network.security.SecurityGroupListener
   2026-01-22 05:44:17,264 DEBUG [c.c.a.m.ClusteredAgentManagerImpl] 
(AgentTaskPool-4:[ctx-198d5c99]) (logid:a36a496a) Sending Disconnect to 
listener: com.cloud.hypervisor.xenserver.discoverer.XcpServerDiscoverer
   2026-01-22 05:44:17,264 DEBUG [c.c.a.m.ClusteredAgentManagerImpl] 
(AgentTaskPool-4:[ctx-198d5c99]) (logid:a36a496a) Sending Disconnect to 
listener: com.cloud.hypervisor.hyperv.discoverer.HypervServerDiscoverer
   2026-01-22 05:44:17,264 DEBUG [c.c.a.m.ClusteredAgentManagerImpl] 
(AgentTaskPool-4:[ctx-198d5c99]) (logid:a36a496a) Sending Disconnect to 
listener: 
org.apache.cloudstack.hypervisor.external.discoverer.ExternalServerDiscoverer
   2026-01-22 05:44:17,264 DEBUG [c.c.a.m.ClusteredAgentManagerImpl] 
(AgentTaskPool-4:[ctx-198d5c99]) (logid:a36a496a) Sending Disconnect to 
listener: org.apache.cloudstack.network.tungsten.service.TungstenElement
   2026-01-22 05:44:17,264 DEBUG [c.c.a.m.ClusteredAgentManagerImpl] 
(AgentTaskPool-4:[ctx-198d5c99]) (logid:a36a496a) Sending Disconnect to 
listener: org.apache.cloudstack.service.NsxElement
   2026-01-22 05:44:17,264 DEBUG [c.c.a.m.ClusteredAgentManagerImpl] 
(AgentTaskPool-4:[ctx-198d5c99]) (logid:a36a496a) Sending Disconnect to 
listener: org.apache.cloudstack.service.NetrisElement
   2026-01-22 05:44:17,264 DEBUG [c.c.a.m.ClusteredAgentManagerImpl] 
(AgentTaskPool-4:[ctx-198d5c99]) (logid:a36a496a) Sending Disconnect to 
listener: com.cloud.storage.listener.StoragePoolMonitor
   2026-01-22 05:44:17,265 DEBUG [c.c.a.m.ClusteredAgentManagerImpl] 
(AgentTaskPool-4:[ctx-198d5c99]) (logid:a36a496a) Sending Disconnect to 
listener: com.cloud.hypervisor.vmware.manager.VmwareManagerImpl
   2026-01-22 05:44:17,265 DEBUG [c.c.a.m.ClusteredAgentManagerImpl] 
(AgentTaskPool-4:[ctx-198d5c99]) (logid:a36a496a) Sending Disconnect to 
listener: com.cloud.deploy.DeploymentPlanningManagerImpl
   2026-01-22 05:44:17,265 DEBUG [c.c.a.m.ClusteredAgentManagerImpl] 
(AgentTaskPool-4:[ctx-198d5c99]) (logid:a36a496a) Sending Disconnect to 
listener: com.cloud.network.security.SecurityGroupListener
   2026-01-22 05:44:17,265 DEBUG [c.c.a.m.ClusteredAgentManagerImpl] 
(AgentTaskPool-4:[ctx-198d5c99]) (logid:a36a496a) Sending Disconnect to 
listener: com.cloud.vm.ClusteredVirtualMachineManagerImpl
   2026-01-22 05:44:17,265 DEBUG [c.c.a.m.ClusteredAgentManagerImpl] 
(AgentTaskPool-4:[ctx-198d5c99]) (logid:a36a496a) Sending Disconnect to 
listener: org.apache.cloudstack.engine.orchestration.NetworkOrchestrator
   2026-01-22 05:44:17,265 DEBUG [c.c.a.m.ClusteredAgentManagerImpl] 
(AgentTaskPool-4:[ctx-198d5c99]) (logid:a36a496a) Sending Disconnect to 
listener: com.cloud.storage.secondary.SecondaryStorageListener
   2026-01-22 05:44:17,265 DEBUG [c.c.a.m.ClusteredAgentManagerImpl] 
(AgentTaskPool-4:[ctx-198d5c99]) (logid:a36a496a) Sending Disconnect to 
listener: com.cloud.capacity.StorageCapacityListener
   2026-01-22 05:44:17,265 DEBUG [c.c.a.m.ClusteredAgentManagerImpl] 
(AgentTaskPool-4:[ctx-198d5c99]) (logid:a36a496a) Sending Disconnect to 
listener: com.cloud.capacity.ComputeCapacityListener
   2026-01-22 05:44:17,265 DEBUG [c.c.a.m.ClusteredAgentManagerImpl] 
(AgentTaskPool-4:[ctx-198d5c99]) (logid:a36a496a) Sending Disconnect to 
listener: com.cloud.storage.download.DownloadListener
   2026-01-22 05:44:17,265 DEBUG [c.c.a.m.ClusteredAgentManagerImpl] 
(AgentTaskPool-4:[ctx-198d5c99]) (logid:a36a496a) Sending Disconnect to 
listener: com.cloud.consoleproxy.ConsoleProxyListener
   2026-01-22 05:44:17,265 DEBUG [c.c.a.m.ClusteredAgentManagerImpl] 
(AgentTaskPool-4:[ctx-198d5c99]) (logid:a36a496a) Sending Disconnect to 
listener: com.cloud.agent.manager.AgentManagerImpl$BehindOnPingListener
   2026-01-22 05:44:17,265 DEBUG [c.c.a.m.ClusteredAgentManagerImpl] 
(AgentTaskPool-4:[ctx-198d5c99]) (logid:a36a496a) Sending Disconnect to 
listener: com.cloud.agent.manager.AgentManagerImpl$SetHostParamsListener
   2026-01-22 05:44:17,265 DEBUG [c.c.a.m.ClusteredAgentManagerImpl] 
(AgentTaskPool-4:[ctx-198d5c99]) (logid:a36a496a) Sending Disconnect to 
listener: com.cloud.network.SshKeysDistriMonitor
   2026-01-22 05:44:17,265 DEBUG [c.c.a.m.ClusteredAgentManagerImpl] 
(AgentTaskPool-4:[ctx-198d5c99]) (logid:a36a496a) Sending Disconnect to 
listener: com.cloud.network.router.VirtualNetworkApplianceManagerImpl
   2026-01-22 05:44:17,265 DEBUG [c.c.a.m.ClusteredAgentManagerImpl] 
(AgentTaskPool-4:[ctx-198d5c99]) (logid:a36a496a) Sending Disconnect to 
listener: com.cloud.network.NetworkUsageManagerImpl$DirectNetworkStatsListener
   2026-01-22 05:44:17,265 DEBUG [c.c.n.NetworkUsageManagerImpl] 
(AgentTaskPool-4:[ctx-198d5c99]) (logid:a36a496a) Disconnected called on [id: 
6, uuid: 650ff5c6-4cf3-4d26-99a8-ce8c0b9b153f, name: kvmhost1] with status 
Disconnected
   2026-01-22 05:44:17,265 DEBUG [c.c.a.m.ClusteredAgentManagerImpl] 
(AgentTaskPool-4:[ctx-198d5c99]) (logid:a36a496a) Sending Disconnect to 
listener: com.cloud.network.SshKeysDistriMonitor
   2026-01-22 05:44:17,265 DEBUG [c.c.a.m.ClusteredAgentManagerImpl] 
(AgentTaskPool-4:[ctx-198d5c99]) (logid:a36a496a) Sending Disconnect to 
listener: com.cloud.network.router.VpcVirtualNetworkApplianceManagerImpl
   2026-01-22 05:44:17,265 DEBUG [c.c.a.m.ClusteredAgentManagerImpl] 
(AgentTaskPool-4:[ctx-198d5c99]) (logid:a36a496a) Sending Disconnect to 
listener: com.cloud.storage.LocalStoragePoolListener
   2026-01-22 05:44:17,265 DEBUG [c.c.a.m.ClusteredAgentManagerImpl] 
(AgentTaskPool-4:[ctx-198d5c99]) (logid:a36a496a) Sending Disconnect to 
listener: com.cloud.storage.upload.UploadListener
   2026-01-22 05:44:17,265 DEBUG [c.c.a.m.ClusteredAgentManagerImpl] 
(AgentTaskPool-4:[ctx-198d5c99]) (logid:a36a496a) Sending Disconnect to 
listener: com.cloud.hypervisor.kvm.discoverer.KvmServerDiscoverer
   2026-01-22 05:44:17,265 DEBUG [c.c.a.m.ClusteredAgentManagerImpl] 
(AgentTaskPool-4:[ctx-198d5c99]) (logid:a36a496a) Sending Disconnect to 
listener: com.cloud.hypervisor.kvm.discoverer.LxcServerDiscoverer
   2026-01-22 05:44:17,265 DEBUG [c.c.a.m.ClusteredAgentManagerImpl] 
(AgentTaskPool-4:[ctx-198d5c99]) (logid:a36a496a) Sending Disconnect to 
listener: com.cloud.hypervisor.discoverer.CustomServerDiscoverer
   2026-01-22 05:44:17,265 DEBUG [c.c.a.m.ClusteredAgentManagerImpl] 
(AgentTaskPool-4:[ctx-198d5c99]) (logid:a36a496a) [Resource state = Enabled, 
Agent event = , Host = ShutdownRequested]
   2026-01-22 05:44:17,269 DEBUG [c.c.a.m.ClusteredAgentManagerImpl] 
(AgentTaskPool-4:[ctx-198d5c99]) (logid:a36a496a) Notifying other nodes of to 
disconnect
   
   ```
   
   12. Check the HOST HA state , Power state ,state , Resource State  
   
   State: Disconnected 
   
   Host HA state : Degraded
   
   Resource state : Enabled 
   
   Poer state : Off
   
   
   The host ha state remains in degraded state and doesn't move to Fenced 
state, nor does the power sate 
   
   <img width="1649" height="188" alt="Image" 
src="https://github.com/user-attachments/assets/83b8c769-f6c4-4a4d-8c97-171073af1c70";
 />
   
   <img width="1639" height="731" alt="Image" 
src="https://github.com/user-attachments/assets/4346287d-6d4a-4890-9613-c53879bd673b";
 />
   
   13. Bring back the KVM HOST 
   ```
   service libvirtd start
   service cloudstack-agent start
   systemctl start multipathd
   iptables -D OUTPUT -p tcp --dport 8250 -j DROP
   iptables -D INPUT  -p tcp --dport 8250 -j DROP
   
   curl -u root:password   -X POST   -H "Content-Type: application/json"   -d 
'{"ResetType":"ForceOn"}'   
http://192.168.55.167/redfish/v1/Systems/27946b59-9e44-4fa7-8e91-f3527a1ef094/Actions/ComputerSystem.Reset
   ```
   
   logs 
   
   ```
   2026-01-22 06:01:08,816 DEBUG [o.a.c.h.HAManagerImpl] (pool-3-thread-4:[]) 
(logid:) Transitioned host HA state from: Degraded to: Available due to 
event:HealthCheckPassed for the host Host 
{"id":6,"name":"kvmhost1","type":"Routing","uuid":"650ff5c6-4cf3-4d26-99a8-ce8c0b9b153f"}
 with id: 6
   2026-01-22 06:01:12,796 DEBUG [o.a.c.h.HAManagerImpl] (pool-3-thread-6:[]) 
(logid:) HA state pre-transition:: new state=[Available], old 
state=[Available], for resource id=[6], status=[true], ha config 
state=[Available].
   2026-01-22 06:01:16,804 DEBUG [o.a.c.h.HAManagerImpl] (pool-3-thread-8:[]) 
(logid:) HA state pre-transition:: new state=[Available], old 
state=[Available], for resource id=[6], status=[true], ha config 
state=[Available].
   2026-01-22 06:01:20,809 DEBUG [o.a.c.h.HAManagerImpl] (pool-3-thread-10:[]) 
(logid:) HA state pre-transition:: new state=[Available], old 
state=[Available], for resource id=[6], status=[true], ha config 
state=[Available].
   2026-01-22 06:01:24,814 DEBUG [o.a.c.h.HAManagerImpl] (pool-3-thread-12:[]) 
(logid:) HA state pre-transition:: new state=[Available], old 
state=[Available], for resource id=[6], status=[tru
   
   
   2026-01-22 06:05:14,046 DEBUG [o.a.c.u.r.RedfishClient] 
(pool-2-thread-4:[ctx-961227c1]) (logid:) Retrieved System ID 
'27946b59-9e44-4fa7-8e91-f3527a1ef094' with request 'GET: 
http://192.168.55.167/redfish/v1/Systems'
   2026-01-22 06:05:14,048 DEBUG [o.a.c.u.r.RedfishClient] 
(pool-2-thread-4:[ctx-961227c1]) (logid:) Retrieved System power state 'On' 
with request 'GET: 
http://192.168.55.167/redfish/v1/Systems/27946b59-9e44-4fa7-8e91-f3527a1ef094'
   2026-01-22 06:05:18,051 DEBUG [o.a.c.u.r.RedfishClient] 
(pool-2-thread-5:[ctx-35dbc010]) (logid:) Retrieved System ID 
'27946b59-9e44-4fa7-8e91-f3527a1ef094' with request 'GET: 
http://192.168.55.167/redfish/v1/Systems'
   
   
   ```
   
   <img width="1643" height="150" alt="Image" 
src="https://github.com/user-attachments/assets/9f01f4ac-c4ea-4021-9b47-9ac828e77941";
 />
   
   <img width="1645" height="719" alt="Image" 
src="https://github.com/user-attachments/assets/bdcdacc7-1c80-4a7e-aca3-7b6d4a0ea22f";
 />
   
   
   
   14. Test the host ha when there is a HA enabled vm running on the kvm host 1.
   
   
   Launch a HA enabled vm  on host1 and perform the same steps ( 9,10) and 
check the management server logs 
   
   The host ha remains in degraded state  and the ha enabled vm  continues to 
remains on kvm host 1.
   
   The vm ha doesn't start the vm on another kvm host in the cluster
   
   <img width="1659" height="222" alt="Image" 
src="https://github.com/user-attachments/assets/6d5c7717-508f-428e-adc7-8ca9854396f8";
 />
   
   <img width="1665" height="207" alt="Image" 
src="https://github.com/user-attachments/assets/087bf97a-42b5-4da0-b43f-5ed12407d495";
 />
   
   
   Ref : 
   
   https://cwiki.apache.org/confluence/display/cloudstack/host+ha
   
https://cwiki.apache.org/confluence/display/CLOUDSTACK/KVM+HA+with+IPMI+Fencing
   
https://cwiki.apache.org/confluence/display/CLOUDSTACK/Out-of-band+Management+for+CloudStack
   
https://cwiki.apache.org/confluence/display/CLOUDSTACK/High+Availability+Developer's+Guide
   https://www.shapeblue.com/host-ha-for-kvm-hosts-in-cloudstack/
   
   
   
   
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to