jmsperu opened a new issue, #12795:
URL: https://github.com/apache/cloudstack/issues/12795

   ## Description
   
   When `provisionCertificate` is called on a KVM host, the agent's 
`PostCertificateRenewalTask` consistently fails with 
`java.lang.IllegalStateException: Shutdown in progress`. This causes the host 
to permanently report `secured=false` in `StartupRoutingCommand.hostDetails`, 
showing as "Unsecure" in the UI despite having valid TLS certificates and a 
working SSL connection.
   
   ## Steps to Reproduce
   
   1. Add a new KVM host to CloudStack 4.22 with 
`ca.plugin.root.auth.strictness=true`
   2. Run `provisionCertificate hostid=<uuid>`
   3. The API returns `{"success": true}` — keystore, cert, CA cert, and key 
are all created correctly
   4. The `PostCertificateRenewalTask` attempts to restart libvirtd and 
reconnect, but the cert provisioning triggers an agent restart
   5. During the agent shutdown, `Runtime.removeShutdownHook()` throws 
`IllegalStateException: Shutdown in progress`
   6. The agent never sets `secured=true` — the host permanently shows 
"Unsecure"
   
   ## Error Log (agent.log)
   
   ```
   INFO  [resource.wrapper.LibvirtPostCertificateRenewalCommandWrapper] 
Restarting libvirt after certificate provisioning/renewal
   WARN  [resource.wrapper.LibvirtPostCertificateRenewalCommandWrapper] 
Execution of process for command [sudo service libvirtd restart ] failed.
   WARN  [cloud.agent.Agent] Failed to execute post certificate renewal 
command: java.lang.IllegalStateException: Shutdown in progress
        at 
java.base/java.lang.ApplicationShutdownHooks.remove(ApplicationShutdownHooks.java:82)
        at java.base/java.lang.Runtime.removeShutdownHook(Runtime.java:244)
        at 
com.cloud.agent.Agent$PostCertificateRenewalTask.runInContext(Agent.java:1377)
   ```
   
   ## Root Cause
   
   In `Agent.java:1377`, the `PostCertificateRenewalTask` calls 
`Runtime.getRuntime().removeShutdownHook()` during JVM shutdown, which is not 
allowed per Java spec. The certificate provisioning triggers an agent 
reconnect/restart, creating a race condition where the PostCertificateRenewal 
task runs during the shutdown window.
   
   ## Impact
   
   - Host permanently shows "Unsecure" in the UI despite valid TLS
   - The actual SSL connection works correctly (keystore loads, handshake 
succeeds)
   - The `secured` flag in `host_details` DB table is overwritten to `false` on 
every agent reconnect
   - Manual DB updates are overwritten by the agent's `StartupRoutingCommand`
   - Reproduced consistently on multiple `provisionCertificate` attempts
   
   ## Suggested Fix
   
   The `PostCertificateRenewalTask.runInContext()` should catch 
`IllegalStateException` from `removeShutdownHook()` and still proceed with 
setting the secured flag. Alternatively, check 
`Thread.currentThread().isInterrupted()` or use a guard flag before calling 
`removeShutdownHook()`.
   
   ```java
   // In Agent.java PostCertificateRenewalTask.runInContext()
   try {
       Runtime.getRuntime().removeShutdownHook(shutdownThread);
   } catch (IllegalStateException e) {
       // JVM is already shutting down, skip hook removal
       LOG.debug("Skipping shutdown hook removal during shutdown", e);
   }
   ```
   
   ## Environment
   
   - CloudStack: 4.22.0.0
   - OS: Ubuntu 22.04 (also reproduced on fresh provision)
   - Java: OpenJDK 11.0.30 and 17.0.18
   - KVM/libvirt: working correctly
   - `ca.plugin.root.auth.strictness`: true
   
   ## Workaround
   
   Manually update the DB: `UPDATE host_details SET value='true' WHERE 
host_id=<id> AND name='secured';`
   This is overwritten on next agent restart but the TLS connection is 
functionally secure regardless of the flag.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to