[GitHub] [cloudstack] mlsorensen opened a new pull request, #7963: Trigger out of band VM state update via libvirt event when VM stops

via GitHub Thu, 14 Sep 2023 16:16:33 -0700


mlsorensen opened a new pull request, #7963:
URL: https://github.com/apache/cloudstack/pull/7963


   ### Description
   
   This PR allows KVM to detect guests that stop or crash, and immediately 
trigger a power state report to update the VM state in CloudStack.
   
   In current design, Agent is responsible for sending pings on interval. These 
pings contain a state report for each VM. If something changes in between these 
pings, it can potentially take a long time for discovery of VM state change.
   
   When Agent first starts, it loads the `ServerResource` (implementation is 
`LibvirtComputingResource` in the shipping agent) and initializes it. Then 
later it calls the ServerResource `getCurrentStatus()` to collect the host and 
VM status to send `PingCommand` on intervals. This change adds two interfaces - 
if the `ServerResource` implements `ResourceStatusUpdater`, then the agent 
registers itself as an `AgentStatusUpdater`, which gives the `ServerResource` a 
way to trigger an update, by calling `AgentStatusUpdater.triggerUpdate()`. This 
keeps the implementation of monitoring and collecting the VM status in the 
ServerResource, while allowing the `Agent` to still handle sending the update, 
and not requiring all existing implementations of `ServerResource` and 
`IAgentControl` to implement these by changing the existing interfaces. There 
may be a cleaner way to do this.
   
   In LibvirtComputingResource, we register an event listener, and process 
domain lifecycle events, looking only for `STOPPED` events that are due to a 
crash or a shutdown. Domain stop due to things like `virsh destroy`" or 
cloudstack issuing a stop will have a detail of `DESTROYED` or `MIGRATED` in 
the case of migration, rather than `CRASHED` or `SHUTDOWN`. I considered 
briefly adding some code to track if we were in the middle of a StopCommand or 
similar to filter out superfluous events, this seems simpler, at the expense of 
not being able to update on admin `virsh destroy`.
   
   The PingCommand has been given a boolean to indicate if the Ping is out of 
band. This is important, because the code that processes pings on the 
management server will ignore pings that come more often than the expected 
interval (presumably to avoid state thrashing?). This boolean gives us the 
ability to force processing of pings if they are out of band. Therefore it's 
also important that we are only triggering these on valid state change events, 
and not issuing superfluous updates any time a VM stops due to cloudstack 
issuing stop, etc.
   
   
   ### Types of changes
   
   - [ ] Breaking change (fix or feature that would cause existing 
functionality to change)
   - [ ] New feature (non-breaking change which adds functionality)
   - [ ] Bug fix (non-breaking change which fixes an issue)
   - [x] Enhancement (improves an existing feature and functionality)
   - [ ] Cleanup (Code refactoring and cleanup, that may add test cases)
   
   ### Feature/Enhancement Scale or Bug Severity
   
   #### Feature/Enhancement Scale
   
   - [ ] Major
   - [x] Minor
   
   #### Bug Severity
   
   - [ ] BLOCKER
   - [ ] Critical
   - [ ] Major
   - [x] Minor
   - [ ] Trivial
   
   
   ### Screenshots (if appropriate):
   
   
   ### How Has This Been Tested?
   
   Tested locally by shutting down VM within guest, vs `virsh destroy` or 
stopping via CloudStack API. Tested to ensure listener still works after 
libvirt restart.
   
   There are no existing tests for VirtualMachienPowerStateSyncImpl, and the 
change here is very minor (reacting to the boolean).
   
   It seems tricky to build a unit test for the Libvirt event listener.
   
   Could possibly write a smoke test to ssh into a vm, shut it down, and check 
the state of the VM via API?
   
   <!-- Please read the 
[CONTRIBUTING](https://github.com/apache/cloudstack/blob/main/CONTRIBUTING.md) 
document -->
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@cloudstack.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [cloudstack] mlsorensen opened a new pull request, #7963: Trigger out of band VM state update via libvirt event when VM stops

Reply via email to