vishesh92 opened a new pull request, #8085: URL: https://github.com/apache/cloudstack/pull/8085
### Description In case of a failure while deploying VM, we reset the host_id for the failed VM to null but not the pod_id. This results in failure when there is enough capacity in another pod, but not in the existing pod. <!--- Describe your changes in DETAIL - And how has behaviour functionally changed. --> <!-- For new features, provide link to FS, dev ML discussion etc. --> <!-- In case of bug fix, the expected and actual behaviours, steps to reproduce. --> <!-- When "Fixes: #<id>" is specified, the issue/PR will automatically be closed when this PR gets merged --> <!-- For addressing multiple issues/PRs, use multiple "Fixes: #<id>" --> <!-- Fixes: # --> <!--- ********************************************************************************* --> <!--- NOTE: AUTOMATATION USES THE DESCRIPTIONS TO SET LABELS AND PRODUCE DOCUMENTATION. --> <!--- PLEASE PUT AN 'X' in only **ONE** box --> <!--- ********************************************************************************* --> ### Types of changes - [ ] Breaking change (fix or feature that would cause existing functionality to change) - [ ] New feature (non-breaking change which adds functionality) - [x] Bug fix (non-breaking change which fixes an issue) - [ ] Enhancement (improves an existing feature and functionality) - [ ] Cleanup (Code refactoring and cleanup, that may add test cases) - [ ] build/CI ### Feature/Enhancement Scale or Bug Severity #### Feature/Enhancement Scale - [ ] Major - [ ] Minor #### Bug Severity - [ ] BLOCKER - [ ] Critical - [ ] Major - [x] Minor - [ ] Trivial ### Screenshots (if appropriate): ### How Has This Been Tested? <!-- Please describe in detail how you tested your changes. --> <!-- Include details of your testing environment, and the tests you ran to --> #### How did you try to break this feature and the system with this change? This needs an environment with 2 pods to reproduce the issue and test the fix. 1. On management server, set a debugger here: https://github.com/apache/cloudstack/blob/9df580cef457cdb767aa5bea926500fa8b1263ca/server/src/main/java/com/cloud/capacity/CapacityManagerImpl.java#L383 2. Deploy a VM. When the debugger reaches the line above, do the following: 1) Run `SELECT id, state, pod_id, host_id, last_host_id FROM vm_instance ORDER BY id DESC LIMIT 1;` on the `cloud` database. 2) Get the pod_id from the above and run this query for that pod_id `UPDATE host_pod_ref SET allocation_state = 'Disabled' WHERE id = <pod id>`. 3) Set `hostHasCpuCapability = false` in the debugger to throw an error in the first run. 4) VM is retried again once more after this failure. Before the fix, it won't stop at the debugger since it no longer has any available resources to deploy on. After the fix, it will stop again at the debugger. At this point, you can check that pod_id is different. <!-- see how your change affects other areas of the code, etc. --> <!-- Please read the [CONTRIBUTING](https://github.com/apache/cloudstack/blob/main/CONTRIBUTING.md) document --> -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
