jmsperu opened a new pull request, #12872:
URL: https://github.com/apache/cloudstack/pull/12872

   ## Summary
   
   Fix three bugs in `nasbackup.sh` that caused VMs to remain **paused 
indefinitely** when backup jobs fail (e.g. storage full, I/O error):
   
   1. **Infinite polling loop**: `backup_running_vm()` falls through the 
`Failed` case without exiting, causing the script to poll the already-failed 
job forever. Fixed by adding `exit 1` after cleanup.
   
   2. **Continued processing after failure**: `backup_stopped_vm()` continues 
processing subsequent disks after `qemu-img convert` fails. Fixed by adding 
`exit 1` after cleanup.
   
   3. **VM never resumed**: `cleanup()` removes temp files and unmounts but 
never resumes the VM that was paused by `virsh backup-begin`. Fixed by adding a 
VM state check and `virsh resume` at the top of `cleanup()`.
   
   ## Root Cause
   
   When `virsh backup-begin` starts a backup, the VM may be paused. If the 
backup fails for any reason (storage full, network issue, I/O error), the 
script's cleanup path never calls `virsh resume`, leaving the VM paused until 
manual intervention.
   
   ## Test Plan
   
   - [ ] Trigger a backup with insufficient NAS storage — verify VM is resumed 
after failure
   - [ ] Trigger a backup with NAS unmounted mid-backup — verify VM is resumed
   - [ ] Normal backup completes successfully (no regression)
   - [ ] Verify `qemu-img convert` failure on stopped VM exits cleanly
   
   Fixes #12821


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to