This is an automated email from the ASF dual-hosted git repository.

rohit pushed a commit to branch 4.11
in repository https://gitbox.apache.org/repos/asf/cloudstack.git


The following commit(s) were added to refs/heads/4.11 by this push:
     new 7030549  CLOUDSTACK-10305: Rare race condition in KVM migration (#2466)
7030549 is described below

commit 703054964a9bae27a563412c35166bae515d10e3
Author: Nicolas Vazquez <[email protected]>
AuthorDate: Mon Feb 26 11:31:51 2018 -0300

    CLOUDSTACK-10305: Rare race condition in KVM migration (#2466)
    
    There is a race condition in the monitoring of the migration process on 
KVM. If the monitor wakes up in the tight window after the migration succeeds, 
but before the migration thread terminates, the monitor will get a 
LibvirtException “Domain not found: no domain with matching uuid” when checking 
on the migration status. This in turn causes CloudStack to sync the VM state to 
stop, in which it issues a defensive StopCommand to ensure it is correctly 
synced.
    
    Fix: Prevent LibvirtException: "Domain not found" caused by the call to 
dm.getInfo()
---
 .../resource/wrapper/LibvirtMigrateCommandWrapper.java | 18 +++++++++++++-----
 1 file changed, 13 insertions(+), 5 deletions(-)

diff --git 
a/plugins/hypervisors/kvm/src/com/cloud/hypervisor/kvm/resource/wrapper/LibvirtMigrateCommandWrapper.java
 
b/plugins/hypervisors/kvm/src/com/cloud/hypervisor/kvm/resource/wrapper/LibvirtMigrateCommandWrapper.java
index 30f0e20..ad32759 100644
--- 
a/plugins/hypervisors/kvm/src/com/cloud/hypervisor/kvm/resource/wrapper/LibvirtMigrateCommandWrapper.java
+++ 
b/plugins/hypervisors/kvm/src/com/cloud/hypervisor/kvm/resource/wrapper/LibvirtMigrateCommandWrapper.java
@@ -172,13 +172,21 @@ public final class LibvirtMigrateCommandWrapper extends 
CommandWrapper<MigrateCo
 
                 // pause vm if we meet the vm.migrate.pauseafter threshold and 
not already paused
                 final int migratePauseAfter = 
libvirtComputingResource.getMigratePauseAfter();
-                if (migratePauseAfter > 0 && sleeptime > migratePauseAfter && 
dm.getInfo().state == DomainState.VIR_DOMAIN_RUNNING ) {
-                    s_logger.info("Pausing VM " + vmName + " due to property 
vm.migrate.pauseafter setting to " + migratePauseAfter+ "ms to complete 
migration");
+                if (migratePauseAfter > 0 && sleeptime > migratePauseAfter) {
+                    DomainState state = null;
                     try {
-                        dm.suspend();
+                        state = dm.getInfo().state;
                     } catch (final LibvirtException e) {
-                        // pause could be racy if it attempts to pause right 
when vm is finished, simply warn
-                        s_logger.info("Failed to pause vm " + vmName + " : " + 
e.getMessage());
+                        s_logger.info("Couldn't get VM domain state after " + 
sleeptime + "ms: " + e.getMessage());
+                    }
+                    if (state != null && state == 
DomainState.VIR_DOMAIN_RUNNING) {
+                        try {
+                            s_logger.info("Pausing VM " + vmName + " due to 
property vm.migrate.pauseafter setting to " + migratePauseAfter + "ms to 
complete migration");
+                            dm.suspend();
+                        } catch (final LibvirtException e) {
+                            // pause could be racy if it attempts to pause 
right when vm is finished, simply warn
+                            s_logger.info("Failed to pause vm " + vmName + " : 
" + e.getMessage());
+                        }
                     }
                 }
             }

-- 
To stop receiving notification emails like this one, please contact
[email protected].

Reply via email to