Change in vdsm[master]: virt: Destroy VM after post-copy migration failure
From Dan Kenigsberg : Dan Kenigsberg has submitted this change and it was merged. Change subject: virt: Destroy VM after post-copy migration failure .. virt: Destroy VM after post-copy migration failure As explained in the source code comment, we don't have currently a better option than to destroy the VM remnants after a failed post-copy migration. This may change in future, if a failed post-copy migration recovery is available in libvirt/QEMU. How do we know that a post-copy migration failed? On the source, the migration simply fails and the VM either remains in a paused status with a post-copy reason or it disappears. On the destination, the VM can end up in the same states, but there are two more things to consider: 1. we may not be aware that we are in post-copy (if the corresponding event is not sent from libvirt before the migration fails); 2. if the VM gets paused, we receive VIR_DOMAIN_EVENT_SUSPENDED_POSTCOPY_FAILED event. Disappeared VMs are already handled as general QEMU crashes. We must specifically handle the paused case on both the ends, since this is a new situation and we want to destroy the VM immediately. We also delete the VM object on the source, since Engine handles only the destination in post-copy migrations. Change-Id: I1918e9afce189c8b3f617766e55afa13f1e153f1 Signed-off-by: Milan Zamazal Bug-Url: https://bugzilla.redhat.com/1354343 --- M lib/vdsm/virt/migration.py M lib/vdsm/virt/vmexitreason.py M vdsm/virt/vm.py 3 files changed, 32 insertions(+), 1 deletion(-) Approvals: Jenkins CI: Passed CI tests Francesco Romani: Looks good to me, approved Milan Zamazal: Verified -- To view, visit https://gerrit.ovirt.org/64142 To unsubscribe, visit https://gerrit.ovirt.org/settings Gerrit-MessageType: merged Gerrit-Change-Id: I1918e9afce189c8b3f617766e55afa13f1e153f1 Gerrit-PatchSet: 24 Gerrit-Project: vdsm Gerrit-Branch: master Gerrit-Owner: Milan Zamazal Gerrit-Reviewer: Arik Hadas Gerrit-Reviewer: Dan Kenigsberg Gerrit-Reviewer: Francesco Romani Gerrit-Reviewer: Jenkins CI Gerrit-Reviewer: Milan Zamazal Gerrit-Reviewer: gerrit-hooks ___ vdsm-patches mailing list -- vdsm-patches@lists.fedorahosted.org To unsubscribe send an email to vdsm-patches-le...@lists.fedorahosted.org
Change in vdsm[master]: virt: Destroy VM after post-copy migration failure
Francesco Romani has posted comments on this change. Change subject: virt: Destroy VM after post-copy migration failure .. Patch Set 8: Code-Review+2 -- To view, visit https://gerrit.ovirt.org/64142 To unsubscribe, visit https://gerrit.ovirt.org/settings Gerrit-MessageType: comment Gerrit-Change-Id: I1918e9afce189c8b3f617766e55afa13f1e153f1 Gerrit-PatchSet: 8 Gerrit-Project: vdsm Gerrit-Branch: master Gerrit-Owner: Milan Zamazal Gerrit-Reviewer: Arik Hadas Gerrit-Reviewer: Francesco Romani Gerrit-Reviewer: Jenkins CI Gerrit-Reviewer: Milan Zamazal Gerrit-Reviewer: gerrit-hooks Gerrit-HasComments: No ___ vdsm-patches mailing list -- vdsm-patches@lists.fedorahosted.org To unsubscribe send an email to vdsm-patches-le...@lists.fedorahosted.org
Change in vdsm[master]: virt: Destroy VM after post-copy migration failure
Milan Zamazal has posted comments on this change. Change subject: virt: Destroy VM after post-copy migration failure .. Patch Set 8: Just rebase + resolved conflicts. -- To view, visit https://gerrit.ovirt.org/64142 To unsubscribe, visit https://gerrit.ovirt.org/settings Gerrit-MessageType: comment Gerrit-Change-Id: I1918e9afce189c8b3f617766e55afa13f1e153f1 Gerrit-PatchSet: 8 Gerrit-Project: vdsm Gerrit-Branch: master Gerrit-Owner: Milan Zamazal Gerrit-Reviewer: Arik Hadas Gerrit-Reviewer: Francesco Romani Gerrit-Reviewer: Jenkins CI Gerrit-Reviewer: Milan Zamazal Gerrit-Reviewer: gerrit-hooks Gerrit-HasComments: No ___ vdsm-patches mailing list -- vdsm-patches@lists.fedorahosted.org To unsubscribe send an email to vdsm-patches-le...@lists.fedorahosted.org
Change in vdsm[master]: virt: Destroy VM after post-copy migration failure
gerrit-hooks has posted comments on this change. Change subject: virt: Destroy VM after post-copy migration failure .. Patch Set 8: * Update Tracker::#1354343::OK, status: POST * Check Bug-Url::IGNORE, not relevant for branch: master * Check Public Bug::#1354343::OK, public bug * Check Product::IGNORE, not relevant for branch: master * Check TM::IGNORE, not relevant for branch: master * Check merged to previous::IGNORE, Not in stable branch (['ovirt-3.6', 'ovirt-4.0']) -- To view, visit https://gerrit.ovirt.org/64142 To unsubscribe, visit https://gerrit.ovirt.org/settings Gerrit-MessageType: comment Gerrit-Change-Id: I1918e9afce189c8b3f617766e55afa13f1e153f1 Gerrit-PatchSet: 8 Gerrit-Project: vdsm Gerrit-Branch: master Gerrit-Owner: Milan Zamazal Gerrit-Reviewer: Arik Hadas Gerrit-Reviewer: Francesco Romani Gerrit-Reviewer: Jenkins CI Gerrit-Reviewer: gerrit-hooks Gerrit-HasComments: No ___ vdsm-patches mailing list -- vdsm-patches@lists.fedorahosted.org To unsubscribe send an email to vdsm-patches-le...@lists.fedorahosted.org
Change in vdsm[master]: virt: Destroy VM after post-copy migration failure
Francesco Romani has posted comments on this change. Change subject: virt: Destroy VM after post-copy migration failure .. Patch Set 7: Code-Review+2 -- To view, visit https://gerrit.ovirt.org/64142 To unsubscribe, visit https://gerrit.ovirt.org/settings Gerrit-MessageType: comment Gerrit-Change-Id: I1918e9afce189c8b3f617766e55afa13f1e153f1 Gerrit-PatchSet: 7 Gerrit-Project: vdsm Gerrit-Branch: master Gerrit-Owner: Milan Zamazal Gerrit-Reviewer: Arik Hadas Gerrit-Reviewer: Francesco Romani Gerrit-Reviewer: Jenkins CI Gerrit-Reviewer: gerrit-hooks Gerrit-HasComments: No ___ vdsm-patches mailing list -- vdsm-patches@lists.fedorahosted.org To unsubscribe send an email to vdsm-patches-le...@lists.fedorahosted.org
Change in vdsm[master]: virt: Destroy VM after post-copy migration failure
Milan Zamazal has uploaded a new change for review. Change subject: virt: Destroy VM after post-copy migration failure .. virt: Destroy VM after post-copy migration failure As explained in the source code comment, we don't have currently a better option than to destroy the VM remnants after a failed post-copy migration. This may change in future, if a failed post-copy migration recovery is available in libvirt/QEMU. Change-Id: I1918e9afce189c8b3f617766e55afa13f1e153f1 Signed-off-by: Milan Zamazal Bug-Url: https://bugzilla.redhat.com/1354343 --- M lib/vdsm/virt/vmexitreason.py M vdsm/virt/vm.py 2 files changed, 24 insertions(+), 1 deletion(-) git pull ssh://gerrit.ovirt.org:29418/vdsm refs/changes/42/64142/7 diff --git a/lib/vdsm/virt/vmexitreason.py b/lib/vdsm/virt/vmexitreason.py index 46c092b..494cd28 100644 --- a/lib/vdsm/virt/vmexitreason.py +++ b/lib/vdsm/virt/vmexitreason.py @@ -30,6 +30,7 @@ MIGRATION_FAILED = 8 LIBVIRT_DOMAIN_MISSING = 9 DESTROYED_ON_STARTUP = 10 +POSTCOPY_MIGRATION_FAILED = 11 exitReasons = { @@ -44,4 +45,5 @@ MIGRATION_FAILED: 'VM failed to migrate', LIBVIRT_DOMAIN_MISSING: 'Failed to find the libvirt domain', DESTROYED_ON_STARTUP: 'VM destroyed during the startup', +POSTCOPY_MIGRATION_FAILED: 'Migration failed in post-copy', } diff --git a/vdsm/virt/vm.py b/vdsm/virt/vm.py index ed60354..e274388 100644 --- a/vdsm/virt/vm.py +++ b/vdsm/virt/vm.py @@ -4163,7 +4163,28 @@ else: hooks.after_vm_pause(domxml, self.conf) elif detail == libvirt.VIR_DOMAIN_EVENT_SUSPENDED_POSTCOPY_FAILED: -pass # will be handled in a followup patch +# This can happen on both the ends of the migration. +# After a failed post-copy migration, the VM remains in a +# paused state on both the ends of the migration. There is +# currently no way to recover it, since the VM is missing some +# memory pages on the destination and the old snapshot at the +# source doesn't know about the changes made to the external +# world (network, storage, ...) during the post-copy phase. +# The best what we can do in such a situation is to destroy the +# paused VM instances on both the ends before someone tries to +# resume any of them, causing confusion at best or more damages +# in the worse case. We must also inform Engine about the +# fatal state of the failed migration, so we can't destroy the +# VM immediately on the destination (but we can do it on the +# source). We report the VM as down on the destination to +# Engine and wait for destroy request from it. +self.log.warning("Migration failed in post-copy, " + "destroying VM: %s" % (self.id,)) +destroy = self.lastStatus == vmstatus.MIGRATION_SOURCE +self.setDownStatus(ERROR, + vmexitreason.POSTCOPY_MIGRATION_FAILED) +if destroy: +self.destroy() elif event == libvirt.VIR_DOMAIN_EVENT_RESUMED: self._setGuestCpuRunning(True) -- To view, visit https://gerrit.ovirt.org/64142 To unsubscribe, visit https://gerrit.ovirt.org/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I1918e9afce189c8b3f617766e55afa13f1e153f1 Gerrit-PatchSet: 7 Gerrit-Project: vdsm Gerrit-Branch: master Gerrit-Owner: Milan Zamazal Gerrit-Reviewer: Arik Hadas Gerrit-Reviewer: Francesco Romani Gerrit-Reviewer: gerrit-hooks ___ vdsm-patches mailing list -- vdsm-patches@lists.fedorahosted.org To unsubscribe send an email to vdsm-patches-le...@lists.fedorahosted.org