[Bug 1097824] Re: Libvirt does not follow RESUME qemu monitor events. VMs remain in "paused" state forever

Serge Hallyn Fri, 22 Feb 2013 11:15:56 -0800

** Description changed:

- If a qemu/KVM VM is paused through a monitor by manual issuing of the
- "stop" command, the state of the VM in libvirtd's view will transition
- to "paused". This is because libvirtd listens to "STOP" events in the
- JSON monitor. However, libvirt does not listen to RESUME events on any
- monitor. So, when the VM is resumed by manually issuing "cont", the
- internal state will remain as "paused" even though the VM is running.
+ =================================
+ SRU Justification:
+ 1. Impact: if a Vm is paused over the monitor, and then resumed, libvirt will 
continue to report the running VM as paused.
+ 2. Development fix: add a hook to follow the resume event
+ 3. Stable fix: same as development fix
+ 4. Test case: see below
+ 5. Regression potential: an error in the hook could cause the above situation 
to cause a crash instead of libvirt following the VM resume.  All regression 
tests passed with this fix.
+ =================================
+ If a qemu/KVM VM is paused through a monitor by manual issuing of the "stop" 
command, the state of the VM in libvirtd's view will transition to "paused". 
This is because libvirtd listens to "STOP" events in the JSON monitor. However, 
libvirt does not listen to RESUME events on any monitor. So, when the VM is 
resumed by manually issuing "cont", the internal state will remain as "paused" 
even though the VM is running.
  
  Libvirt maintains its internal view of the state in sync for migration,
  etc. But without listening to RESUME events it cannot correctly cope
  with third parties issuing stop commands (such as GDB, virsh qemu-
  monitor-command, or software opening another QMP monitor).
  
  This is verified to happen on Precise and Quantal's libvirt versions.
  Since it's a bug in upstream, I expect it to be faulty in Raring as
  well.
  
  The upshot in Openstack is that VMs, even though running, will be
  reported as paused to nova. Due to
  (https://bugs.launchpad.net/nova/+bug/1097806), nova compute will
  erroneously destroy them. This is a nova-compute problem that is
  exacerbated by this bug.
  
  Steps to Reproduce:
  # virsh list
-  Id    Name                           State
+  Id    Name                           State
  ----------------------------------------------------
-  1     instance-00000020              running
+  1     instance-00000020              running
  
  # virsh qemu-monitor-command 1 '{"execute":"stop"}'
  {"return":{},"id":"libvirt-10"}
  
  # virsh list
-  Id    Name                           State
+  Id    Name                           State
  ----------------------------------------------------
-  1     instance-00000020              paused
+  1     instance-00000020              paused
  
  # virsh qemu-monitor-command 1 '{"execute":"cont"}'
  {"return":{},"id":"libvirt-11"}
  
  # virsh list
-  Id    Name                           State
+  Id    Name                           State
  ----------------------------------------------------
-  1     instance-00000020              paused
+  1     instance-00000020              paused
  
  (the state should be "running")
  
  Another way to reproduce this is by if attaching GDB to qemu and start
  single-stepping, libvirt will drop dozens RESUME events and be mightily
  confused.
  
  Client software like OpenStack will tag the VM as paused.
  
  Upstream:
  Reported to libvirt upstream: 
https://bugzilla.redhat.com/show_bug.cgi?id=892791
  Fixed in libvirt's master git: 
http://libvirt.org/git/?p=libvirt.git;a=commit;h=aedfcce33e4c2f266668a39fd655574fe34f1265
  
  I will attach a backport of the master branch fix to
  0.9.13-0ubuntu12~cloud0


-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1097824

Title:
   Libvirt does not follow RESUME qemu monitor events. VMs remain in
  "paused" state forever

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1097824/+subscriptions

-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1097824] Re: Libvirt does not follow RESUME qemu monitor events. VMs remain in "paused" state forever

Reply via email to