[Bug 1131284] Re: Folsom erroneously destroys paused VMs

2013-03-17 Thread Andres Lagar-Cavilla
Will a fix be released for the folsom (2012.2) packages? That is the
intent of the bug filing. Thanks!

-- 
You received this bug notification because you are a member of Ubuntu
Server Team, which is subscribed to nova in Ubuntu.
https://bugs.launchpad.net/bugs/1131284

Title:
  Folsom erroneously destroys paused VMs

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1131284/+subscriptions

-- 
Ubuntu-server-bugs mailing list
Ubuntu-server-bugs@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs


[Bug 1131284] Re: Folsom erroneously destroys paused VMs

2013-03-17 Thread Andres Lagar-Cavilla
Will a fix be released for the folsom (2012.2) packages? That is the
intent of the bug filing. Thanks!

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1131284

Title:
  Folsom erroneously destroys paused VMs

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1131284/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1131284] Re: Folsom erroneously destroys paused VMs

2013-03-01 Thread Andres Lagar-Cavilla
As mentioned in the description, the issue has been fixed in nova
grizzly and backported to nova folsom. Sorry for any confusion.

-- 
You received this bug notification because you are a member of Ubuntu
Server Team, which is subscribed to nova in Ubuntu.
https://bugs.launchpad.net/bugs/1131284

Title:
  Folsom erroneously destroys paused VMs

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1131284/+subscriptions

-- 
Ubuntu-server-bugs mailing list
Ubuntu-server-bugs@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs


[Bug 1131284] Re: Folsom erroneously destroys paused VMs

2013-03-01 Thread Andres Lagar-Cavilla
As mentioned in the description, the issue has been fixed in nova
grizzly and backported to nova folsom. Sorry for any confusion.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1131284

Title:
  Folsom erroneously destroys paused VMs

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1131284/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1131284] [NEW] Folsom erroneously destroys paused VMs

2013-02-21 Thread Andres Lagar-Cavilla
Public bug reported:

Requesting to add upstream stable commit:
https://github.com/openstack/nova/commit/7ace55fcf9e1b7fea074f6c0331b6feafbbc4178

reviewed here:
https://review.openstack.org/#/c/20337/

and which addresses this upstream bug:
https://bugs.launchpad.net/nova/+bug/1097806

(updated description of bug follows)

Libvirt-managed qemu/KVM VMs can be paused outside of nova compute's
workflow through a variety of means.

* By issuing virsh suspend
* By issuing virsh qemu-monitor-command '{execute : stop}'
* By causing qemu to emit a STOP event, for example when attaching a GDB 
debugger and single-stepping
* By connecting through an additional qemu monitor and issuing any commands 
that may cause qemu to emit a STOP event.

Starting in Folsom (specifically
https://github.com/openstack/nova/commit/129b87e17daeaa9e855a70dea51e6581ea63#L6R2502
i.e. commit 129b87e diff line 2502) nova compute will destroy a VM if
libvirt reports it as paused and this doesn't fit nova compute's
recorded state for the VM.

While the original rationale is to destroy VMs that are paused by IO
errors or KVM emulation errors, which would also cause qemu to emit STOP
events.

The problem is that this will also destroy VMs that are paused through a
variety of valid reasons as outlined above.

The problem is exacerbated by a Libvirt bug
(https://bugzilla.redhat.com/show_bug.cgi?id=892791) which latches the
state of a VM to paused even though the VM is running. The fix is
already committed upstream
(http://libvirt.org/git/?p=libvirt.git;a=commit;h=aedfcce33e4c2f28a39fd655574fe34f1265),
as well as being integrated into Raring and triaged for backport into
Precise: https://bugs.launchpad.net/bugs/1097824.

Even with libvirt's bug fixed, there are still points in time at which
nova-compute will check a VMs state, find it paused for a valid reason,
and decide to erroneously destroy it.

The fix is to either remove this behavior, or to further query libvirt
for the paused reason, which will show conclusively whether the VM is
effectively crashed, or just paused.

** Affects: nova (Ubuntu)
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Ubuntu
Server Team, which is subscribed to nova in Ubuntu.
https://bugs.launchpad.net/bugs/1131284

Title:
  Folsom erroneously destroys paused VMs

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/nova/+bug/1131284/+subscriptions

-- 
Ubuntu-server-bugs mailing list
Ubuntu-server-bugs@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs


[Bug 1131284] [NEW] Folsom erroneously destroys paused VMs

2013-02-21 Thread Andres Lagar-Cavilla
Public bug reported:

Requesting to add upstream stable commit:
https://github.com/openstack/nova/commit/7ace55fcf9e1b7fea074f6c0331b6feafbbc4178

reviewed here:
https://review.openstack.org/#/c/20337/

and which addresses this upstream bug:
https://bugs.launchpad.net/nova/+bug/1097806

(updated description of bug follows)

Libvirt-managed qemu/KVM VMs can be paused outside of nova compute's
workflow through a variety of means.

* By issuing virsh suspend
* By issuing virsh qemu-monitor-command '{execute : stop}'
* By causing qemu to emit a STOP event, for example when attaching a GDB 
debugger and single-stepping
* By connecting through an additional qemu monitor and issuing any commands 
that may cause qemu to emit a STOP event.

Starting in Folsom (specifically
https://github.com/openstack/nova/commit/129b87e17daeaa9e855a70dea51e6581ea63#L6R2502
i.e. commit 129b87e diff line 2502) nova compute will destroy a VM if
libvirt reports it as paused and this doesn't fit nova compute's
recorded state for the VM.

While the original rationale is to destroy VMs that are paused by IO
errors or KVM emulation errors, which would also cause qemu to emit STOP
events.

The problem is that this will also destroy VMs that are paused through a
variety of valid reasons as outlined above.

The problem is exacerbated by a Libvirt bug
(https://bugzilla.redhat.com/show_bug.cgi?id=892791) which latches the
state of a VM to paused even though the VM is running. The fix is
already committed upstream
(http://libvirt.org/git/?p=libvirt.git;a=commit;h=aedfcce33e4c2f28a39fd655574fe34f1265),
as well as being integrated into Raring and triaged for backport into
Precise: https://bugs.launchpad.net/bugs/1097824.

Even with libvirt's bug fixed, there are still points in time at which
nova-compute will check a VMs state, find it paused for a valid reason,
and decide to erroneously destroy it.

The fix is to either remove this behavior, or to further query libvirt
for the paused reason, which will show conclusively whether the VM is
effectively crashed, or just paused.

** Affects: nova (Ubuntu)
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1131284

Title:
  Folsom erroneously destroys paused VMs

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/nova/+bug/1131284/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1097824] Re: Libvirt does not follow RESUME qemu monitor events. VMs remain in paused state forever

2013-01-16 Thread Andres Lagar-Cavilla
Serge, additional (and unexpected!) motivation to include this patch
http://www.redhat.com/archives/libvir-list/2013-January/msg01049.html

Thanks
Andres

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1097824

Title:
   Libvirt does not follow RESUME qemu monitor events. VMs remain in
  paused state forever

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1097824/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1097824] Re: Libvirt does not follow RESUME qemu monitor events. VMs remain in paused state forever

2013-01-11 Thread Andres Lagar-Cavilla
The backport needs a small tweak for Raring/1.0.0

** Patch added: Backport for Raring Ringtail
   
https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1097824/+attachment/3480721/+files/handle_resume_1.0.0-0ubuntu4.patch

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1097824

Title:
   Libvirt does not follow RESUME qemu monitor events. VMs remain in
  paused state forever

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1097824/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1097824] [NEW] Libvirt does not follow RESUME qemu monitor events. VMs remain in paused state forever

2013-01-09 Thread Andres Lagar-Cavilla
Public bug reported:

If a qemu/KVM VM is paused through a monitor by manual issuing of the
stop command, the state of the VM in libvirtd's view will transition
to paused. This is because libvirtd listens to STOP events in the
JSON monitor. However, libvirt does not listen to RESUME events on any
monitor. So, when the VM is resumed by manually issuing cont, the
internal state will remain as paused even though the VM is running.

Libvirt maintains its internal view of the state in sync for migration,
etc. But without listening to RESUME events it cannot correctly cope
with third parties issuing stop commands (such as GDB, virsh qemu-
monitor-command, or software opening another QMP monitor).

This is verified to happen on Precise and Quantal's libvirt versions.
Since it's a bug in upstream, I expect it to be faulty in Raring as
well.

The upshot in Openstack is that VMs, even though running, will be
reported as paused to nova. Due to
(https://bugs.launchpad.net/nova/+bug/1097806), nova compute will
erroneously destroy them. This is a nova-compute problem that is
exacerbated by this bug.

Steps to Reproduce:
# virsh list
 IdName   State

 1 instance-0020  running

# virsh qemu-monitor-command 1 '{execute:stop}'
{return:{},id:libvirt-10}

# virsh list
 IdName   State

 1 instance-0020  paused

# virsh qemu-monitor-command 1 '{execute:cont}'
{return:{},id:libvirt-11}

# virsh list
 IdName   State

 1 instance-0020  paused

(the state should be running)

Another way to reproduce this is by if attaching GDB to qemu and start
single-stepping, libvirt will drop dozens RESUME events and be mightily
confused.

Client software like OpenStack will tag the VM as paused.

Upstream:
Reported to libvirt upstream: https://bugzilla.redhat.com/show_bug.cgi?id=892791
Fixed in libvirt's master git: 
http://libvirt.org/git/?p=libvirt.git;a=commit;h=aedfcce33e4c2f28a39fd655574fe34f1265

I will attach a backport of the master branch fix to
0.9.13-0ubuntu12~cloud0

** Affects: libvirt (Ubuntu)
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1097824

Title:
   Libvirt does not follow RESUME qemu monitor events. VMs remain in
  paused state forever

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1097824/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1097824] Re: Libvirt does not follow RESUME qemu monitor events. VMs remain in paused state forever

2013-01-09 Thread Andres Lagar-Cavilla
** Patch added: handle_resume_0.9.13-0ubuntu12~cloud0.patch
   
https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1097824/+attachment/3478189/+files/handle_resume_0.9.13-0ubuntu12%7Ecloud0.patch

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1097824

Title:
   Libvirt does not follow RESUME qemu monitor events. VMs remain in
  paused state forever

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1097824/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1097824] Re: Libvirt does not follow RESUME qemu monitor events. VMs remain in paused state forever

2013-01-09 Thread Andres Lagar-Cavilla
With the above patch:
# virsh list
 IdName   State

 1 instance-0022  running

# virsh qemu-monitor-command 1 '{execute:stop}'
{return:{},id:libvirt-12}

# virsh list
 IdName   State

 1 instance-0022  paused

# virsh qemu-monitor-command 1 '{execute:cont}'
{return:{},id:libvirt-13}

# virsh list
 IdName   State

 1 instance-0022  running

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1097824

Title:
   Libvirt does not follow RESUME qemu monitor events. VMs remain in
  paused state forever

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1097824/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1097824] Re: Libvirt does not follow RESUME qemu monitor events. VMs remain in paused state forever

2013-01-09 Thread Andres Lagar-Cavilla
Serge, no problem. What is the status for Raring? The upstream commit is
not in 1.0.0. I bet the patch as is won't apply, should I rebase?

Thanks

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1097824

Title:
   Libvirt does not follow RESUME qemu monitor events. VMs remain in
  paused state forever

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1097824/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs