Re: [Openstack] Random libvirt hangs

2012-06-08 Thread Daniel P. Berrange
On Thu, May 31, 2012 at 08:19:47AM +0200, Christian Wittwer wrote:
 Hi Daniel,
 
  I'd file a bug against libvirt in Oneiric, requesting that they
  backport the 4 changesets mentioned in
 
 Do you know if that bug is now fixed in Oneiric?

No idea I'm afraid, I only maintain libvirt upstream or in Fedora/RHEL,
so don't track Ubuntu bugs.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


Re: [Openstack] Random libvirt hangs

2012-03-14 Thread Lorin Hochstein
From what I understand, the SmokeStack integration tests are running on Ubuntu 
11.10 (XenServer) and Fedora 16 (Libvirt), so as a practical matter those 
platforms will be more battle-tested even if they aren't officially blessed as 
supported platforms.

Take care,

Lorin
--
Lorin Hochstein
Lead Architect - Cloud Services
Nimbis Services, Inc.
www.nimbisservices.com


On Mar 13, 2012, at 3:15 PM, Justin Santa Barbara wrote:

 I certainly understand your position Thierry.  However, I think it is
 important that we target one 'golden' platform, and that we take
 responsibility for any issues with that platform.  Otherwise we simply
 end up pointing fingers and being blocked on backports, and the end
 result is a system that just doesn't work for the people actually
 deploying it.  File a bug upstream is an appropriate response for
 me, but it's not really OK for end-users.
 
 We could then have a policy that 'if Essex fails on TargetPlatform
 it's an OpenStack issue, otherwise it's a distro issue'.  We can
 either work around the bug or work with TargetPlatform to get a bugfix
 integrated.  Other distros can look to the golden platform to
 understand what patches are needed and how things are supposed to
 work.
 
 It sounds like Precise is a good candidate for Essex: it is an LTS
 release, and we have time to ensure that any required bugfixes (that
 we don't want to work around) make it into the official release.
 
 If that's agreeable, then e.g. we probably retarget devstack and our
 documentation from Oneiric to Precise.  We should probably gate on
 Precise as well.
 
 I will be much happier if we just say we aim to support X; I don't
 really care what X is.  I'm just going to be running OpenStack on the
 machine, so I'm not picking my distro e.g. based on how I feel about
 Unity.  I'd imagine most users are in a similar camp.
 
 Justin
 
 
 
 On Tue, Mar 13, 2012 at 5:00 AM, Thierry Carrez thie...@openstack.org wrote:
 Justin Santa Barbara wrote:
 Which operating system(s) are we aiming to support for Essex?  Is the
 plan to backport the latest libvirt to Oneric, or are we going to wait
 for Precise?
 
 The question is the other way around: which operating systems aim to
 support Essex ? We try to set the dependencies for OpenStack to a
 reasonable set of versions (generally compatible with the release under
 development of the major Linux distributions), but it's up to the
 distributions themselves to make sure they align if they want to support
 a given version of OpenStack.
 
 Ubuntu will ship Essex in 12.04 LTS. I don't think there are any plans
 to backport it to 11.10. Fedora will support Essex in Fedora 17, etc.
 
 --
 Thierry Carrez (ttx)
 Release Manager, OpenStack
 
 ___
 Mailing list: https://launchpad.net/~openstack
 Post to : openstack@lists.launchpad.net
 Unsubscribe : https://launchpad.net/~openstack
 More help   : https://help.launchpad.net/ListHelp
 
 ___
 Mailing list: https://launchpad.net/~openstack
 Post to : openstack@lists.launchpad.net
 Unsubscribe : https://launchpad.net/~openstack
 More help   : https://help.launchpad.net/ListHelp

___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


Re: [Openstack] Random libvirt hangs

2012-03-13 Thread Justin Santa Barbara
I certainly understand your position Thierry.  However, I think it is
important that we target one 'golden' platform, and that we take
responsibility for any issues with that platform.  Otherwise we simply
end up pointing fingers and being blocked on backports, and the end
result is a system that just doesn't work for the people actually
deploying it.  File a bug upstream is an appropriate response for
me, but it's not really OK for end-users.

We could then have a policy that 'if Essex fails on TargetPlatform
it's an OpenStack issue, otherwise it's a distro issue'.  We can
either work around the bug or work with TargetPlatform to get a bugfix
integrated.  Other distros can look to the golden platform to
understand what patches are needed and how things are supposed to
work.

It sounds like Precise is a good candidate for Essex: it is an LTS
release, and we have time to ensure that any required bugfixes (that
we don't want to work around) make it into the official release.

If that's agreeable, then e.g. we probably retarget devstack and our
documentation from Oneiric to Precise.  We should probably gate on
Precise as well.

I will be much happier if we just say we aim to support X; I don't
really care what X is.  I'm just going to be running OpenStack on the
machine, so I'm not picking my distro e.g. based on how I feel about
Unity.  I'd imagine most users are in a similar camp.

Justin



On Tue, Mar 13, 2012 at 5:00 AM, Thierry Carrez thie...@openstack.org wrote:
 Justin Santa Barbara wrote:
 Which operating system(s) are we aiming to support for Essex?  Is the
 plan to backport the latest libvirt to Oneric, or are we going to wait
 for Precise?

 The question is the other way around: which operating systems aim to
 support Essex ? We try to set the dependencies for OpenStack to a
 reasonable set of versions (generally compatible with the release under
 development of the major Linux distributions), but it's up to the
 distributions themselves to make sure they align if they want to support
 a given version of OpenStack.

 Ubuntu will ship Essex in 12.04 LTS. I don't think there are any plans
 to backport it to 11.10. Fedora will support Essex in Fedora 17, etc.

 --
 Thierry Carrez (ttx)
 Release Manager, OpenStack

 ___
 Mailing list: https://launchpad.net/~openstack
 Post to     : openstack@lists.launchpad.net
 Unsubscribe : https://launchpad.net/~openstack
 More help   : https://help.launchpad.net/ListHelp

___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


Re: [Openstack] Random libvirt hangs

2012-03-12 Thread Daniel P. Berrange
On Mon, Mar 12, 2012 at 02:17:49PM -0400, David Kranz wrote:
 In the spirit of Jay's message, we have a long-running cluster
 (diablo/kvm) where about once every 3-4 weeks a user will complain
 that she cannot connect to a vm. Examining the compute node shows
 that libvirt-bin is hung. Sometimes restarting this process fixes
 the problem. Sometimes it does not, but rebooting the compute node
 and then the vm does. I just heard from people in my company
 operating another cluster (essex/kvm) that they have also seen this.
 I filed a bug about a month ago
 
 https://bugs.launchpad.net/nova/+bug/931540
 
 Has any one been running a kvm cluster for a long time with real
 users and never seen this issue?

There have been various scenarios which can cause libvirtd to hang in
the past, but that bug report doesn't have enough useful data to
diagnose the issue. If libvirtd itself is hanging, then you need to
attach to the daemon with GDB, and run 'thread apply all bt' to collect
stack traces across all threads. Make sure you have debug symbols
available when you do this.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


Re: [Openstack] Random libvirt hangs

2012-03-12 Thread Justin Santa Barbara
I just today was able to diagnose a libvirt hang.  It appears to be
(similar to) a known bug in libvirt, likely fixed in the latest
Fedora, but it does not appear to be fixed in Ubuntu Oneirc; I think
the fix is in Precise: https://bugs.launchpad.net/nova/+bug/953656

I believe this is the upstream bug:
https://bugzilla.redhat.com/show_bug.cgi?id=757382


Which operating system(s) are we aiming to support for Essex?  Is the
plan to backport the latest libvirt to Oneric, or are we going to wait
for Precise?

Justin

On Mon, Mar 12, 2012 at 1:31 PM, Daniel P. Berrange berra...@redhat.com wrote:
 On Mon, Mar 12, 2012 at 02:17:49PM -0400, David Kranz wrote:
 In the spirit of Jay's message, we have a long-running cluster
 (diablo/kvm) where about once every 3-4 weeks a user will complain
 that she cannot connect to a vm. Examining the compute node shows
 that libvirt-bin is hung. Sometimes restarting this process fixes
 the problem. Sometimes it does not, but rebooting the compute node
 and then the vm does. I just heard from people in my company
 operating another cluster (essex/kvm) that they have also seen this.
 I filed a bug about a month ago

 https://bugs.launchpad.net/nova/+bug/931540

 Has any one been running a kvm cluster for a long time with real
 users and never seen this issue?

 There have been various scenarios which can cause libvirtd to hang in
 the past, but that bug report doesn't have enough useful data to
 diagnose the issue. If libvirtd itself is hanging, then you need to
 attach to the daemon with GDB, and run 'thread apply all bt' to collect
 stack traces across all threads. Make sure you have debug symbols
 available when you do this.

 Regards,
 Daniel
 --
 |: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
 |: http://libvirt.org              -o-             http://virt-manager.org :|
 |: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
 |: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|

 ___
 Mailing list: https://launchpad.net/~openstack
 Post to     : openstack@lists.launchpad.net
 Unsubscribe : https://launchpad.net/~openstack
 More help   : https://help.launchpad.net/ListHelp

___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp