Re: [Openstack] Random libvirt hangs
On Thu, May 31, 2012 at 08:19:47AM +0200, Christian Wittwer wrote: Hi Daniel, I'd file a bug against libvirt in Oneiric, requesting that they backport the 4 changesets mentioned in Do you know if that bug is now fixed in Oneiric? No idea I'm afraid, I only maintain libvirt upstream or in Fedora/RHEL, so don't track Ubuntu bugs. Regards, Daniel -- |: http://berrange.com -o-http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] Random libvirt hangs
From what I understand, the SmokeStack integration tests are running on Ubuntu 11.10 (XenServer) and Fedora 16 (Libvirt), so as a practical matter those platforms will be more battle-tested even if they aren't officially blessed as supported platforms. Take care, Lorin -- Lorin Hochstein Lead Architect - Cloud Services Nimbis Services, Inc. www.nimbisservices.com On Mar 13, 2012, at 3:15 PM, Justin Santa Barbara wrote: I certainly understand your position Thierry. However, I think it is important that we target one 'golden' platform, and that we take responsibility for any issues with that platform. Otherwise we simply end up pointing fingers and being blocked on backports, and the end result is a system that just doesn't work for the people actually deploying it. File a bug upstream is an appropriate response for me, but it's not really OK for end-users. We could then have a policy that 'if Essex fails on TargetPlatform it's an OpenStack issue, otherwise it's a distro issue'. We can either work around the bug or work with TargetPlatform to get a bugfix integrated. Other distros can look to the golden platform to understand what patches are needed and how things are supposed to work. It sounds like Precise is a good candidate for Essex: it is an LTS release, and we have time to ensure that any required bugfixes (that we don't want to work around) make it into the official release. If that's agreeable, then e.g. we probably retarget devstack and our documentation from Oneiric to Precise. We should probably gate on Precise as well. I will be much happier if we just say we aim to support X; I don't really care what X is. I'm just going to be running OpenStack on the machine, so I'm not picking my distro e.g. based on how I feel about Unity. I'd imagine most users are in a similar camp. Justin On Tue, Mar 13, 2012 at 5:00 AM, Thierry Carrez thie...@openstack.org wrote: Justin Santa Barbara wrote: Which operating system(s) are we aiming to support for Essex? Is the plan to backport the latest libvirt to Oneric, or are we going to wait for Precise? The question is the other way around: which operating systems aim to support Essex ? We try to set the dependencies for OpenStack to a reasonable set of versions (generally compatible with the release under development of the major Linux distributions), but it's up to the distributions themselves to make sure they align if they want to support a given version of OpenStack. Ubuntu will ship Essex in 12.04 LTS. I don't think there are any plans to backport it to 11.10. Fedora will support Essex in Fedora 17, etc. -- Thierry Carrez (ttx) Release Manager, OpenStack ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] Random libvirt hangs
I certainly understand your position Thierry. However, I think it is important that we target one 'golden' platform, and that we take responsibility for any issues with that platform. Otherwise we simply end up pointing fingers and being blocked on backports, and the end result is a system that just doesn't work for the people actually deploying it. File a bug upstream is an appropriate response for me, but it's not really OK for end-users. We could then have a policy that 'if Essex fails on TargetPlatform it's an OpenStack issue, otherwise it's a distro issue'. We can either work around the bug or work with TargetPlatform to get a bugfix integrated. Other distros can look to the golden platform to understand what patches are needed and how things are supposed to work. It sounds like Precise is a good candidate for Essex: it is an LTS release, and we have time to ensure that any required bugfixes (that we don't want to work around) make it into the official release. If that's agreeable, then e.g. we probably retarget devstack and our documentation from Oneiric to Precise. We should probably gate on Precise as well. I will be much happier if we just say we aim to support X; I don't really care what X is. I'm just going to be running OpenStack on the machine, so I'm not picking my distro e.g. based on how I feel about Unity. I'd imagine most users are in a similar camp. Justin On Tue, Mar 13, 2012 at 5:00 AM, Thierry Carrez thie...@openstack.org wrote: Justin Santa Barbara wrote: Which operating system(s) are we aiming to support for Essex? Is the plan to backport the latest libvirt to Oneric, or are we going to wait for Precise? The question is the other way around: which operating systems aim to support Essex ? We try to set the dependencies for OpenStack to a reasonable set of versions (generally compatible with the release under development of the major Linux distributions), but it's up to the distributions themselves to make sure they align if they want to support a given version of OpenStack. Ubuntu will ship Essex in 12.04 LTS. I don't think there are any plans to backport it to 11.10. Fedora will support Essex in Fedora 17, etc. -- Thierry Carrez (ttx) Release Manager, OpenStack ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] Random libvirt hangs
On Mon, Mar 12, 2012 at 02:17:49PM -0400, David Kranz wrote: In the spirit of Jay's message, we have a long-running cluster (diablo/kvm) where about once every 3-4 weeks a user will complain that she cannot connect to a vm. Examining the compute node shows that libvirt-bin is hung. Sometimes restarting this process fixes the problem. Sometimes it does not, but rebooting the compute node and then the vm does. I just heard from people in my company operating another cluster (essex/kvm) that they have also seen this. I filed a bug about a month ago https://bugs.launchpad.net/nova/+bug/931540 Has any one been running a kvm cluster for a long time with real users and never seen this issue? There have been various scenarios which can cause libvirtd to hang in the past, but that bug report doesn't have enough useful data to diagnose the issue. If libvirtd itself is hanging, then you need to attach to the daemon with GDB, and run 'thread apply all bt' to collect stack traces across all threads. Make sure you have debug symbols available when you do this. Regards, Daniel -- |: http://berrange.com -o-http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] Random libvirt hangs
I just today was able to diagnose a libvirt hang. It appears to be (similar to) a known bug in libvirt, likely fixed in the latest Fedora, but it does not appear to be fixed in Ubuntu Oneirc; I think the fix is in Precise: https://bugs.launchpad.net/nova/+bug/953656 I believe this is the upstream bug: https://bugzilla.redhat.com/show_bug.cgi?id=757382 Which operating system(s) are we aiming to support for Essex? Is the plan to backport the latest libvirt to Oneric, or are we going to wait for Precise? Justin On Mon, Mar 12, 2012 at 1:31 PM, Daniel P. Berrange berra...@redhat.com wrote: On Mon, Mar 12, 2012 at 02:17:49PM -0400, David Kranz wrote: In the spirit of Jay's message, we have a long-running cluster (diablo/kvm) where about once every 3-4 weeks a user will complain that she cannot connect to a vm. Examining the compute node shows that libvirt-bin is hung. Sometimes restarting this process fixes the problem. Sometimes it does not, but rebooting the compute node and then the vm does. I just heard from people in my company operating another cluster (essex/kvm) that they have also seen this. I filed a bug about a month ago https://bugs.launchpad.net/nova/+bug/931540 Has any one been running a kvm cluster for a long time with real users and never seen this issue? There have been various scenarios which can cause libvirtd to hang in the past, but that bug report doesn't have enough useful data to diagnose the issue. If libvirtd itself is hanging, then you need to attach to the daemon with GDB, and run 'thread apply all bt' to collect stack traces across all threads. Make sure you have debug symbols available when you do this. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp