On Mon, Mar 12, 2012 at 02:17:49PM -0400, David Kranz wrote: > In the spirit of Jay's message, we have a long-running cluster > (diablo/kvm) where about once every 3-4 weeks a user will complain > that she cannot connect to a vm. Examining the compute node shows > that libvirt-bin is hung. Sometimes restarting this process fixes > the problem. Sometimes it does not, but rebooting the compute node > and then the vm does. I just heard from people in my company > operating another cluster (essex/kvm) that they have also seen this. > I filed a bug about a month ago > > https://bugs.launchpad.net/nova/+bug/931540 > > Has any one been running a kvm cluster for a long time with real > users and never seen this issue?
There have been various scenarios which can cause libvirtd to hang in the past, but that bug report doesn't have enough useful data to diagnose the issue. If libvirtd itself is hanging, then you need to attach to the daemon with GDB, and run 'thread apply all bt' to collect stack traces across all threads. Make sure you have debug symbols available when you do this. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| _______________________________________________ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp