Package: qemu-system-x86 Version: 2.1+dfsg-12~bpo70+1 Severity: important Dear Maintainer,
We are seeing guests lockup/hang with qemu. The guests hang with 100% CPU usage. The problem seems to be storage/IO related, but there is not necessarily high IO happening on the host at the time the guest hangs. At the time of the crash, the VNC console is not responsive and the only way to resolve is to forcefully power off the guest and back on. This guest shown below is running Debian Wheezy, but it seems to affect other guests operating systems such as Windows, CentOS etc. The hosts storage where guest disk images are stored is OCFS2 formatted running over iSCSI. Guests that have a disk cache of cache='writeback' and the qcow2 disk image file created with qemu-img -o preallocation=metadata seem to be less frequently affected, but none the less are still affected. We have also tried virtio-blk, virtio-scsi, scsi and standard IDE for the disk controller on the guest, but doesn't seem to improve things. qemu-system-x86_64 -enable-kvm -name guest1 -S -machine pc- i440fx-2.1,accel=kvm,usb=off -m 1024 -realtime mlock=off -smp 1,sockets=1,cores=1,threads=1 -uuid 973bf27b-04f9-61dd-9272-de2467b599d5 -no- user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/guest1.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -boot strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device lsi,id=scsi0,bus=pci.0,addr=0x4 -drive file=/mnt/vm/guest1.img,if=none,id =drive-scsi0-0-0,format=qcow2,cache=none -device scsi-hd,bus=scsi0.0,scsi- id=0,drive=drive-scsi0-0-0,id=scsi0-0-0,bootindex=1 -netdev tap,fd=31,id=hostnet0,vhost=on,vhostfd=42 -device virtio-net- pci,netdev=hostnet0,id=net0,mac=52:54:00:9a:36:10,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -device usb-tablet,id=input0 -vnc 127.0.0.1:9 -device cirrus- vga,id=video0,bus=pci.0,addr=0x2 -device virtio-balloon- pci,id=balloon0,bus=pci.0,addr=0x5 -msg timestamp=on A backtrace of a hung guest shows: (gdb) bt #0 0x00007f6ce44fed5c in __lll_lock_wait () from /lib/x86_64-linux- gnu/libpthread.so.0 #1 0x00007f6ce44fa3a9 in _L_lock_926 () from /lib/x86_64-linux- gnu/libpthread.so.0 #2 0x00007f6ce44fa1cb in pthread_mutex_lock () from /lib/x86_64-linux- gnu/libpthread.so.0 #3 0x00007f6cea9849f9 in ?? () #4 0x00007f6cea9313bb in ?? () #5 0x00007f6cea66ebed in main () (gdb) info threads Id Target Id Frame 3 Thread 0x7f6cdac4a700 (LWP 29599) "qemu-system-x86" 0x00007f6ce4236de1 in ppoll () from /lib/x86_64-linux-gnu/libc.so.6 2 Thread 0x7f6c993ff700 (LWP 29601) "qemu-system-x86" 0x00007f6ce44fc344 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0 * 1 Thread 0x7f6cea49b900 (LWP 29596) "qemu-system-x86" 0x00007f6ce44fed5c in __lll_lock_wait () from /lib/x86_64-linux-gnu/libpthread.so.0 (gdb) thread apply all bt Thread 3 (Thread 0x7f6cdac4a700 (LWP 29599)): #0 0x00007f6ce4236de1 in ppoll () from /lib/x86_64-linux-gnu/libc.so.6 #1 0x00007f6cea931f1b in ?? () #2 0x00007f6cea933230 in ?? () #3 0x00007f6cea924fdd in ?? () #4 0x00007f6cea8925b6 in ?? () #5 0x00007f6cea899676 in ?? () #6 0x00007f6cea8998c5 in ?? () #7 0x00007f6cea891bb7 in ?? () #8 0x00007f6cea824929 in ?? () #9 0x00007f6cea824078 in ?? () #10 0x00007f6cea823f98 in ?? () #11 0x00007f6cea6b2c79 in ?? () #12 0x00007f6cea6b89bf in ?? () #13 0x00007f6cea679163 in ?? () #14 0x00007f6cea6b1cf5 in ?? () #15 0x00007f6cea69d25c in ?? () #16 0x00007f6ce44f7b50 in start_thread () from /lib/x86_64-linux- gnu/libpthread.so.0 #17 0x00007f6ce424195d in clone () from /lib/x86_64-linux-gnu/libc.so.6 #18 0x0000000000000000 in ?? () Thread 2 (Thread 0x7f6c993ff700 (LWP 29601)): #0 0x00007f6ce44fc344 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/x86_64 -linux-gnu/libpthread.so.0 #1 0x00007f6cea984c19 in ?? () #2 0x00007f6cea920b7b in ?? () #3 0x00007f6cea920f50 in ?? () #4 0x00007f6ce44f7b50 in start_thread () from /lib/x86_64-linux- gnu/libpthread.so.0 #5 0x00007f6ce424195d in clone () from /lib/x86_64-linux-gnu/libc.so.6 #6 0x0000000000000000 in ?? () Thread 1 (Thread 0x7f6cea49b900 (LWP 29596)): #0 0x00007f6ce44fed5c in __lll_lock_wait () from /lib/x86_64-linux- gnu/libpthread.so.0 #1 0x00007f6ce44fa3a9 in _L_lock_926 () from /lib/x86_64-linux- gnu/libpthread.so.0 #2 0x00007f6ce44fa1cb in pthread_mutex_lock () from /lib/x86_64-linux- gnu/libpthread.so.0 #3 0x00007f6cea9849f9 in ?? () #4 0x00007f6cea9313bb in ?? () #5 0x00007f6cea66ebed in main () Host system information: Linux 3.14-0.bpo.2-amd64 #1 SMP Debian 3.14.15-2~bpo70+1 (2014-08-21) x86_64 GNU/Linux qemu-kvm 1:2.1+dfsg-12~bpo70+1 amd64 qemu-system-x86 1:2.1+dfsg-12~bpo70+1 amd64 Though not quite the same situation, google yeilds similar problems: https://lists.gnu.org/archive/html/qemu-devel/2014-08/msg01545.html https://lists.nongnu.org/archive/html/qemu-devel/2010-05/msg01098.html We are not using multi-path on our storage as mentioned in the above hosts here though. -- System Information: Debian Release: 7.8 APT prefers oldstable-updates APT policy: (500, 'oldstable-updates'), (500, 'oldstable') Architecture: amd64 (x86_64) Foreign Architectures: i686 i386 Kernel: Linux 3.12-0.bpo.1-amd64 (SMP w/4 CPU cores) Locale: LANG=en_GB.UTF-8, LC_CTYPE=en_GB.UTF-8 (charmap=UTF-8) Shell: /bin/sh linked to /bin/dash