Hi Stefan, Am 02.08.2013 um 17:24 schrieb Stefan Hajnoczi <1207...@bugs.launchpad.net>:
> On Fri, Aug 02, 2013 at 09:58:29AM -0000, Oliver Francke wrote: >> after some testing I tried to narrow down a problem, which was initially >> reported by some users. >> Seen on different distros - debian 7.1, ubuntu 12.04 LTS, IPFire-2.3 as >> reported by now. >> >> All using some flavour of linux-3.2.x kernel. >> >> Tried e.g. under Ubuntu an upgrade to "Linux 3.8.0-27-generic x86_64" which >> solves the problem. > > Is that a guest kernel upgrade? yeah, sorry if that was not clear enough. > >> Problem could be triggert with some workload ala: >> >> spew -v --raw -P -t -i 3 -b 4k -p random -B 4k 1G /tmp/doof.dat >> and in parallel do some apt-get install/remove/whatever. >> >> That results in a somewhat stuck qemu-session with the bad >> "kernel_hung_task..." messages. >> >> A typical command-line is as follows: >> >> /usr/local/qemu-1.6.0/bin/qemu-system-x86_64 -usbdevice tablet -enable- >> kvm -daemonize -pidfile /var/run/qemu-server/760.pid -monitor >> unix:/var/run/qemu-server/760.mon,server,nowait -vnc unix:/var/run/qemu- >> server/760.vnc,password -qmp unix:/var/run/qemu- >> server/760.qmp,server,nowait -nodefaults -serial none -parallel none >> -device virtio-net-pci,mac=00:F1:70:00:2F:80,netdev=vlan0d0 -netdev >> type=tap,id=vlan0d0,ifname=tap760i0d0,script=/etc/fcms/add_if.sh,downscript=/etc/fcms/downscript.sh >> -name 1155823384-4 -m 512 -vga cirrus -k de -smp sockets=1,cores=1 >> -device virtio-blk-pci,drive=virtio0 -drive >> format=raw,file=rbd:1155823384/vm-760-disk-1.rbd:rbd_cache=false,cache=writeback,if=none,id=virtio0,media=disk,index=0,aio=native >> -drive >> format=raw,file=rbd:1155823384/vm-760-swap-1.rbd:rbd_cache=false,cache=writeback,if=virtio,media=disk,index=1,aio=native >> -drive if=ide,media=cdrom,id=ide1-cd0,readonly=on -drive >> if=ide,media=cdrom,id=ide1-cd1,readonly=on -boot order=dc >> >> no "system_reset", "sendkey ctrl-alt-delete" or "q" in monitoring- >> session is accepted, need to hard-kill the process. > > Yesterday I saw a possibly related report on IRC. It was a Windows > guest running under OpenStack with images on Ceph. > > They reported that the QEMU process would lock up - ping would not work > and their management tools showed 0 CPU activity for the guest. > However, they were able to "kick" the guest by taking a VNC screenshot > (I think). Then it would come back to life. > > If you have a Linux guest that is reporting kernel_hung_task, then it > could be a similar scenario. > > Please confirm that the hung task message is from inside the guest. > confirmed. > If you are able to reproduce this and have an alternative non-Ceph > storage pool, please try that since Ceph is common to both these bug > reports. > I can reproduce it with: kernel 3.2.something + qemu-1.[456] ( never spent much time on 1.3) and high I/O. I took this VM later this day and converted it to local-storage-qcow2, no prob with any kernel. I already asked on ceph-users-list for assistance, especially from Josh ( if he's not on summer holiday ;) ) What is strange, I have a session via VNC-console opened and have a loop ala: while true; do apt-get install -y ntp libopts25; apt-get remove -y ntp-libopts25; done and and parallel spew as described, the apt-"session" dies and one can see the hung_task-thingy, but I still can restart the spew-test. Just for completeness. Thnx for you attention, Oliver. > Stefan > > -- > You received this bug notification because you are subscribed to the bug > report. > https://bugs.launchpad.net/bugs/1207686 > > Title: > qemu-1.4.0 and onwards, linux kernel 3.2.x, heavy I/O leads to > kernel_hung_tasks_timout_secs message and unresponsive qemu-process > > Status in QEMU: > New > > Bug description: > Hi, > > after some testing I tried to narrow down a problem, which was initially > reported by some users. > Seen on different distros - debian 7.1, ubuntu 12.04 LTS, IPFire-2.3 as > reported by now. > > All using some flavour of linux-3.2.x kernel. > > Tried e.g. under Ubuntu an upgrade to "Linux 3.8.0-27-generic x86_64" which > solves the problem. > Problem could be triggert with some workload ala: > > spew -v --raw -P -t -i 3 -b 4k -p random -B 4k 1G /tmp/doof.dat > and in parallel do some apt-get install/remove/whatever. > > That results in a somewhat stuck qemu-session with the bad > "kernel_hung_task..." messages. > > A typical command-line is as follows: > > /usr/local/qemu-1.6.0/bin/qemu-system-x86_64 -usbdevice tablet > -enable-kvm -daemonize -pidfile /var/run/qemu-server/760.pid -monitor > unix:/var/run/qemu-server/760.mon,server,nowait -vnc unix:/var/run > /qemu-server/760.vnc,password -qmp unix:/var/run/qemu- > server/760.qmp,server,nowait -nodefaults -serial none -parallel none > -device virtio-net-pci,mac=00:F1:70:00:2F:80,netdev=vlan0d0 -netdev > > type=tap,id=vlan0d0,ifname=tap760i0d0,script=/etc/fcms/add_if.sh,downscript=/etc/fcms/downscript.sh > -name 1155823384-4 -m 512 -vga cirrus -k de -smp sockets=1,cores=1 > -device virtio-blk-pci,drive=virtio0 -drive > > format=raw,file=rbd:1155823384/vm-760-disk-1.rbd:rbd_cache=false,cache=writeback,if=none,id=virtio0,media=disk,index=0,aio=native > -drive > > format=raw,file=rbd:1155823384/vm-760-swap-1.rbd:rbd_cache=false,cache=writeback,if=virtio,media=disk,index=1,aio=native > -drive if=ide,media=cdrom,id=ide1-cd0,readonly=on -drive > if=ide,media=cdrom,id=ide1-cd1,readonly=on -boot order=dc > > no "system_reset", "sendkey ctrl-alt-delete" or "q" in monitoring- > session is accepted, need to hard-kill the process. > > Please give any advice on what to do for tracing/debugging, because > the number of tickets here are raising, and noone knows, what users > are doing inside their VM. > > Kind regards, > > Oliver Francke. > > To manage notifications about this bug go to: > https://bugs.launchpad.net/qemu/+bug/1207686/+subscriptions -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1207686 Title: qemu-1.4.0 and onwards, linux kernel 3.2.x, heavy I/O leads to kernel_hung_tasks_timout_secs message and unresponsive qemu-process Status in QEMU: New Bug description: Hi, after some testing I tried to narrow down a problem, which was initially reported by some users. Seen on different distros - debian 7.1, ubuntu 12.04 LTS, IPFire-2.3 as reported by now. All using some flavour of linux-3.2.x kernel. Tried e.g. under Ubuntu an upgrade to "Linux 3.8.0-27-generic x86_64" which solves the problem. Problem could be triggert with some workload ala: spew -v --raw -P -t -i 3 -b 4k -p random -B 4k 1G /tmp/doof.dat and in parallel do some apt-get install/remove/whatever. That results in a somewhat stuck qemu-session with the bad "kernel_hung_task..." messages. A typical command-line is as follows: /usr/local/qemu-1.6.0/bin/qemu-system-x86_64 -usbdevice tablet -enable-kvm -daemonize -pidfile /var/run/qemu-server/760.pid -monitor unix:/var/run/qemu-server/760.mon,server,nowait -vnc unix:/var/run /qemu-server/760.vnc,password -qmp unix:/var/run/qemu- server/760.qmp,server,nowait -nodefaults -serial none -parallel none -device virtio-net-pci,mac=00:F1:70:00:2F:80,netdev=vlan0d0 -netdev type=tap,id=vlan0d0,ifname=tap760i0d0,script=/etc/fcms/add_if.sh,downscript=/etc/fcms/downscript.sh -name 1155823384-4 -m 512 -vga cirrus -k de -smp sockets=1,cores=1 -device virtio-blk-pci,drive=virtio0 -drive format=raw,file=rbd:1155823384/vm-760-disk-1.rbd:rbd_cache=false,cache=writeback,if=none,id=virtio0,media=disk,index=0,aio=native -drive format=raw,file=rbd:1155823384/vm-760-swap-1.rbd:rbd_cache=false,cache=writeback,if=virtio,media=disk,index=1,aio=native -drive if=ide,media=cdrom,id=ide1-cd0,readonly=on -drive if=ide,media=cdrom,id=ide1-cd1,readonly=on -boot order=dc no "system_reset", "sendkey ctrl-alt-delete" or "q" in monitoring- session is accepted, need to hard-kill the process. Please give any advice on what to do for tracing/debugging, because the number of tickets here are raising, and noone knows, what users are doing inside their VM. Kind regards, Oliver Francke. To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1207686/+subscriptions