Re: I/O errors in guest OS after repeated migration
On Tue, Nov 6, 2012 at 12:07 PM, Guido Winkelmann guido-k...@thisisnotatest.de wrote: Am Montag, 29. Oktober 2012, 12:29:01 schrieb Stefan Hajnoczi: On Fri, Oct 19, 2012 at 2:55 PM, Guido Winkelmann guido-k...@thisisnotatest.de wrote: Am Donnerstag, 18. Oktober 2012, 18:05:39 schrieb Avi Kivity: On 10/18/2012 05:50 PM, Guido Winkelmann wrote: Am Mittwoch, 17. Oktober 2012, 13:25:45 schrieb Brian Jackson: What about newer versions of qemu/kvm? But of course if those work, your next task is going to be git bisect it or file a bug with your distro that is using an ancient version of qemu/kvm. I've just upgraded both hosts to qemu-kvm 1.2.0 (qemu-1.2.0-14.fc17.x86_64, built from spec files under http://pkgs.fedoraproject.org/cgit/qemu.git/). The bug is still there. If you let the guest go idle (no I/O), then migrate it, then restart the I/O, do the errors show? Just tested - yes, they do. The -EIO error does not really reveal why there is a problem. You can use SystemTap probes in QEMU to find out more about the nature of the error. # stap -e 'probe qemu.kvm.bdrv_*, qemu.kvm.virtio_blk_*, qemu.kvm.paio_* { printf(%s(%s)\n, probefunc(), $$parms) }' -x $PID_OF_QEMU This does not work for me. When I try running this, I'm getting many pages of errors like this: == # stap -e 'probe qemu.kvm.bdrv_*, qemu.kvm.virtio_blk_*, qemu.kvm.paio_* { printf(%s(%s)\n, probefunc(), $$parms) }' -x 1623 parse error: expected statement saw: keyword at /usr/share/systemtap/tapset/qemu-alpha.stp:1455:3 source: function = $arg3; ^ parse error: expected identifier saw: operator '=' at /usr/share/systemtap/tapset/qemu- alpha.stp:1455:12 source: function = $arg3; ^ 2 parse errors. == Unfortunately, I don't know the first thing about systemtap, so I don't really know what's happening here... The QEMU .stp files on your system are using SystemTap keywords as variable names (function). SystemTap gets upset about this. In upstream qemu.git fixes have been merged so this does not happen again in the future. For now, you could edit the QEMU .stp files on disk to replace function with function_. Stefan -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: I/O errors in guest OS after repeated migration
Am Montag, 29. Oktober 2012, 12:29:01 schrieb Stefan Hajnoczi: On Fri, Oct 19, 2012 at 2:55 PM, Guido Winkelmann guido-k...@thisisnotatest.de wrote: Am Donnerstag, 18. Oktober 2012, 18:05:39 schrieb Avi Kivity: On 10/18/2012 05:50 PM, Guido Winkelmann wrote: Am Mittwoch, 17. Oktober 2012, 13:25:45 schrieb Brian Jackson: What about newer versions of qemu/kvm? But of course if those work, your next task is going to be git bisect it or file a bug with your distro that is using an ancient version of qemu/kvm. I've just upgraded both hosts to qemu-kvm 1.2.0 (qemu-1.2.0-14.fc17.x86_64, built from spec files under http://pkgs.fedoraproject.org/cgit/qemu.git/). The bug is still there. If you let the guest go idle (no I/O), then migrate it, then restart the I/O, do the errors show? Just tested - yes, they do. The -EIO error does not really reveal why there is a problem. You can use SystemTap probes in QEMU to find out more about the nature of the error. # stap -e 'probe qemu.kvm.bdrv_*, qemu.kvm.virtio_blk_*, qemu.kvm.paio_* { printf(%s(%s)\n, probefunc(), $$parms) }' -x $PID_OF_QEMU This does not work for me. When I try running this, I'm getting many pages of errors like this: == # stap -e 'probe qemu.kvm.bdrv_*, qemu.kvm.virtio_blk_*, qemu.kvm.paio_* { printf(%s(%s)\n, probefunc(), $$parms) }' -x 1623 parse error: expected statement saw: keyword at /usr/share/systemtap/tapset/qemu-alpha.stp:1455:3 source: function = $arg3; ^ parse error: expected identifier saw: operator '=' at /usr/share/systemtap/tapset/qemu- alpha.stp:1455:12 source: function = $arg3; ^ 2 parse errors. == Unfortunately, I don't know the first thing about systemtap, so I don't really know what's happening here... Guido -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: I/O errors in guest OS after repeated migration
On Fri, Oct 19, 2012 at 2:55 PM, Guido Winkelmann guido-k...@thisisnotatest.de wrote: Am Donnerstag, 18. Oktober 2012, 18:05:39 schrieb Avi Kivity: On 10/18/2012 05:50 PM, Guido Winkelmann wrote: Am Mittwoch, 17. Oktober 2012, 13:25:45 schrieb Brian Jackson: On Wednesday, October 17, 2012 10:45:14 AM Guido Winkelmann wrote: vda1, logical block 1858771 Oct 17 17:12:04 localhost kernel: [ 212.070600] Buffer I/O error on device vda1, logical block 1858772 Oct 17 17:12:04 localhost kernel: [ 212.070602] Buffer I/O error on device vda1, logical block 1858773 Oct 17 17:12:04 localhost kernel: [ 212.070605] Buffer I/O error on device vda1, logical block 1858774 Oct 17 17:12:04 localhost kernel: [ 212.070607] Buffer I/O error on device vda1, logical block 1858775 Oct 17 17:12:04 localhost kernel: [ 212.070610] Buffer I/O error on device vda1, logical block 1858776 Oct 17 17:12:04 localhost kernel: [ 212.070612] Buffer I/O error on device vda1, logical block 1858777 Oct 17 17:12:04 localhost kernel: [ 212.070615] Buffer I/O error on device vda1, logical block 1858778 Oct 17 17:12:04 localhost kernel: [ 212.070617] Buffer I/O error on device vda1, logical block 1858779 (I was writing a large file at the time, to make sure I actually catch I/O errors as they happen) What about newer versions of qemu/kvm? But of course if those work, your next task is going to be git bisect it or file a bug with your distro that is using an ancient version of qemu/kvm. I've just upgraded both hosts to qemu-kvm 1.2.0 (qemu-1.2.0-14.fc17.x86_64, built from spec files under http://pkgs.fedoraproject.org/cgit/qemu.git/). The bug is still there. If you let the guest go idle (no I/O), then migrate it, then restart the I/O, do the errors show? Just tested - yes, they do. The -EIO error does not really reveal why there is a problem. You can use SystemTap probes in QEMU to find out more about the nature of the error. # stap -e 'probe qemu.kvm.bdrv_*, qemu.kvm.virtio_blk_*, qemu.kvm.paio_* { printf(%s(%s)\n, probefunc(), $$parms) }' -x $PID_OF_QEMU Output looks like this: bdrv_co_readv($arg1=0x7fb2397cc580 $arg2=0x80c $arg3=0x1) bdrv_co_io_em($arg1=0x7fb2397cc580 $arg2=0x80c $arg3=0x1 $arg4=0x0 $arg5=0x7fb239da6f60) virtio_blk_rw_complete($arg1=0x7fb23982ed10 $arg2=0x0) virtio_blk_req_complete($arg1=0x7fb23982ed10 $arg2=0x0) virtio_blk_rw_complete $arg2=-5 means -EIO so look for that that. This will reveal what is happening when the error occurs. Stefan -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: I/O errors in guest OS after repeated migration
Am Donnerstag, 18. Oktober 2012, 18:05:39 schrieb Avi Kivity: On 10/18/2012 05:50 PM, Guido Winkelmann wrote: Am Mittwoch, 17. Oktober 2012, 13:25:45 schrieb Brian Jackson: On Wednesday, October 17, 2012 10:45:14 AM Guido Winkelmann wrote: vda1, logical block 1858771 Oct 17 17:12:04 localhost kernel: [ 212.070600] Buffer I/O error on device vda1, logical block 1858772 Oct 17 17:12:04 localhost kernel: [ 212.070602] Buffer I/O error on device vda1, logical block 1858773 Oct 17 17:12:04 localhost kernel: [ 212.070605] Buffer I/O error on device vda1, logical block 1858774 Oct 17 17:12:04 localhost kernel: [ 212.070607] Buffer I/O error on device vda1, logical block 1858775 Oct 17 17:12:04 localhost kernel: [ 212.070610] Buffer I/O error on device vda1, logical block 1858776 Oct 17 17:12:04 localhost kernel: [ 212.070612] Buffer I/O error on device vda1, logical block 1858777 Oct 17 17:12:04 localhost kernel: [ 212.070615] Buffer I/O error on device vda1, logical block 1858778 Oct 17 17:12:04 localhost kernel: [ 212.070617] Buffer I/O error on device vda1, logical block 1858779 (I was writing a large file at the time, to make sure I actually catch I/O errors as they happen) What about newer versions of qemu/kvm? But of course if those work, your next task is going to be git bisect it or file a bug with your distro that is using an ancient version of qemu/kvm. I've just upgraded both hosts to qemu-kvm 1.2.0 (qemu-1.2.0-14.fc17.x86_64, built from spec files under http://pkgs.fedoraproject.org/cgit/qemu.git/). The bug is still there. If you let the guest go idle (no I/O), then migrate it, then restart the I/O, do the errors show? Just tested - yes, they do. Guido -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: I/O errors in guest OS after repeated migration
Am Mittwoch, 17. Oktober 2012, 13:25:45 schrieb Brian Jackson: On Wednesday, October 17, 2012 10:45:14 AM Guido Winkelmann wrote: vda1, logical block 1858771 Oct 17 17:12:04 localhost kernel: [ 212.070600] Buffer I/O error on device vda1, logical block 1858772 Oct 17 17:12:04 localhost kernel: [ 212.070602] Buffer I/O error on device vda1, logical block 1858773 Oct 17 17:12:04 localhost kernel: [ 212.070605] Buffer I/O error on device vda1, logical block 1858774 Oct 17 17:12:04 localhost kernel: [ 212.070607] Buffer I/O error on device vda1, logical block 1858775 Oct 17 17:12:04 localhost kernel: [ 212.070610] Buffer I/O error on device vda1, logical block 1858776 Oct 17 17:12:04 localhost kernel: [ 212.070612] Buffer I/O error on device vda1, logical block 1858777 Oct 17 17:12:04 localhost kernel: [ 212.070615] Buffer I/O error on device vda1, logical block 1858778 Oct 17 17:12:04 localhost kernel: [ 212.070617] Buffer I/O error on device vda1, logical block 1858779 (I was writing a large file at the time, to make sure I actually catch I/O errors as they happen) What about newer versions of qemu/kvm? But of course if those work, your next task is going to be git bisect it or file a bug with your distro that is using an ancient version of qemu/kvm. I've just upgraded both hosts to qemu-kvm 1.2.0 (qemu-1.2.0-14.fc17.x86_64, built from spec files under http://pkgs.fedoraproject.org/cgit/qemu.git/). The bug is still there. Guido -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: I/O errors in guest OS after repeated migration
On 10/18/2012 05:50 PM, Guido Winkelmann wrote: Am Mittwoch, 17. Oktober 2012, 13:25:45 schrieb Brian Jackson: On Wednesday, October 17, 2012 10:45:14 AM Guido Winkelmann wrote: vda1, logical block 1858771 Oct 17 17:12:04 localhost kernel: [ 212.070600] Buffer I/O error on device vda1, logical block 1858772 Oct 17 17:12:04 localhost kernel: [ 212.070602] Buffer I/O error on device vda1, logical block 1858773 Oct 17 17:12:04 localhost kernel: [ 212.070605] Buffer I/O error on device vda1, logical block 1858774 Oct 17 17:12:04 localhost kernel: [ 212.070607] Buffer I/O error on device vda1, logical block 1858775 Oct 17 17:12:04 localhost kernel: [ 212.070610] Buffer I/O error on device vda1, logical block 1858776 Oct 17 17:12:04 localhost kernel: [ 212.070612] Buffer I/O error on device vda1, logical block 1858777 Oct 17 17:12:04 localhost kernel: [ 212.070615] Buffer I/O error on device vda1, logical block 1858778 Oct 17 17:12:04 localhost kernel: [ 212.070617] Buffer I/O error on device vda1, logical block 1858779 (I was writing a large file at the time, to make sure I actually catch I/O errors as they happen) What about newer versions of qemu/kvm? But of course if those work, your next task is going to be git bisect it or file a bug with your distro that is using an ancient version of qemu/kvm. I've just upgraded both hosts to qemu-kvm 1.2.0 (qemu-1.2.0-14.fc17.x86_64, built from spec files under http://pkgs.fedoraproject.org/cgit/qemu.git/). The bug is still there. If you let the guest go idle (no I/O), then migrate it, then restart the I/O, do the errors show? -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: I/O errors in guest OS after repeated migration
Am Dienstag, 16. Oktober 2012, 12:44:27 schrieb Brian Jackson: On Tuesday, October 16, 2012 11:33:44 AM Guido Winkelmann wrote: [...] The commandline, as generated by libvirtd, looks like this: LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin QEMU_AUDIO_DRV=none /usr/bin/qemu-kvm -S -M pc-0.15 -enable-kvm -m 1024 -smp 1,sockets=1,cores=1,threads=1 -name migratetest2 -uuid ddbf11e9-387e-902b-4849-8c3067dc42a2 -nodefconfig -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/migratetest2.monitor,serv e r,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-reboot -no- shutdown -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=/data/migratetest2_system,if=none,id=drive-virtio- disk0,format=qcow2,cache=none -device virtio-blk- pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio- disk0,bootindex=1 -drive file=/data/migratetest2_data-1,if=none,id=drive- virtio-disk1,format=qcow2,cache=none -device virtio-blk- pci,scsi=off,bus=pci.0,addr=0x5,drive=drive-virtio-disk1,id=virtio-disk1 - netdev tap,fd=27,id=hostnet0,vhost=on,vhostfd=28 -device virtio-net- pci,netdev=hostnet0,id=net0,mac=02:00:00:00:00:0c,bus=pci.0,addr=0x3 -vnc 127.0.0.1:2,password -k de -vga cirrus -incoming tcp:0.0.0.0:49153 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 I see qcow2 in there. Live migration of qcow2 was a new feature in 1.0. Have you tried other formats or different qemu/kvm versions? Are you sure about that? Because I'm fairly certain I have been using live migration since at least 0.14, if not 0.13, and I have always been using qcow2 as the image format for the disks... I can still try with other image formats, though. Guido -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: I/O errors in guest OS after repeated migration
Am Dienstag, 16. Oktober 2012, 12:44:27 schrieb Brian Jackson: On Tuesday, October 16, 2012 11:33:44 AM Guido Winkelmann wrote: The commandline, as generated by libvirtd, looks like this: LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin QEMU_AUDIO_DRV=none /usr/bin/qemu-kvm -S -M pc-0.15 -enable-kvm -m 1024 -smp 1,sockets=1,cores=1,threads=1 -name migratetest2 -uuid ddbf11e9-387e-902b-4849-8c3067dc42a2 -nodefconfig -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/migratetest2.monitor,serv e r,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-reboot -no- shutdown -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=/data/migratetest2_system,if=none,id=drive-virtio- disk0,format=qcow2,cache=none -device virtio-blk- pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio- disk0,bootindex=1 -drive file=/data/migratetest2_data-1,if=none,id=drive- virtio-disk1,format=qcow2,cache=none -device virtio-blk- pci,scsi=off,bus=pci.0,addr=0x5,drive=drive-virtio-disk1,id=virtio-disk1 - netdev tap,fd=27,id=hostnet0,vhost=on,vhostfd=28 -device virtio-net- pci,netdev=hostnet0,id=net0,mac=02:00:00:00:00:0c,bus=pci.0,addr=0x3 -vnc 127.0.0.1:2,password -k de -vga cirrus -incoming tcp:0.0.0.0:49153 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 I see qcow2 in there. Live migration of qcow2 was a new feature in 1.0. Have you tried other formats or different qemu/kvm versions? I tried the same thing with a raw image file instead of qcow2, and the problem still happens. From the /var/log/messages of the guest: Oct 17 17:10:34 localhost sshd[2368]: nss_ldap: could not search LDAP server - Server is unavailable Oct 17 17:10:39 localhost kernel: [ 126.800075] eth0: no IPv6 routers present Oct 17 17:10:52 localhost kernel: [ 140.335783] Clocksource tsc unstable (delta = -70265501 ns) Oct 17 17:12:04 localhost /O error on device vda1, logical block 1858765 Oct 17 17:12:04 localhost kernel: [ 212.070584] Buffer I/O error on device vda1, logical block 1858766 Oct 17 17:12:04 localhost kernel: [ 212.070587] Buffer I/O error on device vda1, logical block 1858767 Oct 17 17:12:04 localhost kernel: [ 212.070589] Buffer I/O error on device vda1, logical block 1858768 Oct 17 17:12:04 localhost kernel: [ 212.070592] Buffer I/O error on device vda1, logical block 1858769 Oct 17 17:12:04 localhost kernel: [ 212.070595] Buffer I/O error on device vda1, logical block 1858770 Oct 17 17:12:04 localhost kernel: [ 212.070597] Buffer I/O error on device vda1, logical block 1858771 Oct 17 17:12:04 localhost kernel: [ 212.070600] Buffer I/O error on device vda1, logical block 1858772 Oct 17 17:12:04 localhost kernel: [ 212.070602] Buffer I/O error on device vda1, logical block 1858773 Oct 17 17:12:04 localhost kernel: [ 212.070605] Buffer I/O error on device vda1, logical block 1858774 Oct 17 17:12:04 localhost kernel: [ 212.070607] Buffer I/O error on device vda1, logical block 1858775 Oct 17 17:12:04 localhost kernel: [ 212.070610] Buffer I/O error on device vda1, logical block 1858776 Oct 17 17:12:04 localhost kernel: [ 212.070612] Buffer I/O error on device vda1, logical block 1858777 Oct 17 17:12:04 localhost kernel: [ 212.070615] Buffer I/O error on device vda1, logical block 1858778 Oct 17 17:12:04 localhost kernel: [ 212.070617] Buffer I/O error on device vda1, logical block 1858779 (I was writing a large file at the time, to make sure I actually catch I/O errors as they happen) Guido -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: I/O errors in guest OS after repeated migration
On Wednesday, October 17, 2012 06:54:00 AM Guido Winkelmann wrote: Am Dienstag, 16. Oktober 2012, 12:44:27 schrieb Brian Jackson: On Tuesday, October 16, 2012 11:33:44 AM Guido Winkelmann wrote: [...] The commandline, as generated by libvirtd, looks like this: LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin QEMU_AUDIO_DRV=none /usr/bin/qemu-kvm -S -M pc-0.15 -enable-kvm -m 1024 -smp 1,sockets=1,cores=1,threads=1 -name migratetest2 -uuid ddbf11e9-387e-902b-4849-8c3067dc42a2 -nodefconfig -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/migratetest2.monitor,s erv e r,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-reboot -no- shutdown -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=/data/migratetest2_system,if=none,id=drive-virtio- disk0,format=qcow2,cache=none -device virtio-blk- pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio- disk0,bootindex=1 -drive file=/data/migratetest2_data-1,if=none,id=drive- virtio-disk1,format=qcow2,cache=none -device virtio-blk- pci,scsi=off,bus=pci.0,addr=0x5,drive=drive-virtio-disk1,id=virtio-disk 1 - netdev tap,fd=27,id=hostnet0,vhost=on,vhostfd=28 -device virtio-net- pci,netdev=hostnet0,id=net0,mac=02:00:00:00:00:0c,bus=pci.0,addr=0x3 -vnc 127.0.0.1:2,password -k de -vga cirrus -incoming tcp:0.0.0.0:49153 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 I see qcow2 in there. Live migration of qcow2 was a new feature in 1.0. Have you tried other formats or different qemu/kvm versions? Are you sure about that? Because I'm fairly certain I have been using live migration since at least 0.14, if not 0.13, and I have always been using qcow2 as the image format for the disks... I can still try with other image formats, though. Yes, see the release notes for 1.0. It may have worked by chance before that, but it wasn't guaranteed to work. There was no blacklisting feature then like there is now to stop it. Guido -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: I/O errors in guest OS after repeated migration
On Wednesday, October 17, 2012 10:45:14 AM Guido Winkelmann wrote: Am Dienstag, 16. Oktober 2012, 12:44:27 schrieb Brian Jackson: On Tuesday, October 16, 2012 11:33:44 AM Guido Winkelmann wrote: The commandline, as generated by libvirtd, looks like this: LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin QEMU_AUDIO_DRV=none /usr/bin/qemu-kvm -S -M pc-0.15 -enable-kvm -m 1024 -smp 1,sockets=1,cores=1,threads=1 -name migratetest2 -uuid ddbf11e9-387e-902b-4849-8c3067dc42a2 -nodefconfig -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/migratetest2.monitor,s erv e r,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-reboot -no- shutdown -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=/data/migratetest2_system,if=none,id=drive-virtio- disk0,format=qcow2,cache=none -device virtio-blk- pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio- disk0,bootindex=1 -drive file=/data/migratetest2_data-1,if=none,id=drive- virtio-disk1,format=qcow2,cache=none -device virtio-blk- pci,scsi=off,bus=pci.0,addr=0x5,drive=drive-virtio-disk1,id=virtio-disk 1 - netdev tap,fd=27,id=hostnet0,vhost=on,vhostfd=28 -device virtio-net- pci,netdev=hostnet0,id=net0,mac=02:00:00:00:00:0c,bus=pci.0,addr=0x3 -vnc 127.0.0.1:2,password -k de -vga cirrus -incoming tcp:0.0.0.0:49153 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 I see qcow2 in there. Live migration of qcow2 was a new feature in 1.0. Have you tried other formats or different qemu/kvm versions? I tried the same thing with a raw image file instead of qcow2, and the problem still happens. From the /var/log/messages of the guest: Oct 17 17:10:34 localhost sshd[2368]: nss_ldap: could not search LDAP server - Server is unavailable Oct 17 17:10:39 localhost kernel: [ 126.800075] eth0: no IPv6 routers present Oct 17 17:10:52 localhost kernel: [ 140.335783] Clocksource tsc unstable (delta = -70265501 ns) Oct 17 17:12:04 localhost /O error on device vda1, logical block 1858765 Oct 17 17:12:04 localhost kernel: [ 212.070584] Buffer I/O error on device vda1, logical block 1858766 Oct 17 17:12:04 localhost kernel: [ 212.070587] Buffer I/O error on device vda1, logical block 1858767 Oct 17 17:12:04 localhost kernel: [ 212.070589] Buffer I/O error on device vda1, logical block 1858768 Oct 17 17:12:04 localhost kernel: [ 212.070592] Buffer I/O error on device vda1, logical block 1858769 Oct 17 17:12:04 localhost kernel: [ 212.070595] Buffer I/O error on device vda1, logical block 1858770 Oct 17 17:12:04 localhost kernel: [ 212.070597] Buffer I/O error on device vda1, logical block 1858771 Oct 17 17:12:04 localhost kernel: [ 212.070600] Buffer I/O error on device vda1, logical block 1858772 Oct 17 17:12:04 localhost kernel: [ 212.070602] Buffer I/O error on device vda1, logical block 1858773 Oct 17 17:12:04 localhost kernel: [ 212.070605] Buffer I/O error on device vda1, logical block 1858774 Oct 17 17:12:04 localhost kernel: [ 212.070607] Buffer I/O error on device vda1, logical block 1858775 Oct 17 17:12:04 localhost kernel: [ 212.070610] Buffer I/O error on device vda1, logical block 1858776 Oct 17 17:12:04 localhost kernel: [ 212.070612] Buffer I/O error on device vda1, logical block 1858777 Oct 17 17:12:04 localhost kernel: [ 212.070615] Buffer I/O error on device vda1, logical block 1858778 Oct 17 17:12:04 localhost kernel: [ 212.070617] Buffer I/O error on device vda1, logical block 1858779 (I was writing a large file at the time, to make sure I actually catch I/O errors as they happen) What about newer versions of qemu/kvm? But of course if those work, your next task is going to be git bisect it or file a bug with your distro that is using an ancient version of qemu/kvm. Guido -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
I/O errors in guest OS after repeated migration
Hi, I'm experiencing I/O errors in a guest machine after migrating it from one host to another, and then back to the original host. After doing this, I find the following in the dmesg output of the guest machine: [ 345.390543] end_request: I/O error, dev vda, sector 273871 [ 345.391125] end_request: I/O error, dev vda, sector 273871 [ 345.391705] end_request: I/O error, dev vda, sector 273871 [ 345.394796] end_request: I/O error, dev vda, sector 1745983 [ 345.396005] end_request: I/O error, dev vda, sector 1745983 [ 346.083160] end_request: I/O error, dev vdb, sector 54528008 [ 346.083179] Buffer I/O error on device dm-0, logical block 6815745 [ 346.083181] lost page write due to I/O error on dm-0 [ 346.083193] end_request: I/O error, dev vdb, sector 54528264 [ 346.083195] Buffer I/O error on device dm-0, logical block 6815777 [ 346.083197] lost page write due to I/O error on dm-0 [ 346.083201] end_request: I/O error, dev vdb, sector 2056 [ 346.083204] Buffer I/O error on device dm-0, logical block 1 [ 346.083206] lost page write due to I/O error on dm-0 [ 346.083209] Buffer I/O error on device dm-0, logical block 2 [ 346.083211] lost page write due to I/O error on dm-0 [ 346.083215] end_request: I/O error, dev vdb, sector 10248 [ 346.083217] Buffer I/O error on device dm-0, logical block 1025 [ 346.083219] lost page write due to I/O error on dm-0 [ 346.091499] end_request: I/O error, dev vdb, sector 76240 [ 346.091506] Buffer I/O error on device dm-0, logical block 9274 [ 346.091508] lost page write due to I/O error on dm-0 [ 346.091572] JBD2: Detected IO errors while flushing file data on dm-0-8 [ 346.091915] end_request: I/O error, dev vdb, sector 38017360 [ 346.091956] Aborting journal on device dm-0-8. [ 346.092557] end_request: I/O error, dev vdb, sector 38012928 [ 346.092566] Buffer I/O error on device dm-0, logical block 4751360 [ 346.092569] lost page write due to I/O error on dm-0 [ 346.092624] JBD2: I/O error detected when updating journal superblock for dm-0-8. [ 346.100940] end_request: I/O error, dev vdb, sector 2048 [ 346.100948] Buffer I/O error on device dm-0, logical block 0 [ 346.100952] lost page write due to I/O error on dm-0 [ 346.101027] EXT4-fs error (device dm-0): ext4_journal_start_sb:327: Detected aborted journal [ 346.101038] EXT4-fs (dm-0): Remounting filesystem read-only [ 346.101051] EXT4-fs (dm-0): previous I/O error to superblock detected [ 346.101836] end_request: I/O error, dev vdb, sector 2048 [ 346.101845] Buffer I/O error on device dm-0, logical block 0 [ 346.101849] lost page write due to I/O error on dm-0 [ 373.006680] end_request: I/O error, dev vda, sector 624319 [ 373.007543] end_request: I/O error, dev vda, sector 624319 [ 373.008327] end_request: I/O error, dev vda, sector 624319 [ 374.886674] end_request: I/O error, dev vda, sector 624319 [ 374.887563] end_request: I/O error, dev vda, sector 624319 The hosts are both running Fedora 17 with qemu-kvm-1.0.1-1.fc17.x86_64. The guest machine has been started and migrated using libvirt (0.9.11). Kernel version is 3.5.6-1.fc17.x86_64 on the first host and 3.5.5-2.fc17.x86_64 on the second. The guest machine is on Kernel 3.3.8 and uses ext4 on its disks. The commandline, as generated by libvirtd, looks like this: LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin QEMU_AUDIO_DRV=none /usr/bin/qemu-kvm -S -M pc-0.15 -enable-kvm -m 1024 -smp 1,sockets=1,cores=1,threads=1 -name migratetest2 -uuid ddbf11e9-387e-902b-4849-8c3067dc42a2 -nodefconfig -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/migratetest2.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-reboot -no- shutdown -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=/data/migratetest2_system,if=none,id=drive-virtio- disk0,format=qcow2,cache=none -device virtio-blk- pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio- disk0,bootindex=1 -drive file=/data/migratetest2_data-1,if=none,id=drive- virtio-disk1,format=qcow2,cache=none -device virtio-blk- pci,scsi=off,bus=pci.0,addr=0x5,drive=drive-virtio-disk1,id=virtio-disk1 - netdev tap,fd=27,id=hostnet0,vhost=on,vhostfd=28 -device virtio-net- pci,netdev=hostnet0,id=net0,mac=02:00:00:00:00:0c,bus=pci.0,addr=0x3 -vnc 127.0.0.1:2,password -k de -vga cirrus -incoming tcp:0.0.0.0:49153 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 The second host has an ext4 filesystem mounted under /data, which it exports using NFSv3 over TCP to the first host, which also mounts it under /data. So far, the problem seems reproducible: When I start another guest machine and do the same thing with it, the same problem happens. Can anybody help me with this problem? Guido -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: I/O errors in guest OS after repeated migration
On Tuesday, October 16, 2012 11:33:44 AM Guido Winkelmann wrote: Hi, I'm experiencing I/O errors in a guest machine after migrating it from one host to another, and then back to the original host. After doing this, I find the following in the dmesg output of the guest machine: [ 345.390543] end_request: I/O error, dev vda, sector 273871 [ 345.391125] end_request: I/O error, dev vda, sector 273871 [ 345.391705] end_request: I/O error, dev vda, sector 273871 [ 345.394796] end_request: I/O error, dev vda, sector 1745983 [ 345.396005] end_request: I/O error, dev vda, sector 1745983 [ 346.083160] end_request: I/O error, dev vdb, sector 54528008 [ 346.083179] Buffer I/O error on device dm-0, logical block 6815745 [ 346.083181] lost page write due to I/O error on dm-0 [ 346.083193] end_request: I/O error, dev vdb, sector 54528264 [ 346.083195] Buffer I/O error on device dm-0, logical block 6815777 [ 346.083197] lost page write due to I/O error on dm-0 [ 346.083201] end_request: I/O error, dev vdb, sector 2056 [ 346.083204] Buffer I/O error on device dm-0, logical block 1 [ 346.083206] lost page write due to I/O error on dm-0 [ 346.083209] Buffer I/O error on device dm-0, logical block 2 [ 346.083211] lost page write due to I/O error on dm-0 [ 346.083215] end_request: I/O error, dev vdb, sector 10248 [ 346.083217] Buffer I/O error on device dm-0, logical block 1025 [ 346.083219] lost page write due to I/O error on dm-0 [ 346.091499] end_request: I/O error, dev vdb, sector 76240 [ 346.091506] Buffer I/O error on device dm-0, logical block 9274 [ 346.091508] lost page write due to I/O error on dm-0 [ 346.091572] JBD2: Detected IO errors while flushing file data on dm-0-8 [ 346.091915] end_request: I/O error, dev vdb, sector 38017360 [ 346.091956] Aborting journal on device dm-0-8. [ 346.092557] end_request: I/O error, dev vdb, sector 38012928 [ 346.092566] Buffer I/O error on device dm-0, logical block 4751360 [ 346.092569] lost page write due to I/O error on dm-0 [ 346.092624] JBD2: I/O error detected when updating journal superblock for dm-0-8. [ 346.100940] end_request: I/O error, dev vdb, sector 2048 [ 346.100948] Buffer I/O error on device dm-0, logical block 0 [ 346.100952] lost page write due to I/O error on dm-0 [ 346.101027] EXT4-fs error (device dm-0): ext4_journal_start_sb:327: Detected aborted journal [ 346.101038] EXT4-fs (dm-0): Remounting filesystem read-only [ 346.101051] EXT4-fs (dm-0): previous I/O error to superblock detected [ 346.101836] end_request: I/O error, dev vdb, sector 2048 [ 346.101845] Buffer I/O error on device dm-0, logical block 0 [ 346.101849] lost page write due to I/O error on dm-0 [ 373.006680] end_request: I/O error, dev vda, sector 624319 [ 373.007543] end_request: I/O error, dev vda, sector 624319 [ 373.008327] end_request: I/O error, dev vda, sector 624319 [ 374.886674] end_request: I/O error, dev vda, sector 624319 [ 374.887563] end_request: I/O error, dev vda, sector 624319 The hosts are both running Fedora 17 with qemu-kvm-1.0.1-1.fc17.x86_64. The guest machine has been started and migrated using libvirt (0.9.11). Kernel version is 3.5.6-1.fc17.x86_64 on the first host and 3.5.5-2.fc17.x86_64 on the second. The guest machine is on Kernel 3.3.8 and uses ext4 on its disks. The commandline, as generated by libvirtd, looks like this: LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin QEMU_AUDIO_DRV=none /usr/bin/qemu-kvm -S -M pc-0.15 -enable-kvm -m 1024 -smp 1,sockets=1,cores=1,threads=1 -name migratetest2 -uuid ddbf11e9-387e-902b-4849-8c3067dc42a2 -nodefconfig -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/migratetest2.monitor,serve r,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-reboot -no- shutdown -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=/data/migratetest2_system,if=none,id=drive-virtio- disk0,format=qcow2,cache=none -device virtio-blk- pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio- disk0,bootindex=1 -drive file=/data/migratetest2_data-1,if=none,id=drive- virtio-disk1,format=qcow2,cache=none -device virtio-blk- pci,scsi=off,bus=pci.0,addr=0x5,drive=drive-virtio-disk1,id=virtio-disk1 - netdev tap,fd=27,id=hostnet0,vhost=on,vhostfd=28 -device virtio-net- pci,netdev=hostnet0,id=net0,mac=02:00:00:00:00:0c,bus=pci.0,addr=0x3 -vnc 127.0.0.1:2,password -k de -vga cirrus -incoming tcp:0.0.0.0:49153 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 I see qcow2 in there. Live migration of qcow2 was a new feature in 1.0. Have you tried other formats or different qemu/kvm versions? The second host has an ext4 filesystem mounted under /data, which it exports using NFSv3 over TCP to the first host, which also mounts it under /data. So far, the problem seems reproducible: When I start another guest machine and do the same thing with