Re: I/O errors in guest OS after repeated migration

2012-11-08 Thread Stefan Hajnoczi
On Tue, Nov 6, 2012 at 12:07 PM, Guido Winkelmann
guido-k...@thisisnotatest.de wrote:
 Am Montag, 29. Oktober 2012, 12:29:01 schrieb Stefan Hajnoczi:
 On Fri, Oct 19, 2012 at 2:55 PM, Guido Winkelmann
 guido-k...@thisisnotatest.de wrote:
  Am Donnerstag, 18. Oktober 2012, 18:05:39 schrieb Avi Kivity:
  On 10/18/2012 05:50 PM, Guido Winkelmann wrote:
   Am Mittwoch, 17. Oktober 2012, 13:25:45 schrieb Brian Jackson:
  
   What about newer versions of qemu/kvm? But of course if those work,
   your
   next task is going to be git bisect it or file a bug with your distro
   that
   is using an ancient version of qemu/kvm.
  
   I've just upgraded both hosts to qemu-kvm 1.2.0
   (qemu-1.2.0-14.fc17.x86_64,
   built from spec files under
   http://pkgs.fedoraproject.org/cgit/qemu.git/).
  
   The bug is still there.
 
  If you let the guest go idle (no I/O), then migrate it, then restart the
  I/O, do the errors show?
 
  Just tested - yes, they do.

 The -EIO error does not really reveal why there is a problem.  You can
 use SystemTap probes in QEMU to find out more about the nature of the
 error.

 # stap -e 'probe qemu.kvm.bdrv_*, qemu.kvm.virtio_blk_*,
 qemu.kvm.paio_* { printf(%s(%s)\n, probefunc(), $$parms) }' -x
 $PID_OF_QEMU

 This does not work for me. When I try running this, I'm getting many pages of
 errors like this:

 ==
 # stap -e 'probe qemu.kvm.bdrv_*, qemu.kvm.virtio_blk_*, qemu.kvm.paio_* {
 printf(%s(%s)\n, probefunc(), $$parms) }' -x 1623
 parse error: expected statement
 saw: keyword at /usr/share/systemtap/tapset/qemu-alpha.stp:1455:3
  source:   function = $arg3;
^
 parse error: expected identifier
 saw: operator '=' at /usr/share/systemtap/tapset/qemu-
 alpha.stp:1455:12
  source:   function = $arg3;
 ^
 2 parse errors.
 ==

 Unfortunately, I don't know the first thing about systemtap, so I don't really
 know what's happening here...

The QEMU .stp files on your system are using SystemTap keywords as
variable names (function).  SystemTap gets upset about this.

In upstream qemu.git fixes have been merged so this does not happen
again in the future.

For now, you could edit the QEMU .stp files on disk to replace
function with function_.

Stefan
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: I/O errors in guest OS after repeated migration

2012-11-06 Thread Guido Winkelmann
Am Montag, 29. Oktober 2012, 12:29:01 schrieb Stefan Hajnoczi:
 On Fri, Oct 19, 2012 at 2:55 PM, Guido Winkelmann
 guido-k...@thisisnotatest.de wrote:
  Am Donnerstag, 18. Oktober 2012, 18:05:39 schrieb Avi Kivity:
  On 10/18/2012 05:50 PM, Guido Winkelmann wrote:
   Am Mittwoch, 17. Oktober 2012, 13:25:45 schrieb Brian Jackson:
   
   What about newer versions of qemu/kvm? But of course if those work,
   your
   next task is going to be git bisect it or file a bug with your distro
   that
   is using an ancient version of qemu/kvm.
   
   I've just upgraded both hosts to qemu-kvm 1.2.0
   (qemu-1.2.0-14.fc17.x86_64,
   built from spec files under
   http://pkgs.fedoraproject.org/cgit/qemu.git/).
   
   The bug is still there.
  
  If you let the guest go idle (no I/O), then migrate it, then restart the
  I/O, do the errors show?
  
  Just tested - yes, they do.
 
 The -EIO error does not really reveal why there is a problem.  You can
 use SystemTap probes in QEMU to find out more about the nature of the
 error.
 
 # stap -e 'probe qemu.kvm.bdrv_*, qemu.kvm.virtio_blk_*,
 qemu.kvm.paio_* { printf(%s(%s)\n, probefunc(), $$parms) }' -x
 $PID_OF_QEMU

This does not work for me. When I try running this, I'm getting many pages of 
errors like this:

==
# stap -e 'probe qemu.kvm.bdrv_*, qemu.kvm.virtio_blk_*, qemu.kvm.paio_* { 
printf(%s(%s)\n, probefunc(), $$parms) }' -x 1623
parse error: expected statement
saw: keyword at /usr/share/systemtap/tapset/qemu-alpha.stp:1455:3
 source:   function = $arg3;
   ^
parse error: expected identifier
saw: operator '=' at /usr/share/systemtap/tapset/qemu-
alpha.stp:1455:12
 source:   function = $arg3;
^
2 parse errors.
==

Unfortunately, I don't know the first thing about systemtap, so I don't really 
know what's happening here...

Guido
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: I/O errors in guest OS after repeated migration

2012-10-29 Thread Stefan Hajnoczi
On Fri, Oct 19, 2012 at 2:55 PM, Guido Winkelmann
guido-k...@thisisnotatest.de wrote:
 Am Donnerstag, 18. Oktober 2012, 18:05:39 schrieb Avi Kivity:
 On 10/18/2012 05:50 PM, Guido Winkelmann wrote:
  Am Mittwoch, 17. Oktober 2012, 13:25:45 schrieb Brian Jackson:
  On Wednesday, October 17, 2012 10:45:14 AM Guido Winkelmann wrote:
   vda1, logical block 1858771
   Oct 17 17:12:04 localhost kernel: [  212.070600] Buffer I/O error on
   device
   vda1, logical block 1858772
   Oct 17 17:12:04 localhost kernel: [  212.070602] Buffer I/O error on
   device
   vda1, logical block 1858773
   Oct 17 17:12:04 localhost kernel: [  212.070605] Buffer I/O error on
   device
   vda1, logical block 1858774
   Oct 17 17:12:04 localhost kernel: [  212.070607] Buffer I/O error on
   device
   vda1, logical block 1858775
   Oct 17 17:12:04 localhost kernel: [  212.070610] Buffer I/O error on
   device
   vda1, logical block 1858776
   Oct 17 17:12:04 localhost kernel: [  212.070612] Buffer I/O error on
   device
   vda1, logical block 1858777
   Oct 17 17:12:04 localhost kernel: [  212.070615] Buffer I/O error on
   device
   vda1, logical block 1858778
   Oct 17 17:12:04 localhost kernel: [  212.070617] Buffer I/O error on
   device
   vda1, logical block 1858779
  
   (I was writing a large file at the time, to make sure I actually catch
   I/O
   errors as they happen)
 
  What about newer versions of qemu/kvm? But of course if those work, your
  next task is going to be git bisect it or file a bug with your distro
  that
  is using an ancient version of qemu/kvm.
 
  I've just upgraded both hosts to qemu-kvm 1.2.0
  (qemu-1.2.0-14.fc17.x86_64,
  built from spec files under http://pkgs.fedoraproject.org/cgit/qemu.git/).
 
  The bug is still there.

 If you let the guest go idle (no I/O), then migrate it, then restart the
 I/O, do the errors show?

 Just tested - yes, they do.

The -EIO error does not really reveal why there is a problem.  You can
use SystemTap probes in QEMU to find out more about the nature of the
error.

# stap -e 'probe qemu.kvm.bdrv_*, qemu.kvm.virtio_blk_*,
qemu.kvm.paio_* { printf(%s(%s)\n, probefunc(), $$parms) }' -x
$PID_OF_QEMU

Output looks like this:

bdrv_co_readv($arg1=0x7fb2397cc580 $arg2=0x80c $arg3=0x1)
bdrv_co_io_em($arg1=0x7fb2397cc580 $arg2=0x80c $arg3=0x1 $arg4=0x0
$arg5=0x7fb239da6f60)
virtio_blk_rw_complete($arg1=0x7fb23982ed10 $arg2=0x0)
virtio_blk_req_complete($arg1=0x7fb23982ed10 $arg2=0x0)

virtio_blk_rw_complete $arg2=-5 means -EIO so look for that that.
This will reveal what is happening when the error occurs.

Stefan
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: I/O errors in guest OS after repeated migration

2012-10-19 Thread Guido Winkelmann
Am Donnerstag, 18. Oktober 2012, 18:05:39 schrieb Avi Kivity:
 On 10/18/2012 05:50 PM, Guido Winkelmann wrote:
  Am Mittwoch, 17. Oktober 2012, 13:25:45 schrieb Brian Jackson:
  On Wednesday, October 17, 2012 10:45:14 AM Guido Winkelmann wrote:
   vda1, logical block 1858771
   Oct 17 17:12:04 localhost kernel: [  212.070600] Buffer I/O error on
   device
   vda1, logical block 1858772
   Oct 17 17:12:04 localhost kernel: [  212.070602] Buffer I/O error on
   device
   vda1, logical block 1858773
   Oct 17 17:12:04 localhost kernel: [  212.070605] Buffer I/O error on
   device
   vda1, logical block 1858774
   Oct 17 17:12:04 localhost kernel: [  212.070607] Buffer I/O error on
   device
   vda1, logical block 1858775
   Oct 17 17:12:04 localhost kernel: [  212.070610] Buffer I/O error on
   device
   vda1, logical block 1858776
   Oct 17 17:12:04 localhost kernel: [  212.070612] Buffer I/O error on
   device
   vda1, logical block 1858777
   Oct 17 17:12:04 localhost kernel: [  212.070615] Buffer I/O error on
   device
   vda1, logical block 1858778
   Oct 17 17:12:04 localhost kernel: [  212.070617] Buffer I/O error on
   device
   vda1, logical block 1858779
   
   (I was writing a large file at the time, to make sure I actually catch
   I/O
   errors as they happen)
  
  What about newer versions of qemu/kvm? But of course if those work, your
  next task is going to be git bisect it or file a bug with your distro
  that
  is using an ancient version of qemu/kvm.
  
  I've just upgraded both hosts to qemu-kvm 1.2.0
  (qemu-1.2.0-14.fc17.x86_64,
  built from spec files under http://pkgs.fedoraproject.org/cgit/qemu.git/).
  
  The bug is still there.
 
 If you let the guest go idle (no I/O), then migrate it, then restart the
 I/O, do the errors show?

Just tested - yes, they do.

Guido
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: I/O errors in guest OS after repeated migration

2012-10-18 Thread Guido Winkelmann
Am Mittwoch, 17. Oktober 2012, 13:25:45 schrieb Brian Jackson:
 On Wednesday, October 17, 2012 10:45:14 AM Guido Winkelmann wrote:
  vda1, logical block 1858771
  Oct 17 17:12:04 localhost kernel: [  212.070600] Buffer I/O error on
  device
  vda1, logical block 1858772
  Oct 17 17:12:04 localhost kernel: [  212.070602] Buffer I/O error on
  device
  vda1, logical block 1858773
  Oct 17 17:12:04 localhost kernel: [  212.070605] Buffer I/O error on
  device
  vda1, logical block 1858774
  Oct 17 17:12:04 localhost kernel: [  212.070607] Buffer I/O error on
  device
  vda1, logical block 1858775
  Oct 17 17:12:04 localhost kernel: [  212.070610] Buffer I/O error on
  device
  vda1, logical block 1858776
  Oct 17 17:12:04 localhost kernel: [  212.070612] Buffer I/O error on
  device
  vda1, logical block 1858777
  Oct 17 17:12:04 localhost kernel: [  212.070615] Buffer I/O error on
  device
  vda1, logical block 1858778
  Oct 17 17:12:04 localhost kernel: [  212.070617] Buffer I/O error on
  device
  vda1, logical block 1858779
  
  (I was writing a large file at the time, to make sure I actually catch I/O
  errors as they happen)
 
 What about newer versions of qemu/kvm? But of course if those work, your
 next task is going to be git bisect it or file a bug with your distro that
 is using an ancient version of qemu/kvm.

I've just upgraded both hosts to qemu-kvm 1.2.0 (qemu-1.2.0-14.fc17.x86_64, 
built from spec files under http://pkgs.fedoraproject.org/cgit/qemu.git/).

The bug is still there.

Guido
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: I/O errors in guest OS after repeated migration

2012-10-18 Thread Avi Kivity
On 10/18/2012 05:50 PM, Guido Winkelmann wrote:
 Am Mittwoch, 17. Oktober 2012, 13:25:45 schrieb Brian Jackson:
 On Wednesday, October 17, 2012 10:45:14 AM Guido Winkelmann wrote:
  vda1, logical block 1858771
  Oct 17 17:12:04 localhost kernel: [  212.070600] Buffer I/O error on
  device
  vda1, logical block 1858772
  Oct 17 17:12:04 localhost kernel: [  212.070602] Buffer I/O error on
  device
  vda1, logical block 1858773
  Oct 17 17:12:04 localhost kernel: [  212.070605] Buffer I/O error on
  device
  vda1, logical block 1858774
  Oct 17 17:12:04 localhost kernel: [  212.070607] Buffer I/O error on
  device
  vda1, logical block 1858775
  Oct 17 17:12:04 localhost kernel: [  212.070610] Buffer I/O error on
  device
  vda1, logical block 1858776
  Oct 17 17:12:04 localhost kernel: [  212.070612] Buffer I/O error on
  device
  vda1, logical block 1858777
  Oct 17 17:12:04 localhost kernel: [  212.070615] Buffer I/O error on
  device
  vda1, logical block 1858778
  Oct 17 17:12:04 localhost kernel: [  212.070617] Buffer I/O error on
  device
  vda1, logical block 1858779
  
  (I was writing a large file at the time, to make sure I actually catch I/O
  errors as they happen)
 
 What about newer versions of qemu/kvm? But of course if those work, your
 next task is going to be git bisect it or file a bug with your distro that
 is using an ancient version of qemu/kvm.
 
 I've just upgraded both hosts to qemu-kvm 1.2.0 (qemu-1.2.0-14.fc17.x86_64, 
 built from spec files under http://pkgs.fedoraproject.org/cgit/qemu.git/).
 
 The bug is still there.
 

If you let the guest go idle (no I/O), then migrate it, then restart the
I/O, do the errors show?


-- 
error compiling committee.c: too many arguments to function
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: I/O errors in guest OS after repeated migration

2012-10-17 Thread Guido Winkelmann
Am Dienstag, 16. Oktober 2012, 12:44:27 schrieb Brian Jackson:
 On Tuesday, October 16, 2012 11:33:44 AM Guido Winkelmann wrote:
[...]
  The commandline, as generated by libvirtd, looks like this:
  
  LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin
  QEMU_AUDIO_DRV=none /usr/bin/qemu-kvm -S -M pc-0.15 -enable-kvm -m 1024
  -smp 1,sockets=1,cores=1,threads=1 -name migratetest2 -uuid
  ddbf11e9-387e-902b-4849-8c3067dc42a2 -nodefconfig -nodefaults -chardev
  socket,id=charmonitor,path=/var/lib/libvirt/qemu/migratetest2.monitor,serv
  e
  r,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc
  -no-reboot -no- shutdown -device
  piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive
  file=/data/migratetest2_system,if=none,id=drive-virtio-
  disk0,format=qcow2,cache=none -device virtio-blk-
  pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-
  disk0,bootindex=1 -drive file=/data/migratetest2_data-1,if=none,id=drive-
  virtio-disk1,format=qcow2,cache=none -device virtio-blk-
  pci,scsi=off,bus=pci.0,addr=0x5,drive=drive-virtio-disk1,id=virtio-disk1 -
  netdev tap,fd=27,id=hostnet0,vhost=on,vhostfd=28 -device virtio-net-
  pci,netdev=hostnet0,id=net0,mac=02:00:00:00:00:0c,bus=pci.0,addr=0x3 -vnc
  127.0.0.1:2,password -k de -vga cirrus -incoming tcp:0.0.0.0:49153 -device
  virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6
 
 I see qcow2 in there. Live migration of qcow2 was a new feature in 1.0. Have
 you tried other formats or different qemu/kvm versions?

Are you sure about that? Because I'm fairly certain I have been using live 
migration since at least 0.14, if not 0.13, and I have always been using qcow2 
as the image format for the disks...

I can still try with other image formats, though.

Guido
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: I/O errors in guest OS after repeated migration

2012-10-17 Thread Guido Winkelmann
Am Dienstag, 16. Oktober 2012, 12:44:27 schrieb Brian Jackson:
 On Tuesday, October 16, 2012 11:33:44 AM Guido Winkelmann wrote:
  The commandline, as generated by libvirtd, looks like this:
  
  LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin
  QEMU_AUDIO_DRV=none /usr/bin/qemu-kvm -S -M pc-0.15 -enable-kvm -m 1024
  -smp 1,sockets=1,cores=1,threads=1 -name migratetest2 -uuid
  ddbf11e9-387e-902b-4849-8c3067dc42a2 -nodefconfig -nodefaults -chardev
  socket,id=charmonitor,path=/var/lib/libvirt/qemu/migratetest2.monitor,serv
  e
  r,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc
  -no-reboot -no- shutdown -device
  piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive
  file=/data/migratetest2_system,if=none,id=drive-virtio-
  disk0,format=qcow2,cache=none -device virtio-blk-
  pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-
  disk0,bootindex=1 -drive file=/data/migratetest2_data-1,if=none,id=drive-
  virtio-disk1,format=qcow2,cache=none -device virtio-blk-
  pci,scsi=off,bus=pci.0,addr=0x5,drive=drive-virtio-disk1,id=virtio-disk1 -
  netdev tap,fd=27,id=hostnet0,vhost=on,vhostfd=28 -device virtio-net-
  pci,netdev=hostnet0,id=net0,mac=02:00:00:00:00:0c,bus=pci.0,addr=0x3 -vnc
  127.0.0.1:2,password -k de -vga cirrus -incoming tcp:0.0.0.0:49153 -device
  virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6
 
 I see qcow2 in there. Live migration of qcow2 was a new feature in 1.0. Have
 you tried other formats or different qemu/kvm versions?

I tried the same thing with a raw image file instead of qcow2, and the problem 
still happens. From the /var/log/messages of the guest:

Oct 17 17:10:34 localhost sshd[2368]: nss_ldap: could not search LDAP server - 
Server is unavailable
Oct 17 17:10:39 localhost kernel: [  126.800075] eth0: no IPv6 routers present
Oct 17 17:10:52 localhost kernel: [  140.335783] Clocksource tsc unstable 
(delta = -70265501 ns)
Oct 17 17:12:04 localhost /O error on device vda1, logical block 1858765
Oct 17 17:12:04 localhost kernel: [  212.070584] Buffer I/O error on device 
vda1, logical block 1858766
Oct 17 17:12:04 localhost kernel: [  212.070587] Buffer I/O error on device 
vda1, logical block 1858767
Oct 17 17:12:04 localhost kernel: [  212.070589] Buffer I/O error on device 
vda1, logical block 1858768
Oct 17 17:12:04 localhost kernel: [  212.070592] Buffer I/O error on device 
vda1, logical block 1858769
Oct 17 17:12:04 localhost kernel: [  212.070595] Buffer I/O error on device 
vda1, logical block 1858770
Oct 17 17:12:04 localhost kernel: [  212.070597] Buffer I/O error on device 
vda1, logical block 1858771
Oct 17 17:12:04 localhost kernel: [  212.070600] Buffer I/O error on device 
vda1, logical block 1858772
Oct 17 17:12:04 localhost kernel: [  212.070602] Buffer I/O error on device 
vda1, logical block 1858773
Oct 17 17:12:04 localhost kernel: [  212.070605] Buffer I/O error on device 
vda1, logical block 1858774
Oct 17 17:12:04 localhost kernel: [  212.070607] Buffer I/O error on device 
vda1, logical block 1858775
Oct 17 17:12:04 localhost kernel: [  212.070610] Buffer I/O error on device 
vda1, logical block 1858776
Oct 17 17:12:04 localhost kernel: [  212.070612] Buffer I/O error on device 
vda1, logical block 1858777
Oct 17 17:12:04 localhost kernel: [  212.070615] Buffer I/O error on device 
vda1, logical block 1858778
Oct 17 17:12:04 localhost kernel: [  212.070617] Buffer I/O error on device 
vda1, logical block 1858779

(I was writing a large file at the time, to make sure I actually catch I/O 
errors as they happen)

Guido
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: I/O errors in guest OS after repeated migration

2012-10-17 Thread Brian Jackson
On Wednesday, October 17, 2012 06:54:00 AM Guido Winkelmann wrote:
 Am Dienstag, 16. Oktober 2012, 12:44:27 schrieb Brian Jackson:
  On Tuesday, October 16, 2012 11:33:44 AM Guido Winkelmann wrote:
 [...]
 
   The commandline, as generated by libvirtd, looks like this:
   
   LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin
   QEMU_AUDIO_DRV=none /usr/bin/qemu-kvm -S -M pc-0.15 -enable-kvm -m 1024
   -smp 1,sockets=1,cores=1,threads=1 -name migratetest2 -uuid
   ddbf11e9-387e-902b-4849-8c3067dc42a2 -nodefconfig -nodefaults -chardev
   socket,id=charmonitor,path=/var/lib/libvirt/qemu/migratetest2.monitor,s
   erv e
   r,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc
   -no-reboot -no- shutdown -device
   piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive
   file=/data/migratetest2_system,if=none,id=drive-virtio-
   disk0,format=qcow2,cache=none -device virtio-blk-
   pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-
   disk0,bootindex=1 -drive
   file=/data/migratetest2_data-1,if=none,id=drive-
   virtio-disk1,format=qcow2,cache=none -device virtio-blk-
   pci,scsi=off,bus=pci.0,addr=0x5,drive=drive-virtio-disk1,id=virtio-disk
   1 - netdev tap,fd=27,id=hostnet0,vhost=on,vhostfd=28 -device
   virtio-net-
   pci,netdev=hostnet0,id=net0,mac=02:00:00:00:00:0c,bus=pci.0,addr=0x3
   -vnc 127.0.0.1:2,password -k de -vga cirrus -incoming
   tcp:0.0.0.0:49153 -device
   virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6
  
  I see qcow2 in there. Live migration of qcow2 was a new feature in 1.0.
  Have you tried other formats or different qemu/kvm versions?
 
 Are you sure about that? Because I'm fairly certain I have been using live
 migration since at least 0.14, if not 0.13, and I have always been using
 qcow2 as the image format for the disks...
 
 I can still try with other image formats, though.


Yes, see the release notes for 1.0. It may have worked by chance before that, 
but it wasn't guaranteed to work. There was no blacklisting feature then like 
there is now to stop it.


 
   Guido
 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: I/O errors in guest OS after repeated migration

2012-10-17 Thread Brian Jackson
On Wednesday, October 17, 2012 10:45:14 AM Guido Winkelmann wrote:
 Am Dienstag, 16. Oktober 2012, 12:44:27 schrieb Brian Jackson:
  On Tuesday, October 16, 2012 11:33:44 AM Guido Winkelmann wrote:
   The commandline, as generated by libvirtd, looks like this:
   
   LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin
   QEMU_AUDIO_DRV=none /usr/bin/qemu-kvm -S -M pc-0.15 -enable-kvm -m 1024
   -smp 1,sockets=1,cores=1,threads=1 -name migratetest2 -uuid
   ddbf11e9-387e-902b-4849-8c3067dc42a2 -nodefconfig -nodefaults -chardev
   socket,id=charmonitor,path=/var/lib/libvirt/qemu/migratetest2.monitor,s
   erv e
   r,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc
   -no-reboot -no- shutdown -device
   piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive
   file=/data/migratetest2_system,if=none,id=drive-virtio-
   disk0,format=qcow2,cache=none -device virtio-blk-
   pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-
   disk0,bootindex=1 -drive
   file=/data/migratetest2_data-1,if=none,id=drive-
   virtio-disk1,format=qcow2,cache=none -device virtio-blk-
   pci,scsi=off,bus=pci.0,addr=0x5,drive=drive-virtio-disk1,id=virtio-disk
   1 - netdev tap,fd=27,id=hostnet0,vhost=on,vhostfd=28 -device
   virtio-net-
   pci,netdev=hostnet0,id=net0,mac=02:00:00:00:00:0c,bus=pci.0,addr=0x3
   -vnc 127.0.0.1:2,password -k de -vga cirrus -incoming
   tcp:0.0.0.0:49153 -device
   virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6
  
  I see qcow2 in there. Live migration of qcow2 was a new feature in 1.0.
  Have you tried other formats or different qemu/kvm versions?
 
 I tried the same thing with a raw image file instead of qcow2, and the
 problem still happens. From the /var/log/messages of the guest:
 
 Oct 17 17:10:34 localhost sshd[2368]: nss_ldap: could not search LDAP
 server - Server is unavailable
 Oct 17 17:10:39 localhost kernel: [  126.800075] eth0: no IPv6 routers
 present Oct 17 17:10:52 localhost kernel: [  140.335783] Clocksource tsc
 unstable (delta = -70265501 ns)
 Oct 17 17:12:04 localhost /O error on device vda1, logical block 1858765
 Oct 17 17:12:04 localhost kernel: [  212.070584] Buffer I/O error on device
 vda1, logical block 1858766
 Oct 17 17:12:04 localhost kernel: [  212.070587] Buffer I/O error on device
 vda1, logical block 1858767
 Oct 17 17:12:04 localhost kernel: [  212.070589] Buffer I/O error on device
 vda1, logical block 1858768
 Oct 17 17:12:04 localhost kernel: [  212.070592] Buffer I/O error on device
 vda1, logical block 1858769
 Oct 17 17:12:04 localhost kernel: [  212.070595] Buffer I/O error on device
 vda1, logical block 1858770
 Oct 17 17:12:04 localhost kernel: [  212.070597] Buffer I/O error on device
 vda1, logical block 1858771
 Oct 17 17:12:04 localhost kernel: [  212.070600] Buffer I/O error on device
 vda1, logical block 1858772
 Oct 17 17:12:04 localhost kernel: [  212.070602] Buffer I/O error on device
 vda1, logical block 1858773
 Oct 17 17:12:04 localhost kernel: [  212.070605] Buffer I/O error on device
 vda1, logical block 1858774
 Oct 17 17:12:04 localhost kernel: [  212.070607] Buffer I/O error on device
 vda1, logical block 1858775
 Oct 17 17:12:04 localhost kernel: [  212.070610] Buffer I/O error on device
 vda1, logical block 1858776
 Oct 17 17:12:04 localhost kernel: [  212.070612] Buffer I/O error on device
 vda1, logical block 1858777
 Oct 17 17:12:04 localhost kernel: [  212.070615] Buffer I/O error on device
 vda1, logical block 1858778
 Oct 17 17:12:04 localhost kernel: [  212.070617] Buffer I/O error on device
 vda1, logical block 1858779
 
 (I was writing a large file at the time, to make sure I actually catch I/O
 errors as they happen)


What about newer versions of qemu/kvm? But of course if those work, your next 
task is going to be git bisect it or file a bug with your distro that is using 
an ancient version of qemu/kvm.


 
   Guido
 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


I/O errors in guest OS after repeated migration

2012-10-16 Thread Guido Winkelmann
Hi,

I'm experiencing I/O errors in a guest machine after migrating it from one 
host to another, and then back to the original host. After doing this, I find 
the following in the dmesg output of the guest machine:

[  345.390543] end_request: I/O error, dev vda, sector 273871
[  345.391125] end_request: I/O error, dev vda, sector 273871
[  345.391705] end_request: I/O error, dev vda, sector 273871
[  345.394796] end_request: I/O error, dev vda, sector 1745983
[  345.396005] end_request: I/O error, dev vda, sector 1745983
[  346.083160] end_request: I/O error, dev vdb, sector 54528008
[  346.083179] Buffer I/O error on device dm-0, logical block 6815745
[  346.083181] lost page write due to I/O error on dm-0
[  346.083193] end_request: I/O error, dev vdb, sector 54528264
[  346.083195] Buffer I/O error on device dm-0, logical block 6815777
[  346.083197] lost page write due to I/O error on dm-0
[  346.083201] end_request: I/O error, dev vdb, sector 2056
[  346.083204] Buffer I/O error on device dm-0, logical block 1
[  346.083206] lost page write due to I/O error on dm-0
[  346.083209] Buffer I/O error on device dm-0, logical block 2
[  346.083211] lost page write due to I/O error on dm-0
[  346.083215] end_request: I/O error, dev vdb, sector 10248
[  346.083217] Buffer I/O error on device dm-0, logical block 1025
[  346.083219] lost page write due to I/O error on dm-0
[  346.091499] end_request: I/O error, dev vdb, sector 76240
[  346.091506] Buffer I/O error on device dm-0, logical block 9274
[  346.091508] lost page write due to I/O error on dm-0
[  346.091572] JBD2: Detected IO errors while flushing file data on dm-0-8
[  346.091915] end_request: I/O error, dev vdb, sector 38017360
[  346.091956] Aborting journal on device dm-0-8.
[  346.092557] end_request: I/O error, dev vdb, sector 38012928
[  346.092566] Buffer I/O error on device dm-0, logical block 4751360
[  346.092569] lost page write due to I/O error on dm-0
[  346.092624] JBD2: I/O error detected when updating journal superblock for 
dm-0-8.
[  346.100940] end_request: I/O error, dev vdb, sector 2048
[  346.100948] Buffer I/O error on device dm-0, logical block 0
[  346.100952] lost page write due to I/O error on dm-0
[  346.101027] EXT4-fs error (device dm-0): ext4_journal_start_sb:327: 
Detected aborted journal
[  346.101038] EXT4-fs (dm-0): Remounting filesystem read-only
[  346.101051] EXT4-fs (dm-0): previous I/O error to superblock detected
[  346.101836] end_request: I/O error, dev vdb, sector 2048
[  346.101845] Buffer I/O error on device dm-0, logical block 0
[  346.101849] lost page write due to I/O error on dm-0
[  373.006680] end_request: I/O error, dev vda, sector 624319
[  373.007543] end_request: I/O error, dev vda, sector 624319
[  373.008327] end_request: I/O error, dev vda, sector 624319
[  374.886674] end_request: I/O error, dev vda, sector 624319
[  374.887563] end_request: I/O error, dev vda, sector 624319

The hosts are both running Fedora 17 with qemu-kvm-1.0.1-1.fc17.x86_64. The 
guest machine has been started and migrated using libvirt (0.9.11). Kernel 
version is 3.5.6-1.fc17.x86_64 on the first host and 3.5.5-2.fc17.x86_64 on 
the second.
The guest machine is on Kernel 3.3.8 and uses ext4 on its disks.

The commandline, as generated by libvirtd, looks like this:

LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin 
QEMU_AUDIO_DRV=none /usr/bin/qemu-kvm -S -M pc-0.15 -enable-kvm -m 1024 -smp 
1,sockets=1,cores=1,threads=1 -name migratetest2 -uuid 
ddbf11e9-387e-902b-4849-8c3067dc42a2 -nodefconfig -nodefaults -chardev 
socket,id=charmonitor,path=/var/lib/libvirt/qemu/migratetest2.monitor,server,nowait
 
-mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-reboot -no-
shutdown -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive 
file=/data/migratetest2_system,if=none,id=drive-virtio-
disk0,format=qcow2,cache=none -device virtio-blk-
pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-
disk0,bootindex=1 -drive file=/data/migratetest2_data-1,if=none,id=drive-
virtio-disk1,format=qcow2,cache=none -device virtio-blk-
pci,scsi=off,bus=pci.0,addr=0x5,drive=drive-virtio-disk1,id=virtio-disk1 -
netdev tap,fd=27,id=hostnet0,vhost=on,vhostfd=28 -device virtio-net-
pci,netdev=hostnet0,id=net0,mac=02:00:00:00:00:0c,bus=pci.0,addr=0x3 -vnc 
127.0.0.1:2,password -k de -vga cirrus -incoming tcp:0.0.0.0:49153 -device 
virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6

The second host has an ext4 filesystem mounted under /data, which it exports 
using NFSv3 over TCP to the first host, which also mounts it under /data.

So far, the problem seems reproducible: When I start another guest machine and 
do the same thing with it, the same problem happens.

Can anybody help me with this problem?

Guido

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: I/O errors in guest OS after repeated migration

2012-10-16 Thread Brian Jackson
On Tuesday, October 16, 2012 11:33:44 AM Guido Winkelmann wrote:
 Hi,
 
 I'm experiencing I/O errors in a guest machine after migrating it from one
 host to another, and then back to the original host. After doing this, I
 find the following in the dmesg output of the guest machine:
 
 [  345.390543] end_request: I/O error, dev vda, sector 273871
 [  345.391125] end_request: I/O error, dev vda, sector 273871
 [  345.391705] end_request: I/O error, dev vda, sector 273871
 [  345.394796] end_request: I/O error, dev vda, sector 1745983
 [  345.396005] end_request: I/O error, dev vda, sector 1745983
 [  346.083160] end_request: I/O error, dev vdb, sector 54528008
 [  346.083179] Buffer I/O error on device dm-0, logical block 6815745
 [  346.083181] lost page write due to I/O error on dm-0
 [  346.083193] end_request: I/O error, dev vdb, sector 54528264
 [  346.083195] Buffer I/O error on device dm-0, logical block 6815777
 [  346.083197] lost page write due to I/O error on dm-0
 [  346.083201] end_request: I/O error, dev vdb, sector 2056
 [  346.083204] Buffer I/O error on device dm-0, logical block 1
 [  346.083206] lost page write due to I/O error on dm-0
 [  346.083209] Buffer I/O error on device dm-0, logical block 2
 [  346.083211] lost page write due to I/O error on dm-0
 [  346.083215] end_request: I/O error, dev vdb, sector 10248
 [  346.083217] Buffer I/O error on device dm-0, logical block 1025
 [  346.083219] lost page write due to I/O error on dm-0
 [  346.091499] end_request: I/O error, dev vdb, sector 76240
 [  346.091506] Buffer I/O error on device dm-0, logical block 9274
 [  346.091508] lost page write due to I/O error on dm-0
 [  346.091572] JBD2: Detected IO errors while flushing file data on dm-0-8
 [  346.091915] end_request: I/O error, dev vdb, sector 38017360
 [  346.091956] Aborting journal on device dm-0-8.
 [  346.092557] end_request: I/O error, dev vdb, sector 38012928
 [  346.092566] Buffer I/O error on device dm-0, logical block 4751360
 [  346.092569] lost page write due to I/O error on dm-0
 [  346.092624] JBD2: I/O error detected when updating journal superblock
 for dm-0-8.
 [  346.100940] end_request: I/O error, dev vdb, sector 2048
 [  346.100948] Buffer I/O error on device dm-0, logical block 0
 [  346.100952] lost page write due to I/O error on dm-0
 [  346.101027] EXT4-fs error (device dm-0): ext4_journal_start_sb:327:
 Detected aborted journal
 [  346.101038] EXT4-fs (dm-0): Remounting filesystem read-only
 [  346.101051] EXT4-fs (dm-0): previous I/O error to superblock detected
 [  346.101836] end_request: I/O error, dev vdb, sector 2048
 [  346.101845] Buffer I/O error on device dm-0, logical block 0
 [  346.101849] lost page write due to I/O error on dm-0
 [  373.006680] end_request: I/O error, dev vda, sector 624319
 [  373.007543] end_request: I/O error, dev vda, sector 624319
 [  373.008327] end_request: I/O error, dev vda, sector 624319
 [  374.886674] end_request: I/O error, dev vda, sector 624319
 [  374.887563] end_request: I/O error, dev vda, sector 624319
 
 The hosts are both running Fedora 17 with qemu-kvm-1.0.1-1.fc17.x86_64. The
 guest machine has been started and migrated using libvirt (0.9.11). Kernel
 version is 3.5.6-1.fc17.x86_64 on the first host and 3.5.5-2.fc17.x86_64 on
 the second.
 The guest machine is on Kernel 3.3.8 and uses ext4 on its disks.
 
 The commandline, as generated by libvirtd, looks like this:
 
 LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin
 QEMU_AUDIO_DRV=none /usr/bin/qemu-kvm -S -M pc-0.15 -enable-kvm -m 1024
 -smp 1,sockets=1,cores=1,threads=1 -name migratetest2 -uuid
 ddbf11e9-387e-902b-4849-8c3067dc42a2 -nodefconfig -nodefaults -chardev
 socket,id=charmonitor,path=/var/lib/libvirt/qemu/migratetest2.monitor,serve
 r,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc
 -no-reboot -no- shutdown -device
 piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive
 file=/data/migratetest2_system,if=none,id=drive-virtio-
 disk0,format=qcow2,cache=none -device virtio-blk-
 pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-
 disk0,bootindex=1 -drive file=/data/migratetest2_data-1,if=none,id=drive-
 virtio-disk1,format=qcow2,cache=none -device virtio-blk-
 pci,scsi=off,bus=pci.0,addr=0x5,drive=drive-virtio-disk1,id=virtio-disk1 -
 netdev tap,fd=27,id=hostnet0,vhost=on,vhostfd=28 -device virtio-net-
 pci,netdev=hostnet0,id=net0,mac=02:00:00:00:00:0c,bus=pci.0,addr=0x3 -vnc
 127.0.0.1:2,password -k de -vga cirrus -incoming tcp:0.0.0.0:49153 -device
 virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6


I see qcow2 in there. Live migration of qcow2 was a new feature in 1.0. Have 
you tried other formats or different qemu/kvm versions?


 
 The second host has an ext4 filesystem mounted under /data, which it
 exports using NFSv3 over TCP to the first host, which also mounts it under
 /data.
 
 So far, the problem seems reproducible: When I start another guest machine
 and do the same thing with