Bug#627615: VM terminates when doing a live migration

2011-05-26 Thread Michael Tokarev
25.05.2011 22:42, Daniel Bareiro wrote:

 I tested the new version of qemu-kvm (0.12.5+dfsg-5+squeeze2) available
 on Squeeze which was notified in the DSA 2241-1, but the problem still
 persists.

Sure it persists because DSA 2241-1 has absolutely nothing to
do with the problem.  It was an unrelated security update.
You may guess that by the fact that this bug has not been
closed by the update.  I told you where the fix is.

/mjt



-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#627615: VM terminates when doing a live migration

2011-05-26 Thread Daniel Bareiro
On Thursday, 26 May 2011 12:22:36 +0400,
Michael Tokarev wrote:

  I tested the new version of qemu-kvm (0.12.5+dfsg-5+squeeze2)
  available on Squeeze which was notified in the DSA 2241-1, but the
  problem still persists.

 Sure it persists because DSA 2241-1 has absolutely nothing to
 do with the problem.  It was an unrelated security update.
 You may guess that by the fact that this bug has not been
 closed by the update.

Yes, I know it has nothing to do with the problem. But as you told me
that both problems will be fixed in the upcoming Squeeze version of
qemu-kvm, I wanted to try this version to see if there was any
difference.

I found it odd that nobody has answered your report in #625571.

 I told you where the fix is.

Great! Thanks!

Regards,
Daniel
-- 
Daniel Bareiro - GNU/Linux registered user #188.598
Proudly running Debian GNU/Linux with uptime:
17:03:20 up 5 days,  3:22, 10 users,  load average: 0.07, 0.04, 0.00


signature.asc
Description: Digital signature


Bug#627615: VM terminates when doing a live migration

2011-05-25 Thread Daniel Bareiro
On Monday, 23 May 2011 15:45:20 +0400,
Michael Tokarev wrote:

 This bug has 2 halves, one half is general 32bit migration issue
 (it does not actually work on 32bits, the fact it worked for you
 is pure luck), and second half is special case of 32bit userspace
 running on 64bit kernel (due to wrong kernel/user space communications).
 Both will be fixed in the upcoming squeeze version of qemu-kvm.

I tested the new version of qemu-kvm (0.12.5+dfsg-5+squeeze2) available
on Squeeze which was notified in the DSA 2241-1, but the problem still
persists.

Regards,
Daniel
-- 
Daniel Bareiro - GNU/Linux registered user #188.598
Proudly running Debian GNU/Linux with uptime:
15:34:52 up 4 days,  1:54,  9 users,  load average: 0.07, 0.10, 0.04


signature.asc
Description: Digital signature


Bug#627615: VM terminates when doing a live migration

2011-05-24 Thread Daniel Bareiro
On Monday, 23 May 2011 15:45:20 +0400,
Michael Tokarev wrote:

  This is the output without daemonizing:
  
  
  *** glibc detected *** kvm: free(): invalid next size (fast): 0x09fad3c0 ***
 [...]
  Can we confirm that it is the same problem? If you need to do another
  test, please don't hesitate to ask me.

 Yes it's exactly this problem, you can check the other bug I
 mentioned - it shows this very memory corruption too.
 That's why I merged the two.

I didn't even do a test of migration between two 64bit VMHost. In this
scenario there is any problem?

  As I noted earlier, trying to migrate from Defiant (Debian GNU/Linux
  5.0.8 with Linux 2.6.32-15~bpo50+1 and qemu-kvm
  0.12.5+dfsg-3~bpo50+2) to SS01, this problem does not occur. Both
  installation are 32-bit, but the kernel in SS01 is amd64 and the
  kernel in Defiant is i686.
  
  Ie both are 32bit userspace with the difference that ss01 has a
  64-bit kernel. The problem is there? Because versions of Linux and
  qemu-kvm look the same.

 This bug has 2 halves, one half is general 32bit migration issue
 (it does not actually work on 32bits, the fact it worked for you
 is pure luck), and second half is special case of 32bit userspace
 running on 64bit kernel (due to wrong kernel/user space communications).
 Both will be fixed in the upcoming squeeze version of qemu-kvm.

From what I read in #625571, the problem was detected in
0.12.5+dfsg-5+squeeze1 (the same version as in ss01) and 0.12.0+dfsg-5
(Lenny backports?), but in the VMHost where didn't appear the problem, I
am using 0.12.5+dfsg-3~bpo50+2 (Lenny backports). Perhaps
0.12.5+dfsg-3~bpo50+2 is not affected.

 You can try patching and rebuilding your qemu-kvm using patches in
 the package waiting upload.  Unfortunately right now the anonscm.debian.org
 service (with http access to the git repository) does not work due to
 system maintenance.  It's in git://git.debian.org/collab-maint/qemu-kvm.git,
 I extracted the patch into my site here:
  http://www.corpit.ru/mjt/tmp/fix-crash-in-migration-32-bit-51b0c6065a.diff
 This patch fixes both halves of the problem.

Perfect! Thanks!


Thanks for your reply.

Regards,
Daniel
-- 
Daniel Bareiro - GNU/Linux registered user #188.598
Proudly running Debian GNU/Linux with uptime:
17:51:22 up 3 days,  4:10, 10 users,  load average: 0.00, 0.00, 0.00


signature.asc
Description: Digital signature


Bug#627615: VM terminates when doing a live migration

2011-05-23 Thread Daniel Bareiro
On Monday, 23 May 2011 01:54:01 +0400,
Michael Tokarev wrote:

 forcemerge 625571 627615
 thanks

 22.05.2011 22:11, Daniel Bareiro wrote:

  Package: qemu-kvm
  Version: 0.12.5+dfsg-5+squeeze1
  Severity: important
 
  model name  : AMD Athlon(tm) 64 X2 Dual Core Processor 3800+
 
  -- System Information:
  Debian Release: 6.0.1
APT prefers stable
APT policy: (500, 'stable')
  Architecture: i386 (x86_64)
  
  Kernel: Linux 2.6.32-5-amd64 (SMP w/2 CPU cores)
 
  ss01:~# kvm -m 256 -boot d -net nic,vlan=0,macaddr=52:54:67:92:9d:63 \
   -net tap -daemonize -vnc :15 -k es -localtime -cdrom \
   /mnt/systemrescuecd-x86-2.0.1.iso -monitor 
  telnet:localhost:4055,server,nowait
  
  
  Destination:
  
  defiant:~# kvm -m 256 -boot d -net nic,vlan=0,macaddr=52:54:67:92:9d:63 
  -net tap \
   -daemonize -vnc :1 -k es -localtime -cdrom 
  /mnt/systemrescuecd-x86-2.0.1.iso -monitor \
   telnet:localhost:4041,server,nowait -incoming tcp:0:4455
  
  Migration:
  
  ss01:~# telnet localhost 4055
  Trying ::1...
  Connected to localhost.
  Escape character is '^]'.
  QEMU 0.12.5 monitor - type 'help' for more information
  (qemu) migrate -d tcp:10.1.0.65:4455
  (qemu) Connection closed by foreign host.
  
  ss01:~# ps ax|grep systemrescuecd
  15640 pts/0R+ 0:00 grep systemrescuecd
 
 When debugging don't enable daemonizing, instead run it in foreground
 to see what messages, if any, it prints.
 
 But this is, with a very good chance, #625571 - migration
 fails on 32bit userspace always.  That bug is finally fixed,
 after more than 2 years, and is pending upload after we will
 sort out other, more important issues.

This is the output without daemonizing:


*** glibc detected *** kvm: free(): invalid next size (fast): 0x09fad3c0 ***
=== Backtrace: =
/lib/i686/cmov/libc.so.6(+0x6b281)[0xf723e281]
/lib/i686/cmov/libc.so.6(+0x6cad8)[0xf723fad8]
/lib/i686/cmov/libc.so.6(cfree+0x6d)[0xf7242bbd]
kvm[0x806f6e7]
kvm[0x806f7d3]
kvm[0x8051c85]
kvm[0x8051e1b]
kvm[0x810d3c7]
kvm[0x8104fe9]
kvm[0x8105e06]
kvm[0x80529b0]
kvm[0x806de64]
kvm[0x8055a95]
/lib/i686/cmov/libc.so.6(__libc_start_main+0xe6)[0xf71e9c76]
kvm[0x804f3a1]
=== Memory map: 
08048000-0823e000 r-xp  09:01 426838 
/usr/bin/kvm
0823e000-0825 rw-p 001f6000 09:01 426838 
/usr/bin/kvm
0825-0846 rw-p  00:00 0
09f6-09f89000 rw-p  00:00 0
09f89000-09f91000 rw-p  00:00 0
09f91000-0a081000 rw-p  00:00 0
0a081000-0a091000 rw-p  00:00 0
0a091000-0a16 rw-p  00:00 0
e440-e4421000 rw-p  00:00 0
e4421000-e450 ---p  00:00 0
e4586000-e45a3000 r-xp  09:01 16315  
/lib/libgcc_s.so.1
e45a3000-e45a4000 rw-p 0001c000 09:01 16315  
/lib/libgcc_s.so.1
e45a4000-e45a5000 ---p  00:00 0
e45a5000-e4da5000 rwxp  00:00 0
e4da5000-e4e06000 rw-p  00:00 0
e4e19000-e4f33000 rw-p  00:00 0
e4f33000-e5096000 r-xp  09:01 424347 
/usr/lib/libdb-4.8.so
e5096000-e5099000 rw-p 00163000 09:01 424347 
/usr/lib/libdb-4.8.so
e509e000-e50a2000 r-xp  09:01 440618 
/usr/lib/sasl2/libsasldb.so.2.0.23
e50a2000-e50a3000 rw-p 4000 09:01 440618 
/usr/lib/sasl2/libsasldb.so.2.0.23
e50a3000-e52a2000 rw-p  00:00 0
e52b5000-e52b6000 rw-p  00:00 0
e52b6000-e62b6000 rw-p  00:00 0
e62b6000-e62b8000 rw-p  00:00 0
e62b8000-e62d8000 rw-p  00:00 0
e62d8000-e62d9000 rw-p  00:00 0
e62fa000-e62fb000 rw-p  00:00 0
e62fb000-e631b000 rw-p  00:00 0
e631b000-e631d000 rw-p  00:00 0
e631d000-f631d000 rw-p  00:00 0
f631d000-f631e000 rw-p  00:00 0
f631e000-f631f000 ---p  00:00 0
f631f000-f6b1f000 rwxp  00:00 0
f6b1f000-f6b29000 r-xp  09:01 27073  
/lib/i686/cmov/libnss_files-2.11.2.so
f6b29000-f6b2a000 r--p 9000 09:01 27073  
/lib/i686/cmov/libnss_files-2.11.2.so
f6b2a000-f6b2b000 rw-p a000 09:01 27073  
/lib/i686/cmov/libnss_files-2.11.2.so
f6b2b000-f6b2e000 rw-p  00:00 0
f6b2e000-f6b33000 r-xp  09:01 425645 
/usr/lib/libogg.so.0.7.0
f6b33000-f6b34000 rw-p 4000 09:01 425645 
/usr/lib/libogg.so.0.7.0
f6b34000-f6b5b000 r-xp  09:01 425649 
/usr/lib/libvorbis.so.0.4.4
f6b5b000-f6b5c000 rw-p 00026000 09:01 425649 
/usr/lib/libvorbis.so.0.4.4
f6b5c000-f6b5d000 rw-p  00:00 0
f6b5d000-f6cc2000 r-xp  09:01 425652 
/usr/lib/libvorbisenc.so.2.0.7
f6cc2000-f6cd3000 rw-p 00165000 09:01 425652 
/usr/lib/libvorbisenc.so.2.0.7

Bug#627615: VM terminates when doing a live migration

2011-05-23 Thread Michael Tokarev
23.05.2011 15:30, Daniel Bareiro wrpte:

 This is the output without daemonizing:
 
 
 *** glibc detected *** kvm: free(): invalid next size (fast): 0x09fad3c0 ***
[...]
 Can we confirm that it is the same problem? If you need to do another
 test, please don't hesitate to ask me.

Yes it's exactly this problem, you can check the other bug I
mentioned - it shows this very memory corruption too.
That's why I merged the two.

 As I noted earlier, trying to migrate from Defiant (Debian GNU/Linux
 5.0.8 with Linux 2.6.32-15~bpo50+1 and qemu-kvm 0.12.5+dfsg-3~bpo50+2)
 to SS01, this problem does not occur. Both installation are 32-bit,
 but the kernel in SS01 is amd64 and the kernel in Defiant is i686.
 
 Ie both are 32bit userspace with the difference that ss01 has a 64-bit
 kernel. The problem is there? Because versions of Linux and qemu-kvm
 look the same.

This bug has 2 halves, one half is general 32bit migration issue
(it does not actually work on 32bits, the fact it worked for you
is pure luck), and second half is special case of 32bit userspace
running on 64bit kernel (due to wrong kernel/user space communications).
Both will be fixed in the upcoming squeeze version of qemu-kvm.

You can try patching and rebuilding your qemu-kvm using patches in
the package waiting upload.  Unfortunately right now the anonscm.debian.org
service (with http access to the git repository) does not work due to
system maintenance.  It's in git://git.debian.org/collab-maint/qemu-kvm.git,
I extracted the patch into my site here:
 http://www.corpit.ru/mjt/tmp/fix-crash-in-migration-32-bit-51b0c6065a.diff
This patch fixes both halves of the problem.

/mjt



-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#627615: VM terminates when doing a live migration

2011-05-22 Thread Daniel Bareiro

Package: qemu-kvm
Version: 0.12.5+dfsg-5+squeeze1
Severity: important



-- Package-specific info:


/proc/cpuinfo:

processor   : 0
vendor_id   : AuthenticAMD
cpu family  : 15
model   : 75
model name  : AMD Athlon(tm) 64 X2 Dual Core Processor 3800+
stepping: 2
cpu MHz : 2009.081
cache size  : 512 KB
physical id : 0
siblings: 2
core id : 0
cpu cores   : 2
apicid  : 0
initial apicid  : 0
fpu : yes
fpu_exception   : yes
cpuid level : 1
wp  : yes
flags   : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov 
pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt rdtscp lm 
3dnowext 3dnow rep_good pni cx16 lahf_lm
cmp_legacy svm extapic cr8_legacy
bogomips: 4018.16
TLB size: 1024 4K pages
clflush size: 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management: ts fid vid ttp tm stc

processor   : 1
vendor_id   : AuthenticAMD
cpu family  : 15
model   : 75
model name  : AMD Athlon(tm) 64 X2 Dual Core Processor 3800+
stepping: 2
cpu MHz : 2009.081
cache size  : 512 KB
physical id : 0
siblings: 2
core id : 1
cpu cores   : 2
apicid  : 1
initial apicid  : 1
fpu : yes
fpu_exception   : yes
cpuid level : 1
wp  : yes
flags   : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov 
pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt rdtscp lm 
3dnowext 3dnow rep_good pni cx16 lahf_lm
cmp_legacy svm extapic cr8_legacy
bogomips: 4018.54
TLB size: 1024 4K pages
clflush size: 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management: ts fid vid ttp tm stc




-- System Information:
Debian Release: 6.0.1
  APT prefers stable
  APT policy: (500, 'stable')
Architecture: i386 (x86_64)

Kernel: Linux 2.6.32-5-amd64 (SMP w/2 CPU cores)
Locale: LANG=es_AR.UTF-8, LC_CTYPE=es_AR.UTF-8 (charmap=UTF-8) (ignored: LC_ALL 
set to es_AR.UTF-8)
Shell: /bin/sh linked to /bin/dash

Versions of packages qemu-kvm depends on:
ii  adduser3.112+nmu2add and remove users and groups
ii  bridge-utils   1.4-5 Utilities for configuring the Linu
ii  iproute20100519-3networking and traffic control too
ii  libaio10.3.107-7 Linux kernel AIO access library -
ii  libasound2 1.0.23-2.1shared library for ALSA applicatio
ii  libbluetooth3  4.66-3Library to use the BlueZ Linux Blu
ii  libbrlapi0.5   4.2-7 braille display access via BRLTTY
ii  libc6  2.11.2-10 Embedded GNU C Library: Shared lib
ii  libcurl3-gnutls7.21.0-1  Multi-protocol file transfer libra
ii  libgnutls262.8.6-1   the GNU TLS library - runtime libr
ii  libncurses55.7+20100313-5shared libraries for terminal hand
ii  libpci31:3.1.7-6 Linux PCI Utilities (shared librar
ii  libpulse0  0.9.21-3+squeeze1 PulseAudio client libraries
ii  libsasl2-2 2.1.23.dfsg1-7Cyrus SASL - authentication abstra
ii  libsdl1.2debian1.2.14-6.1Simple DirectMedia Layer
ii  libuuid1   2.17.2-9  Universally Unique ID library
ii  libvdeplug22.2.3-3   Virtual Distributed Ethernet - Plu
ii  libx11-6   2:1.3.3-4 X11 client-side library
ii  python 2.6.6-3+squeeze6  interactive high-level object-orie
ii  zlib1g 1:1.2.3.4.dfsg-3  compression library - runtime

Versions of packages qemu-kvm recommends:
ii  linux-image-2.6.32-5-686 [lin 2.6.32-31  Linux 2.6.32 for modern PCs
ii  linux-image-2.6.32-5-amd64 [l 2.6.32-31  Linux 2.6.32 for 64-bit PCs

Versions of packages qemu-kvm suggests:
pn  debootstrap   none (no description available)
pn  samba none (no description available)
pn  vde2  none (no description available)

-- Configuration Files:
/etc/kvm/kvm-ifup changed:
switch=$(ip route ls | \
awk '/^default / {
  for(i=0;iNF;i++) { if ($i == dev) { print $(i+1); exit; } }
 }'
)
/sbin/ifconfig $1 0.0.0.0 up
if [ -n $switch -a -d /sys/class/net/$switch/bridge/. ]; then
  /usr/sbin/brctl addif $switch $1 || :
fi


-- no debconf information

First Test:
===

Starting the VM in the source VMHost and starting the VM on the
destination VMHost with the exact same parameters as the VM on the
source, in migration-listen mode: 

Source:

ss01:~# kvm -m 256 -boot d -net nic,vlan=0,macaddr=52:54:67:92:9d:63 \
 -net tap -daemonize -vnc :15 -k es -localtime -cdrom \
 /mnt/systemrescuecd-x86-2.0.1.iso -monitor telnet:localhost:4055,server,nowait


Destination:

defiant:~# kvm -m 256 

Bug#627615: VM terminates when doing a live migration

2011-05-22 Thread Michael Tokarev
forcemerge 625571 627615
thanks

22.05.2011 22:11, Daniel Bareiro wrote:
 
 Package: qemu-kvm
 Version: 0.12.5+dfsg-5+squeeze1
 Severity: important

 model name  : AMD Athlon(tm) 64 X2 Dual Core Processor 3800+

 -- System Information:
 Debian Release: 6.0.1
   APT prefers stable
   APT policy: (500, 'stable')
 Architecture: i386 (x86_64)
 
 Kernel: Linux 2.6.32-5-amd64 (SMP w/2 CPU cores)

 ss01:~# kvm -m 256 -boot d -net nic,vlan=0,macaddr=52:54:67:92:9d:63 \
  -net tap -daemonize -vnc :15 -k es -localtime -cdrom \
  /mnt/systemrescuecd-x86-2.0.1.iso -monitor 
 telnet:localhost:4055,server,nowait
 
 
 Destination:
 
 defiant:~# kvm -m 256 -boot d -net nic,vlan=0,macaddr=52:54:67:92:9d:63 -net 
 tap \
  -daemonize -vnc :1 -k es -localtime -cdrom /mnt/systemrescuecd-x86-2.0.1.iso 
 -monitor \
  telnet:localhost:4041,server,nowait -incoming tcp:0:4455
 
 Migration:
 
 ss01:~# telnet localhost 4055
 Trying ::1...
 Connected to localhost.
 Escape character is '^]'.
 QEMU 0.12.5 monitor - type 'help' for more information
 (qemu) migrate -d tcp:10.1.0.65:4455
 (qemu) Connection closed by foreign host.
 
 ss01:~# ps ax|grep systemrescuecd
 15640 pts/0R+ 0:00 grep systemrescuecd

When debugging don't enable daemonizing, instead run it in foreground
to see what messages, if any, it prints.

But this is, with a very good chance, #625571 - migration
fails on 32bit userspace always.  That bug is finally fixed,
after more than 2 years, and is pending upload after we will
sort out other, more important issues.


 ss01:~# telnet localhost 4055
 Trying ::1...
 Connected to localhost.
 Escape character is '^]'.
 QEMU 0.12.5 monitor - type 'help' for more information
 (qemu) stop
 (qemu) migrate_set_speed 4095m
 (qemu) migrate exec:gzip -c  STATEFILE.gz
 Connection closed by foreign host.
 
 ss01:~# ps ax|grep systemrescuecd
 26564 pts/0S+ 0:00 grep systemrescuecd

Again, this is a very bad idea to run it in backrgound when
debugging.  It's impossible to tell what exactly it is doing
this way.

/mjt



-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org