Bugs item #2351676, was opened at 2008-11-26 19:59
Message generated for change (Comment added) made by z-image
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=893831&aid=2351676&group_id=180599

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Chris Jones (c_jones)
Assigned to: Nobody/Anonymous (nobody)
Summary: Guests hang periodically on Ubuntu-8.10

Initial Comment:
I'm seeing periodic hangs on my guests.  I've been unable so far to find a 
trigger - they always boot fine, but after anywhere from 10 minutes to 24 hours 
they eventually hang completely.

My setup:
  * AMD Athlon X2 4850e (2500 MHz dual core)
  * 4Gig memory
  * Ubuntu 8.10 server, 64-bit
  * KVMs tried:
    : kvm-72 (shipped with ubuntu)
    : kvm-79 (built myself, --patched-kernel option)
  * Kernels tried:
    : 2.6.27.7 (kernel.org, self built)
    : 2.6.27-7-server from Ubuntu 8.10 distribution

  In guests
  * Ubuntu 8.10 server, 64-bit (virtual machine install)
  * kernel 2.6.27-7-server from Ubuntu 8.10

I'm running the guests like:
  sudo /usr/local/bin/qemu-system-x86_64        \
     -daemonize                                 \
     -no-kvm-irqchip                            \
     -hda Imgs/ndev_root.img                    \
     -m 1024                                    \
     -cdrom ISOs/ubuntu-8.10-server-amd64.iso   \
     -vnc :4                                    \
     -net nic,macaddr=DE:AD:BE:EF:04:04,model=e1000 \
     -net tap,ifname=tap4,script=/home/chris/kvm/qemu-ifup.sh 

The problem does not happen if I use -no-kvm.

I've tried some other options that have no effect:
  -no-kvm-pit
  -no-acpi

The disk images are raw format.

When the guests hang, I cannot ping them, and the vnc console us hung.  The 
qemu monitor is still accessible, and the guests recover if I issue a 
system_reset command from the monitor.  However, often, the console will not 
take keyboard after doing so.

When the guest is hung, kvm_stat shows all 0s for the counters:

efer_relo      exits  fpu_reloa  halt_exit  halt_wake  host_stat  hypercall
+insn_emul  insn_emul     invlpg   io_exits  irq_exits  irq_windo  largepage
+mmio_exit  mmu_cache  mmu_flood  mmu_pde_z  mmu_pte_u  mmu_pte_w  mmu_recyc
+mmu_shado  nmi_windo   pf_fixed   pf_guest  remote_tl  request_i  signal_ex
+tlb_flush
>          0          0          0          0          0          0          0
+0          0          0          0          0          0          0          0
+0          0          0          0          0          0          0          0
+0          0          0          0          0          0

gdb shows two threads - both waiting:

c(gdb) info threads
  2 Thread 0x414f1950 (LWP 422)  0x00007f36f07a03e1 in sigtimedwait ()
   from /lib/libc.so.6
  1 Thread 0x7f36f1f306e0 (LWP 414)  0x00007f36f084b482 in select ()
   from /lib/libc.so.6
(gdb) thread 1
[Switching to thread 1 (Thread 0x7f36f1f306e0 (LWP 414))]#0  0x00007f36f084b482
+in select () from /lib/libc.so.6
(gdb) bt
#0  0x00007f36f084b482 in select () from /lib/libc.so.6
#1  0x00000000004094cb in main_loop_wait (timeout=0)
    at /home/chris/pkgs/kvm/kvm-79/qemu/vl.c:4719
#2  0x000000000050a7ea in kvm_main_loop ()
    at /home/chris/pkgs/kvm/kvm-79/qemu/qemu-kvm.c:619
#3  0x000000000040fafc in main (argc=<value optimized out>,
    argv=0x7ffff9f41948) at /home/chris/pkgs/kvm/kvm-79/qemu/vl.c:4871
(gdb) thread 2
[Switching to thread 2 (Thread 0x414f1950 (LWP 422))]#0  0x00007f36f07a03e1 in
+sigtimedwait () from /lib/libc.so.6
(gdb) bt
#0  0x00007f36f07a03e1 in sigtimedwait () from /lib/libc.so.6
#1  0x000000000050a560 in kvm_main_loop_wait (env=0xc319e0, timeout=0)
    at /home/chris/pkgs/kvm/kvm-79/qemu/qemu-kvm.c:284
#2  0x000000000050aaf7 in ap_main_loop (_env=<value optimized out>)
    at /home/chris/pkgs/kvm/kvm-79/qemu/qemu-kvm.c:425
#3  0x00007f36f11ba3ea in start_thread () from /lib/libpthread.so.0
#4  0x00007f36f0852c6d in clone () from /lib/libc.so.6
#5  0x0000000000000000 in ?? ()


Any clues to help me resolve this would be much appreciated.


----------------------------------------------------------------------

Comment By: Teodor Milkov (z-image)
Date: 2009-08-24 10:45

Message:
With 2.6.31-rc6 it is running fine for almost 72 hours. Looks like the
problem is gone in 2.6.31.

----------------------------------------------------------------------

Comment By: Teodor Milkov (z-image)
Date: 2009-08-21 11:53

Message:
With -no-kvm-pit it is running fine for almost 20 hours. Didn't survive
that long without -no-kvm-pit.

----------------------------------------------------------------------

Comment By: Daniel Poelzleithner (poelzi)
Date: 2009-08-20 18:20

Message:
I'm still in investigation but I got new informations so far. There seem to
be diffenerent issues that cause different crashes.

- dynamic cpu throtteling on the host
- oops due the paravirt kvm support in the guest. i got hit by
http://bugzilla.kernel.org/show_bug.cgi?id=12405 and I'm now investigation
if disableing highmem helps as someone suggested. don't know if this also
affects 64bit guests, which seems to run more stable on other machines
here.

it helps to setup netconsole and let syslog-ng write it to a log file, so
oopses can be logged nicely

----------------------------------------------------------------------

Comment By: Teodor Milkov (z-image)
Date: 2009-08-20 16:12

Message:
On a closer look qemu actually exited, but it was virt manager who held
it's monitoring console. Here's full transcript of what happened in a shell
session:

gdb --args /usr/local/bin/qemu-system-x86_64 -S -M pc -m 2047 -smp 3 -name
kvm2 -uuid 4f484293-7e31-2fb9-f2c8-246b5f87f301 -monitor stdio -boot c
-drive file=/dev/vg0/kvm2,if=virtio,index=0,boot=on -serial none -parallel
none -vnc 213.145.98.164:1 -k en-us

GNU gdb (GDB) 6.8.50.20090628-cvs-debian
Copyright (C) 2009 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later
<http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "i486-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>

(gdb) run
Starting program: /usr/local/bin/qemu-system-x86_64 -S -M pc -m 2047 -smp
3 -name kvm2 -uuid 4f484293-7e31-2fb9-f2c8-246b5f87f301 -monitor stdio
-boot c -drive file=/dev/vg0/kvm2,if=virtio,index=0,boot=on -serial none
-parallel none -vnc 213.145.98.164:1 -k en-us
[Thread debugging using libthread_db enabled]
[New Thread 0xb7dafb90 (LWP 19769)]
[New Thread 0xb75aab90 (LWP 19770)]
[New Thread 0xb6da6b90 (LWP 19771)]

QEMU 0.10.50 monitor - type 'help' for more information
(qemu) c

[New Thread 0x35a2db90 (LWP 19772)]
[New Thread 0x3522cb90 (LWP 19773)]
[New Thread 0x348ffb90 (LWP 19774)]
[New Thread 0x340feb90 (LWP 19775)]
[New Thread 0x338fdb90 (LWP 19776)]
[New Thread 0x330fcb90 (LWP 19777)]
[New Thread 0x328fbb90 (LWP 19778)]
[New Thread 0x320fab90 (LWP 19779)]
[New Thread 0x318f9b90 (LWP 19780)]
[New Thread 0x310f8b90 (LWP 19781)]
[New Thread 0x308f7b90 (LWP 19782)]
[New Thread 0x300f6b90 (LWP 19783)]
[New Thread 0x2f8f5b90 (LWP 19784)]
[New Thread 0x2f0f4b90 (LWP 19785)]
[New Thread 0x2e8f3b90 (LWP 19786)]
[New Thread 0x2e0f2b90 (LWP 19787)]
[New Thread 0x2d8f1b90 (LWP 19788)]
[New Thread 0x2d0f0b90 (LWP 19789)]
[New Thread 0x2c8efb90 (LWP 19790)]
[New Thread 0x2c0eeb90 (LWP 19791)]
[New Thread 0x2b8edb90 (LWP 19792)]
[New Thread 0x2b0ecb90 (LWP 19793)]
[New Thread 0x2a8ebb90 (LWP 19794)]
[New Thread 0x2a0eab90 (LWP 19795)]
[New Thread 0x298e9b90 (LWP 19796)]
[New Thread 0x290e8b90 (LWP 19797)]
[New Thread 0x288e7b90 (LWP 19798)]
[New Thread 0x280e6b90 (LWP 19799)]
[New Thread 0x278e5b90 (LWP 19800)]
[New Thread 0x270e4b90 (LWP 19801)]
[New Thread 0x268e3b90 (LWP 19802)]
[Thread 0x2e0f2b90 (LWP 19787) exited]
[Thread 0x338fdb90 (LWP 19776) exited]
[Thread 0x2b8edb90 (LWP 19792) exited]
[Thread 0x2e8f3b90 (LWP 19786) exited]
[Thread 0x2f8f5b90 (LWP 19784) exited]
[Thread 0x308f7b90 (LWP 19782) exited]
[Thread 0x300f6b90 (LWP 19783) exited]
[New Thread 0x300f6b90 (LWP 19808)]
[New Thread 0x308f7b90 (LWP 19813)]
[New Thread 0x2f8f5b90 (LWP 19814)]
[New Thread 0x2e8f3b90 (LWP 19815)]
[New Thread 0x2b8edb90 (LWP 19816)]
[New Thread 0x260e2b90 (LWP 19817)]
[New Thread 0x258e1b90 (LWP 19818)]
[New Thread 0x250e0b90 (LWP 19819)]
[New Thread 0x248dfb90 (LWP 19820)]
[New Thread 0x240deb90 (LWP 19821)]
[New Thread 0x236ffb90 (LWP 19822)]
[New Thread 0x22cffb90 (LWP 19823)]
[New Thread 0x224feb90 (LWP 19824)]
[New Thread 0x21affb90 (LWP 19825)]
[New Thread 0x212feb90 (LWP 19828)]
[New Thread 0x20afdb90 (LWP 19829)]
[New Thread 0x202fcb90 (LWP 19830)]
[New Thread 0x1fafbb90 (LWP 19831)]
kvm: unhandled exit 31
kvm_run returned -22
[Thread 0x2c0eeb90 (LWP 19791) exited]
[Thread 0x270e4b90 (LWP 19801) exited]
[Thread 0x2f8f5b90 (LWP 19814) exited]
[Thread 0x320fab90 (LWP 19779) exited]
[Thread 0x310f8b90 (LWP 19781) exited]
[Thread 0x2e8f3b90 (LWP 19815) exited]
[Thread 0x21affb90 (LWP 19825) exited]
[Thread 0x300f6b90 (LWP 19808) exited]
[Thread 0x328fbb90 (LWP 19778) exited]
[Thread 0x2c8efb90 (LWP 19790) exited]
[Thread 0x2d0f0b90 (LWP 19789) exited]
[Thread 0x260e2b90 (LWP 19817) exited]
[Thread 0x268e3b90 (LWP 19802) exited]
[Thread 0x240deb90 (LWP 19821) exited]
[Thread 0x290e8b90 (LWP 19797) exited]
[Thread 0x280e6b90 (LWP 19799) exited]
[Thread 0x2a8ebb90 (LWP 19794) exited]
[Thread 0x20afdb90 (LWP 19829) exited]
[Thread 0x2a0eab90 (LWP 19795) exited]
[Thread 0x2d8f1b90 (LWP 19788) exited]
[Thread 0x248dfb90 (LWP 19820) exited]
[Thread 0x2b8edb90 (LWP 19816) exited]
[Thread 0x278e5b90 (LWP 19800) exited]
[Thread 0x2b0ecb90 (LWP 19793) exited]
[Thread 0x2f0f4b90 (LWP 19785) exited]
[Thread 0x298e9b90 (LWP 19796) exited]
[Thread 0x35a2db90 (LWP 19772) exited]
[Thread 0x318f9b90 (LWP 19780) exited]
[Thread 0x236ffb90 (LWP 19822) exited]
[Thread 0x258e1b90 (LWP 19818) exited]
[Thread 0x348ffb90 (LWP 19774) exited]
[Thread 0x308f7b90 (LWP 19813) exited]
[Thread 0x288e7b90 (LWP 19798) exited]
[Thread 0x202fcb90 (LWP 19830) exited]
[Thread 0x340feb90 (LWP 19775) exited]
[Thread 0x3522cb90 (LWP 19773) exited]
[Thread 0x330fcb90 (LWP 19777) exited]
[Thread 0x1fafbb90 (LWP 19831) exited]
[Thread 0x250e0b90 (LWP 19819) exited]
[Thread 0x22cffb90 (LWP 19823) exited]
[Thread 0x224feb90 (LWP 19824) exited]
[Thread 0x212feb90 (LWP 19828) exited]

(qemu)

I'm going to try it with -no-kvm-pit now...

----------------------------------------------------------------------

Comment By: Teodor Milkov (z-image)
Date: 2009-08-20 12:48

Message:
I believe I may hit the same bug.

* CPU is 2x 8 core + SMT (so it looks like 16 cores) Nehalem (Intel(R)
Xeon(R) CPU E5520  @ 2.27GHz)
* Host kernel is i386 and not x86_64: Debian sid package
linux-image-2.6.30-1-686-bigmem 2.6.30-5
* QEMU PC emulator version 0.10.50 (qemu-kvm-devel-88)
* Guests:
   * Debian Etch with backports 32 bit kernel 2.6.26-bpo.2-686-bigmem
   * Debian Etch with custom compiled 32 bit kernel 2.6.30.4

Load testing with stress (http://weather.ou.edu/~apw/projects/stress/).
Guests are configured to use 2047MB memory and 3 VCPUs (tried with 2VCPUs
as well).

After some time - anywhere from 30 minutes to several hours - the virtual
machine hangs. It doesn't crash, just doesn't respond anymore to keyboard,
vnc, ping or anything else. I tried to run a gdb session on the two guests
and the results are more or less equal:

gdb --args /usr/local/bin/qemu-system-x86_64 -S -M pc -m 2047 -smp 3 -name
kvm2 -uuid 4f484293-7e31-2fb9-f2c8-246b5f87f301 -monitor pty -boot c -drive
file=/var/lib/libvirt/images/iso/debian-40r8-etchnhalf-i386-netinst.iso,if=ide,media=cdrom,index=2
-drive file=/dev/vg0/kvm2,if=virtio,index=0,boot=on -net
nic,macaddr=54:52:00:31:be:e3,vlan=0,model=virtio -net tap,fd=29,vlan=0
-serial pty -parallel none -usb -vnc 127.0.0.1:1 -k en-us    
GNU gdb (GDB) 6.8.50.20090628-cvs-debian

...

^C
Program received signal SIGINT, Interrupt.
0xb8036424 in __kernel_vsyscall ()

(gdb) info threads
  27 Thread 0xb7e10b90 (LWP 19064)  0xb8036424 in __kernel_vsyscall ()
  26 Thread 0xb760bb90 (LWP 19065)  0xb8036424 in __kernel_vsyscall ()
  25 Thread 0xb6e07b90 (LWP 19066)  0xb8036424 in __kernel_vsyscall ()
* 1 Thread 0xb7e11a70 (LWP 19060)  0xb8036424 in __kernel_vsyscall ()

(gdb) thread 1
[Switching to thread 1 (Thread 0xb7e11a70 (LWP 19060))]#0  0xb8036424 in
__kernel_vsyscall ()
(gdb) bt
#0  0xb8036424 in __kernel_vsyscall ()
#1  0xb7f06fe1 in select () from /lib/i686/cmov/libc.so.6
#2  0x0804c3c6 in qemu_select (max_fd=30, rfds=0xbfd46f00,
wfds=0xbfd46e80, xfds=0xbfd46e00, tv=0xbfd46df4) at
/home/zimage/kvm/qemu-kvm-devel-88/vl.c:313
#3  0x08052958 in main_loop_wait (timeout=1000) at
/home/zimage/kvm/qemu-kvm-devel-88/vl.c:4339
#4  0x0818777e in kvm_main_loop () at
/home/zimage/kvm/qemu-kvm-devel-88/qemu-kvm.c:2194
#5  0x080530c9 in main_loop () at
/home/zimage/kvm/qemu-kvm-devel-88/vl.c:4550
#6  0x08056799 in main (argc=33, argv=0xbfd47424, envp=0xbfd474ac) at
/home/zimage/kvm/qemu-kvm-devel-88/vl.c:6416

(gdb) thread 25
[Switching to thread 25 (Thread 0xb6e07b90 (LWP 19066))]#0  0xb8036424 in
__kernel_vsyscall ()
(gdb) bt
#0  0xb8036424 in __kernel_vsyscall ()
#1  0xb7e59551 in sigtimedwait () from /lib/i686/cmov/libc.so.6
#2  0x08186e1b in kvm_main_loop_wait (env=0x9aad960, timeout=1000) at
/home/zimage/kvm/qemu-kvm-devel-88/qemu-kvm.c:1869
#3  0x08187231 in kvm_main_loop_cpu (env=0x9aad960) at
/home/zimage/kvm/qemu-kvm-devel-88/qemu-kvm.c:2009
#4  0x08187340 in ap_main_loop (_env=0x9aad960) at
/home/zimage/kvm/qemu-kvm-devel-88/qemu-kvm.c:2044
#5  0xb7fd74b5 in start_thread () from /lib/i686/cmov/libpthread.so.0
#6  0xb7f0ea5e in clone () from /lib/i686/cmov/libc.so.6

(gdb) thread 26
[Switching to thread 26 (Thread 0xb760bb90 (LWP 19065))]#0  0xb8036424 in
__kernel_vsyscall ()
(gdb) bt
#0  0xb8036424 in __kernel_vsyscall ()
#1  0xb7e59551 in sigtimedwait () from /lib/i686/cmov/libc.so.6
#2  0x08186e1b in kvm_main_loop_wait (env=0x9aa4028, timeout=1000) at
/home/zimage/kvm/qemu-kvm-devel-88/qemu-kvm.c:1869
#3  0x08187231 in kvm_main_loop_cpu (env=0x9aa4028) at
/home/zimage/kvm/qemu-kvm-devel-88/qemu-kvm.c:2009
#4  0x08187340 in ap_main_loop (_env=0x9aa4028) at
/home/zimage/kvm/qemu-kvm-devel-88/qemu-kvm.c:2044
#5  0xb7fd74b5 in start_thread () from /lib/i686/cmov/libpthread.so.0
#6  0xb7f0ea5e in clone () from /lib/i686/cmov/libc.so.6

(gdb) thread 27
[Switching to thread 27 (Thread 0xb7e10b90 (LWP 19064))]#0  0xb8036424 in
__kernel_vsyscall ()
(gdb) bt
#0  0xb8036424 in __kernel_vsyscall ()
#1  0xb7e59551 in sigtimedwait () from /lib/i686/cmov/libc.so.6
#2  0x08186e1b in kvm_main_loop_wait (env=0x9a93df0, timeout=1000) at
/home/zimage/kvm/qemu-kvm-devel-88/qemu-kvm.c:1869
#3  0x08187231 in kvm_main_loop_cpu (env=0x9a93df0) at
/home/zimage/kvm/qemu-kvm-devel-88/qemu-kvm.c:2009
#4  0x08187340 in ap_main_loop (_env=0x9a93df0) at
/home/zimage/kvm/qemu-kvm-devel-88/qemu-kvm.c:2044
#5  0xb7fd74b5 in start_thread () from /lib/i686/cmov/libpthread.so.0
#6  0xb7f0ea5e in clone () from /lib/i686/cmov/libc.so.6


----------------------------------------------------------------------

Comment By: Bryan Cameron Lesiuk (clesiuk)
Date: 2009-03-25 19:35

Message:
I have a similar problem as the original poster. 

I've discovered a possible workaround: disable CPU frequency scaling in
the host:
# apt-get remove powernowd

I'm running with disabled frequency scaling and so far my system is
stable.

I set the host frequency manually: 
# cd /sys/devices/system/cpu/cpu0/cpufreq
# cat scaling_available_frequencies
>     2500000 2400000 2200000 2000000 1800000 1000000 
# cat scaling_available_governors
>     conservative ondemand userspace powersave performance 
# echo powersave > scaling_governor    (minimum frequency)
# echo performance > scaling_governor  (maximum frequency)

Here's my rig: 
* AMD Athlon X2 4850e (2500 MHz dual core)
* 4Gig memory, 800MHz, dual channel
* 780G chipset (Jetway NC81-LF motherboard)

I tried combinations of Host/Guest using:
* Ubuntu 8.10 server, i686, KVM-72 
* Ubuntu 8.10 server, amd64, KVM-72
* Ubuntu 9.04 server, amd64, KVM-84 (22 March 2009 beta)

Stuff I've tried which had no discernible effect: 
* clock source: kvm-clock, acpi_pm
* block device: ide, virtual
* network device: e1000, virtual

----------------------------------------------------------------------

Comment By: Michael Tokarev (mjtsf)
Date: 2009-02-09 15:52

Message:
Ok, I have very similar issue here as well.
Host - 4-core Phenom CPU and AMD 780G chipset, running 2.6.28.4-x86-64
(from kernel.org).
kvm-83 32bits
Guest - 2.6.27.13-i686smp, also from kernel.org.

The guest is running with KVM_GUEST stuff enabled, using kvm timer and
virtio network and block.  The system is Debian (lenny-to-be) on both, but
I don't think it matters since both uses custom-compiled kernels.

Guest - at least one of them - hangs, especially when many guests are
running in parallel (we've 4 windows machines and 4 linux machines, mostly
idle).  When it hangs, nothing really works - console, ping, etc.  It
usually continues working after 1..2 minutes or more.  During the hang, the
host is either silent or is spewing tons of "vcpu not ready for
apic_round_robin" messages (several 1000s of them) -- but I can't be sure
that message is directly related to the hangs.

Nothing is logged on guest.

The so-far-only-affected guest is assigned 2 virtual CPUs, -- I'll try to
reboot it with single cpu only to see if it will change anything.

I wasn't able to check gdb/trace/etc so far, because the guest that hangs
is my main working machine, which is a terminal server, so I have to run to
another room to server's console and check there.

----------------------------------------------------------------------

Comment By: Dustin Kirkland (dustin_kirkland)
Date: 2009-02-09 14:38

Message:
In the Ubuntu 8.10 guest, can you try the linux-image-virtual kernel?  The
current one points to linux-image-2.6.27-11-virtual.

:-Dustin

----------------------------------------------------------------------

Comment By: Daniel Poelzleithner (poelzi)
Date: 2009-01-18 08:18

Message:
New stability infos on my side.

Host:
Linux dirus-dom 2.6.28-2-server #3-Ubuntu SMP Thu Dec 4 22:35:12 UTC 2008
x86_64 GNU/Linux


Guest:
2.6.28 x86_64 
- disabled all kvm guest options (with kvm_clock disabled)
- enabled virtio_block 
- started with -smp 1 and -smp 2

they didn't crash yet, with 1 or 2 smp. I think disabling kvm guest
support did the trick.
however using nfs out of the guest is quite slow and not very stable it
seems. the guest laggs quite often
i have the feeling but even loads up to 11. running crashme, high -j
kernel build and file transfers didn't crash the machine.

----------------------------------------------------------------------

Comment By: James Thomason (james_thomason)
Date: 2009-01-15 09:30

Message:
Update: 

I installed Ubuntu 8.10 server and upgraded to 2.6.29-rc1 and KVM-83. I am
still able to reproduce when kvm -smp > 1.  New behavior in this
configuration is the printing of the message "Stuck??" to the console,
followed shortly by a kernel panic.   

KVM Host:

Ubuntu Server 8.10
Linux 2.6.29-RC1
KVM-83 

KVM Guest: 

Ubuntu Server 8.10
2.6.27-9-server



----------------------------------------------------------------------

Comment By: James Thomason (james_thomason)
Date: 2009-01-15 09:20

Message:
Hello, 

I am able to reliably reproduce a condition where a guest goes into a
tight
loop or spinlock on all running cores.  The scenario is exactly as
described
in bug 2351676, though my environment differs as detailed below.  My
observation is that the issue is correlated to the number of VCPUs
assigned
to the guest and CPU load. The higher the number of VCPUs and CPU
utilization, the more easily it is triggered.  If a KVM developer is
interested in debugging live, I might be able to arrange getting the
system
in question into a DMZ.  A review of the kvm tracker leads me to believe
that the following bugs are possibly related:

[ 2351676 ] Guests hang periodically on Ubuntu-8.10
[ 2353811 ] Solaris 10 guest unstable
[ 2494730 ] Guests "stalling" on kvm-82
[ 2138079 ] kvm locks up system
[ 2113643 ] guests AND host still getting stuck under CPU load

KVM Host Configuration:

4 x Quad-Core AMD Opteron Processors (8346 HE @ 1.8Ghz)
64GB DDR2 667Mhz
Fedora 10 x64
Kernel 2.6.28
KVM-82 

KVM Guest Configuration:
32GB Memory
1 to 16 VCPUs
Centos 5.2 x64
Kernel 2.6.28
IDE disk
e1000 NIC

----------------------------------------------------------------------

Comment By: Daniel Poelzleithner (poelzi)
Date: 2009-01-13 21:11

Message:
I have a very simelar setup.

Host: 
Ubuntu 8.10. 
Kernel 2.6.28-2-server
KVM: 72, 80, 81, 82, 83 tried (using the up to date kvm module, too)

Guests:
Endian Firewall (centos based.) 
Kernel 2.6.22.19-72.endian15
Is stable so far. sometimes loos usb devices

Ubuntu 8.10
Kernel 2.6.27, 2.6.28-2-server, 2.6.28 vanilla home brew
Very unstable.

As the Ubuntu 8.10 is also unstable when using the 2.6.28 vanilla kernel,
i'm not so sure it's a guest problem.
I will now compile a 2.6.28 kernel not having any kvm guest support.

Things doesn't seem to have a affect:
- using ide instead of virtio
- using e1000 instead of virtio

however, it seems that it may be caused by io access, but is not
reproducable easily.

Last tries i did': using kernel parameters "clocksource=acpi_pm notsc" in
the guest. Still investigating if it makes the guest stable.

btw. with kvm-82 i saw arround 100 io_exits when only the crashed ubuntu
8.10 is running. nothing else.

----------------------------------------------------------------------

Comment By: Chris Jones (c_jones)
Date: 2008-12-10 22:29

Message:
Actually, I was too quick to say that a Fedora 8 guest is stable.  Even
there, I'm seeing hangs once I get my application fully installed
(basically, once I introduce some load).

I also did an update to kvm-80 and the problem still exists (on all the
guests I've tried).  That's with kvm-80 kernel modules and the kvm-80 user,
running on linux-2.6.27.8.

Thanks,
Chris

----------------------------------------------------------------------

Comment By: Chris Jones (c_jones)
Date: 2008-12-01 21:09

Message:
Alexey,

Thanks for the response.  As you advised, I tried a Fedora 8 guest, and it
does seem to be much more stable.  However, I really need a Debian base
system for my application.  Not necessarily Ubuntu 8.10, but I haven't had
much luck with others either.  Do you have any recommendations on one that
is particularly stable?

Over the weekend I tried:
  Fedora 8       : Seems very stable, but I really need a debian base.
  Ubuntu 8.04LTS : Same periodic hangs I was seeing on 8.10
  Debian 4.0 Etch: Seems stable on the guest, but on the host, qemu
process is running 100% busy
                   while the guest is idle.

Any chance you know a workaround for the issue I'm seeing on etch, or can
recommend a Debian base distribution which works well with KVM?

Thanks much,
Chris

----------------------------------------------------------------------

Comment By: Technologov (technologov)
Date: 2008-11-27 14:54

Message:
In my opinion it is not the Ubuntu host that is problematic - but the guest
on KVM.

I mean that Ubuntu 8.10 guest is unstable on KVM. I have not found out
why.

If you try some better tested guest (Fedora 7/8 or Windows XP guest it
should be lots more stable).

And if you try some other host (i.e. Fedora host and run Ubuntu 8.10 guest
it will be unstable).

In short - in my opinion - the problem is not host OS, but either KVM or
it's connection with guest OS.

-Alexey E. "Technologov", 27.11.2008.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=893831&aid=2351676&group_id=180599
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to