Bug#496917: BUG: soft lockup - CPU#1 stuck for 4096s! [java:3174] and crash (kvm guest)

2009-01-07 Thread Lukas Kolbe
Hi Moritz,

 Does this bug still persist with the current Lenny kernel?

Lucky me (sort of) - we stumbled upon this bug again on another server
(8 cores, 8 GB Ram) using 2.6.26-1-amd64_2.6.26-11, and fortune wants it
that it will be under heavy load tomorrow so that we can try the -12
kernel. I'll report back on it.

 Cheers,
 Moritz

-- 
Lukas




-- 
To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#496917: BUG: soft lockup - CPU#1 stuck for 4096s! [java:3174] and crash (kvm guest)

2009-01-06 Thread Lukas Kolbe
Am Sonntag, den 14.12.2008, 23:50 +0100 schrieb Moritz Muehlenhoff:
 On Mon, Sep 01, 2008 at 03:56:49PM +0200, Lukas Kolbe wrote:
  Hi!
  
   seeing if it is fixed in 2.6.27-rc5 might be more interesting.
   thanks
   
   2.6.27-rc5 has now been running fine in the guest for more than four
   hours (and me restarting jboss every now and then). I'll report back 
   tomorrow evening, that would be the timeframe
   the bug should've triggered. 
  
  So far the 2.6.27-rc5 seems to be stable, at least it hasn't crashed on
  me. What can I do to help to get the needed fix to testing? (I know that
  this kernel won't make it and that's a good thing, but I don't really
  know how to identify what's needed to fix this).
 
 Does this bug still persist with the current Lenny kernel?

I am sorry to report that I cannot reproduce this bug anymore, but
that's because the system locks up hard (I have no serial console to it
and it is remotely accessible only).

When I do an rsync of ca. 100GB of data on a guest from a remote
location (both guest and host with 2.6.26-1-amd64 2.6.26-12, but it also
happened with -9), after around 10-25GB the host (!) locks up. Ping does
work, but the shells are dead and no further access (via ssh) is
possible at all. After a forced reboot, the logs show up completely
clueless (as in: syslog marks, but nothing else until the new bootup
messages).

When I do the rsync in the host system (chroot'ed to the guest system
for easyness), I can transfer all 100GB without a hitch.

I am using virtio by the way, as in:

#!/bin/bash

KERNEL=2.6.26-1-amd64
NAME=myname

kvm -smp 2 \
 -drive if=virtio,file=/dev/vg0/${NAME}-root,cache=on,boot=on \
 -drive if=virtio,file=/dev/vg0/${NAME}-log,cache=on,boot=off \
 -m 512 \
 -nographic \
 -daemonize \
 -name ${NAME} \
 -kernel /boot/kvm/${NAME}/vmlinuz-${KERNEL} \
 -initrd /boot/kvm/${NAME}/initrd.img-${KERNEL} \
 -append root=/dev/vda ro console=ttyS0,115200 \
 -serial mon:unix:/etc/kvm/consoles/${NAME}.sock,server,nowait \
 -net nic,macaddr=DE:AD:BE:EF:21:75,model=virtio \
 -net tap,ifname=tap04,script=/etc/kvm/kvm-ifup \
 -net nic,macaddr=DE:AD:BE:EF:21:76,model=virtio \
 -net tap,ifname=tap14,script=/etc/kvm/kvm-ifup-${NAME} \

I'm sorry as I can't at the moment reproduce the problem that lead to
open this bug. Maybe I should open a new bug for this?

 Cheers,
 Moritz

All the best and a happy new year, 
Lukas





-- 
To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#496917: BUG: soft lockup - CPU#1 stuck for 4096s! [java:3174] and crash (kvm guest)

2008-12-15 Thread Lukas Kolbe
Hi,

  So far the 2.6.27-rc5 seems to be stable, at least it hasn't crashed on
  me. What can I do to help to get the needed fix to testing? (I know that
  this kernel won't make it and that's a good thing, but I don't really
  know how to identify what's needed to fix this).
 
 Does this bug still persist with the current Lenny kernel?

Sorry, I have no machine to test this at the moment. Perhaps I find one
in the next days ...

 Cheers,
 Moritz

-- 
Lukas





-- 
To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#496917: BUG: soft lockup - CPU#1 stuck for 4096s! [java:3174] and crash (kvm guest)

2008-12-14 Thread Moritz Muehlenhoff
On Mon, Sep 01, 2008 at 03:56:49PM +0200, Lukas Kolbe wrote:
 Hi!
 
  seeing if it is fixed in 2.6.27-rc5 might be more interesting.
  thanks
  
  2.6.27-rc5 has now been running fine in the guest for more than four
  hours (and me restarting jboss every now and then). I'll report back 
  tomorrow evening, that would be the timeframe
  the bug should've triggered. 
 
 So far the 2.6.27-rc5 seems to be stable, at least it hasn't crashed on
 me. What can I do to help to get the needed fix to testing? (I know that
 this kernel won't make it and that's a good thing, but I don't really
 know how to identify what's needed to fix this).

Does this bug still persist with the current Lenny kernel?

Cheers,
Moritz



-- 
To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#496917: BUG: soft lockup - CPU#1 stuck for 4096s! [java:3174] and crash (kvm guest)

2008-09-01 Thread Lukas Kolbe
Hi!

 seeing if it is fixed in 2.6.27-rc5 might be more interesting.
 thanks
 
 2.6.27-rc5 has now been running fine in the guest for more than four
 hours (and me restarting jboss every now and then). I'll report back tomorrow 
 evening, that would be the timeframe
 the bug should've triggered. 

So far the 2.6.27-rc5 seems to be stable, at least it hasn't crashed on
me. What can I do to help to get the needed fix to testing? (I know that
this kernel won't make it and that's a good thing, but I don't really
know how to identify what's needed to fix this).


 -- 
 maks

-- 
Lukas





-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#496917: BUG: soft lockup - CPU#1 stuck for 4096s! [java:3174] and crash (kvm guest)

2008-08-29 Thread Lukas Kolbe
maximilian attems wrote:

should be fixed in 2.6.26-4, should be available tomorrow in unstable.
otherwise find sid snapshots http://wiki.debian.org/DebianKernel


Sorry, but it crashed on me again - this time stuck in swapper.

[36037.786125] BUG: soft lockup - CPU#1 stuck for
4097s! [swapper:0]
[36037.786125] Modules linked in: ipv6 snd_pcm snd_timer snd soundcore
snd_page_alloc parport_pc parport serio_raw psmouse pcspkr i2c_piix4
i2c_core button evdev joydev dm_mirror dm_log dm_snapshot dm_mod
ide_cd_mod cdrom ata_generic floppy piix ide_pci_generic thermal fan
virtio_balloon virtio_pci virtio_ring virtio_rng rng_core virtio_net
virtio_blk virtio freq_table processor thermal_sys raid1 raid0 md_mod
atiixp ahci sata_nv sata_sil sata_via libata dock via82cxxx ide_core
3w_9xxx 3w_ scsi_mod xfs ext3 jbd ext2 mbcache reiserfs
[36037.786125] CPU 1:
[36037.786125] Modules linked in: ipv6 snd_pcm snd_timer snd soundcore
snd_page_alloc parport_pc parport serio_raw psmouse pcspkr i2c_piix4
i2c_core button evdev joydev dm_mirror dm_log dm_snapshot dm_mod
ide_cd_mod cdrom ata_generic floppy piix ide_pci_generic thermal fan
virtio_balloon virtio_pci virtio_ring virtio_rng rng_core virtio_net
virtio_blk virtio freq_table processor thermal_sys raid1 raid0 md_mod
atiixp ahci sata_nv sata_sil sata_via libata dock via82cxxx ide_core
3w_9xxx 3w_ scsi_mod xfs ext3 jbd ext2 mbcache reiserfs
[36037.786125] Pid: 0, comm: swapper Not tainted 2.6.26-1-amd64 #1
[36037.786125] RIP: 0010:[8021eb20]  [8021eb20]
native_safe_halt+0x2/0x3
[36037.786125] RSP: 0018:8100c6ea5f38  EFLAGS: 0246
[36037.786125] RAX: 8100c6ea5fd8 RBX:  RCX:

[36037.786125] RDX:  RSI: 0001 RDI:
804fadf0
[36037.786125] RBP: 00ce8b7c R08: 8100010276a0 R09:
8100c55b4b10
[36037.786125] R10: 8100c45dbbc8 R11: 8100c485f160 R12:
8100c6ea5ed8
[36037.786125] R13:  R14: 8023cd02 R15:
196c2157187c
[36037.786125] FS:  414e8960() GS:8100c6e4a0c0()
knlGS:
[36037.786125] CS:  0010 DS: 0018 ES: 0018 CR0: 8005003b
[36037.786125] CR2: 7f4a0d0c4000 CR3: c548c000 CR4:
06e0
[36037.786125] DR0:  DR1:  DR2:

[36037.786125] DR3:  DR6: 0ff0 DR7:
0400
[36037.786125]
[36037.786125] Call Trace:
[36037.786125]  [8020b0cd] ? default_idle+0x2a/0x49
[36037.786125]  [8020ac79] ? cpu_idle+0x89/0xb3
[36037.786125]
[40435.828242] BUG: soft lockup - CPU#0 stuck for 8193s! [cron:2082]
[9223956581.178656] Modules linked in: ipv6 snd_pcm snd_timer snd
soundcore snd_page_alloc parport_pc parport serio_raw psmouse pcspkr
i2c_piix4 i2c_core button evdev joydev dm_mirror dm_log dm_snapshot
dm_mod ide_cd_mod cdrom ata_generic floppy piix ide_pci_generic thermal
fan virtio_balloon virtio_pci virtio_ring virtio_rng rng_core virtio_net
virtio_blk virtio freq_table processor thermal_sys raid1 raid0 md_mod
atiixp ahci sata_nv sata_sil sata_via libata dock via82cxxx ide_core
3w_9xxx 3w_ scsi_mod xfs ext3 jbd ext2 mbcache reiserfs
[9223956581.178656] CPU 0:
[9223956581.178656] Modules linked in: ipv6 snd_pcm snd_timer snd
soundcore snd_page_alloc parport_pc parport serio_raw psmouse pcspkr
i2c_piix4 i2c_core button evdev joydev dm_mirror dm_log dm_snapshot
dm_mod ide_cd_mod cdrom ata_generic floppy piix ide_pci_generic thermal
fan virtio_balloon virtio_pci virtio_ring virtio_rng rng_core virtio_net
virtio_blk virtio freq_table processor thermal_sys raid1 raid0 md_mod
atiixp ahci sata_nv sata_sil sata_via libata dock via82cxxx ide_core
3w_9xxx 3w_ scsi_mod xfs ext3 jbd ext2 mbcache reiserfs
[9223956581.178656] Pid: 2082, comm: cron Not tainted 2.6.26-1-amd64 #1
[9223956581.178656] RIP: 0010:[8024aa86]  [8024aa86]
getnstimeofday+0x9/0x98
[9223956581.178656] RSP: 0018:8100c545df18  EFLAGS: 0202
[9223956581.178656] RAX: 00ce8b7d RBX: 00ce8b7d RCX:
07d8
[9223956581.178656] RDX: 0002 RSI:  RDI:
8100c545df38
[9223956581.178656] RBP: 0008 R08: 0003 R09:
07d8
[9223956581.178656] R10: 07d8 R11: 0246 R12:
1000
[9223956581.178656] R13: 01c8 R14:  R15:

[9223956581.178656] FS:  7fcf5ad0e6d0()
GS:8053b000() knlGS:
[9223956581.178656] CS:  0010 DS:  ES:  CR0: 8005003b
[9223956581.178656] CR2: 00418238 CR3: c41ac000 CR4:
06e0
[9223956581.178656] DR0:  DR1:  DR2:

[9223956581.178656] DR3:  DR6: 0ff0 DR7:
0400
[9223956581.178656]
[9223956581.178656] Call Trace:
[9223956581.178656]  [8024aab6] ? getnstimeofday+0x39/0x98

Bug#496917: BUG: soft lockup - CPU#1 stuck for 4096s! [java:3174] and crash (kvm guest)

2008-08-29 Thread maximilian attems
On Fri, Aug 29, 2008 at 02:36:00PM +0200, Lukas Kolbe wrote:
 
 Sorry, my previous answer didn't make it through my mail setup. I was
 using 2.6.26-4snapshot.12144 when the crash happened. I'll try it with
 the current snapshot again, though the changelog doesn't say anything
 about actual changes :)

seeing if it is fixed in 2.6.27-rc5 might be more interesting.
thanks

-- 
maks



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#496917: BUG: soft lockup - CPU#1 stuck for 4096s! [java:3174] and crash (kvm guest)

2008-08-29 Thread Lukas Kolbe

Sorry, my previous answer didn't make it through my mail setup. I was
using 2.6.26-4snapshot.12144 when the crash happened. I'll try it with
the current snapshot again, though the changelog doesn't say anything
about actual changes :)

-- 
Lukas




-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#496917: BUG: soft lockup - CPU#1 stuck for 4096s! [java:3174] and crash (kvm guest)

2008-08-29 Thread Lukas Kolbe
maximilian attems wrote:

On Fri, Aug 29, 2008 at 02:36:00PM +0200, Lukas Kolbe wrote:
 
 Sorry, my previous answer didn't make it through my mail setup. I was
 using 2.6.26-4snapshot.12144 when the crash happened. I'll try it with
 the current snapshot again, though the changelog doesn't say anything
 about actual changes :)

seeing if it is fixed in 2.6.27-rc5 might be more interesting.
thanks

2.6.27-rc5 has now been running fine in the guest for more than four
hours (and me restarting jboss every now and then). I'll report back tomorrow 
evening, that would be the timeframe
the bug should've triggered. 

-- 
maks

-- 
Lukas




-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#496917: BUG: soft lockup - CPU#1 stuck for 4096s! [java:3174] and crash (kvm guest)

2008-08-28 Thread Lukas Kolbe
Package: linux-image-2.6.26-1-amd64
Version: 2.6.26-3
Severity: important


Using kvm 72, the guest is started with:
kvm -smp 2 \
 -net nic,macaddr=DE:AD:BE:EF:21:71,model=virtio \
 -net tap,ifname=tap02,script=/etc/kvm/kvm-ifup \
 -net nic,macaddr=DE:AD:BE:EF:21:72,model=virtio \
 -net tap,ifname=tap12,script=/etc/kvm/kvm-ifup-web \
 -drive if=virtio,boot=on,file=/dev/vg0/web-root \
 -drive if=virtio,file=/dev/vg0/web-log \
 -drive if=virtio,file=/dev/vg0/web-srv \
 -drive if=virtio,file=/dev/vg0/web-swap \
 -m 3192 \
 -kernel /boot/vmlinuz-2.6.26-1-amd64 \
 -initrd /boot/initrd.img-2.6.26-1-amd64 \
 -append root=/dev/vda ro \
 -curses

On the guest, I run  the latest JDK from Sun 1.6.0_07, and the latest JBoss from
Redhat 4.2.3.GA, default configuration (except that it's started via runit:
 exec chpst -u jboss:jboss -e ./env -o 65536 /opt/jboss/bin/run.sh -b host

After restarting it a few times, I get the following crash of the guest kernel:

Aug 28 14:38:06 web-new kernel: [ 6677.939474] BUG: soft lockup - CPU#0 stuck 
for 4096s! [master:2066]
Aug 28 14:38:06 web-new kernel: [ 6677.941978] Modules linked in: ipv6 snd_pcsp 
snd_pcm parport_pc parport snd_timer serio_raw snd psmouse soundcore i2c_piix4 
snd_page_alloc i2c_core
 button evdev dm_mirror dm_log dm_snapshot dm_mod ide_cd_mod cdrom ata_generic 
floppy piix ide_pci_generic thermal fan virtio_balloon virtio_pci virtio_ring 
virtio_rng rng_core virti
o_net virtio_blk virtio freq_table processor thermal_sys raid1 raid0 md_mod 
atiixp ahci sata_nv sata_sil sata_via libata dock via82cxxx ide_core 3w_9xxx 
3w_ scsi_mod xfs ext3 jbd
 ext2 mbcache reiserfs
Aug 28 14:38:06 web-new kernel: [ 6677.941978] CPU 0:
Aug 28 14:38:06 web-new kernel: [ 6677.941978] Modules linked in: ipv6 snd_pcsp 
snd_pcm parport_pc parport snd_timer serio_raw snd psmouse soundcore i2c_piix4 
snd_page_alloc i2c_core
 button evdev dm_mirror dm_log dm_snapshot dm_mod ide_cd_mod cdrom ata_generic 
floppy piix ide_pci_generic thermal fan virtio_balloon virtio_pci virtio_ring 
virtio_rng rng_core virti
o_net virtio_blk virtio freq_table processor thermal_sys raid1 raid0 md_mod 
atiixp ahci sata_nv sata_sil sata_via libata dock via82cxxx ide_core 3w_9xxx 
3w_ scsi_mod xfs ext3 jbd
 ext2 mbcache reiserfs
Aug 28 14:38:06 web-new kernel: [ 6677.941978] Pid: 2066, comm: master Not 
tainted 2.6.26-1-amd64 #1
Aug 28 14:38:06 web-new kernel: [ 6677.941978] RIP: 0033:[7fcf1e3d4c0a]  
[7fcf1e3d4c0a]
Aug 28 14:38:06 web-new kernel: [ 6677.941978] RSP: 002b:7fff264e18e0  
EFLAGS: 0202
Aug 28 14:38:06 web-new kernel: [ 6677.941978] RAX: 0002 RBX: 
0086 RCX: 
Aug 28 14:38:06 web-new kernel: [ 6677.941978] RDX: 0430 RSI: 
7fcf1e4dc903 RDI: 00401895
Aug 28 14:38:06 web-new kernel: [ 6677.941978] RBP: 005078bc R08: 
7fcf1e4e4c98 R09: 
Aug 28 14:38:06 web-new kernel: [ 6677.941978] R10:  R11: 
0064 R12: 7fff264e1e50
Aug 28 14:38:06 web-new kernel: [ 6677.941978] R13: 7fff264e1ed0 R14: 
7fcf1e298de0 R15: 7fff264e1f50
Aug 28 14:38:06 web-new kernel: [ 6677.941978] FS:  7fcf1e4d96d0() 
GS:8053b000() knlGS:
Aug 28 14:38:06 web-new kernel: [ 6677.941978] CS:  0010 DS:  ES:  CR0: 
8005003b
Aug 28 14:38:06 web-new kernel: [ 6677.941978] CR2: 7f65642cb860 CR3: 
c4162000 CR4: 06e0
Aug 28 14:38:06 web-new kernel: [ 6677.941978] DR0:  DR1: 
 DR2: 
Aug 28 14:38:06 web-new kernel: [ 6677.941978] DR3:  DR6: 
0ff0 DR7: 0400
Aug 28 14:38:06 web-new kernel: [ 6677.941978] 
Aug 28 14:38:06 web-new kernel: [ 6677.941978] Call Trace:
Aug 28 14:38:06 web-new kernel: [ 6677.941978] 
Aug 28 14:38:06 web-new kernel: [ 6677.939462] BUG: soft lockup - CPU#1 stuck 
for 4096s! [java:3174]
Aug 28 14:38:06 web-new kernel: [ 6677.939462] Modules linked in: ipv6 snd_pcsp 
snd_pcm parport_pc parport snd_timer serio_raw snd psmouse soundcore i2c_piix4 
snd_page_alloc i2c_core
 button evdev dm_mirror dm_log dm_snapshot dm_mod ide_cd_mod cdrom ata_generic 
floppy piix ide_pci_generic thermal fan virtio_balloon virtio_pci virtio_ring 
virtio_rng rng_core virti
o_net virtio_blk virtio freq_table processor thermal_sys raid1 raid0 md_mod 
atiixp ahci sata_nv sata_sil sata_via libata dock via82cxxx ide_core 3w_9xxx 
3w_ scsi_mod xfs ext3 jbd
 ext2 mbcache reiserfs
Aug 28 14:38:06 web-new kernel: [ 6677.939462] CPU 1:
Aug 28 14:38:06 web-new kernel: [ 6677.939462] Modules linked in: ipv6 snd_pcsp 
snd_pcm parport_pc parport snd_timer serio_raw snd psmouse soundcore i2c_piix4 
snd_page_alloc i2c_core
 button evdev dm_mirror dm_log dm_snapshot dm_mod ide_cd_mod cdrom ata_generic 
floppy piix ide_pci_generic thermal fan virtio_balloon virtio_pci virtio_ring 
virtio_rng rng_core virti
o_net virtio_blk virtio freq_table 

Bug#496917: BUG: soft lockup - CPU#1 stuck for 4096s! [java:3174] and crash (kvm guest)

2008-08-28 Thread maximilian attems
On Thu, Aug 28, 2008 at 02:54:56PM +, Lukas Kolbe wrote:
 Package: linux-image-2.6.26-1-amd64
 Version: 2.6.26-3
 Severity: important
 
 
 Using kvm 72, the guest is started with:
 kvm -smp 2 \
  -net nic,macaddr=DE:AD:BE:EF:21:71,model=virtio \
  -net tap,ifname=tap02,script=/etc/kvm/kvm-ifup \
  -net nic,macaddr=DE:AD:BE:EF:21:72,model=virtio \
  -net tap,ifname=tap12,script=/etc/kvm/kvm-ifup-web \
  -drive if=virtio,boot=on,file=/dev/vg0/web-root \
  -drive if=virtio,file=/dev/vg0/web-log \
  -drive if=virtio,file=/dev/vg0/web-srv \
  -drive if=virtio,file=/dev/vg0/web-swap \
  -m 3192 \
  -kernel /boot/vmlinuz-2.6.26-1-amd64 \
  -initrd /boot/initrd.img-2.6.26-1-amd64 \
  -append root=/dev/vda ro \
  -curses
 
 On the guest, I run  the latest JDK from Sun 1.6.0_07, and the latest JBoss 
 from
 Redhat 4.2.3.GA, default configuration (except that it's started via runit:
  exec chpst -u jboss:jboss -e ./env -o 65536 /opt/jboss/bin/run.sh -b host
 

should be fixed in 2.6.26-4, should be available tomorrow in unstable.
otherwise find sid snapshots http://wiki.debian.org/DebianKernel

thanks

-- 
maks



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]