I've been fighting with this bug for 2 months now. Sometimes I get
uptime of a couple of weeks. Sometimes only a couple of days. Very weird
and seems resilient to kernel changes.

My setup is a Dual Quad Core Xeon 5405 (6Mbyte cache) mounted on a
supermicro X7DCL-i motherboard with 8gigs of DDR2 ECC ram memory. The
hardware did complete a 48 hours memtest successfully so I'm quite
confident it's not MB/RAM/Hardware issue. The BIOS is the latest
available (8/18/2008 from Supermicro).

I've seen the bug randomly with both 2.6.24-18-xen and 2.6.24-19-xen
versions of the kernel. The process that dies can be anything in Dom0,
DomU and seems unrelated to the actual process/module that is executed.
It seems somewhat related to the order of processes loaded (in my case
the order of domU startup). The 2.6.24-18 kernel seems a little more
stable, but this could be a coincidence.

So far I've seen crashing a simple ext2 formatting, various processes in
different domU, various processes in dom0. The offending process is
somewhat sticky (leading me to believe a memory/hardware issue) but I
ruled that out above.

The latest incarnation is a clamd process that won't live longher than
few hours without crashing:

19240.984220] BUG: soft lockup - CPU#0 stuck for 11s! [clamd:2976]
[19240.984230] 
[19240.984233] Pid: 2976, comm: clamd Not tainted (2.6.24-18-xen #1)
[19240.984237] EIP: 0061:[<c0327677>] EFLAGS: 00000286 CPU: 0
[19240.984245] EIP is at _spin_lock+0x7/0x10
[19240.984248] EAX: c1c2898c EBX: 00000000 ECX: c1c28980 EDX: 00000d88
[19240.984251] ESI: 50425067 EDI: 00000001 EBP: c0477158 ESP: e7f49ef4
[19240.984254]  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0069
[19240.984260] CR0: 8005003b CR2: b70b1000 CR3: 28400000 CR4: 00002660
[19240.984264] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[19240.984267] DR6: ffff0ff0 DR7: 00000400
[19240.984271]  [<c01759d5>] mprotect_fixup+0x395/0x800
[19240.984284]  [<c013bb90>] autoremove_wake_function+0x0/0x40
[19240.984293]  [<c0175fcb>] sys_mprotect+0x18b/0x230
[19240.984299]  [<c0105832>] syscall_call+0x7/0xb
[19240.984305]  [<c0320000>] vcc_def_wakeup+0x30/0x60
[19240.984310]  =======================

I'm currently running this particular domU with 2.4.26-21-xen kernel,
for testing. Will report if it crashes.

Here's CPUinfo. Might be usefull:

processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 23
model name      : Intel(R) Xeon(R) CPU           E5405  @ 2.00GHz
stepping        : 6
cpu MHz         : 1999.999
cache size      : 6144 KB
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 10
wp              : yes
flags           : fpu de tsc msr pae mce cx8 apic mca cmov pat pse36 clflush 
dts acpi mmx fxsr sse sse2 ss ht tm pbe nx lm constant_tsc up arch_perfmon pebs 
bts pni monitor ds_cpl vmx tm2 ssse3 cx16 xtpr dca sse4_1 lahf_lm
bogomips        : 4004.96
clflush size    : 64

-- 
Server 8.04 LTS: soft lockup - CPU#1 stuck for 11s! [bond1:3795] - bond - bond0
https://bugs.launchpad.net/bugs/245779
You received this bug notification because you are a member of Ubuntu
Bugs, which is a direct subscriber.

-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to