I've been fighting with this bug for 2 months now. Sometimes I get uptime of a couple of weeks. Sometimes only a couple of days. Very weird and seems resilient to kernel changes.
My setup is a Dual Quad Core Xeon 5405 (6Mbyte cache) mounted on a supermicro X7DCL-i motherboard with 8gigs of DDR2 ECC ram memory. The hardware did complete a 48 hours memtest successfully so I'm quite confident it's not MB/RAM/Hardware issue. The BIOS is the latest available (8/18/2008 from Supermicro). I've seen the bug randomly with both 2.6.24-18-xen and 2.6.24-19-xen versions of the kernel. The process that dies can be anything in Dom0, DomU and seems unrelated to the actual process/module that is executed. It seems somewhat related to the order of processes loaded (in my case the order of domU startup). The 2.6.24-18 kernel seems a little more stable, but this could be a coincidence. So far I've seen crashing a simple ext2 formatting, various processes in different domU, various processes in dom0. The offending process is somewhat sticky (leading me to believe a memory/hardware issue) but I ruled that out above. The latest incarnation is a clamd process that won't live longher than few hours without crashing: 19240.984220] BUG: soft lockup - CPU#0 stuck for 11s! [clamd:2976] [19240.984230] [19240.984233] Pid: 2976, comm: clamd Not tainted (2.6.24-18-xen #1) [19240.984237] EIP: 0061:[<c0327677>] EFLAGS: 00000286 CPU: 0 [19240.984245] EIP is at _spin_lock+0x7/0x10 [19240.984248] EAX: c1c2898c EBX: 00000000 ECX: c1c28980 EDX: 00000d88 [19240.984251] ESI: 50425067 EDI: 00000001 EBP: c0477158 ESP: e7f49ef4 [19240.984254] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0069 [19240.984260] CR0: 8005003b CR2: b70b1000 CR3: 28400000 CR4: 00002660 [19240.984264] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 [19240.984267] DR6: ffff0ff0 DR7: 00000400 [19240.984271] [<c01759d5>] mprotect_fixup+0x395/0x800 [19240.984284] [<c013bb90>] autoremove_wake_function+0x0/0x40 [19240.984293] [<c0175fcb>] sys_mprotect+0x18b/0x230 [19240.984299] [<c0105832>] syscall_call+0x7/0xb [19240.984305] [<c0320000>] vcc_def_wakeup+0x30/0x60 [19240.984310] ======================= I'm currently running this particular domU with 2.4.26-21-xen kernel, for testing. Will report if it crashes. Here's CPUinfo. Might be usefull: processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 23 model name : Intel(R) Xeon(R) CPU E5405 @ 2.00GHz stepping : 6 cpu MHz : 1999.999 cache size : 6144 KB fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 10 wp : yes flags : fpu de tsc msr pae mce cx8 apic mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx lm constant_tsc up arch_perfmon pebs bts pni monitor ds_cpl vmx tm2 ssse3 cx16 xtpr dca sse4_1 lahf_lm bogomips : 4004.96 clflush size : 64 -- Server 8.04 LTS: soft lockup - CPU#1 stuck for 11s! [bond1:3795] - bond - bond0 https://bugs.launchpad.net/bugs/245779 You received this bug notification because you are a member of Ubuntu Bugs, which is a direct subscriber. -- ubuntu-bugs mailing list [email protected] https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
