Hi Liu,
On Mon, Sep 13, 2010 at 8:49 PM, Liu Tao <[email protected]> wrote: > Hello everyone, > > I'm porting coreboot v4 to a k8-rs780-sb710 based mainboard, and use > amd/mahogany > and amd/tilapia_fam10 codes as the reference. Now coreboot boots the > board and filo loads linux,but the board crashes at a MCE error during > booting process. I'm not very know the detail about the MCE, so any > suggestions will be appreciated, thanks very much. > > The mainboard architecture: > CPU: socket F Opteron 2210 EE get_cpu_rev EAX=0x40f13 (1 cpu, dual core) > DIMM: DDR2 333M (x1 / x2) > HT Link0: off > HT Link1: RS780->SB710 > HT Link2: off > VGA off > GFX off > PCIE off > > coreboot code revision: modified on r5692 > > The MCE/panic message: > > HARDWARE ERROR > CPU 0: Machine Check Exception: 4 Bank 0: f658a00000000833 > TSC 572507f34 ADDR 6000 > This is not a software problem! > Run through mcelog --ascii to decode and contact your hardware vendor > Kernel panic - not syncing: Machine check > ------------[ cut here ]------------ > WARNING: at kernel/smp.c:331 smp_call_function_mask+0x32/0x1ec() > Modules linked in: > Supported: Yes > Pid: 1, comm: swapper Tainted: G M 2.6.27.19-5-default #1 > > Call Trace: > [<ffffffff8020d9f9>] show_trace_log_lvl+0x41/0x58 > [<ffffffff80496a74>] dump_stack+0x69/0x6f > [<ffffffff8023bfba>] warn_on_slowpath+0x51/0x77 > [<ffffffff8025b1c5>] smp_call_function_mask+0x32/0x1ec > [<ffffffff8025b3a8>] smp_call_function+0x29/0x2e > [<ffffffff8021a04a>] native_smp_send_stop+0x1a/0x26 > [<ffffffff80496b36>] panic+0xbc/0x169 > [<ffffffff80216366>] mce_log+0x0/0x7e > [<ffffffff80216740>] do_machine_check+0x31e/0x3cd > [<ffffffff8020d27f>] machine_check+0x7f/0x90 > [<ffffffff802126c8>] setup_trampoline+0x20/0x30 > [<ffffffff804919a5>] native_cpu_up+0x31e/0xc64 > [<ffffffff80493d17>] _cpu_up+0x9a/0x11c > [<ffffffff80493df4>] cpu_up+0x5b/0x6f > [<ffffffff8095b708>] kernel_init+0xe1/0x1eb > [<ffffffff8020cf49>] child_rip+0xa/0x11 > > ---[ end trace 4eaa2a86a8e2da22 ]--- > > mcelog --k8 --ascii > > HARDWARE ERROR > CPU 0: Machine Check Exception: 4 Bank 0: f658a00000000833 > TSC 572507f34 ADDR 6000 > This is not a software problem! > Run through mcelog --ascii to decode and contact your hardware vendor > HARDWARE ERROR > CPU 0 0 data cache TSC 572507f34 > Data cache ECC error (syndrome b1) > bit45 = uncorrected ecc error > bit57 = processor context corrupt > bit61 = error uncorrected > bit62 = error overflow (multiple errors) > bus error 'local node origin, request didn't time out > data read mem transaction > memory access, level generic' > STATUS f658a00000000833 MCGSTATUS 4 > This is not a software problem! > Run through mcelog --ascii to decode and contact your hardware vendor > > Attached is the detailed boot message. I haven't worked with K8 is a while, but it seems like this could be a real CPU problem. Do you have another CPU to test with? The other possibility is that there is a missing errata or workaround for your CPU. You could review the AMD K8 revision guide for cache and MCA/MCE issues. Please let us know what you find. Marc -- http://se-eng.com -- coreboot mailing list: [email protected] http://www.coreboot.org/mailman/listinfo/coreboot

