Re: [XenPPC] Machine check: instruction-fetch TLB tablewalk
[NOTE: I'm assuming the decode here is correct] (XEN) MACHINE CHECK: IS Recoverable (XEN) SRR1: 0x000cf032 (XEN) 0b11: Exception caused by a hardware uncorrectable (XEN) error (UE) detected while doing a reload of an (XEN) instruction-fetch TLB tablewalk. (XEN) (XEN) DSISR: 0x0220 There was a parity error in the ITLB CAM array. The hardware won't recover this, but software can (blast the entry away, reload it -- or just blast all TLBs away, probably easier, and performance isn't an issue, this shouldn't happen often at all). You should hardly ever see this. If you add recovery routines, there are some special settings (in HID4 I think) that will introduce bit errors for you, it's almost impossible to test this stuff otherwise, unless you have serious hardware problems :-) Segher ___ Xen-ppc-devel mailing list Xen-ppc-devel@lists.xensource.com http://lists.xensource.com/xen-ppc-devel
[XenPPC] Machine check: instruction-fetch TLB tablewalk
Just saw this on a JS21 blade (internal name is cso52): (XEN) MACHINE CHECK: IS Recoverable (XEN) [ Xen-3.0-unstable ] (XEN) CPU: DOMID: (XEN) pc 10046510 msr 000cf032 (XEN) lr 10063bf4 ctr 0fde93c0 (XEN) srr0 srr1 (XEN) r00: 10063be4 ffda9710 f7fcd470 0003 (XEN) r04: 100a9268 0001 0030 fefefeff (XEN) r08: 0fecb4d8 100a 0019 (XEN) r12: 28242424 100a3300 1002 (XEN) r16: 100a 100a9008 100a 1008 (XEN) r20: 1006 (XEN) r24: 0008 100a9df8 100a (XEN) r28: 100a9268 100a 10063bc0 (XEN) dar 0xffda9628, dsisr 0x0220 (XEN) hid4 0x (XEN) ---[ backtrace ]--- (XEN) SP (ffda9710) is not in xen space (XEN) SRR1: 0x000cf032 (XEN) 0b11: Exception caused by a hardware uncorrectable (XEN) error (UE) detected while doing a reload of an (XEN) instruction-fetch TLB tablewalk. (XEN) (XEN) DSISR: 0x0220 (XEN) program_exception: machine check (XEN) machine_halt called: spinning (XEN) machine_halt called ___ Xen-ppc-devel mailing list Xen-ppc-devel@lists.xensource.com http://lists.xensource.com/xen-ppc-devel