Package: mcelog
Version: 1.0~pre3-3
Severity: minor
Tags: patch
debian/control and debian/README.Debian claim that mcelog is only useful
on x86_64. This no longer seems to be the case, it is also useful on
some i686 CPU types. Below is a log from an AMD XP3000+ (AMD K7 CPU)
running Squeeze with backported kernel 2.6.38-bpo.2-686-bigmem :
Jun 5 09:36:32 hub kernel: [453000.000016] [Hardware Error]: No human readable
MCE decoding support on this CPU type.
Jun 5 09:36:32 hub kernel: [453000.002600] [Hardware Error]: Run the message
through 'mcelog --ascii' to decode.
Jun 5 09:36:32 hub kernel: [453000.005012] Disabling lock debugging due to
kernel taint
Jun 5 09:36:32 hub kernel: [453000.007421] [Hardware Error]: Machine check
events logged
(hmm, not sure why the kernel thinks it is tainted. never mind for now.)
# aptitude install mcelog
Jun 6 12:16:17 hub mcelog: MCE 0
Jun 6 12:16:17 hub mcelog: CPU 0 BANK 1
Jun 6 12:16:17 hub mcelog: ADDR c102aff0
Jun 6 12:16:17 hub mcelog: TIME 1307358977 Mon Jun 6 12:16:17 2011
Jun 6 12:16:17 hub mcelog: STATUS 9400000000000151 MCGSTATUS 0
Jun 6 12:16:17 hub mcelog: MCGCAP 104 APICID 0 SOCKETID 0
Jun 6 12:16:17 hub mcelog: CPUID Vendor AMD Family 6 Model 10
Jun 6 12:16:17 hub mcelog: failed to prefill DIMM database from DMI data
Jun 6 12:16:17 hub mcelog: Kernel does not support page offline interface
Jun 6 12:16:17 hub mcelog: Unknown CPU type vendor 2 family 6 model a
Jun 6 12:16:17 hub mcelog: HARDWARE ERROR. This is *NOT* a software problem!
Jun 6 12:16:17 hub mcelog: Please contact your hardware vendor
Admittedly mcelog couldn't decode this due to the lack of DMI data, but
it's still useful to know that I had a memory problem. I hadn't installed
mcelog on this server as I thought it was still x86_64 only, fortunately
the MCE data was still in the kernel buffer when I got the daily logcheck.
I'm not sure when MCE logging was added to i386 architecture but kernel
logs for this area go back to 2005 and 2007, and it apparently works
(indeed, is required) on the current backported kernel from unstable.
So I suggest just removing the reference to particular kernel versions and
x86 CPU types, unless you happen to know which ones now generate mcelogs.
A proposed patch is attached.
Nick
Index: mcelog-1.0~pre3-66-g39f5b74/debian/README.Debian
===================================================================
--- mcelog-1.0~pre3-66-g39f5b74.orig/debian/README.Debian 2011-06-06 13:01:47.000000000 +0100
+++ mcelog-1.0~pre3-66-g39f5b74/debian/README.Debian 2011-06-06 13:03:40.000000000 +0100
@@ -1,9 +1,8 @@
mcelog for Debian
-----------------
-mcelog is only needed/useful on x86-64 platforms, ie AMD64 and EM64T
-hardware. You need to configure your kernel with CONFIG_X86_MCE=y (which
-is the default).
+mcelog is only needed/useful on modern x86 hardware. You need to configure
+your kernel with CONFIG_X86_MCE=y (which is the default).
mcelog can be run in one of two modes:
- as a daemon, this is the default mode
Index: mcelog-1.0~pre3-66-g39f5b74/debian/control
===================================================================
--- mcelog-1.0~pre3-66-g39f5b74.orig/debian/control 2011-06-06 13:01:39.000000000 +0100
+++ mcelog-1.0~pre3-66-g39f5b74/debian/control 2011-06-06 13:04:19.000000000 +0100
@@ -8,10 +8,9 @@
Package: mcelog
Architecture: i386 amd64
Depends: ${shlibs:Depends}, ${misc:Depends}, debconf (>= 0.5) | debconf-2.0, udev | makedev (>= 2.3.1-81)
-Description: x86-64 Machine Check Exceptions collector and decoder
- Starting with version 2.6.4, the Linux kernel for x86-64 no longer decodes
- and logs recoverable Machine Check Exception events to the kernel log on
- its own.
+Description: x86 Machine Check Exceptions collector and decoder
+ The Linux kernel for x86 no longer decodes and logs recoverable Machine
+ Check Exception events to the kernel log on its own.
.
Instead, the MCE data is kept in a buffer which can be read from userspace
via the /dev/mcelog device node.