MCE machine check exceptions - status, tools?

2010-07-28 Thread V. T. Mueller, Continum

Hello,

By searching the net I was only able to find that better support for 
9.0 is on its way. So I'd like to ask if MCEs (like ECC-related messages 
from, say Supermicro boards) are being already processed by the kernel.
Are there any (plans for) tools to handle and process these messages in 
userland?


The amount of memory and memory modules keeps increasing, so MCE logging 
for non A-brand hardware (these trigger LEDs and/or tools from firmware) 
appears to gain increasing importance, too.


I'd be grateful for hints, URLs, tips etc.

If sent as private mails, I'll post a summary back to the list.

TIA,
Volker

--
Volker T. Mueller
Continum AG
Bismarckallee 7d
79098 Freiburg i. Br.
Tel. +49 761 21711171
Fax. +49 761 21711198
http://www.continum.net

Sitz der Gesellschaft: Freiburg im Breisgau
Registergericht: Amtsgericht Freiburg, HRB 6866
Vorstand: Rolf Mathis, Volker T. Mueller
Vorsitzender d. Aufsichtsrats: Prof. Dr. Karl-F. Fischbach

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: MCE machine check exceptions - status, tools?

2010-07-28 Thread Andriy Gapon
on 28/07/2010 12:31 V. T. Mueller, Continum said the following:
 Hello,
 
 By searching the net I was only able to find that better support for
 9.0 is on its way. So I'd like to ask if MCEs (like ECC-related messages
 from, say Supermicro boards) are being already processed by the kernel.
 Are there any (plans for) tools to handle and process these messages in
 userland?
 
 The amount of memory and memory modules keeps increasing, so MCE logging
 for non A-brand hardware (these trigger LEDs and/or tools from firmware)
 appears to gain increasing importance, too.
 
 I'd be grateful for hints, URLs, tips etc.

MCA support is in current and stable/8.
I believe that it's enabled by default, so there is not much to configure or to
do except watching for MCE reports in system log (or via hw.mca.count).
That's for correctable MCEs though, non-correctable would result in panic.

See sys/x86/x86/mca.c code for details.
John Baldwin has a tool that produces more human-friendly description of the
exceptions should you ever get one.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org