* >> Are those the only MCA errors you're seeing? The reason I ask is that
there's an errata in the X5600 series which can cause an "internal timer
error" MCA to be logged after another uncorrectable MCA occurs.*
90% are these MCA errors regarding rest of the 10% there is no log for it
such as one of the supermicro was rebooted two days ago but it was unable to
generate crashdump under /var/crash directory though dump is enabled in
*>>This seems to me like it would be a CPU failure. Can you try replacing
the CPU itself? I've seen this exact message on a different board, and
the cause was a failing CPU. *
We're thinking to replace x5690 with x5675 CPUs.
*>>Well, mcelog has this hardcoded and prints this for every MCA just as a
matter of course. It isn't selective but assumes every machine check is
a hardware error (which they are, though some are warnings for corrected
events that you can ignore as the hardware hasn't degraded enough to
warrant replacement. However, corrected events don't generate panics,
just messages in the logs, and only a subset of corrected events include
the "yellow / green" indicators for which you can ignore "green" events.
Even corrected ECC errors I would ignore if you get a few events with
a count of 1 that don't recur). *
Each time the MCA error occurs, server went down. So please guide how do we
suppose to tackle this issue ?
>> Depending on the CPU model, you can determine more info about the
error using the CPU manuals (for Intel the SDM). *
CPU is x5690, is there a link we can get manual for supermicro x5690 cpu ?
View this message in context:
Sent from the freebsd-current mailing list archive at Nabble.com.
firstname.lastname@example.org mailing list
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"