We have two brand new Dell PowerEdge R910 with quad E7540 processor each, running Redhat Enterprise 5.4 and Oracle that keep getting CPU overheat error in their MCE log. There is no temperature warning in DRAC log, and also based on DRAC, CPU temperature is fine.. Currently all of these R910 are running BIOS version 1.0.1
An example of the MCE log is as follow: MCE 0 HARDWARE ERROR. This is *NOT* a software problem! Please contact your hardware vendor CPU 21 THERMAL EVENT TSC 1d6f5a78c4a1eb [at 1995 Mhz 45 days 1:34:53 uptime (unreliable)] Processor 21 heated above trip temperature. Throttling enabled. Please check your system cooling. Performance will be impacted STATUS 880003cb MCGSTATUS 0 Does anyone have similar issue with their R910 ? I'm wondering if we have defective CPU with bad thermal sensor or something with these R910... Thanks _______________________________________________ Linux-PowerEdge mailing list [email protected] https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
