On 2011-07-19 22:04, Kelsey Cummings wrote:
This isn't exactly an scientific linux issue but I hope that folks here
may be more likely to be using IPMI then some of the other lists.
We have a series of Supermicro systems w/IPMI running RHEL 5.5. We're
using IPMI primarily to monitor psu status and for the hardware watchdog
support teamed with the watchdog service. 3 or 4 out of 8 identical
systems have exhibited hardware watchdog triggered resets for no
apparent reason.
Best we can tell, despite the OS and hardware being perfectly healthy
(no other errors, and the systems work fine after the watchdog is
disabled,) the hardware watchdog is triggering a reset on its own, and
worse, the boxes do not appear to come back from it.
Anyone else seen similar issue or have any input?
I have seen it. Under OpenBSD, in fact. I ended up shutting off the
hardware watchdog after failing to find a fix.
--
Garrett Holmstrom