I've used the 4501 watchdog as follows:
My system uses a 2.4.19 kernel is complied with the watchdog driver (wdtsc),
after applying a hard-to-find patch to fix a hardware bug (it's old, but I can
forward it if you wish).
I created a node at major 10, minor 130 that appears as /dev/watchdog.
I wrote a daemon in Python that performs a data gathering function and deals
with the watchdog. Here's what was required:
-To start the watchdog, I opened /dev/watchdog as a file with write access.
It's important to only open it once - bad things happen the second time it's
opened.
-To keep the watchdog from rebooting the system, you "ping" the watchdog by
writing a character (any character except the "stop" character below) to it
every second or so. As I recall it takes 2-3 seconds for it to decide to
reboot the system, but I'm not sure.
-To stop the watchdog, write the "stop" character (I think is was "V"), then
flush and close the file open.
I included the watchdog because I'm not a good programmer, but so far it's only
been triggered by hardware problems. I syslog my 4501 to another machine and
I've seen a few cases where the drive controller errors out, casuing problems
until it affects my daemon, then the watchdog rebooted the system and corrected
the problem.
Hope it helps!
Tim Sharpe
-----------------------
Thorsten Mühlfelder wrote:
Hmm, I guess you're talking about a software based solution. But the the
Soekris boards have a hardware watchdog, which perhaps works the other way
round:
Kernel talks with it's driver to the hardware. If the kernel freezes the
hardware chip resets the machine because it doesn't get "pings" by the kernel
anymore.
This would be more logical to me, but I really don't know ;-)
On Wed, 14 Jan 2009 18:08:14 -0500
"Michael Proto" <mike at jellydonut.org> wrote:
> I'm no authority on the subject (I've learned most of what I know simply by
> lurking this list and reading PHK's posts), but there are two pieces to
> watchdog-- the kernel-side piece and a userland daemon that acts as a dead
> man's switch. The userland watchdogd (or whatever linux's equivalent)
> enables the kernel bits and sends a "I'm still running" message to the
> kernel every few seconds. If the kernel doesn't get that message, it assumes
> userland is broken and institutes a reboot. This is why you can test with a
> kill -9 to the watchdogd program-- it doesn't have time to deactivate the
> in-kernel watchdog component before it dies.
>
>
> -Proto
_______________________________________________
Soekris-tech mailing list
[email protected]
http://lists.soekris.com/mailman/listinfo/soekris-tech