On Sun, 2007-04-01 at 12:30 -0500, Ted Phelps wrote: > Bill Maas writes: > > There still is the other issue, that of the watchdog reset during > > boot. It happens on an OpenBSD box, maybe on Linux boxes too, if the > > timeout value is set too low. The appropriate lines from /etc/rc (4.0 > > unpatched) are: > > Just FYI, the Linux watchdog timer driver (for the net4801, at least -- > I haven't investigated any other watchdog drivers) disables itself until > the watchdog device is opened by a user-space program. This side-steps > the reset during boot problem. Unfortunately it means that the watchdog > timer won't be able to help if the system hangs during boot.
As I understand it, watchdog timers are designed for quick and dirty recovery from random lockups on unattended machines. If a boot time lockup is more likely to be persistent due to config errors, the Linux approach is correct. Watching the boot process would only cause the system to be reset over and over again - not much gained here on an unattended machine, at best. The fact that wachdogd is one of the last services to be brought up at boot time from /etc/rc, suggests that the OpenBSD maintainers too decided that the system boot shouldn't be watched. From this point of view, the fact that the timer is started as soon as sysctl.conf is sourced, is a bug, at least when the watchdog is not used in "auto" mode. In non-auto mode, the daemon itself should start the wdt (there's no use in running a wdt if it isn't maintained, if only temporarily). In this case, if the daemon fails to come up, the machine isn't watched. Too bad - in a High Availability environment, surely the Admin can be expected to monitor which services are running and which aren't. Surprise: a bug report (5438) about this issue was submitted to the OpenBSD bug department just yesterday. Interesting topic, this.. may be continued. Bill _______________________________________________ Soekris-tech mailing list [email protected] http://lists.soekris.com/mailman/listinfo/soekris-tech
