Package: watchdog Followup-For: Bug #923254 X-Debbugs-Cc: [email protected]
Wanted to share what I'm doing locally to address this, and how I ran into it. I have not submitted Debian patches before but can try to figure it out if anyone would like. There is a way to react differently if the service is restarted versus if it is being stopped or killed in some other way. We can do that by sending a different kill signal for restarts. This can be done by removing the existing `ExecStopPost=` and adding the following: ``` RestartKillSignal=SIGINT ExecStopPost=/bin/sh -c 'if [ "$SERVICE_RESULT" != "success" ] || [ "$EXIT_STATUS" != "INT" ]; then /bin/systemctl start --no-block wd_keepalive.service; fi' ``` This directly starts the keepalive service when the watchdog is stopped instead of relying on `OnFailure=` for that, so it ought to be possible to remove `/bin/systemctl reset-failed` from wd_keepalive.service, too, as requested in #835496. Until fixed, other affected folks can use these changes in override files (e.g. `systemctl edit watchdog`, adding a `[Service]` section with the lines above, along with an additional `ExecStopPost=` line to ensure the pre-existing `ExecStopPost` line is ignored). For me, I encountered this bug as part of a series of troublesome issues in a Debian-based distribution. An automatic update caused systemd to reexecute, which restarted watchdog, which failed as described, causing wd_keepalive to start and preventing the system from rebooting as I wanted it to when it lost connectivity due to a separate issue that I was seeing every time systemd reexecuted. As a result, when the affected server updated, I became unable to access it again until I had physical access, which was the very problem I installed watchdog to avoid. (Guess I should have turned on automatic reboots for updates, too!)

