Public bug reported:

I was testing the watchdog daemon code on the development version of
16.04 today and found the associated systemd service files for this has
some bugs.

The first of these was a typo in /lib/systemd/system/watchdog.service
where there was a missing ['] character. This lead to an error message
"Unbalanced quoting, ignoring" which was easy to fix. The update is now
in the source-forge repository for the project as commit
http://sourceforge.net/p/watchdog/code/ci/38e6430f80907a84741c760ef48df69a679b294c/

However, I have not found the reason for the second problem where the
daemon has some fault at reboot time and goes in to "failed state" and
then it will not restart with the machine booting. The typical related
syslog entries are:

Jan 19 16:46:08 ubuntu watchdog[2066]: stopping daemon (5.14)
Jan 19 16:46:08 ubuntu systemd[1]: Stopping watchdog daemon...
Jan 19 16:46:09 ubuntu systemd[1]: watchdog.service: Control process exited, 
code=exited status=1
Jan 19 16:46:09 ubuntu systemd[1]: Stopped watchdog daemon.
Jan 19 16:46:09 ubuntu systemd[1]: watchdog.service: Unit entered failed state.
Jan 19 16:46:09 ubuntu systemd[1]: watchdog.service: Triggering OnFailure= 
dependencies.
Jan 19 16:46:09 ubuntu systemd[1]: watchdog.service: Failed to enqueue 
OnFailure= job: Resource deadlock avoided
Jan 19 16:46:09 ubuntu systemd[1]: watchdog.service: Failed with result 
'exit-code'.

I am guessing this is related to the shut-down approach for the watchdog
daemon where normally the wd_keepalive daemon is started afterwards (to
prevent a reboot if the hardware module is configured for "no way out"
so the timer cannot be stopped).

$ lsb_release -rd
Description:    Ubuntu Xenial Xerus (development branch)
Release:        16.04

$ apt-cache policy watchdog
watchdog:
  Installed: 5.14-3
  Candidate: 5.14-3
  Version table:
 *** 5.14-3 500
        500 http://us.archive.ubuntu.com/ubuntu xenial/universe i386 Packages
        100 /var/lib/dpkg/status

I have contacted the watchdog project maintainer with a view to working
out a solution to this. This bug report is more of a marker to let folks
know that there is an issue here if you plan on having high-availability
servers based on Ubuntu 16.04 (well, any systemd based system really..)
where watchdog-based fault recovery is an expectation.

** Affects: watchdog (Ubuntu)
     Importance: Undecided
         Status: New

** Package changed: rsyslog (Ubuntu) => watchdog (Ubuntu)

-- 
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to rsyslog in Ubuntu.
https://bugs.launchpad.net/bugs/1535854

Title:
  watchdog daemon going in to failed state on reboot

Status in watchdog package in Ubuntu:
  New

Bug description:
  I was testing the watchdog daemon code on the development version of
  16.04 today and found the associated systemd service files for this
  has some bugs.

  The first of these was a typo in /lib/systemd/system/watchdog.service
  where there was a missing ['] character. This lead to an error message
  "Unbalanced quoting, ignoring" which was easy to fix. The update is
  now in the source-forge repository for the project as commit
  
http://sourceforge.net/p/watchdog/code/ci/38e6430f80907a84741c760ef48df69a679b294c/

  However, I have not found the reason for the second problem where the
  daemon has some fault at reboot time and goes in to "failed state" and
  then it will not restart with the machine booting. The typical related
  syslog entries are:

  Jan 19 16:46:08 ubuntu watchdog[2066]: stopping daemon (5.14)
  Jan 19 16:46:08 ubuntu systemd[1]: Stopping watchdog daemon...
  Jan 19 16:46:09 ubuntu systemd[1]: watchdog.service: Control process exited, 
code=exited status=1
  Jan 19 16:46:09 ubuntu systemd[1]: Stopped watchdog daemon.
  Jan 19 16:46:09 ubuntu systemd[1]: watchdog.service: Unit entered failed 
state.
  Jan 19 16:46:09 ubuntu systemd[1]: watchdog.service: Triggering OnFailure= 
dependencies.
  Jan 19 16:46:09 ubuntu systemd[1]: watchdog.service: Failed to enqueue 
OnFailure= job: Resource deadlock avoided
  Jan 19 16:46:09 ubuntu systemd[1]: watchdog.service: Failed with result 
'exit-code'.

  I am guessing this is related to the shut-down approach for the
  watchdog daemon where normally the wd_keepalive daemon is started
  afterwards (to prevent a reboot if the hardware module is configured
  for "no way out" so the timer cannot be stopped).

  $ lsb_release -rd
  Description:  Ubuntu Xenial Xerus (development branch)
  Release:      16.04

  $ apt-cache policy watchdog
  watchdog:
    Installed: 5.14-3
    Candidate: 5.14-3
    Version table:
   *** 5.14-3 500
          500 http://us.archive.ubuntu.com/ubuntu xenial/universe i386 Packages
          100 /var/lib/dpkg/status

  I have contacted the watchdog project maintainer with a view to
  working out a solution to this. This bug report is more of a marker to
  let folks know that there is an issue here if you plan on having high-
  availability servers based on Ubuntu 16.04 (well, any systemd based
  system really..) where watchdog-based fault recovery is an
  expectation.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/watchdog/+bug/1535854/+subscriptions

-- 
Mailing list: https://launchpad.net/~touch-packages
Post to     : touch-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~touch-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to