Public bug reported:

The unit file for my service xxxx-33005.service:

[Unit]
Description=product xxxx 33005 service
Wants=some.mount
After=some.mount
PartOf=product.target
PartOf=someother.service
[Service]
Type=simple
ExecStart=/opt/product/bin/xxxx -conf=/etc/product.conf
Restart=on-failure
User=product
Group=product
LimitCORE=infinity
LimitNOFILE=10000
Slice=xxxx.slice
WatchdogSec=120
RestartSec=10
StartLimitInterval=120
StartLimitBurst=10
[Install]
WantedBy=product.target

specifies RestartSec=10, which is also what systemd sees:

$ systemctl show xxxx-33005.service
Type=simple
Restart=on-failure
NotifyAccess=main
RestartUSec=10s
...

However, today I saw in the journalctl log:
$ journalctl -u xxxx-33005.service
...
Aug 15 23:28:29 a-hostname systemd[1]: xxxx-33005.service: Watchdog timeout 
(limit 2min)!
Aug 15 23:28:29 a-hostname systemd[1]: xxxx-33005.service: Main process exited, 
code=exited, status=2/INVALIDARGUMENT
Aug 15 23:28:29 a-hostname systemd[1]: xxxx-33005.service: Unit entered failed 
state.
Aug 15 23:28:29 a-hostname systemd[1]: xxxx-33005.service: Failed with result 
'exit-code'.
Aug 15 23:28:39 a-hostname systemd[1]: xxxx-33005.service: Service hold-off 
time over, scheduling restart.
Aug 15 23:28:39 a-hostname systemd[1]: Stopped product xxxx 33005 service.
Aug 15 23:28:39 a-hostname systemd[1]: Started product xxxx 33005 service.
Aug 15 23:28:45 a-hostname systemd[1]: Started product xxxx 33005 service.
Aug 15 23:28:45 a-hostname systemd[1]: Stopping product xxxx 33005 service...
Aug 15 23:28:45 a-hostname systemd[1]: Stopped product xxxx 33005 service.
Aug 15 23:33:51 a-hostname systemd[1]: Started product xxxx 33005 service.
Aug 15 23:33:51 a-hostname xxxx[3629]: Get http://127.0.0.1:8081/: dial tcp 
127.0.0.1:8081: getsockopt: connection refused
Aug 15 23:33:51 a-hostname systemd[1]: xxxx-33005.service: Main process exited, 
code=exited, status=2/INVALIDARGUMENT
Aug 15 23:33:51 a-hostname systemd[1]: xxxx-33005.service: Unit entered failed 
state.
Aug 15 23:33:51 a-hostname systemd[1]: xxxx-33005.service: Failed with result 
'exit-code'.
Aug 15 23:34:01 a-hostname systemd[1]: xxxx-33005.service: Service hold-off 
time over, scheduling restart.
Aug 15 23:34:02 a-hostname systemd[1]: Stopped product xxxx 33005 service.
Aug 15 23:34:02 a-hostname systemd[1]: Started product xxxx 33005 service.
Aug 15 23:34:02 a-hostname systemd[1]: Started product xxxx 33005 service.
Aug 15 23:34:02 a-hostname systemd[1]: Started product xxxx 33005 service.
Aug 15 23:34:02 a-hostname systemd[1]: Started product xxxx 33005 service.
Aug 15 23:34:02 a-hostname systemd[1]: Started product xxxx 33005 service.
Aug 15 23:34:02 a-hostname systemd[1]: Started product xxxx 33005 service.
Aug 15 23:34:02 a-hostname systemd[1]: Started product xxxx 33005 service.
Aug 15 23:34:02 a-hostname systemd[1]: Started product xxxx 33005 service.
Aug 15 23:34:02 a-hostname systemd[1]: Started product xxxx 33005 service.
Aug 15 23:34:02 a-hostname systemd[1]: xxxx-33005.service: Start request 
repeated too quickly.
Aug 15 23:34:02 a-hostname systemd[1]: Failed to start product xxxx 33005 
service.

Everything looks to me that systemd is behaving as expected until Aug 15
23:34:02 when it attempts to start xxxx-33005.service not at the rate I
specified in RestartSec, but possibly at the default 100ms rate.  This
excerpt shows both the expected 10s restart rate as well as the
unexpected restart rate.

$ lsb_release -rd
Description:    Ubuntu 16.04.1 LTS
Release:        16.04

$ uname -a
Linux a-hostname 4.4.0-31-generic #50-Ubuntu SMP Wed Jul 13 00:07:12 UTC 2016 
x86_64 x86_64 x86_64 GNU/Linux

$ apt-cache policy systemd
systemd:
  Installed: 229-4ubuntu16
  Candidate: 229-4ubuntu19
  Version table:
     229-4ubuntu19 500
        500 http://internal-repo/ubuntu xenial-updates/main amd64 Packages
 *** 229-4ubuntu16 100
        100 /var/lib/dpkg/status
     229-4ubuntu10 500
        500 http://internal-repo/ubuntu xenial-security/main amd64 Packages
     229-4ubuntu4 500
        500 http://internal-repo/ubuntu xenial/main amd64 Packages

** Affects: systemd (Ubuntu)
     Importance: Undecided
         Status: New

** Description changed:

- The unit file for my service xxxx:
+ The unit file for my service xxxx-33005.service:
  
  [Unit]
  Description=product xxxx 33005 service
  Wants=some.mount
  After=some.mount
  PartOf=product.target
  PartOf=someother.service
  [Service]
  Type=simple
  ExecStart=/opt/product/bin/xxxx -conf=/etc/product.conf
  Restart=on-failure
  User=product
  Group=product
  LimitCORE=infinity
  LimitNOFILE=10000
  Slice=xxxx.slice
  WatchdogSec=120
  RestartSec=10
  StartLimitInterval=120
  StartLimitBurst=10
  [Install]
  WantedBy=product.target
  
  specifies RestartSec=10, which is also what systemd sees:
  
  $ systemctl show xxxx-33005.service
  Type=simple
  Restart=on-failure
  NotifyAccess=main
  RestartUSec=10s
  ...
  
  However, today I saw in the journalctl log:
  $ journalctl -u xxxx-33005.service
  ...
  Aug 15 23:28:29 a-hostname systemd[1]: xxxx-33005.service: Watchdog timeout 
(limit 2min)!
  Aug 15 23:28:29 a-hostname systemd[1]: xxxx-33005.service: Main process 
exited, code=exited, status=2/INVALIDARGUMENT
  Aug 15 23:28:29 a-hostname systemd[1]: xxxx-33005.service: Unit entered 
failed state.
  Aug 15 23:28:29 a-hostname systemd[1]: xxxx-33005.service: Failed with result 
'exit-code'.
  Aug 15 23:28:39 a-hostname systemd[1]: xxxx-33005.service: Service hold-off 
time over, scheduling restart.
  Aug 15 23:28:39 a-hostname systemd[1]: Stopped product xxxx 33005 service.
  Aug 15 23:28:39 a-hostname systemd[1]: Started product xxxx 33005 service.
  Aug 15 23:28:45 a-hostname systemd[1]: Started product xxxx 33005 service.
  Aug 15 23:28:45 a-hostname systemd[1]: Stopping product xxxx 33005 service...
  Aug 15 23:28:45 a-hostname systemd[1]: Stopped product xxxx 33005 service.
  Aug 15 23:33:51 a-hostname systemd[1]: Started product xxxx 33005 service.
  Aug 15 23:33:51 a-hostname xxxx[3629]: Get http://127.0.0.1:8081/: dial tcp 
127.0.0.1:8081: getsockopt: connection refused
  Aug 15 23:33:51 a-hostname systemd[1]: xxxx-33005.service: Main process 
exited, code=exited, status=2/INVALIDARGUMENT
  Aug 15 23:33:51 a-hostname systemd[1]: xxxx-33005.service: Unit entered 
failed state.
  Aug 15 23:33:51 a-hostname systemd[1]: xxxx-33005.service: Failed with result 
'exit-code'.
  Aug 15 23:34:01 a-hostname systemd[1]: xxxx-33005.service: Service hold-off 
time over, scheduling restart.
  Aug 15 23:34:02 a-hostname systemd[1]: Stopped product xxxx 33005 service.
  Aug 15 23:34:02 a-hostname systemd[1]: Started product xxxx 33005 service.
  Aug 15 23:34:02 a-hostname systemd[1]: Started product xxxx 33005 service.
  Aug 15 23:34:02 a-hostname systemd[1]: Started product xxxx 33005 service.
  Aug 15 23:34:02 a-hostname systemd[1]: Started product xxxx 33005 service.
  Aug 15 23:34:02 a-hostname systemd[1]: Started product xxxx 33005 service.
  Aug 15 23:34:02 a-hostname systemd[1]: Started product xxxx 33005 service.
  Aug 15 23:34:02 a-hostname systemd[1]: Started product xxxx 33005 service.
  Aug 15 23:34:02 a-hostname systemd[1]: Started product xxxx 33005 service.
  Aug 15 23:34:02 a-hostname systemd[1]: Started product xxxx 33005 service.
  Aug 15 23:34:02 a-hostname systemd[1]: xxxx-33005.service: Start request 
repeated too quickly.
  Aug 15 23:34:02 a-hostname systemd[1]: Failed to start product xxxx 33005 
service.
  
  Everything looks to me that systemd is behaving as expected until Aug 15
  23:34:02 when it attempts to start xxxx-33005.service not at the rate I
  specified in RestartSec, but possibly at the default 100ms rate.  This
  excerpt shows both the expected 10s restart rate as well as the
  unexpected restart rate.
  
  $ lsb_release -rd
  Description:    Ubuntu 16.04.1 LTS
  Release:        16.04
  
  $ uname -a
  Linux a-hostname 4.4.0-31-generic #50-Ubuntu SMP Wed Jul 13 00:07:12 UTC 2016 
x86_64 x86_64 x86_64 GNU/Linux
  
  $ apt-cache policy systemd
  systemd:
-   Installed: 229-4ubuntu16
-   Candidate: 229-4ubuntu19
-   Version table:
-      229-4ubuntu19 500
-         500 http://internal-repo/ubuntu xenial-updates/main amd64 Packages
-  *** 229-4ubuntu16 100
-         100 /var/lib/dpkg/status
-      229-4ubuntu10 500
-         500 http://internal-repo/ubuntu xenial-security/main amd64 Packages
-      229-4ubuntu4 500
-         500 http://internal-repo/ubuntu xenial/main amd64 Packages
+   Installed: 229-4ubuntu16
+   Candidate: 229-4ubuntu19
+   Version table:
+      229-4ubuntu19 500
+         500 http://internal-repo/ubuntu xenial-updates/main amd64 Packages
+  *** 229-4ubuntu16 100
+         100 /var/lib/dpkg/status
+      229-4ubuntu10 500
+         500 http://internal-repo/ubuntu xenial-security/main amd64 Packages
+      229-4ubuntu4 500
+         500 http://internal-repo/ubuntu xenial/main amd64 Packages

-- 
You received this bug notification because you are a member of Ubuntu
Desktop Bugs, which is subscribed to systemd in Ubuntu.
https://bugs.launchpad.net/bugs/1711255

Title:
  systemd RestartSec does not seem to work consistently

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1711255/+subscriptions

-- 
desktop-bugs mailing list
desktop-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/desktop-bugs

Reply via email to