What happens (packages from -updates):

On Groovy: Assuming same behaviour as focal due to systemd service file.
Untested.

On Focal: The rabbitmq service will start, and stay in 'activating' mode
until the daemon notifies systemd that it has started up (type=notify).
Every 300 seconds / 5 minutes rabbitmq will log failure to synchronise
the message queue until rabbitmq2 returns, but the daemon never dies.
TimeoutStartSec=3600 or one hour, so daemon stays waiting for 1 hour,
with it soft resetting every 5 minutes as queue synchronisation timeouts
occur. Service will only change to 'active' when rabbitmq2 starts and
the message queue is synced.

>From what I understand, I don't think there is any problems on focal or
groovy. As long as rabbitmq2 comes up within an hour, things work. Note
because of this bug, groovy and upstream has now been changed to 10 min
timeout, down from 1hr.

On Eoan: Assuming same behaviour as focal due to systemd service file.
Untested.

On Bionic: The rabbitmq service will start, and runs a ExecStartPost
script that waits on the rabbitmq daemon. If this ExecStartPost script
times out (which it does after 90 seconds it seems, even though
documentation suggests infinite timeout), it terminates with a error
exit code, and since the Unit type=simple, systemd marks the service as
failed. There is no Restart=on-failure on Bionic's systemd unit, and
rabbitmq stays dead. Rabbitmq dies 90 seconds after boot, and will never
rejoin the cluster by itself. The machine needs to be power cycled, or
manual ssh in and restart rabbitmq services.

On Xenial: Assuming same behaviour as Bionic due to systemd service
file. Untested.

Suggested actions:
For Bionic: From my understanding of the problem and my testing, I found that 
replacing the systemd service file with the one from focal, which changes 
type=simple to type=notify, with a 1hr timeout, and restart=on-failure solves 
the problem. Notes: I checked the source code, and rabbitmq in bionic does 
indeed support type=notify, although, we need to add a dependency to the 
package, socat. See below commit for details:

commit: 2d6383bade61fea0b8652b72d25bb1a9f0d6133f
From: Alexey Lebedeff <alebe...@mirantis.com>
Date: Fri, 11 Mar 2016 17:42:15 +0300
Subject: Improve systemd integration
Link: 
https://github.com/rabbitmq/rabbitmq-server/commit/2d6383bade61fea0b8652b72d25bb1a9f0d6133f

Github Issue for above commit: https://github.com/rabbitmq/rabbitmq-
server/issues/664

Xenial: I need to dig into this. We will likely follow the same path as
bionic, but we need to be careful to ensure service type=notify is
sufficiently supported in rabbitmq 3.5.7 before we SRU the change. Will
also likely need socat as a dependency and maybe a backport of the above
commit.

** Bug watch added: github.com/rabbitmq/rabbitmq-server/issues #664
   https://github.com/rabbitmq/rabbitmq-server/issues/664

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1874075

Title:
  rabbitmq-server startup timeouts differ between SysV and systemd

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/rabbitmq-server/+bug/1874075/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to