Bug#854197: marked as done (systemd: please handle the case where socket activation leads to restart loop better)

Debian Bug Tracking System Sat, 29 Aug 2020 08:51:38 -0700

Your message dated Sat, 29 Aug 2020 17:49:24 +0200
with message-id <[email protected]>
and subject line Re: systemd: please handle the case where socket activation 
leads to restart loop better
has caused the Debian Bug report #854197,
regarding systemd: please handle the case where socket activation leads to 
restart loop better
to be marked as done.


This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.

(NB: If you are a system administrator and have no idea what this
message is talking about, this may indicate a serious mail system
misconfiguration somewhere. Please contact [email protected]
immediately.)


-- 
854197: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=854197
Debian Bug Tracking System
Contact [email protected] with problems

--- Begin Message ---

Package: systemd
Version: 232-15
Severity: wishlist

Dear maintainers,

while helping out on debian-mentors@ with #854192 I noticed that systemd
doesn't appear to handle the case very well when dbus is installed but not
configured properly (this was due to a bug in the usbguard package that
missed a dependency on dbus), trying to start a Type=dbus service (that
does DBus requests) will cause a nasty restart loop that you can only get
out of if you stop dbus.socket - but it's very non-obvious that that is
what you should do.

Steps to reproduce:

 - install a stretch system, minimal (tasksel empty), no DBus
 - Recreate a broken DBus installation:
      apt-get download libdbus-1-3 dbus
      dpkg --install libdbus-1-3_*.deb
      dpkg --unpack dbus_*.deb
 - Create a dummy service:
      cat > /etc/systemd/system/dummy.service
      [Service]
      BusName=org.example.dummy
      ExecStart=/usr/bin/dbus-monitor --system
      (Ctrl+D)
 - Try to start that service
      systemctl daemon-reload
      systemctl start dummy

The dbus-monitor startup will cause dbus.socket to be triggered, which
in turn will cause systemd to try to start dbus.service. Problem here is
that dbus's postinst won't have run yet, so the "messagebus" user won't
exist, so dbus-daemon won't start up propery.

Problem: this creates a restart loop, since systemd tries to restart
the service over and over again because there's data on the DBus socket.
I'm pretty sure you could also reproduce that with other services that
are socket activated, but this definitely reproduces this.

Doing systemctl stop dummy or systemctl stop dbus doesn't help here;
masking dbus.service or dummy.service doesn't either. journalctl doesn't
say anything useful except "Looping too fast" being printed every 1s or
so. systemctl daemon-reexec has no effect (it does reexec though). The
only way to get out of this problem is to stop dbus.socket, which is
not very obvious to a user - even I didn't think of that immediately,
and rebooted my test VM a couple of times while figuring this out. I
suspect users with less knowledge of systemd than I will not fare
better.

What I would like to see is: systemd could maybe print a message when
a service (repeatedly) fails to start as a result of socket activation
(including which socket is responsible), so that users have an idea of
what they could do to make systemd cooperate again. Also once could
think about a mode where a socket is stopped (in failed state)
automatically after the service associated with it has failed to start
more than N times (configurable in the socket's unit file), with N
defaulting to 30 or something similar. This would really help in this
kind of situation.

Not sure about the severity of this bug, because the current behavior
of systemd does indeed work as designed (data on the socket -> try to
start service -> service fails -> service marked inactive -> systemd
looks at socket again -> data on the socket -> rinse and repeat ...),
but the consequences are rather nasty IMHO. I've filed it under
wishlist for now because of the "works as designed" argument, but my
annoyance level with this bug would easily make this 'normal' or
'important'. I'll leave this up to you.

Regards,
Christian

--- End Message ---

--- Begin Message ---

Am 03.09.19 um 13:35 schrieb Michael Biebl:
> Control: tags -1 + moreinfo
> 
> Hi Christian
> 
> On Sun, 5 Feb 2017 00:05:03 +0100 Christian Seiler <[email protected]>
> wrote:
>> while helping out on debian-mentors@ with #854192 I noticed that systemd
>> doesn't appear to handle the case very well when dbus is installed but not
>> configured properly (this was due to a bug in the usbguard package that
>> missed a dependency on dbus), trying to start a Type=dbus service (that
>> does DBus requests) will cause a nasty restart loop that you can only get
>> out of if you stop dbus.socket - but it's very non-obvious that that is
>> what you should do.
>>
>> Steps to reproduce:
>>
>>  - install a stretch system, minimal (tasksel empty), no DBus
>>  - Recreate a broken DBus installation:
>>       apt-get download libdbus-1-3 dbus
>>       dpkg --install libdbus-1-3_*.deb
>>       dpkg --unpack dbus_*.deb
>>  - Create a dummy service:
>>       cat > /etc/systemd/system/dummy.service
>>       [Service]
>>       BusName=org.example.dummy
>>       ExecStart=/usr/bin/dbus-monitor --system
>>       (Ctrl+D)
>>  - Try to start that service
>>       systemctl daemon-reload
>>       systemctl start dummy
>>
>> The dbus-monitor startup will cause dbus.socket to be triggered, which
>> in turn will cause systemd to try to start dbus.service. Problem here is
>> that dbus's postinst won't have run yet, so the "messagebus" user won't
>> exist, so dbus-daemon won't start up propery.
>>
>> Problem: this creates a restart loop, since systemd tries to restart
>> the service over and over again because there's data on the DBus socket.
>> I'm pretty sure you could also reproduce that with other services that
>> are socket activated, but this definitely reproduces this.
>>
> 
> It seems I can't reproduce the problem with the steps you provided in
> stretch VM. This is what I get:
> 
>> Sep 03 13:29:45 debian systemd[1]: Listening on D-Bus System Message Bus 
>> Socket.
>> Sep 03 13:29:45 debian systemd[1]: Starting dummy.service...
>> Sep 03 13:29:45 debian systemd[1]: Started D-Bus System Message Bus.
>> Sep 03 13:29:45 debian dbus-daemon[856]: Failed to start message bus: Could 
>> not get UID and GID for username "messagebus"
>> Sep 03 13:30:10 debian systemd[1]: Failed to subscribe to NameOwnerChanged 
>> signal for 'org.example.dummy': Connection timed out
>> Sep 03 13:30:10 debian systemd[1]: Failed to subscribe to NameOwnerChanged 
>> signal for 'org.freedesktop.login1': Connection timed out
>> Sep 03 13:30:10 debian systemd[1]: Failed to subscribe to activation signal: 
>> Connection timed out
>> Sep 03 13:30:10 debian systemd[1]: Failed to register name: Connection timed 
>> out
>> Sep 03 13:30:10 debian systemd[1]: Failed to set up API bus: Connection 
>> timed out
>> Sep 03 13:30:10 debian systemd[1]: dbus.service: Main process exited, 
>> code=exited, status=1/FAILURE
>> Sep 03 13:30:10 debian systemd[1]: dbus.service: Unit entered failed state.
>> Sep 03 13:30:10 debian systemd[1]: dbus.service: Failed with result 
>> 'exit-code'.
>> Sep 03 13:31:15 debian systemd[1]: dummy.service: Start operation timed out. 
>> Terminating.
>> Sep 03 13:31:15 debian systemd[1]: Failed to start dummy.service.
>> Sep 03 13:31:15 debian systemd[1]: dummy.service: Unit entered failed state.
>> Sep 03 13:31:15 debian systemd[1]: dummy.service: Failed with result 
>> 'timeout'.
> 
> I.e. the usual 90s timeout kicks in, after which the systemctl start
> attempt fails.
> 
> Can you please double check that this is really still an issue with
> stretch (and ideally with buster as well).


Let's assume this is fixed, so closing.

Regards,
Michael

signature.asc
Description: OpenPGP digital signature

--- End Message ---

Bug#854197: marked as done (systemd: please handle the case where socket activation leads to restart loop better)

Reply via email to