You probably want to alter the startlimitinterval or limitburst[1] in an /etc/systemd/system/dhcpd.service

Odds are an upstream bug is also warranted as this is some unexpected behavior.

Pat


[1] http://www.freedesktop.org/software/systemd/man/systemd.service.html#StartLimitInterval=

On 07/07/2015 03:25 PM, Vladimir Mosgalin wrote:
Hello everybody.

At certain point (now I understand it started to happen after I added one more
vlan) this SL7.1 server stopped running dhcpd at boot. After each reboot, after
I notice that clients aren't getting IPs I have to manually start it. Which I
always forget to do because reboots don't happen often. There are no errors
from dhcpd, it always starts manually but doesn't want start at boot
anymore.

Now I looked at the situation, and this is what happens. This is the log
of dhcpd service (without dhcpd log itself because it's irrelevant)

-----
# journalctl -b -u dhcpd|grep systemd
июл 04 17:55:19 cherry.asgard systemd[1]: Starting DHCPv4 Server Daemon...
июл 04 17:55:20 cherry.asgard systemd[1]: Started DHCPv4 Server Daemon.
июл 04 17:55:23 cherry.asgard systemd[1]: Stopping DHCPv4 Server Daemon...
июл 04 17:55:23 cherry.asgard systemd[1]: Starting DHCPv4 Server Daemon...
июл 04 17:55:24 cherry.asgard systemd[1]: Started DHCPv4 Server Daemon.
июл 04 17:55:25 cherry.asgard systemd[1]: Stopping DHCPv4 Server Daemon...
июл 04 17:55:25 cherry.asgard systemd[1]: Starting DHCPv4 Server Daemon...
июл 04 17:55:25 cherry.asgard systemd[1]: Started DHCPv4 Server Daemon.
июл 04 17:55:27 cherry.asgard systemd[1]: Stopping DHCPv4 Server Daemon...
июл 04 17:55:27 cherry.asgard systemd[1]: Starting DHCPv4 Server Daemon...
июл 04 17:55:27 cherry.asgard systemd[1]: Started DHCPv4 Server Daemon.
июл 04 17:55:28 cherry.asgard systemd[1]: Stopping DHCPv4 Server Daemon...
июл 04 17:55:28 cherry.asgard systemd[1]: Starting DHCPv4 Server Daemon...
июл 04 17:55:28 cherry.asgard systemd[1]: Started DHCPv4 Server Daemon.
июл 04 17:55:29 cherry.asgard systemd[1]: Stopping DHCPv4 Server Daemon...
июл 04 17:55:29 cherry.asgard systemd[1]: Starting DHCPv4 Server Daemon...
июл 04 17:55:29 cherry.asgard systemd[1]: dhcpd.service start request repeated 
too quickly, refusing to start.
июл 04 17:55:29 cherry.asgard systemd[1]: Failed to start DHCPv4 Server Daemon.
июл 04 17:55:29 cherry.asgard systemd[1]: Unit dhcpd.service entered failed 
state.
июл 04 19:21:36 cherry.asgard systemd[1]: Starting DHCPv4 Server Daemon...
июл 04 19:21:37 cherry.asgard systemd[1]: Started DHCPv4 Server Daemon.
июл 05 17:55:51 cherry.asgard systemd[1]: Stopping DHCPv4 Server Daemon...
июл 05 17:55:51 cherry.asgard systemd[1]: Starting DHCPv4 Server Daemon...
июл 05 17:55:51 cherry.asgard systemd[1]: Started DHCPv4 Server Daemon.
июл 06 17:56:16 cherry.asgard systemd[1]: Stopping DHCPv4 Server Daemon...
июл 06 17:56:16 cherry.asgard systemd[1]: Starting DHCPv4 Server Daemon...
июл 06 17:56:16 cherry.asgard systemd[1]: Started DHCPv4 Server Daemon.
июл 07 17:56:41 cherry.asgard systemd[1]: Stopping DHCPv4 Server Daemon...
июл 07 17:56:42 cherry.asgard systemd[1]: Starting DHCPv4 Server Daemon...
июл 07 17:56:42 cherry.asgard systemd[1]: Started DHCPv4 Server Daemon.
-----

17:55 here is the moment during boot.
19:21 is the moment when I noticed that clients aren't working and started it 
manually.


So what happens? On boot, dhcpd and NetworkManager are started. There are bunch
of interfaces, including vlans and ppp interface to backup ISP (which restarts
once a day, which you can see in log). There is this nice script in
/etc/NetworkManager/dispatcher.d/12-dhcpd

-----
#!/bin/bash

INTERFACE=$1 # The interface which is brought up or down
STATUS=$2 # The new state of the interface

# whenever interface is brought up by NM (rhbz #565921)
if [ "$STATUS" = "up" ]; then
     # restart the services
     systemctl -q is-enabled dhcpd.service && systemctl restart dhcpd.service
     systemctl -q is-enabled dhcpd6.service && systemctl restart dhcpd6.service
fi

exit 0
-----

Basically, during the course of setting up interfaces one by one (which takes
1-2 second per interface) NM restarts dhcpd after configuring the very each
interface - and does it so many times so that systemd thinks something is wrong
and disables service.

Obviously this is wrong. Systemd *should* know that it's not service that
quits, it's restarted with command - so why it thinks service fails?
I mean, something is seriously wrong if the fate of dhcpd running (which only
cares about single local interface) depends on amount of interfaces I need on
the server.

I'm looking for a proper solution. I know there are tons of improper solutions,
like disabling NM, or removing/modifying this script (but I'll have to remember
to do that again after each dhcp package update - which is *really* annoying, I
have similar hacks which I have to reinstall after package updates for other
causes in my systems already and I want to try as hard as possible to avoid
another one), or forcing systemd not to give up on restarting services too
quickly or disabling dispatcher scripts altogether (I need them, at least for
named). Can anyone suggest nice solution - like changing some config file or
something - which solves it without breaking the way system works?

Is this a bug in systemd? I mean, of course it's badly written dispatcher
script (should consult config with the list of interfaces or something..), but
*why* systemd thinks that service fails and disables it in the first place?



--
Pat Riehecky
Scientific Linux developer

Fermi National Accelerator Laboratory
www.fnal.gov
www.scientificlinux.org

Reply via email to