Public bug reported:

I have two servers that run named.service, and I recently discovered
that (on both servers), when I reboot and then run "systemctl status
named.service" (or "journalctl -u named.service"), I see messages like
this:

Mar 18 21:03:05 mail named[859]: managed-keys-zone/xxx: Unable to fetch
DNSKEY set '.': failure

...where xxx is the view name, and this error is repeated for each view.
(I have many views.)

(OTOH if the server is already up and running, and then I start
named.service it starts up with no errors.)

By creating a shell script that ran various "ip" diagnostic commands,
and adding this to named.service as a "ExecStartPre" hook, I was able to
determine that the error above occurs because BIND is being started
before the network is available. (The network interfaces didn't even
have IP addresses at that time.)

I should highlight at this point that in spite of the error, as far as I
know BIND was running OK, serving DNS as normal. I can only guess that
it had cached copies of the required records or something like that?

Anyway I don't like seeing errors like this in my logs, so...

My initial attempt to solve this problem involved setting named.service
to start after network-online.target. Results were mixed. Sometimes
there were no errors on reboot, but more often than not the same errors
were there.

Then I worked out that network-online.target is based on systemd-
networkd-wait-online, which by default only waits until IP addresses are
assigned to interfaces. To solve the error above, I needed it to wait
for the operational status to become "routable". I was able to achieve
this by specifying the following in
/etc/systemd/system/named.service.d/override.conf (i.e. file content is
between the "-----" lines)

-----
[Unit]
After=network-online.target

[Service]
ExecStartPre=-/lib/systemd/systemd-networkd-wait-online 
--operational-state=routable --timeout=10 --quiet
-----

Effectively this causes systemd to delay starting named.service until
the network interfaces have addresses, and then when it does start
named.service, the ExecStartPre line above waits (for up to 10 seconds)
until network routes are added before BIND (i.e. /usr/sbin/named) is
launched.

Can I please request that the named.service definition in the bind9
package is updated to include the options above?

Final note: Although this bug would appear to be similar to 1909822 (
https://bugs.launchpad.net/ubuntu/+source/bind9/+bug/1909822 ), the
error message I observed is different, and so I've raised this as a
separate bug report. Having said that, I suspect the solution that I'm
offering above would fix both issues (and would be slightly more optimal
than that offered by 1909822).

System-specific information follows:
# lsb_release -rd
Description:    Ubuntu 21.10
Release:        21.10

# apt-cache policy bind9
bind9:
  Installed: 1:9.16.15-1ubuntu1.2
  Candidate: 1:9.16.15-1ubuntu1.2
  Version table:
 *** 1:9.16.15-1ubuntu1.2 500
        500 http://nz.archive.ubuntu.com/ubuntu impish-updates/main amd64 
Packages
        500 http://nz.archive.ubuntu.com/ubuntu impish-security/main amd64 
Packages
        100 /var/lib/dpkg/status
     1:9.16.15-1ubuntu1 500
        500 http://nz.archive.ubuntu.com/ubuntu impish/main amd64 Packages

Thanks,
Nick.

** Affects: bind9 (Ubuntu)
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1965521

Title:
  named.service starts too early: Unable to fetch DNSKEY set '.':
  failure

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/bind9/+bug/1965521/+subscriptions


-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to