Bug#807132: [Pkg-dns-devel] Bug#807132: Bug#807132: Bug#807132: Related issue? unbound not restarted after upgrade

2016-06-10 Thread Robert Edmonds
Robert Edmonds wrote:
> Hi,
> 
> After looking into this problem some more, I think the minimal fix for
> jessie is going to look something like a drop-in that overrides the bad
> values from systemd-sys-generator. Any chance anyone following this bug
> report could test the following drop-in? (Also attached.)
> 
> ==> /etc/systemd/system/unbound.service.d/sysv-generator-overrides.conf <==
> 
> [Service]
> Type=forking
> PIDFile=/run/unbound.pid
> RemainAfterExit=no
> 
> It works for me in a stretch VM when I run through the original bug
> submitter's sequence of commands.

Michael Biebl showed me an even more minimal fix, which is to put this
magic comment in /etc/init.d/unbound:

# pidfile: /run/unbound.pid

which is equivalent to the drop-in quoted above. (But the magic comment
must be *outside* of the LSB header.)

I'm planning on uploading 1.5.9-1 with this fix, and after the fix has
been proven in testing/unstable, backporting it to a stable update for
jessie. (And after that's done, adding native service unit files in a
subsequent upload.)

-- 
Robert Edmonds
edmo...@debian.org



Bug#807132: [Pkg-dns-devel] Bug#807132: Bug#807132: Related issue? unbound not restarted after upgrade

2016-06-03 Thread Robert Edmonds
There's a very important step missing in the text below, which is to
install the postfix package between steps 5 and 6. Sorry about that.

Robert Edmonds wrote:
> I've attached the unbound.service unit file that I've been working on
> that ports the functionality from the sysvinit script. I can reliably
> get this unit file to fail with the following steps:
> 
> 1) Start with a minimal installation of Debian testing in a virtual
> machine, with DHCP networking, and no MTA installed. /etc/resolv.conf
> should list the DNS resolvers learned from the DHCP server.
> 
> 2) Install unbound 1.5.8-1 from testing/unstable. This package uses the
> old sysvinit script. The default config listens on localhost only.
> 
> 3) Install resolvconf and reboot the VM. /etc/resolv.conf should now
> list the unbound server running on localhost.
> 
> 4) Copy the attached unbound.service file into /etc/systemd/system.
> I think there's a systemctl command you have to run to activate this so
> that it takes over from the generated unit file.
> 
> 5) Reboot the VM. It should still work and /etc/resolv.conf should still
> list the unbound server as before.
> 
> 6) Run "systemctl stop unbound.service". It should stop normally and
> /etc/resolv.conf should switch back to the resolvers learned from the
> DHCP server.
> 
> 7) Run "systemctl start unbound.service". This command will hang for a
> few minutes and then print:
> 
> Job for unbound.service failed because a timeout was exceeded. See
> "systemctl status unbound.service" and "journalctl -xe" for details.
> 
> 8) While that command is hung, "ps axfwu" shows the following process
> tree (edited slightly) corresponding to the resolvconf hooks being run.
> These are ultimately being invoked by the ExecStartPost= in the
> unbound.service unit file.
> 
> [...] /bin/sh -e /usr/lib/unbound/package-helper resolvconf_start
> [...]  \_ run-parts --arg=-a --arg=lo.unbound /etc/resolvconf/update.d
> [...]  \_ run-parts /etc/resolvconf/update-libc.d
> [...]  \_ /bin/sh -e /etc/resolvconf/update-libc.d/postfix
> [...]  \_ /bin/sh -e /etc/init.d/postfix reload
> [...]  \_ /bin/systemctl --no-pager reload postfix.service
> 
> Also while the "start" command is hung, "systemctl list-jobs" shows the
> following output:
> 
> JOB UNIT  TYPE   STATE
> 283 nss-lookup.target start  waiting
> 284 postfix.service   reload waiting
> 226 unbound.service   start  running
> 
> 3 jobs listed.
> 
> postfix's resolvconf hook (/etc/resolvconf/update-libc.d/postfix) calls
> back into the init system to reload postfix when /etc/resolv.conf has
> been changed by resolvconf, and this resolvconf hook is itself running
> as a result of the init system starting unbound. This must be causing
> some sort of dependency cycle or deadlock somewhere.
> 
> That's as far as I've gotten.

-- 
Robert Edmonds
edmo...@debian.org



Bug#807132: [Pkg-dns-devel] Bug#807132: Related issue? unbound not restarted after upgrade

2016-05-22 Thread Robert Edmonds
Nicolas Braud-Santoni wrote:
> Hi,
> 
> I can confirm that this issue prevents systemd from detecting Unbound failing
> - either at startup (for instance due to bad configuration);
> - while running;
> - because it was stopped with unbound-control.
> 
> 
> Could you expand a bit on what is required, re: resolvconf and systemd,
>   and how would it be possible to help?

Hi, Nicolas:

Basically, in order to ship a native systemd unit file for unbound, we
need to have feature parity with the existing sysvinit script. In
1.5.7-2 I factored out all the functionality in the sysvinit script not
related to interfacing with sysvinit into a separate script
(/usr/lib/unbound/package-helper), so that that functionality could be
reused by the systemd unit file.

That functionality consists of:

  - Setting up the chroot.

  - Updating the DNSSEC root trust anchor.

  - Registering/unregistering with resolvconf.

The minimal unbound.service unit file I posted earlier on this bug
report doesn't have any of that functionality. It works just fine for
starting/stopping the daemon, etc. It just doesn't have the same (but
optional) functionality of the sysvinit script.

I've attached the unbound.service unit file that I've been working on
that ports the functionality from the sysvinit script. I can reliably
get this unit file to fail with the following steps:

1) Start with a minimal installation of Debian testing in a virtual
machine, with DHCP networking, and no MTA installed. /etc/resolv.conf
should list the DNS resolvers learned from the DHCP server.

2) Install unbound 1.5.8-1 from testing/unstable. This package uses the
old sysvinit script. The default config listens on localhost only.

3) Install resolvconf and reboot the VM. /etc/resolv.conf should now
list the unbound server running on localhost.

4) Copy the attached unbound.service file into /etc/systemd/system.
I think there's a systemctl command you have to run to activate this so
that it takes over from the generated unit file.

5) Reboot the VM. It should still work and /etc/resolv.conf should still
list the unbound server as before.

6) Run "systemctl stop unbound.service". It should stop normally and
/etc/resolv.conf should switch back to the resolvers learned from the
DHCP server.

7) Run "systemctl start unbound.service". This command will hang for a
few minutes and then print:

Job for unbound.service failed because a timeout was exceeded. See
"systemctl status unbound.service" and "journalctl -xe" for details.

8) While that command is hung, "ps axfwu" shows the following process
tree (edited slightly) corresponding to the resolvconf hooks being run.
These are ultimately being invoked by the ExecStartPost= in the
unbound.service unit file.

[...] /bin/sh -e /usr/lib/unbound/package-helper resolvconf_start
[...]  \_ run-parts --arg=-a --arg=lo.unbound /etc/resolvconf/update.d
[...]  \_ run-parts /etc/resolvconf/update-libc.d
[...]  \_ /bin/sh -e /etc/resolvconf/update-libc.d/postfix
[...]  \_ /bin/sh -e /etc/init.d/postfix reload
[...]  \_ /bin/systemctl --no-pager reload postfix.service

Also while the "start" command is hung, "systemctl list-jobs" shows the
following output:

JOB UNIT  TYPE   STATE
283 nss-lookup.target start  waiting
284 postfix.service   reload waiting
226 unbound.service   start  running

3 jobs listed.

postfix's resolvconf hook (/etc/resolvconf/update-libc.d/postfix) calls
back into the init system to reload postfix when /etc/resolv.conf has
been changed by resolvconf, and this resolvconf hook is itself running
as a result of the init system starting unbound. This must be causing
some sort of dependency cycle or deadlock somewhere.

That's as far as I've gotten.

-- 
Robert Edmonds
edmo...@debian.org
[Unit]
Description=Unbound DNS server
After=network.target
Before=nss-lookup.target
Wants=nss-lookup.target

[Service]
Type=simple
Restart=on-failure

EnvironmentFile=-/etc/default/unbound

ExecStartPre=-/usr/lib/unbound/package-helper chroot_setup
ExecStartPre=-/usr/lib/unbound/package-helper root_trust_anchor_update

ExecStart=/usr/sbin/unbound -d $DAEMON_OPTS

ExecStartPost=/usr/lib/unbound/package-helper resolvconf_start
ExecStopPost=/usr/lib/unbound/package-helper resolvconf_stop

ExecReload=/usr/sbin/unbound-control reload

[Install]
WantedBy=multi-user.target


Bug#807132: Related issue? unbound not restarted after upgrade

2016-05-16 Thread Nicolas Braud-Santoni
Control: found -1 1.5.8-1~bpo8+1

Hi,

I can confirm that this issue prevents systemd from detecting Unbound failing
- either at startup (for instance due to bad configuration);
- while running;
- because it was stopped with unbound-control.


Could you expand a bit on what is required, re: resolvconf and systemd,
  and how would it be possible to help?


Best,

  nicoo


signature.asc
Description: PGP signature


Bug#807132: [Pkg-dns-devel] Bug#807132: Bug#807132: Related issue? unbound not restarted after upgrade

2016-04-18 Thread Robert Edmonds
Matthew Vernon wrote:
> Yep. Without that service file, service unbound restart fails to restart
> unbound.
> 
> With that service file installed, service unbound restart behaves as
> expected.
> 
> So I do think rolling this out to jessie would be good :)

Thanks for the confirmation.

Unfortunately, the unbound resolvconf integration also needs to somehow
be ported to systemd, and my experiments so far have run into deadlocks
in certain situations (I can reliably reproduce it with resolvconf and
postfix). Otherwise unbound in sid would already have a systemd unit
file.

I think for jessie we'll need a more targeted fix. Can you send more
details about your environment? I haven't run into this scenario
(without using "unbound-control") on my own jessie boxes.

-- 
Robert Edmonds
edmo...@debian.org



Bug#807132: [Pkg-dns-devel] Bug#807132: Related issue? unbound not restarted after upgrade

2016-04-18 Thread Matthew Vernon
Hi,

> Thank you for this follow-up, up until now I had thought this behavior
> could only be triggered when using "unbound-control" to stop unbound.
> 
> Could you try using the "very basic native unbound.service unit file"
> from https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=807132#10 as a
> drop-in, and see if the behavior goes away?

Yep. Without that service file, service unbound restart fails to restart
unbound.

With that service file installed, service unbound restart behaves as
expected.

So I do think rolling this out to jessie would be good :)

Thanks,

Matthew



Bug#807132: [Pkg-dns-devel] Bug#807132: Related issue? unbound not restarted after upgrade

2016-04-18 Thread Robert Edmonds
Matthew Vernon wrote:
> I have unbound & systemd on a jessie system, and every time there's an
> update to unbound, it ends up not running. I /think/ it's this bug
> biting us, but I'm not entirely sure.
> 
> For instance, we recent upgraded to 1.4.22-3+deb8u1 and unbound was no
> longer running. daemon.log output:
> 
> Apr 18 11:18:40 boarstall unbound[50877]: Stopping recursive DNS server:
> unbound.
> 
> [then follows info about processing times]
> 
> Apr 18 11:18:40 boarstall unbound-anchor: /var/lib/unbound/root.key has
> content
> Apr 18 11:18:40 boarstall unbound-anchor: success: the anchor is ok
> Apr 18 11:18:40 boarstall unbound[50885]: Starting recursive DNS server:
> unbound.
> 
> ...except at this point unbound is not running (I think that last entry
> is actually from systemd).
> 
> Is it plausible that this is caused by this bug? If so, a jessie update
> with a working systemd service file might be warranted - having your DNS
> go away on upgrade is annoying...

Hi, Matthew:

Thank you for this follow-up, up until now I had thought this behavior
could only be triggered when using "unbound-control" to stop unbound.

Could you try using the "very basic native unbound.service unit file"
from https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=807132#10 as a
drop-in, and see if the behavior goes away?

-- 
Robert Edmonds
edmo...@debian.org



Bug#807132: Related issue? unbound not restarted after upgrade

2016-04-18 Thread Sven Hartge
On 18.04.2016 13:07, Matthew Vernon wrote:

> I have unbound & systemd on a jessie system, and every time there's an
> update to unbound, it ends up not running. I /think/ it's this bug
> biting us, but I'm not entirely sure.

> Is it plausible that this is caused by this bug? If so, a jessie update
> with a working systemd service file might be warranted - having your DNS
> go away on upgrade is annoying...

Yes, this is very plausible that you are hit by this bug, as I see the
same problem here. "service unbound restart" does not work correctly
when used with systemd as PID1.

Grüße,
Sven.



signature.asc
Description: OpenPGP digital signature


Bug#807132: Related issue? unbound not restarted after upgrade

2016-04-18 Thread Matthew Vernon
Hi,

I have unbound & systemd on a jessie system, and every time there's an
update to unbound, it ends up not running. I /think/ it's this bug
biting us, but I'm not entirely sure.

For instance, we recent upgraded to 1.4.22-3+deb8u1 and unbound was no
longer running. daemon.log output:

Apr 18 11:18:40 boarstall unbound[50877]: Stopping recursive DNS server:
unbound.

[then follows info about processing times]

Apr 18 11:18:40 boarstall unbound-anchor: /var/lib/unbound/root.key has
content
Apr 18 11:18:40 boarstall unbound-anchor: success: the anchor is ok
Apr 18 11:18:40 boarstall unbound[50885]: Starting recursive DNS server:
unbound.

...except at this point unbound is not running (I think that last entry
is actually from systemd).

Is it plausible that this is caused by this bug? If so, a jessie update
with a working systemd service file might be warranted - having your DNS
go away on upgrade is annoying...

Regards,

Matthew