OK - I was able to repro again, and this time with MAAS 2.6.

Here are the steps

PREP WORK
1) Have 50 machines in Ready state with one interface enabled configured as 
'Autoassign' to Default VLAN PXE subnet (auto assign so that every 
deploy/release causes MAAS to reload DNS)
2) Clear out any DNS entries in the PXE subnet (this forces nodes to send DNS 
queries to MAAS)
3) Settings-> Network Services -> DNS -> Upstream DNS -> enter valid upstream 
DNS IP
4) Settings-> Network Services -> DNS -> DNSSEC -> Automatic (for some reason 
this breaks Upstream DNS)
5) Verify that Upstream DNS is broken
a) Rescue Mode one machine
b) ssh to Rescue machine
c) dig www.google.com
d) (dig should timeout/fail)
e) MAAS->Settings-> Network Services -> DNS -> DNSSEC -> Disable
f) dig www.google.com
g) (dig should succeed)
h) MAAS->Settings-> Network Services -> DNS -> DNSSEC -> Automatic
i) Release Rescue machine

REPRO
1) run repro.py (attached, WARNING this code will use all machines available to 
MAAS)
2) wait up to 3 hours, checking if bind9 is hung by regularly running `sudo 
rndc status` on MAAS 

monitoring steps (optional)
(See DNS Query activity)
in one ssh window to Maas run
sudo tcpdump dst <your-rack-controller-ip> -i ens3 and dst port 53
(See DNS reloads, and why)
in another ssh window to Maas run
sudo tail -f /var/log/maas/regiond.log |grep Reloaded -A 3

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1710278

Title:
  [2.3a1] named stuck on reload, DNS broken

To manage notifications about this bug go to:
https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions

-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to