Hi,

I'm using unbound on my openwrt router. Openwrt uses unbound-control to add the dhcp leases (from odhcp) into the dns. This works fine, except that once in a while, unbound-control seems to get stuck and never returns, and I end up with a large number of unbound-control processes:

# ps | grep unbound-control
2335 root 5988 S /usr/sbin/unbound-control -c /var/lib/unbound/unbound.conf local_datas_remove 2385 root 5988 S /usr/sbin/unbound-control -c /var/lib/unbound/unbound.conf local_datas 2428 root 5988 S /usr/sbin/unbound-control -c /var/lib/unbound/unbound.conf local_datas 2995 root 5988 S /usr/sbin/unbound-control -c /var/lib/unbound/unbound.conf local_datas 3839 root 5988 S unbound-control -c /var/lib/unbound/unbound.conf stats_noreset 3970 root 5988 S unbound-control -c /var/lib/unbound/unbound.conf stats_noreset 4090 root 5988 S unbound-control -c /var/lib/unbound/unbound.conf stats_noreset 25060 root 5964 S /usr/sbin/unbound-control -c /var/lib/unbound/unbound.conf local_datas_remove 25064 root 5964 S /usr/sbin/unbound-control -c /var/lib/unbound/unbound.conf local_datas_remove 28771 root 5984 S /usr/sbin/unbound-control -c /var/lib/unbound/unbound.conf local_datas_remove 29845 root 5984 S /usr/sbin/unbound-control -c /var/lib/unbound/unbound.conf local_datas_remove 30351 root 5984 S /usr/sbin/unbound-control -c /var/lib/unbound/unbound.conf local_datas 30681 root 5968 S /usr/sbin/unbound-control -c /var/lib/unbound/unbound.conf local_datas_remove 30721 root 5968 S /usr/sbin/unbound-control -c /var/lib/unbound/unbound.conf local_datas

At that point, dns resolving becomes also problematic:

$ dig aaaa google.es @192.168.1.1

; <<>> DiG 9.16.1-Ubuntu <<>> aaaa google.es @192.168.1.1
;; global options: +cmd
;; connection timed out; no servers could be reached

$ dig aaaa google.es @fd81:631b:716f:10::1

; <<>> DiG 9.16.1-Ubuntu <<>> aaaa google.es @fd81:631b:716f:10::1
;; global options: +cmd
;; connection timed out; no servers could be reached

Once I manually kill all those unbound-control processes, everything starts working again.

I have reported the problem on the openwrt forum:

https://forum.openwrt.org/t/issues-with-unbound-and-odhcp-setup/66354/26

The problem seems to be triggered by executing multiple unbound-control instances in parallel. Openwrt now contains a workaround to avoid doing this with a lockfile, but I suspect there is still some kind of bug present in unbound itself. Maybe some kind of deadlock condition?

Any ideas what could be the problem here?

Jef

Reply via email to