*I’m experiencing an issue where all DNS resolutions sent to dnsmasq
timeout, but only after the dnsmasq service has been successfully running
for a period of time (anecdotally, after a few weeks of time). After a lot
of digging, I’ve discovered that dnsmasq’s UDP socket file will eventually
“disappear”. The issue can be resolved by restarting the dnsmasq service.I
haven’t been able to reproduce it yet, but it has happened numerous times
on servers which are running dozens of docker containers. From what I know,
nothing should be removing this socket file and I can’t find anything
relevant in the dnsmasq logs. Is anyone aware of any situations that can
cause socket files to disappear? EnvironmentUbuntu 16.04.3 LTS8 Cores, 16GB
of RAMDnsmasq 2.75-1ubuntu0.16.04.4BackgroundI’m using dnsmasq to forward
requests to Consul <https://www.consul.io/docs/guides/forwarding.html>,
which is used for service discovery. The Consul agent listens on port 8600
and is configured to bind to all interfaces (the relevant interface here is
172.17.0.1, which docker creates).  Resolv.conf```# Dynamic resolv.conf(5)
file for glibc resolver(3) generated by resolvconf(8)nameserver
127.0.0.1```Dnsmasq.conf ```server=/consul/172.17.0.1#8600
<http://172.17.0.1/#8600>server=/10.in-addr.arpa/172.17.0.1#8600
<http://172.17.0.1/#8600>bind-dynamic```Systemd config for
Docker```ExecStart=/usr/bin/dockerd --bip=172.17.0.1/24
<http://172.17.0.1/24> --dns=172.17.0.1 -H fd://```While investigating the
servers in the broken state, I observed the following: - nslookup / dig DNS
resolutions are timing out- Docker logs show containers are also timing out
on DNS resolutions- Systemd reports that dnsmasq is still running, pid
still exists - DNS resolutions sent directly to the consul agent
(127.0.0.1:8600 <http://127.0.0.1:8600/>) succeed - DNS resolutions sent to
system[dnsmasq] (127.0.0.1:53 <http://127.0.0.1:53/>) time out- IPV6 UDP
(::1) resolutions sent to dnsmasq succeeded- Netstat shows that the IPV4
UDP socket file for dnsmasq is missing- No relevant messages in kernel log
(specifically, no dnsmasq OOM kill events)- File descriptor usage for the
entire server was normal- File descriptor usage for the individual dnsmasq
process was normal- CPU, RAM, and storage all look goodThanks in advance
for any discussion at all - I've been really struggling with this one for a
while now.Zach*

_______________________________________________
Dnsmasq-discuss mailing list
Dnsmasq-discuss@lists.thekelleys.org.uk
http://lists.thekelleys.org.uk/mailman/listinfo/dnsmasq-discuss

Reply via email to