Hello every one,
I run consul services on my network where servics are registered with
xyz.service.consul when they start. All containers and bare metal hosts are
running dnsmasq 2.80.
I noticed that if I restart one of the containers one of the hosts continue
failing to resolve the server hostname. I can see that dnsmasq is a culprit
1. I can resolve service against standard dns servers
2. Dnsmasq on 127.0.0.1 is first in the resolve.conf and when I run tcpdump
against port 53 on lo I see it returns NXDOMAIN on the service query
3. If I restart dnsmasq everything is back to normal again. Even more weird, if I send SIGHUP to dnsmasq which only causes to reread /etc/hosts file, everything is bad to normal as far as service
The weird thing is I have it only happen on some hosts without the pattern I can recognize. For example I have to nodes with the same config, os, kernel version, dnsmasq version, etc ... and one of
them have the problem 100% on service restart and other is not.
Where do I start troubleshooting, any ideas are welcome.
Here is a standard dnsmasq confugration.
# If you don't want dnsmasq to read /etc/hosts, uncomment the
# following line.
# or if you want it to read another file, as well as /etc/hosts, use
# Set the cachesize here.
# If you want to disable negative caching, uncomment this.
# For debugging purposes, log each DNS query as it passes through
options timeout:1 attempts:1
Dnsmasq-discuss mailing list