subject:"Re\: \[Dnsmasq\-discuss\] NXDOMAIN on exisiting A record"

Re: [Dnsmasq-discuss] NXDOMAIN on exisiting A record

2019-07-10 Thread Sasha Litvak

Petr,

Thank you very much for your help.   I will follow your advice and report
my findings to the list.

On Wed, Jul 10, 2019, 4:47 AM Petr Mensik  wrote:

> Hello Alex,
>
> I would try removing all-servers and clear-on-reload statements away. I
> would use just one server for testing, retesting all of them for the
> same behaviour. When you do not know which server is used, it is hard to
> debug better.
>
> I think dots in server=/.X/ are not necessary and maybe even misleading.
> Try it without them, just server=/X/ip
>
> I think one second timeout is too short. Just use only localhost in
> /etc/resolv.conf and debug what happens with dnsmasq. Record what
> queries are sent to dnsmasq and what dnsmasq forwards to configured
> servers.
>
> Note I discovered already requests without recursion desired bit set are
> forwarded always, do not serve any local records. But that should not be
> the issue. Try dig +rec and dig +norec to rule it out.
>
> Regards,
> Petr
>
> On 7/7/19 10:28 PM, Alex Litvak wrote:
> > (luck of sleep, fixing some mistakes in text)
> >
> > Hello everyone,
> >
> > I run consul services on my network where services are registered with
> > .service.consul when they start.  All containers and bare metal
> > hosts are running dnsmasq 2.80.
> > I noticed that if I restart one of the containers, one of the hosts
> > continue failing to resolve the service name.  I assume that dnsmasq is
> > a culprit because:
> >
> > 1. I can resolve service xyz.service.consul against standard dns servers
> > with dig.
> > 2. Dnsmasq listening on 127.0.0.1 is the first line in the resolve.conf
> > and when I run tcpdump against port 53 on interface lo I see it returns
> > NXDOMAIN on each A record query for service in question.
> > 3. If I restart dnsmasq everything is back to normal again.  Even more
> > weird, if I send SIGHUP to dnsmasq, which only causes a reread of
> > /etc/hosts file, everything is back to normal as far as service
> > resolution goes.
> >
> > I have this problem only happening  on some hosts without the pattern I
> > can recognize.  For example I have two nodes with the same config, os,
> > kernel version, dnsmasq version, etc ... and one of them has the problem
> > 100% after service xyz.service.consul restart and the other is not.
> >
> > Where do I start troubleshooting? Any ideas are welcome.
> >
> > Here is a standard dnsmasq confugration.
> >
> > port=53
> > domain-needed
> > bogus-priv
> > interface=lo
> > listen-address=127.0.0.1
> > no-dhcp-interface=127.0.0.1
> > #bind-interfaces
> > no-resolv
> > all-servers
> > dns-forward-max=500
> >
> > # If you don't want dnsmasq to read /etc/hosts, uncomment the
> > # following line.
> > #no-hosts
> > # or if you want it to read another file, as well as /etc/hosts, use
> > # this.
> > #addn-hosts=/etc/banner_add_hosts
> >
> > #log-queries=extra
> > #log-facility=/var/log/dnsmasq.log
> > log-async=25
> >
> > # Set the cachesize here.
> > cache-size=1
> > min-cache-ttl=5
> > #neg-ttl=3600
> >
> > # If you want to disable negative caching, uncomment this.
> > #no-negcache
> >
> > # For debugging purposes, log each DNS query as it passes through
> > # dnsmasq.
> > #log-queries
> > clear-on-reload
> >
> > server=10.0.48.12
> > server=10.0.48.11
> > server=10.0.21.63
> > server=10.0.21.61
> >
> > server=/.la.consul/10.0.73.43
> > server=/.la.consul/10.0.73.40
> > server=/.la.consul/10.0.73.28
> > server=/.chi-pbx.consul/10.1.73.1
> > server=/.chi-pbx.consul/10.1.73.2
> > server=/.chi-pbx.consul/10.1.73.3
> > server=/.consul/10.0.73.43
> > server=/.consul/10.0.73.40
> > server=/.consul/10.0.73.28
> >
> > Resolver config
> >
> > search ''
> > options  timeout:1 attempts:1
> > nameserver 127.0.0.1
> > nameserver 10.0.48.11
> > nameserver 10.0.48.12
> > nameserver 10.0.21.63
> >
> >
> >
> > ___
> > Dnsmasq-discuss mailing list
> > Dnsmasq-discuss@lists.thekelleys.org.uk
> > http://lists.thekelleys.org.uk/mailman/listinfo/dnsmasq-discuss
>
> --
> Petr Menšík
> Software Engineer
> Red Hat, http://www.redhat.com/
> email: pemen...@redhat.com  PGP: 65C6C973
>
> ___
> Dnsmasq-discuss mailing list
> Dnsmasq-discuss@lists.thekelleys.org.uk
> http://lists.thekelleys.org.uk/mailman/listinfo/dnsmasq-discuss
>
___
Dnsmasq-discuss mailing list
Dnsmasq-discuss@lists.thekelleys.org.uk
http://lists.thekelleys.org.uk/mailman/listinfo/dnsmasq-discuss

Re: [Dnsmasq-discuss] NXDOMAIN on exisiting A record

2019-07-10 Thread Petr Mensik

Hello Alex,

I would try removing all-servers and clear-on-reload statements away. I
would use just one server for testing, retesting all of them for the
same behaviour. When you do not know which server is used, it is hard to
debug better.

I think dots in server=/.X/ are not necessary and maybe even misleading.
Try it without them, just server=/X/ip

I think one second timeout is too short. Just use only localhost in
/etc/resolv.conf and debug what happens with dnsmasq. Record what
queries are sent to dnsmasq and what dnsmasq forwards to configured servers.

Note I discovered already requests without recursion desired bit set are
forwarded always, do not serve any local records. But that should not be
the issue. Try dig +rec and dig +norec to rule it out.

Regards,
Petr

On 7/7/19 10:28 PM, Alex Litvak wrote:
> (luck of sleep, fixing some mistakes in text)
> 
> Hello everyone,
> 
> I run consul services on my network where services are registered with
> .service.consul when they start.  All containers and bare metal
> hosts are running dnsmasq 2.80.
> I noticed that if I restart one of the containers, one of the hosts
> continue failing to resolve the service name.  I assume that dnsmasq is
> a culprit because:
> 
> 1. I can resolve service xyz.service.consul against standard dns servers
> with dig.
> 2. Dnsmasq listening on 127.0.0.1 is the first line in the resolve.conf
> and when I run tcpdump against port 53 on interface lo I see it returns
> NXDOMAIN on each A record query for service in question.
> 3. If I restart dnsmasq everything is back to normal again.  Even more
> weird, if I send SIGHUP to dnsmasq, which only causes a reread of
> /etc/hosts file, everything is back to normal as far as service
> resolution goes.
> 
> I have this problem only happening  on some hosts without the pattern I
> can recognize.  For example I have two nodes with the same config, os,
> kernel version, dnsmasq version, etc ... and one of them has the problem
> 100% after service xyz.service.consul restart and the other is not.
> 
> Where do I start troubleshooting? Any ideas are welcome.
> 
> Here is a standard dnsmasq confugration.
> 
> port=53
> domain-needed
> bogus-priv
> interface=lo
> listen-address=127.0.0.1
> no-dhcp-interface=127.0.0.1
> #bind-interfaces
> no-resolv
> all-servers
> dns-forward-max=500
> 
> # If you don't want dnsmasq to read /etc/hosts, uncomment the
> # following line.
> #no-hosts
> # or if you want it to read another file, as well as /etc/hosts, use
> # this.
> #addn-hosts=/etc/banner_add_hosts
> 
> #log-queries=extra
> #log-facility=/var/log/dnsmasq.log
> log-async=25
> 
> # Set the cachesize here.
> cache-size=1
> min-cache-ttl=5
> #neg-ttl=3600
> 
> # If you want to disable negative caching, uncomment this.
> #no-negcache
> 
> # For debugging purposes, log each DNS query as it passes through
> # dnsmasq.
> #log-queries
> clear-on-reload
> 
> server=10.0.48.12
> server=10.0.48.11
> server=10.0.21.63
> server=10.0.21.61
> 
> server=/.la.consul/10.0.73.43
> server=/.la.consul/10.0.73.40
> server=/.la.consul/10.0.73.28
> server=/.chi-pbx.consul/10.1.73.1
> server=/.chi-pbx.consul/10.1.73.2
> server=/.chi-pbx.consul/10.1.73.3
> server=/.consul/10.0.73.43
> server=/.consul/10.0.73.40
> server=/.consul/10.0.73.28
> 
> Resolver config
> 
> search ''
> options  timeout:1 attempts:1
> nameserver 127.0.0.1
> nameserver 10.0.48.11
> nameserver 10.0.48.12
> nameserver 10.0.21.63
> 
> 
> 
> ___
> Dnsmasq-discuss mailing list
> Dnsmasq-discuss@lists.thekelleys.org.uk
> http://lists.thekelleys.org.uk/mailman/listinfo/dnsmasq-discuss

-- 
Petr Menšík
Software Engineer
Red Hat, http://www.redhat.com/
email: pemen...@redhat.com  PGP: 65C6C973

___
Dnsmasq-discuss mailing list
Dnsmasq-discuss@lists.thekelleys.org.uk
http://lists.thekelleys.org.uk/mailman/listinfo/dnsmasq-discuss

Re: [Dnsmasq-discuss] NXDOMAIN on exisiting A record

2019-07-07 Thread Alex Litvak

(luck of sleep, fixing some mistakes in text)

Hello everyone,

I run consul services on my network where services are registered with
.service.consul when they start. All containers and bare metal hosts are
running dnsmasq 2.80.
I noticed that if I restart one of the containers, one of the hosts continue
failing to resolve the service name. I assume that dnsmasq is a culprit
because:

1. I can resolve service xyz.service.consul against standard dns servers with
dig.
2. Dnsmasq listening on 127.0.0.1 is the first line in the resolve.conf and
when I run tcpdump against port 53 on interface lo I see it returns NXDOMAIN on
each A record query for service in question.
3. If I restart dnsmasq everything is back to normal again. Even more weird, if I send SIGHUP to dnsmasq, which only causes a reread of /etc/hosts file, everything is back to normal as far as service
resolution goes.

I have this problem only happening on some hosts without the pattern I can recognize. For example I have two nodes with the same config, os, kernel version, dnsmasq version, etc ... and one of them
has the problem 100% after service xyz.service.consul restart and the other is not.

Where do I start troubleshooting? Any ideas are welcome.

Here is a standard dnsmasq confugration.

port=53
domain-needed
bogus-priv
interface=lo
listen-address=127.0.0.1
no-dhcp-interface=127.0.0.1
#bind-interfaces
no-resolv
all-servers
dns-forward-max=500

# If you don't want dnsmasq to read /etc/hosts, uncomment the
# following line.
#no-hosts
# or if you want it to read another file, as well as /etc/hosts, use
# this.
#addn-hosts=/etc/banner_add_hosts

#log-queries=extra
#log-facility=/var/log/dnsmasq.log
log-async=25

# Set the cachesize here.
cache-size=1
min-cache-ttl=5
#neg-ttl=3600

# If you want to disable negative caching, uncomment this.
#no-negcache

# For debugging purposes, log each DNS query as it passes through
# dnsmasq.
#log-queries
clear-on-reload

server=10.0.48.12
server=10.0.48.11
server=10.0.21.63
server=10.0.21.61

server=/.la.consul/10.0.73.43
server=/.la.consul/10.0.73.40
server=/.la.consul/10.0.73.28
server=/.chi-pbx.consul/10.1.73.1
server=/.chi-pbx.consul/10.1.73.2
server=/.chi-pbx.consul/10.1.73.3
server=/.consul/10.0.73.43
server=/.consul/10.0.73.40
server=/.consul/10.0.73.28

Resolver config

search ''
options timeout:1 attempts:1
nameserver 127.0.0.1
nameserver 10.0.48.11
nameserver 10.0.48.12
nameserver 10.0.21.63

___
Dnsmasq-discuss mailing list
Dnsmasq-discuss@lists.thekelleys.org.uk
http://lists.thekelleys.org.uk/mailman/listinfo/dnsmasq-discuss

Re: [Dnsmasq-discuss] NXDOMAIN on exisiting A record

2019-07-07 Thread Geert Stappers

On Sun, Jul 07, 2019 at 02:09:20PM -0500, Alex Litvak wrote:
> Hello every one,
> 
> I run consul services on my network where servics are registered with 
> xyz.service.consul when they start.  All containers and bare metal hosts are 
> running dnsmasq 2.80.
> I noticed that if I restart one of the containers one of the hosts continue 
> failing to resolve the server hostname.  I can see that dnsmasq is a culprit 
> because:
> 
> 1. I can resolve service against standard dns servers
> 2. Dnsmasq on 127.0.0.1 is first in the resolve.conf and when I run tcpdump 
> against port 53 on lo I see it returns NXDOMAIN on the service query
> 3. If I restart dnsmasq everything is back to normal again.  Even more
> weird, if I send SIGHUP to dnsmasq which only causes to reread /etc/hosts
> file, everything is bad to normal as far as service resolution goes.
> 
> The weird thing is I have it only happen on some hosts without the pattern I
> can recognize.  For example I have to nodes with the same config, os, kernel
> version, dnsmasq version, etc ... and one of them have the problem 100% on
> service restart and other is not.
> 
> Where do I start troubleshooting, any ideas are welcome.

Draw a diagram  /  make a sketch  / picture it



 
> Here is a standard dnsmasq confugration.
> 
> port=53
> domain-needed
> bogus-priv
> interface=lo
> listen-address=127.0.0.1
> no-dhcp-interface=127.0.0.1
> #bind-interfaces
> no-resolv
> all-servers
> dns-forward-max=500
> 
> # If you don't want dnsmasq to read /etc/hosts, uncomment the
> # following line.
> #no-hosts
> # or if you want it to read another file, as well as /etc/hosts, use
> # this.
> #addn-hosts=/etc/banner_add_hosts
> 
> #log-queries=extra
> #log-facility=/var/log/dnsmasq.log
> log-async=25
> 
> # Set the cachesize here.
> cache-size=1
> min-cache-ttl=5
> #neg-ttl=3600
> 
> # If you want to disable negative caching, uncomment this.
> #no-negcache
> 
> # For debugging purposes, log each DNS query as it passes through
> # dnsmasq.
> #log-queries
> clear-on-reload
> 
> server=10.0.48.12
> server=10.0.48.11
> server=10.0.21.63
> server=10.0.21.61
> 
> server=/.la.consul/10.0.73.43
> server=/.la.consul/10.0.73.40
> server=/.la.consul/10.0.73.28
> server=/.chi-pbx.consul/10.1.73.1
> server=/.chi-pbx.consul/10.1.73.2
> server=/.chi-pbx.consul/10.1.73.3
> server=/.consul/10.0.73.43
> server=/.consul/10.0.73.40
> server=/.consul/10.0.73.28
> 
> Resolver config
> 
> search ''
> options  timeout:1 attempts:1
> nameserver 127.0.0.1
> nameserver 10.0.48.11
> nameserver 10.0.48.12
> nameserver 10.0.21.63
> 
> 
> 
> 
> 
> ___
> Dnsmasq-discuss mailing list
> Dnsmasq-discuss@lists.thekelleys.org.uk
> http://lists.thekelleys.org.uk/mailman/listinfo/dnsmasq-discuss

-- 
Groeten
Geert Stappers
-- 
Leven en laten leven

___
Dnsmasq-discuss mailing list
Dnsmasq-discuss@lists.thekelleys.org.uk
http://lists.thekelleys.org.uk/mailman/listinfo/dnsmasq-discuss

Re: [Dnsmasq-discuss] NXDOMAIN on exisiting A record

Re: [Dnsmasq-discuss] NXDOMAIN on exisiting A record

Re: [Dnsmasq-discuss] NXDOMAIN on exisiting A record

Re: [Dnsmasq-discuss] NXDOMAIN on exisiting A record

4 matches

Site Navigation

Mail list logo

Footer information