Subject: DNSMASQ failing to return SRV records with loss of communication to a 
single DNS server

Issue:  We have SIP SRV records for a domain which can be provided by two DNS 
servers in our environment.  During testing we have noticed that if one of the 
DNS servers is un-reachable, the request for the SRV records via dnsmasq times 
out.


This only happens when the query is originated from outside the box where 
dnsmasq is running.  IE - if we issue the SRV query from the dnsmasq server, 
the SRV records are returned.  If we issue the request from a client VM which 
is set to resolve queries against our dnsmasq host - the request times out.



Note:  some of the information below has been changed/replaced with xxx,  such 
as IP addresses and domain names for security reasons.



Dnsmasq.conf has the following entries - indicating to forward requests for 
labdomain.net to 10.xx.xx.12 and 10.xx.xx.20.

server=/labdomain.net/10.xx.xx.12

server=/labdomain.net/10.xx.xx.20



VM making SRV queries is 10.xx.xx.99





When we query for an SRV record with 10.xx.xx.5 being our DNSMASQ server, and 
have commented out the non-reachable DNS server: 10.xx.xx.12 - we receive a 
response to the SRV query.



#server=/labdomain.net/10.xx.xx.12

server=/labdomain.net/10.xx.xx.20





[labuser@f5-test ~]$ dig srv _sip._udp.scscf.sprout.lp.labdomain.net @10.xx.xx.5

;; Truncated, retrying in TCP mode.



; <<>> DiG 9.8.2rc1-RedHat-9.8.2-0.62.rc1.el6_9.5 <<>> srv 
_sip._udp.scscf.sprout.lp.labdomain.net @10.xx.xx.5

;; global options: +cmd

;; Got answer:

;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 14584

;; flags: qr aa; QUERY: 1, ANSWER: 5, AUTHORITY: 0, ADDITIONAL: 5



;; QUESTION SECTION:

;_sip._udp.scscf.sprout.lp.labdomain.net. IN SRV



;; ANSWER SECTION:

_sip._udp.scscf.sprout.lp.labdomain.net. 15 IN SRV 10 50 5054 
ovpklp-viscscf-spn-05.labdomain.net.

_sip._udp.scscf.sprout.lp.labdomain.net. 15 IN SRV 10 50 5054 
ovpklp-viscscf-spn-01.labdomain.net.

_sip._udp.scscf.sprout.lp.labdomain.net. 15 IN SRV 10 50 5054 
ovpklp-viscscf-spn-02.labdomain.net.

_sip._udp.scscf.sprout.lp.labdomain.net. 15 IN SRV 10 50 5054 
ovpklp-viscscf-spn-03.labdomain.net.

_sip._udp.scscf.sprout.lp.labdomain.net. 15 IN SRV 10 50 5054 
ovpklp-viscscf-spn-04.labdomain.net.



;; ADDITIONAL SECTION:

ovpklp-viscscf-spn-05.labdomain.net. 43200 IN A 10.xx.xx.18

ovpklp-viscscf-spn-01.labdomain.net. 43200 IN A 10.xx.xx.14

ovpklp-viscscf-spn-02.labdomain.net. 43200 IN A 10.xx.xx.15

ovpklp-viscscf-spn-03.labdomain.net. 43200 IN A 10.xx.xx.16

ovpklp-viscscf-spn-04.labdomain.net. 43200 IN A 10.xx.xx.17



;; Query time: 2 msec

;; SERVER: 10.xx.xx.5#53(10.xx.xx.5)

;; WHEN: Mon Aug 13 16:34:40 2018

;; MSG SIZE  rcvd: 528





When we query for an SRV record with 10.xx.xx.5 being our DNSMASQ server, and 
have both the good and non-reachable DNS server in play - we receive a timeout 
to the SRV query.  In this case - 10.xx.xx.20 is fully capable of responding to 
the SRV query.


server=/labdomain.net/10.xx.xx.12        <-- not reachable

server=/labdomain.net/10.xx.xx.20



[labuser@f5-test ~]$ dig srv _sip._udp.scscf.sprout.lp.labdomain.net @10.xx.xx.5

;; Truncated, retrying in TCP mode.



; <<>> DiG 9.8.2rc1-RedHat-9.8.2-0.62.rc1.el6_9.5 <<>> srv 
_sip._udp.scscf.sprout.lp.labdomain.net @10.xx.xx.5

;; global options: +cmd

;; connection timed out; no servers could be reached


Dnsmasq logging shows:

Aug 14 16:22:14 vsmslp-az2-dev-dnsmasq1-mgt dnsmasq[5161]: query[SRV] 
_sip._udp.scscf.sprout.lp.labdomain.net from 10.xx.xx.99
Aug 14 16:22:14 vsmslp-az2-dev-dnsmasq1-mgt dnsmasq[5161]: forwarded 
_sip._udp.scscf.sprout.lp.labdomain.net to 10.xx.xx.12
Aug 14 16:22:14 vsmslp-az2-dev-dnsmasq1-mgt dnsmasq[5161]: forwarded 
_sip._udp.scscf.sprout.lp.labdomain.net to 10.xx.xx.20
Aug 14 16:22:14 vsmslp-az2-dev-dnsmasq1-mgt dnsmasq[5161]: nameserver 
10.xx.xx.20 refused to do a recursive query
Aug 14 16:22:14 vsmslp-az2-dev-dnsmasq1-mgt dnsmasq[5172]: query[SRV] 
_sip._udp.scscf.sprout.lp.labdomain.net from 10.xx.xx.99
Aug 14 16:22:24 vsmslp-az2-dev-dnsmasq1-mgt dnsmasq[5173]: query[SRV] 
_sip._udp.scscf.sprout.lp.labdomain.net from 10.xx.xx.99
Aug 14 16:22:34 vsmslp-az2-dev-dnsmasq1-mgt dnsmasq[5174]: query[SRV] 
_sip._udp.scscf.sprout.lp.labdomain.net from 10.xx.xx.99


I could use some ideas on how to further troubleshoot this issue.




Andy Warner
Telecom Design Engineer
O: 406-752-3330 / M: 913-972-7521
andrew.c.war...@sprint.com
[cid:408000_086801428601138001@pvmxe13g01]

_______________________________________________
Dnsmasq-discuss mailing list
Dnsmasq-discuss@lists.thekelleys.org.uk
http://lists.thekelleys.org.uk/mailman/listinfo/dnsmasq-discuss

Reply via email to