I am researching an issue we are seeing with significant volumes of DNS traffic
being sent to non-local forwarders. I think I understand how the srtt algorithm
works, but I am seeing more traffic going to the non-local forwarders than I
To give you some context, we have 2 forwarders in the UK and 2 in Hong Kong,
all 4 servers are responsible for outbound internet resolution. We also have a
number of resolving servers (in the UK and Hong Kong) that have these 4 servers
listed in their local "forwarders" statement, so I am expecting the HK
resolvers to forward mainly to the 2 local HK forwarders, with the occasional
query out to the 2 UK forwarders so that the rtt can be measured.
When I do a packet capture on a Hong Kong resolver, over a 5 minute period, 22%
of all packets captured are DNS queries being forwarded to the local HK
forwarders, and 14% of the packets captured are being sent to the UK forwarders
- this seems high to me. I had always believed that the number of queries sent
to non-local forwarders would be a lot lower, but from looking into this in
detail this doesn't seem to be the case.
When I do a ping from Honk Kong, the rtt to the UK forwarders is 180-190ms, in
contrast the local HK forwarder rtt is <1ms. I can see from dumping the cache
on the HK resolver that the rtt is indeed much lower to the HK servers:
; 10.<HK IP> [srtt 478560] [flags 00004000] [edns 146/5/4/4/4] [plain
0/0] [udpsize 2448] [ttl -1033437]
; 10.<HK IP> [srtt 648550] [flags 00004000] [edns 153/4/4/4/4] [plain
0/0] [udpsize 2270] [ttl -1033437]
; 10.<UK IP> [srtt 2774590] [flags 00004000] [edns 133/4/4/4/2] [plain
0/0] [udpsize 1160] [ttl -1033437]
; 10.<UK IP> [srtt 3477510] [flags 00004000] [edns 170/6/6/6/4] [plain
0/0] [udpsize 1012] [ttl -1033437]
I did some digging and came across this presentation:
This seems to imply on slide 16 that with lower query rates, BIND 9.8 has a
habit of sending fairly significant volumes to DNS servers with higher rtts. I
am wondering if this is still the case in BIND 9.10 or 9.11 and whether there
is anything that can be done about it?
In BIND 8 I think we could have used the topology statement to influence the
behaviour but I gather that is not an option in BIND 9?
Is there a solution to this because the slow responses back from the UK are
impacting application performance for users in HK?
We need to keep the UK servers as part of the configuration for
failover/redundancy, removing them is not an option.
Calleva Networks Ltd.
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe
from this list
bind-users mailing list