I am researching an issue we are seeing with significant volumes of DNS traffic 
being sent to non-local forwarders. I think I understand how the srtt algorithm 
works, but I am seeing more traffic going to the non-local forwarders than I 
was expecting.

To give you some context, we have 2 forwarders in the UK and 2 in Hong Kong, 
all 4 servers are responsible for outbound internet resolution. We also have a 
number of resolving servers (in the UK and Hong Kong) that have these 4 servers 
listed in their local "forwarders" statement, so I am expecting the HK 
resolvers to forward mainly to the 2 local HK forwarders, with the occasional 
query out to the 2 UK forwarders so that the rtt can be measured.

When I do a packet capture on a Hong Kong resolver, over a 5 minute period, 22% 
of all packets captured are DNS queries being forwarded to the local HK 
forwarders, and 14% of the packets captured are being sent to the UK forwarders 
- this seems high to me. I had always believed that the number of queries sent 
to non-local forwarders would be a lot lower, but from looking into this in 
detail this doesn't seem to be the case.

When I do a ping from Honk Kong, the rtt to the UK forwarders is 180-190ms, in 
contrast the local HK forwarder rtt is <1ms. I can see from dumping the cache 
on the HK resolver that the rtt is indeed much lower to the HK servers:

;       10.<HK IP> [srtt 478560] [flags 00004000] [edns 146/5/4/4/4] [plain 
0/0] [udpsize 2448] [ttl -1033437]

;       10.<HK IP> [srtt 648550] [flags 00004000] [edns 153/4/4/4/4] [plain 
0/0] [udpsize 2270] [ttl -1033437]

;       10.<UK IP> [srtt 2774590] [flags 00004000] [edns 133/4/4/4/2] [plain 
0/0] [udpsize 1160] [ttl -1033437]

;       10.<UK IP> [srtt 3477510] [flags 00004000] [edns 170/6/6/6/4] [plain 
0/0] [udpsize 1012] [ttl -1033437]

I did some digging and came across this presentation: 

This seems to imply on slide 16 that with lower query rates, BIND 9.8 has a 
habit of sending fairly significant volumes to DNS servers with higher rtts. I 
am wondering if this is still the case in BIND 9.10 or 9.11 and whether there 
is anything that can be done about it?

In BIND 8 I think we could have used the topology statement to influence the 
behaviour but I gather that is not an option in BIND 9?

Is there a solution to this because the slow responses back from the UK are 
impacting application performance for users in HK?

We need to keep the UK servers as part of the configuration for 
failover/redundancy, removing them is not an option.



Paul Roberts
Calleva Networks Ltd.
Email: p...@callevanetworks.com

Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list

Reply via email to