Not sure if EDNS(0) extensions would make a difference here.

The real issue for caching is balancing load across many caches while 
restricting content to as few caches as possible to maintain cache efficiency.  
Too few DNS answers risks load piling up on a few caches and overrunning them 
(though this is unlikely except in the case of very high throughput).  Too many 
DNS answers (much more likely) spreads your service’s content across too many 
caches and increases the cache churn and risk of hitting cold caches and having 
poor service performance.

I spoke with our DNS team about a year ago about EDNS(0) relative to client 
sub-netting (ECS) and it was not embraced due to the fact that it made their 
recursion jump by several orders of magnitude and broke the DNS system.  Not 
sure if they plan to use EDNS(0) for other things, but not sure how that would 
factor into the load on the caches and need to spread that load via additional 
IP responses, but please educate me if you know something about this.

In an ideal world TR monitors the popularity of a service based on incoming 
request counts per second and potentially expands or contracts IP response.  
Given DNS caching that may be difficult to judge accurately, but we may be able 
to use it to differentiate between a “1” and “4” response.  I thought I cut a 
request for that a while back, but I can’t find it so I created a new one: 
https://github.com/apache/incubator-trafficcontrol/issues/1614

Ryan Durfey    M | 303-524-5099
CDN Support (24x7): 866-405-2993 or 
[email protected]<mailto:[email protected]>


From: "Eric Friedrich (efriedri)" <[email protected]>
Reply-To: "[email protected]" 
<[email protected]>
Date: Monday, December 4, 2017 at 6:18 PM
To: "[email protected]" 
<[email protected]>, "[email protected]" 
<[email protected]>
Subject: Re: Changing max_dns_answers default

Does EDNS0 (which TR already supports) reduce the severity of this problem? If 
so, could TR do an auto detection on if the sending resolver supports EDNS0 
when deciding how big to make the response?

—Eric

On Dec 4, 2017, at 5:31 PM, Jason Tucker 
<[email protected]<mailto:[email protected]>> wrote:
HTTP-routing seems to go to the opposite end of the spectrum - the default
is to use a dispersion of "1", which gives best cache efficiency as Ryan
mentions. I think the behavior in this regard should be somewhat similar
between HTTP and DNS routing.
__Jason
On Mon, Dec 4, 2017 at 10:19 PM, Durfey, Ryan 
<[email protected]<mailto:[email protected]>>
wrote:
I like the idea of code that makes it always under the threshold and I
think this is a good feature to add, but from a practical perspective we
always want the max dns response to be the minimum viable for cache
efficiency.  Most of our services (95%+) should be set to 1, 2, 3, or 4
correlated to throughput of the service.  Making the default set to as many
as possible ensures that unless you are paying close attention you will
have terrible cache efficiency.  I would advocate for 2 or 3 since this
would cover the majority of our services, keep cache efficiency reasonable,
and work for most other applications as well.  I would also advocate to add
the threshold check in case someone goes too high or sets it to 0.
*Ryan Durfey*    M | 303-524-5099 <(303)%20524-5099>
CDN Support (24x7): 866-405-2993 <(866)%20405-2993> or
[email protected]<mailto:[email protected]>
*From: *Jason Tucker <[email protected]<mailto:[email protected]>>
*Reply-To: 
*"[email protected]<mailto:[email protected]>"
 <
[email protected]<mailto:[email protected]>>,
 "[email protected]<mailto:[email protected]>" <
[email protected]<mailto:[email protected]>>
*Date: *Monday, December 4, 2017 at 3:10 PM
*To: *Phil Sorber <[email protected]<mailto:[email protected]>>
*Cc: 
*"[email protected]<mailto:[email protected]>"
 <
[email protected]<mailto:[email protected]>>
*Subject: *Re: Changing max_dns_answers default
I can't comment on the development effort for that (or the compute /
latency overhead that it might add to TR), but I think having a default
variable that could be set per TC installation doesn't seem unreasonable.
__Jason
On Mon, Dec 4, 2017 at 9:11 PM, Phil Sorber 
<[email protected]<mailto:[email protected]>> wrote:
What about adding code that would count the bytes dynamically and make
sure it keeps under the threshold? Maybe even make that the behavior for
the current default of 0.
On Mon, Dec 4, 2017 at 2:06 PM Jason Tucker 
<[email protected]<mailto:[email protected]>>
wrote:
Yes, this is the UDP thing. We've had customers with clients that sit
behind DNS infrastructure that has problems with large response packets.
However, the "max" is going to be installation dependent, though.
Variables
such as edge hostname convention, and CDN DNS domain suffixes are going to
cause that threshold to vary from installation to installtion. If you have
short FQDNS, you can fit many of them in a single UDP response.
__Jason
On Mon, Dec 4, 2017 at 9:00 PM, Phil Sorber 
<[email protected]<mailto:[email protected]>> wrote:
You say it causes issues with "large cache groups". What is "large" in
this
context? Maybe we should pick a default that puts us slightly below
that.
Reading a little into your comment here, I assume the "problems" stems
from
the number of answers that fit in a UDP packet. Maybe we should just
make
the default below that threshold so we get as close to the max without
causing said problems?
Thanks.
On Mon, Dec 4, 2017 at 12:52 PM Volz, Dylan 
<[email protected]<mailto:[email protected]>>
wrote:
Hi All,
The max_dns_answers has been defaulted to 0, which is an unlimited
number
of answers, which causes issues for deployments with large cache
groups.
I
opened a PR (1611<
https://github.com/apache/incubator-trafficcontrol/pull/1611><https://github.com/apache/incubator-trafficcontrol/pull/1611%3e>)
 to
change
the default from 0 to 5 which is hopefully a sensible value for most
deployments. If this doesn’t seem like a sensible default please
respond
with alternatives.
Thanks,
Dylan


Reply via email to