+1 on 5. IMO we're overthinking this, 5 isn't unreasonably large, it isn't the end of the world if TCP gets triggered, and anyone it matters for will change the default value.
On Tue, Dec 5, 2017 at 8:33 AM, Dave Neuman <[email protected]> wrote: > Hey Dylan, > I think since we currently default to 0 (all) and we don't want to > re-invent the wheel right now, I think 5 sounds like a reasonable default. > > Thanks, > Dave > > On Tue, Dec 5, 2017 at 8:21 AM, Durfey, Ryan <[email protected]> > wrote: > > > Not sure if EDNS(0) extensions would make a difference here. > > > > The real issue for caching is balancing load across many caches while > > restricting content to as few caches as possible to maintain cache > > efficiency. Too few DNS answers risks load piling up on a few caches and > > overrunning them (though this is unlikely except in the case of very high > > throughput). Too many DNS answers (much more likely) spreads your > > service’s content across too many caches and increases the cache churn > and > > risk of hitting cold caches and having poor service performance. > > > > I spoke with our DNS team about a year ago about EDNS(0) relative to > > client sub-netting (ECS) and it was not embraced due to the fact that it > > made their recursion jump by several orders of magnitude and broke the > DNS > > system. Not sure if they plan to use EDNS(0) for other things, but not > > sure how that would factor into the load on the caches and need to spread > > that load via additional IP responses, but please educate me if you know > > something about this. > > > > In an ideal world TR monitors the popularity of a service based on > > incoming request counts per second and potentially expands or contracts > IP > > response. Given DNS caching that may be difficult to judge accurately, > but > > we may be able to use it to differentiate between a “1” and “4” response. > > I thought I cut a request for that a while back, but I can’t find it so I > > created a new one: https://github.com/apache/incubator-trafficcontrol/ > > issues/1614 > > > > Ryan Durfey M | 303-524-5099 > > CDN Support (24x7): 866-405-2993 or [email protected]<mailto: > > [email protected]> > > > > > > From: "Eric Friedrich (efriedri)" <[email protected]> > > Reply-To: "[email protected]" < > > [email protected]> > > Date: Monday, December 4, 2017 at 6:18 PM > > To: "[email protected]" < > > [email protected]>, "[email protected]" < > > [email protected]> > > Subject: Re: Changing max_dns_answers default > > > > Does EDNS0 (which TR already supports) reduce the severity of this > > problem? If so, could TR do an auto detection on if the sending resolver > > supports EDNS0 when deciding how big to make the response? > > > > —Eric > > > > On Dec 4, 2017, at 5:31 PM, Jason Tucker <[email protected]<mailto: > > [email protected]>> wrote: > > HTTP-routing seems to go to the opposite end of the spectrum - the > default > > is to use a dispersion of "1", which gives best cache efficiency as Ryan > > mentions. I think the behavior in this regard should be somewhat similar > > between HTTP and DNS routing. > > __Jason > > On Mon, Dec 4, 2017 at 10:19 PM, Durfey, Ryan <[email protected]< > > mailto:[email protected]>> > > wrote: > > I like the idea of code that makes it always under the threshold and I > > think this is a good feature to add, but from a practical perspective we > > always want the max dns response to be the minimum viable for cache > > efficiency. Most of our services (95%+) should be set to 1, 2, 3, or 4 > > correlated to throughput of the service. Making the default set to as > many > > as possible ensures that unless you are paying close attention you will > > have terrible cache efficiency. I would advocate for 2 or 3 since this > > would cover the majority of our services, keep cache efficiency > reasonable, > > and work for most other applications as well. I would also advocate to > add > > the threshold check in case someone goes too high or sets it to 0. > > *Ryan Durfey* M | 303-524-5099 <(303)%20524-5099> > > CDN Support (24x7): 866-405-2993 <(866)%20405-2993> or > > [email protected]<mailto:[email protected]> > > *From: *Jason Tucker <[email protected]<mailto: > [email protected] > > >> > > *Reply-To: *"[email protected]<mailto:de > > [email protected]>" < > > [email protected]<mailto:dev@ > > trafficcontrol.incubator.apache.org>>, "[email protected]<mailto: > > [email protected]>" < > > [email protected]<mailto:[email protected]>> > > *Date: *Monday, December 4, 2017 at 3:10 PM > > *To: *Phil Sorber <[email protected]<mailto:[email protected]>> > > *Cc: *"[email protected]<mailto:de > > [email protected]>" < > > [email protected]<mailto:dev@ > > trafficcontrol.incubator.apache.org>> > > *Subject: *Re: Changing max_dns_answers default > > I can't comment on the development effort for that (or the compute / > > latency overhead that it might add to TR), but I think having a default > > variable that could be set per TC installation doesn't seem unreasonable. > > __Jason > > On Mon, Dec 4, 2017 at 9:11 PM, Phil Sorber <[email protected]<mailto: > sorb > > [email protected]>> wrote: > > What about adding code that would count the bytes dynamically and make > > sure it keeps under the threshold? Maybe even make that the behavior for > > the current default of 0. > > On Mon, Dec 4, 2017 at 2:06 PM Jason Tucker <[email protected]< > > mailto:[email protected]>> > > wrote: > > Yes, this is the UDP thing. We've had customers with clients that sit > > behind DNS infrastructure that has problems with large response packets. > > However, the "max" is going to be installation dependent, though. > > Variables > > such as edge hostname convention, and CDN DNS domain suffixes are going > to > > cause that threshold to vary from installation to installtion. If you > have > > short FQDNS, you can fit many of them in a single UDP response. > > __Jason > > On Mon, Dec 4, 2017 at 9:00 PM, Phil Sorber <[email protected]<mailto: > sorb > > [email protected]>> wrote: > > You say it causes issues with "large cache groups". What is "large" in > > this > > context? Maybe we should pick a default that puts us slightly below > > that. > > Reading a little into your comment here, I assume the "problems" stems > > from > > the number of answers that fit in a UDP packet. Maybe we should just > > make > > the default below that threshold so we get as close to the max without > > causing said problems? > > Thanks. > > On Mon, Dec 4, 2017 at 12:52 PM Volz, Dylan <[email protected]< > > mailto:[email protected]>> > > wrote: > > Hi All, > > The max_dns_answers has been defaulted to 0, which is an unlimited > > number > > of answers, which causes issues for deployments with large cache > > groups. > > I > > opened a PR (1611< > > https://github.com/apache/incubator-trafficcontrol/pull/1611>< > > https://github.com/apache/incubator-trafficcontrol/pull/1611%3e>) to > > change > > the default from 0 to 5 which is hopefully a sensible value for most > > deployments. If this doesn’t seem like a sensible default please > > respond > > with alternatives. > > Thanks, > > Dylan > > > > > > >
