On 2020-03-06 8:17 a.m., Fredrik Pettai via dnsdist wrote:
Hi Remi,

Thanks for your clarifications (see inline below)

On 6 Mar 2020, at 14:26, Remi Gacogne via dnsdist 
<[email protected]> wrote:

Signed PGP part
Hi,

On 3/6/20 8:09 AM, Fredrik Pettai via dnsdist wrote:
On 6 Mar 2020, at 05:42, Michael Van Der Beek <[email protected]> wrote:
Have you noticed this setting on dnsdist.
setUDPTimeout(num)

Yes, I did, but I didn’t play around with that before I sent the email to the 
mailing list

Set the maximum time dnsdist will wait for a response from a backend over UDP, 
in seconds. Defaults to 2
I'm not sure if timeouts are classified as drops. My guess probably, because it 
didn't get a response in time.

Yes they are.

"Drops", as reported by dnsdist, are almost always cause by the backend
not responding fast enough. On some setups, dealing with 100k+ qps, it
might also be caused by dnsdist not processing the responses fast
enough, but that's very easy to spot because at least one of the dnsdist
threads will use ~100% of one core.

Since your backend is a recursor. There are times that the recursor cannot 
reach or encounters a non-responsive authoritative server.  Unbound has an 
exponential backoff when querying such servers. I think it starts with 10s.
https://nlnetlabs.nl/documentation/unbound/info-timeout/

I would suggest you set the dnsdist setUDPTImeout(10), frankly, if Unbound cannot 
respond to you in < 10 seconds, most likely the target authoritative server is 
not responding.

Good point, while I didn’t turn to the unbound documentation (thanks for the 
pointer) I played around with the UDPTimeout setting yesterday,
first increasing to setUDPTImeout(5), which yielded better results in terms of 
Drops (and increased the latency) and then later to 15, just to be sure that 
unbound really should be done with queries, and noticed that the Drops became a 
lot less (and latency increase again). But as you suggest, setUDPTImeout(10) is 
probably the ultimate setting.

OK so that settles it, your backends are not responding fast enough to
some queries. I would really advise you to try to understand why the
backend is taking so long to respond, instead of tuning dnsdist via
setUDPTImeout(), because a latency greater than 2s is going to cause a
lot of issues anyway.

Right, in this case the #1 reason for those queries that don’t make it under 2s, 
are queries that some MX servers & software on those generates
A lot of crappy stuff out on the Internet are in contact with those 
servers/services, so broken reverse zones or badly setup domains that spams are 
what I see in topSlow() all the time


You may see some benefit to adjusting network-timeout (if you use powerdns recursor) or similar settings to lower values. Most people have found not much legitimate use to wait 1500ms or more for a query response before trying another name server in the set.


This brings back one of the (last) questions in my original email, which was;
Is there a simple way to move those long tail queries / DNS clients into a “slow 
pool"?
Or maybe I should rephrase it to;
 From a dnsdist PoW; would it be a good idea to move away clients that ask lots 
of questions about badly functioning domains, to their own worker pool?

I don’t seem to find any ready-to-use Rule/Action for applying clients that are 
causing X amount of SERVFAILs (or Timeouts) to a PoolAction.
(Although, I see there's a possibility to block clients with such query pattern 
(SERVFAIL/s), but that’s not the right solution or service in this case.)
(I’m guessing “anything can be done” with some clever Lua scripting, but that’s not 
really same as “simple")

I thought of using a NMG for statically map such client’s (the MX servers) into 
their own worker pool, but I didn’t get that to work :(
(perhaps I did it wrong or I misinterpret the function of a NMG)

Re,
Fredrik


_______________________________________________
dnsdist mailing list
[email protected]
https://mailman.powerdns.com/mailman/listinfo/dnsdist


_______________________________________________
dnsdist mailing list
[email protected]
https://mailman.powerdns.com/mailman/listinfo/dnsdist

Reply via email to