On 2020-03-06 8:17 a.m., Fredrik Pettai via dnsdist wrote:
Hi Remi,
Thanks for your clarifications (see inline below)
On 6 Mar 2020, at 14:26, Remi Gacogne via dnsdist
<[email protected]> wrote:
Signed PGP part
Hi,
On 3/6/20 8:09 AM, Fredrik Pettai via dnsdist wrote:
On 6 Mar 2020, at 05:42, Michael Van Der Beek <[email protected]> wrote:
Have you noticed this setting on dnsdist.
setUDPTimeout(num)
Yes, I did, but I didn’t play around with that before I sent the email to the
mailing list
Set the maximum time dnsdist will wait for a response from a backend over UDP,
in seconds. Defaults to 2
I'm not sure if timeouts are classified as drops. My guess probably, because it
didn't get a response in time.
Yes they are.
"Drops", as reported by dnsdist, are almost always cause by the backend
not responding fast enough. On some setups, dealing with 100k+ qps, it
might also be caused by dnsdist not processing the responses fast
enough, but that's very easy to spot because at least one of the dnsdist
threads will use ~100% of one core.
Since your backend is a recursor. There are times that the recursor cannot
reach or encounters a non-responsive authoritative server. Unbound has an
exponential backoff when querying such servers. I think it starts with 10s.
https://nlnetlabs.nl/documentation/unbound/info-timeout/
I would suggest you set the dnsdist setUDPTImeout(10), frankly, if Unbound cannot
respond to you in < 10 seconds, most likely the target authoritative server is
not responding.
Good point, while I didn’t turn to the unbound documentation (thanks for the
pointer) I played around with the UDPTimeout setting yesterday,
first increasing to setUDPTImeout(5), which yielded better results in terms of
Drops (and increased the latency) and then later to 15, just to be sure that
unbound really should be done with queries, and noticed that the Drops became a
lot less (and latency increase again). But as you suggest, setUDPTImeout(10) is
probably the ultimate setting.
OK so that settles it, your backends are not responding fast enough to
some queries. I would really advise you to try to understand why the
backend is taking so long to respond, instead of tuning dnsdist via
setUDPTImeout(), because a latency greater than 2s is going to cause a
lot of issues anyway.
Right, in this case the #1 reason for those queries that don’t make it under 2s,
are queries that some MX servers & software on those generates
A lot of crappy stuff out on the Internet are in contact with those
servers/services, so broken reverse zones or badly setup domains that spams are
what I see in topSlow() all the time
You may see some benefit to adjusting network-timeout (if you use
powerdns recursor) or similar settings to lower values. Most people have
found not much legitimate use to wait 1500ms or more for a query
response before trying another name server in the set.
This brings back one of the (last) questions in my original email, which was;
Is there a simple way to move those long tail queries / DNS clients into a “slow
pool"?
Or maybe I should rephrase it to;
From a dnsdist PoW; would it be a good idea to move away clients that ask lots
of questions about badly functioning domains, to their own worker pool?
I don’t seem to find any ready-to-use Rule/Action for applying clients that are
causing X amount of SERVFAILs (or Timeouts) to a PoolAction.
(Although, I see there's a possibility to block clients with such query pattern
(SERVFAIL/s), but that’s not the right solution or service in this case.)
(I’m guessing “anything can be done” with some clever Lua scripting, but that’s not
really same as “simple")
I thought of using a NMG for statically map such client’s (the MX servers) into
their own worker pool, but I didn’t get that to work :(
(perhaps I did it wrong or I misinterpret the function of a NMG)
Re,
Fredrik
_______________________________________________
dnsdist mailing list
[email protected]
https://mailman.powerdns.com/mailman/listinfo/dnsdist
_______________________________________________
dnsdist mailing list
[email protected]
https://mailman.powerdns.com/mailman/listinfo/dnsdist