September 22, 2021 3:03 PM, "Andrey Sedletsky via Pdns-users" 
<pdns-users@mailman.powerdns.com>
wrote:

> Good afternoon!

Hi Andrey,

> After restarting the pdns-recursor process, the number of "outgoing 
> query timeout" and "over capacity drops" sharply increases, which leads 
> to serious degradation of the service.
> This behavior manifests itself at times of high load on the server (more 
> than 400 thousand requests per second). With a lower load, restarting 
> the process does not lead to such consequences.

Have you considered the possibility that 400 thousand queries per second is a 
load that is taxing your server to the brink of resource exhaustion? That sure 
is a lot of queries. According to 
https://pc.nanog.org/static/published/meetings/NANOG77/2142/20191029_Spacek_Lightning_Talk_Dns_v2.pdf
 they were able to achieve a lot less than that in 2019.

> We are interested in what could be the reason for this behavior

Upon the hunch that your setup might be in an overload scenario i followed 
'over-capacity-drops' in the code and ended up at 
https://github.com/PowerDNS/pdns/blob/97a4cff6fc7b3da1ff44d42b950cfc17d2fd95cf/pdns/pdns_recursor.cc#L3146
 so it seems that you have exhausted your thread capacity when that happens. 
See https://doc.powerdns.com/recursor/performance.html on how to tune the 
recursor however if that is not benchmark traffic but real world i would 
strongly suggest getting more servers installed.

The SERVFAIL response is just what i would expect in such a case. See 
https://www.rfc-editor.org/rfc/rfc1035.html#section-4.1.1 .

kinds regards,

 Stefan
_______________________________________________
Pdns-users mailing list
Pdns-users@mailman.powerdns.com
https://mailman.powerdns.com/mailman/listinfo/pdns-users

Reply via email to