Otto, Thanks for your assistance.Since these were setup with private IPs I wasn't sure how useful the config would be however, I have included it below.
# rec_control dump-throttlemap - ; throttle map dump follows ; remote IP qname qtype count ttd reason 10.0.196.197 0.10.in-addr.arpa A 2 2025-04-18T18:44:22 RCodeRefused 10.0.196.197 10.10.in-addr.arpa A 3 2025-04-18T18:44:25 RCodeRefused 10.0.196.197 255.10.in-addr.arpa A 1 2025-04-18T18:44:23 RCodeRefused 10.0.62.244 0.10.in-addr.arpa A 2 2025-04-18T18:44:22 RCodeRefused 10.0.62.244 10.10.in-addr.arpa A 3 2025-04-18T18:44:25 RCodeRefused 10.0.62.244 255.10.in-addr.arpa A 2 2025-04-18T18:44:23 RCodeRefused dump-throttlemap: dumped 6 records # rec_control dump-failedservers - I removed any count 1 or 2 for brevity since this email is already a long read. ; failed servers dump follows ; remote IP count timestamp 203.119.25.5 8 2025-04-18T18:43:44 203.119.26.5 8 2025-04-18T18:43:42 203.119.27.5 8 2025-04-18T18:43:41 203.119.28.5 8 2025-04-18T18:43:39 203.119.29.5 8 2025-04-18T18:43:45 200.189.41.10 7 2025-04-18T18:42:46 200.219.148.10 6 2025-04-18T18:39:47 200.219.154.10 6 2025-04-18T18:42:43 200.219.159.10 7 2025-04-18T18:42:45 200.192.233.10 7 2025-04-18T18:42:40 200.229.248.10 4 2025-04-18T18:42:42 203.119.95.53 3 2025-04-18T18:39:30 203.119.86.101 1229 2025-04-18T18:40:03 35.173.255.124 4895 2025-04-18T18:36:21 dump-failedservers: dumped 43 records Config(s) Please note that one of the zones forwarding is 'split brained' from a legacy setup. The zone consists of a private Active Directory environment and a separately maintained public zone. The configuration forwards to the private AD servers and I believe the lua script drops queries that have no match in that zone. The public zone is being slowly phased out. I noted while reviewing the previous server configs and found a comment about this value but no context for the specific reasoning. This may explain the values you noted but I would like to understand the implications of removing it. It doesn't seem like something that should have been enabled. # https://github.com/PowerDNS/pdns/issues/6186 max-negative-ttl=0 /etc/pdns-recursor/recursor.conf --- dnssec: validation: validate incoming: allow_from: - 127.0.0.1/8 - 10.0.0.0/8 - 172.16.0.0/12 - 192.168.0.0/16 - 'fd00::/8' - '2607:B600::/32' listen: - 0.0.0.0 max_tcp_clients: 128 max_tcp_per_client: 0 max_tcp_queries_per_connection: 0 port: 53 tcp_timeout: 2 outgoing: dont_query: [] max_qperq: 50 network_timeout: 1500 packetcache: max_entries: 1000000 recordcache: max_entries: 1000000 max_negative_ttl: 0 max_ttl: 86400 recursor: daemon: false forward_zones: - zone: momentumbusiness.com recurse: false forwarders: - 10.255.255.76 - 10.1.3.228 - zone: 10.in-addr.arpa recurse: false forwarders: - 10.0.196.197 - 10.0.62.244 - zone: 168.192.in-addr.arpa recurse: false forwarders: - 10.0.196.197 - 10.0.62.244 - zone: 16.172.in-addr.arpa recurse: false forwarders: - 10.0.196.197 - 10.0.62.244 lua_dns_script: /etc/pdns-recursor/momentumbusiness_com.lua max_recursion_depth: 40 max_total_msec: 7000 minimum_ttl_override: 1 server_id: nsres01.momentumtelecom.com setgid: pdns-recursor setuid: pdns-recursor webservice: address: 0.0.0.0 allow_from: - 192.168.9.164 - 192.168.21.134 - 192.168.20.0/24 api_key: <sanitized> port: 8080 webserver: true logging: loglevel: 3 ... /etc/pdns-recursor/momentumbusiness_com.lua pdnslog("Lua NXDomain filter for momentumbusiness.com loading...", pdns.loglevels.Notice) nxdomainsuffix=newDN("momentumbusiness.com") function nxdomain(dq) if dq.qname:isPartOf(nxdomainsuffix) then dq.appliedPolicy.policyKind = pdns.policykinds.Drop return true end return false end On Fri, Apr 18, 2025 at 9:39 AM Otto Moerbeek <o...@drijf.net> wrote: > On Fri, Apr 18, 2025 at 08:28:48AM -0400, Scott Crace via Pdns-users wrote: > > Hi, > > Please include your config. That said: > > You seem to have pretty low cache hit ratio, a high number of outgoing > queries. How is your cache configged? > > Also some throttling is going on. I suspect rec has trouble contacting > one or more auths or forwarders. The throttling tables can be viewed > using > > rec_control dump-throttlemap - > rec_control dump-failedservers - > > Also, what happens *during* the trace can be very relevant. If one > auth (or forwarder) does not respond, rec will turn to another one, > but only after the timeout of 1500ms by default. > > -Otto > > > Hello all, > > Long time lurker on the message list and would like some performance > > and/or tuning advice. > > We've been using pdns-recursor as internal recursive nameservers for > quite > > some time now. > > The original implementer of pdns departed and I was recently tasked with > > replacing or upgrading all of the servers with newer RHEL9 versions. I > > opted to build fresh and migrate the configuration to the latest 5.2 > > release. > > > > I'm hearing occasional complaints about odd issues and/or clients cycling > > through their DNS servers rapidly (timeouts?). Manual testing DNS works > but > > I am reading through the metrics and performance documentation. I am > hoping > > someone with a more experienced eye could take a look at a sampling of > the > > periodic statistics report (below) and provide some insight or > > prioritization on any urgent issues I should focus on studying first. > > > > My observations: > > * I do note that the performance documentation talks about > > firewalld/stateful firewalls impact but the legacy servers were using the > > same basic setup. If the firewall is the problem is there a way to > validate > > this (other than stopping firewalld and waiting)? > > * The "worker" threads seem evenly distributed to my novice eye and our > qps > > (queries per second) rate is low as I would expect since the name servers > > are internal only resources. > > * I ran a few pcaps and rec_control trace-regex for specific domain items > > being reported as problematic. Everything seemed to be working with the > > trace-regex always showing "Step3 Final resolve: No Error/6 or 8". > > > > Thank you in advance for your time and consideration. > > > > Sincerely, > > Scotsie > > > > ``` > > Apr 17 16:07:28 nsrecdns01-1 pdns-recursor[1092]: msg="Periodic > statistics > > report" subsystem="stats" level="0" prio="Info" tid="0" > ts="1744920448.170" > > cache-entries="23666" negcache-entries="497" questions="6831695" > > record-cache-acquired="286931329" record-cache-contended="64414" > > record-cache-contended-perc="0.02" record-cache-hitratio-perc="0.87" > > Apr 17 16:07:28 nsrecdns01-1 pdns-recursor[1092]: msg="Periodic > statistics > > report" subsystem="stats" level="0" prio="Info" tid="0" > ts="1744920448.170" > > packetcache-acquired="16887684" packetcache-contended="1019" > > packetcache-contended-perc="0.01" packetcache-entries="7112" > > packetcache-hitratio-perc="37.75" > > Apr 17 16:07:28 nsrecdns01-1 pdns-recursor[1092]: msg="Periodic > statistics > > report" subsystem="stats" level="0" prio="Info" tid="0" > ts="1744920448.170" > > edns-entries="38" failed-host-entries="50" > > non-resolving-nameserver-entries="0" nsspeed-entries="968" > > saved-parent-ns-sets-entries="65" throttle-entries="8" > > Apr 17 16:07:28 nsrecdns01-1 pdns-recursor[1092]: msg="Periodic > statistics > > report" subsystem="stats" level="0" prio="Info" tid="0" > ts="1744920448.170" > > concurrent-queries="1" dot-outqueries="0" idle-tcpout-connections="0" > > outgoing-timeouts="36594" outqueries="14668546" > > outqueries-per-query-perc="214.71" tcp-outqueries="3131" > > throttled-queries-perc="1.90" > > Apr 17 16:07:28 nsrecdns01-1 pdns-recursor[1092]: msg="Periodic > statistics > > report" subsystem="stats" level="0" prio="Info" tid="0" > ts="1744920448.170" > > taskqueue-expired="0" taskqueue-pushed="540" taskqueue-size="0" > > Apr 17 16:07:28 nsrecdns01-1 pdns-recursor[1092]: msg="Queries handled by > > thread" subsystem="stats" level="0" prio="Info" tid="0" > ts="1744920448.170" > > count="3470098" thread="0" tname="worker" > > Apr 17 16:07:28 nsrecdns01-1 pdns-recursor[1092]: msg="Queries handled by > > thread" subsystem="stats" level="0" prio="Info" tid="0" > ts="1744920448.170" > > count="3360836" thread="1" tname="worker" > > Apr 17 16:07:28 nsrecdns01-1 pdns-recursor[1092]: msg="Queries handled by > > thread" subsystem="stats" level="0" prio="Info" tid="0" > ts="1744920448.171" > > count="764" thread="2" tname="tcpworker" > > Apr 17 16:07:28 nsrecdns01-1 pdns-recursor[1092]: msg="Periodic QPS > report" > > subsystem="stats" level="0" prio="Info" tid="0" ts="1744920448.171" > > averagedOver="1800" qps="117" > > ``` > > > _______________________________________________ > > Pdns-users mailing list > > Pdns-users@mailman.powerdns.com > > https://mailman.powerdns.com/mailman/listinfo/pdns-users > >
_______________________________________________ Pdns-users mailing list Pdns-users@mailman.powerdns.com https://mailman.powerdns.com/mailman/listinfo/pdns-users