In my experience pdns_recursor (okay, I tested only with older versions) will not retry fast enough to have a 100% user experience.
I moved to bgp with my internal auth addresses. The auths check themself and announce their service IP only if they are ready to answer. If you don't have the chance to move to bgp, give dnsdist a try. In my experience it does a very good job on figuring out whether a server is up or not. Both options complicate your setup. You could experiment with server-down-max-fails and server-down-throttle-time to minimize the number of lost queries to not responding Nameservers. But thats dangerous too, because this setting is for all servers, not only your internal auths. But remember, pdns_recursor does not do background checking whether a Nameserver is alive. Background checking is only done by dnsdist afaik. Cheers Thomas Am 08.02.22 um 13:08 schrieb Prochazka via Pdns-users:
Hello, using pdns-recursor 4.5.7-1pdns.bullseye i am getting problem with dns redundancy for records with expired ttl (best seen on low ttl). Forward zones are used for internal domains only. Our clients has configured 3 recurcors (resolv.conf) and every recursor connect to any of the four auth servers for our domains. All subdomains are delegated to own zones but resides on the same auth servers, extra step is using forward-zones. I thought, it's depending on configured order, so i set it to use same location first and remote location on the end (evading firewall, if it's possible). Pdns recursor config: ... forward-zones= forward-zones+=some.domain.tld=AUTH1_ipv6 forward-zones+=some.domain.tld=AUTH1_ipv4 forward-zones+=some.domain.tld=AUTH2_ipv6 forward-zones+=some.domain.tld=AUTH2_ipv4 forward-zones+=some.domain.tld=AUTH3_ipv6 forward-zones+=some.domain.tld=AUTH3_ipv4 forward-zones+=some.domain.tld=AUTH4_ipv6 forward-zones+=some.domain.tld=AUTH4_ipv4 ... AAAA dns query: ;; QUESTION SECTION: ;host.some.domain.tld. IN AAAA ;; ANSWER SECTION: host.some.domain.tld. 60 IN CNAME host1.some.domain.tld. host1.some.domain.tld. 3600 IN AAAA host1_ipv6 Problem: When there is maintenance on for example AUTH4 (server is offline): Client <-> Recursor: 233336 2022-02-08 01:57:58,031241 client_ipv6 REC1_ipv6 DNS 106 Standard query 0x7f30 AAAA host.some.domain.tld 233337 2022-02-08 01:57:58,031241 client_ipv6 REC1_ipv6 DNS 106 Standard query 0xb42e A host.some.domain.tld 233442 2022-02-08 01:57:59,902472 REC1_ipv6 client_ipv6 DNS 106 Standard query response 0x7f30 Server failure AAAA host.some.domain.tld 233443 2022-02-08 01:57:59,902577 REC1_ipv6 client_ipv6 DNS 106 Standard query response 0xb42e Server failure A host.some.domain.tld Recursor <-> Auth: 196982 2022-02-08 01:57:58,031733 REC1_ipv4 AUTH4_ipv4 DNS 97 Standard query 0xedac AAAA host.some.domain.tld OPT 196983 2022-02-08 01:57:58,031981 REC1_ipv4 AUTH4_ipv4 DNS 97 Standard query 0x1246 A host.some.domain.tld OPT ... 197989 2022-02-08 01:58:13,667275 REC1_ipv4 AUTH1_ipv4 DNS 107 Standard query 0xf4e9 A host.some.domain.tld.domain.tld OPT 197990 2022-02-08 01:58:13,667542 REC1_ipv4 AUTH1_ipv4 DNS 107 Standard query 0xff8c AAAA host.some.domain.tld.domain.tld OPT 197991 2022-02-08 01:58:13,671010 AUTH1_ipv4 REC1_ipv4 DNS 154 Standard query response 0xf4e9 No such name A host.some.domain.tld.domain.tld SOA ns.domain.tld OPT 197992 2022-02-08 01:58:13,671222 AUTH1_ipv4 REC1_ipv4 DNS 154 Standard query response 0xff8c No such name AAAA host.some.domain.tld.domain.tld SOA ns.domain.tld OPT ... 218012 2022-02-08 02:02:03,229271 REC1_ipv4 AUTH4_ipv4 DNS 97 Standard query 0xce1c A host.some.domain.tld OPT 218013 2022-02-08 02:02:03,229359 REC1_ipv4 AUTH4_ipv4 DNS 97 Standard query 0xccf5 AAAA host.some.domain.tld OPT 218014 2022-02-08 02:02:03,232700 AUTH4_ipv4 REC1_ipv4 DNS 140 Standard query response 0xce1c A host.some.domain.tld CNAME host1.some.domain.tld A host1_ipv4 OPT 218015 2022-02-08 02:02:03,232700 AUTH4_ipv4 REC1_ipv4 DNS 152 Standard query response 0xccf5 AAAA host.some.domain.tld CNAME host1.some.domain.tld AAAA host1_ipv6 OPT It looks as recursor is querying the same Auth server for such record until server is up. How to change such setup so maintenance don't break resolving? Thanks. _______________________________________________ Pdns-users mailing list Pdns-users@mailman.powerdns.com https://mailman.powerdns.com/mailman/listinfo/pdns-users
_______________________________________________ Pdns-users mailing list Pdns-users@mailman.powerdns.com https://mailman.powerdns.com/mailman/listinfo/pdns-users