In my experience pdns_recursor (okay, I tested only with older versions)
will not retry fast enough to have a 100% user experience.

I moved to bgp with my internal auth addresses. The auths check themself
and announce their service IP only if they are ready to answer.

If you don't have the chance to move to bgp, give dnsdist a try. In my
experience it does a very good job on figuring out whether a server is
up or not.

Both options complicate your setup. You could experiment with
server-down-max-fails and server-down-throttle-time to minimize the
number of lost queries to not responding Nameservers. But thats
dangerous too, because this setting is for all servers, not only your
internal auths.

But remember, pdns_recursor does not do background checking whether a
Nameserver is alive. Background checking is only done by dnsdist afaik.

Cheers Thomas

Am 08.02.22 um 13:08 schrieb Prochazka via Pdns-users:
Hello,

using pdns-recursor 4.5.7-1pdns.bullseye i am getting problem with dns
redundancy for records with expired ttl (best seen on low ttl). Forward
zones are used for internal domains only. Our clients has configured 3
recurcors (resolv.conf) and every recursor connect to any of the four
auth servers for our domains. All subdomains are delegated to own zones
but resides on the same auth servers, extra step is using forward-zones.
I thought, it's depending on configured order, so i set it to use same
location first and remote location on the end (evading firewall, if it's
possible).

Pdns recursor config:

...
forward-zones=
forward-zones+=some.domain.tld=AUTH1_ipv6
forward-zones+=some.domain.tld=AUTH1_ipv4
forward-zones+=some.domain.tld=AUTH2_ipv6
forward-zones+=some.domain.tld=AUTH2_ipv4
forward-zones+=some.domain.tld=AUTH3_ipv6
forward-zones+=some.domain.tld=AUTH3_ipv4
forward-zones+=some.domain.tld=AUTH4_ipv6
forward-zones+=some.domain.tld=AUTH4_ipv4
...

AAAA dns query:
;; QUESTION SECTION:
;host.some.domain.tld.    IN    AAAA

;; ANSWER SECTION:
host.some.domain.tld. 60    IN    CNAME    host1.some.domain.tld.
host1.some.domain.tld. 3600 IN    AAAA    host1_ipv6

Problem:
When there is maintenance on for example AUTH4 (server is offline):

Client <-> Recursor:
233336    2022-02-08 01:57:58,031241    client_ipv6    REC1_ipv6
DNS    106    Standard query 0x7f30 AAAA host.some.domain.tld
233337    2022-02-08 01:57:58,031241    client_ipv6    REC1_ipv6
DNS    106    Standard query 0xb42e A host.some.domain.tld
233442    2022-02-08 01:57:59,902472    REC1_ipv6    client_ipv6
DNS    106    Standard query response 0x7f30 Server failure AAAA
host.some.domain.tld
233443    2022-02-08 01:57:59,902577    REC1_ipv6    client_ipv6
DNS    106    Standard query response 0xb42e Server failure A
host.some.domain.tld

Recursor <-> Auth:
196982    2022-02-08 01:57:58,031733    REC1_ipv4    AUTH4_ipv4
DNS    97    Standard query 0xedac AAAA host.some.domain.tld OPT
196983    2022-02-08 01:57:58,031981    REC1_ipv4    AUTH4_ipv4
DNS    97    Standard query 0x1246 A host.some.domain.tld OPT
...
197989    2022-02-08 01:58:13,667275    REC1_ipv4    AUTH1_ipv4
DNS    107    Standard query 0xf4e9 A host.some.domain.tld.domain.tld OPT
197990    2022-02-08 01:58:13,667542    REC1_ipv4    AUTH1_ipv4
DNS    107    Standard query 0xff8c AAAA host.some.domain.tld.domain.tld
OPT
197991    2022-02-08 01:58:13,671010    AUTH1_ipv4    REC1_ipv4
DNS    154    Standard query response 0xf4e9 No such name A
host.some.domain.tld.domain.tld SOA ns.domain.tld OPT
197992    2022-02-08 01:58:13,671222    AUTH1_ipv4    REC1_ipv4
DNS    154    Standard query response 0xff8c No such name AAAA
host.some.domain.tld.domain.tld SOA ns.domain.tld OPT
...
218012    2022-02-08 02:02:03,229271    REC1_ipv4    AUTH4_ipv4
DNS    97    Standard query 0xce1c A host.some.domain.tld OPT
218013    2022-02-08 02:02:03,229359    REC1_ipv4    AUTH4_ipv4
DNS    97    Standard query 0xccf5 AAAA host.some.domain.tld OPT
218014    2022-02-08 02:02:03,232700    AUTH4_ipv4    REC1_ipv4
DNS    140    Standard query response 0xce1c A host.some.domain.tld
CNAME host1.some.domain.tld A host1_ipv4 OPT
218015    2022-02-08 02:02:03,232700    AUTH4_ipv4    REC1_ipv4
DNS    152    Standard query response 0xccf5 AAAA host.some.domain.tld
CNAME host1.some.domain.tld AAAA host1_ipv6 OPT

It looks as recursor is querying the same Auth server for such record
until server is up. How to change such setup so maintenance don't break
resolving?

Thanks.

_______________________________________________
Pdns-users mailing list
Pdns-users@mailman.powerdns.com
https://mailman.powerdns.com/mailman/listinfo/pdns-users
_______________________________________________
Pdns-users mailing list
Pdns-users@mailman.powerdns.com
https://mailman.powerdns.com/mailman/listinfo/pdns-users

Reply via email to