Re: [Pdns-users] pdns_recursor issue

2023-01-26 Thread Otto Moerbeek via Pdns-users
On Thu, Jan 26, 2023 at 10:57:21PM +0100, Arien Vijn wrote:

> 
> > On 26 Jan 2023, at 19:00, Otto Moerbeek  wrote:
> 
> [...]
> 
> > I expect the aggressive cache workaround to function.
> 
> It seems so indeed.
> 
> > What is happening is that a query of a non-existent type (e.g. )
> > for xdsl-c-serviceweb.gslb.kpn.com 
> > 
> > $ dig @ns1gslb.kpn.com.  xdsl-c-serviceweb.gslb.kpn.com 
> >   +dnssec
> > 
> > produces an NSEC3 record that denies all types except TXT and RRSIG:
> > 
> > cq026lgcduus730qu6cbhtrt7qpr2jnu.gslb.kpn.com 
> > . 86400 IN NSEC3 
> > 1 0 1 19623DE58C1E7E40 CQ026LGCDUUS730QU6CBHTRT7QPR2JNV TXT RRSIG
> > 
> > So when the A record expires and somebody has done an  query in
> > between, the aggressive cache concludes that the wanted A record  does
> > not exists and not even asks the auth for it.
> > 
> > The after a cache wipe it works because when the (aggressive) cache is
> > empty for that zone, there is also no NSEC3 record denying anything.
> > 
> > So in the end this is a misconfigured domain. Completely disabling the
> > aggressive cache is a bit of a big hammer, you can also add an NTA for
> > the specific problem domain, something like:
> > 
> > addNTA('gslb.kpn.com ', 'Invalid NSEC3 record served 
> > for xdsl-c-serviceweb.gslb.kpn.com 
> > ')
> > 
> > in your Lua config file. This effectively does disable DNSSEC for the
> > domain. And please also report this to KPN.
> 
> Thanks for the explanation! This is really useful because KPN pointed to our 
> DNS= servers.
> 
> We also saw this with other (KPN hosted) 'gslb-domains', which also show no 
> trouble anymore after disabling the
> aggressive cache. So, if we go the NTA-way then I am afraid that we'll have 
> to add a series of NTAs then :/
> 
> At any rate, I am really glad with this explanation. I hope that KPN, and the 
> parties they outsourced their DNS service to, wil appreciate this too :)
> 
> -- Arien

This gives background information and a link to a remedy to be
employed on the load balancer side.

https://en.blog.nic.cz/2019/07/10/error-in-dnssec-implementation-on-f5-big-ip-load-balancers/

-Otto



___
Pdns-users mailing list
Pdns-users@mailman.powerdns.com
https://mailman.powerdns.com/mailman/listinfo/pdns-users


Re: [Pdns-users] pdns_recursor issue

2023-01-26 Thread Arien Vijn via Pdns-users

> On 26 Jan 2023, at 19:00, Otto Moerbeek  wrote:

[...]

> I expect the aggressive cache workaround to function.

It seems so indeed.

> What is happening is that a query of a non-existent type (e.g. )
> for xdsl-c-serviceweb.gslb.kpn.com 
> 
> $ dig @ns1gslb.kpn.com.  xdsl-c-serviceweb.gslb.kpn.com 
>   +dnssec
> 
> produces an NSEC3 record that denies all types except TXT and RRSIG:
> 
> cq026lgcduus730qu6cbhtrt7qpr2jnu.gslb.kpn.com 
> . 86400 IN   NSEC3 
> 1 0 1 19623DE58C1E7E40 CQ026LGCDUUS730QU6CBHTRT7QPR2JNV TXT RRSIG
> 
> So when the A record expires and somebody has done an  query in
> between, the aggressive cache concludes that the wanted A record  does
> not exists and not even asks the auth for it.
> 
> The after a cache wipe it works because when the (aggressive) cache is
> empty for that zone, there is also no NSEC3 record denying anything.
> 
> So in the end this is a misconfigured domain. Completely disabling the
> aggressive cache is a bit of a big hammer, you can also add an NTA for
> the specific problem domain, something like:
> 
> addNTA('gslb.kpn.com ', 'Invalid NSEC3 record served 
> for xdsl-c-serviceweb.gslb.kpn.com ')
> 
> in your Lua config file. This effectively does disable DNSSEC for the
> domain. And please also report this to KPN.

Thanks for the explanation! This is really useful because KPN pointed to our 
DNS= servers.

We also saw this with other (KPN hosted) 'gslb-domains', which also show no 
trouble anymore after disabling the
aggressive cache. So, if we go the NTA-way then I am afraid that we'll have to 
add a series of NTAs then :/

At any rate, I am really glad with this explanation. I hope that KPN, and the 
parties they outsourced their DNS service to, wil appreciate this too :)

-- Arien





signature.asc
Description: Message signed with OpenPGP
___
Pdns-users mailing list
Pdns-users@mailman.powerdns.com
https://mailman.powerdns.com/mailman/listinfo/pdns-users


Re: [Pdns-users] pdns_recursor issue

2023-01-26 Thread Otto Moerbeek via Pdns-users
On Thu, Jan 26, 2023 at 05:37:12PM +0100, Arien Vijn via Pdns-users wrote:

> Hi Peter,
> 
> > On 26 Jan 2023, at 17:28, Peter van Dijk via Pdns-users 
> >  wrote:
> 
> [...]
> 
> > After some brief investigation we somewhat suspect this is aggressive
> > NSEC caching. Can you see if aggressive-nsec-cache-size=0 makes the
> > problem go away?
> 
> Thanks! I'll add this line to the configuration right away :)
> 
> -- Ari??n
> 

I expect the aggressive cache workaround to function.

What is happening is that a query of a non-existent type (e.g. )
for xdsl-c-serviceweb.gslb.kpn.com

$ dig @ns1gslb.kpn.com.  xdsl-c-serviceweb.gslb.kpn.com  +dnssec 

produces an NSEC3 record that denies all types except TXT and RRSIG:

cq026lgcduus730qu6cbhtrt7qpr2jnu.gslb.kpn.com. 86400 IN NSEC3 1 0 1 
19623DE58C1E7E40 CQ026LGCDUUS730QU6CBHTRT7QPR2JNV TXT RRSIG

So when the A record expires and somebody has done an  query in
between, the aggressive cache concludes that the wanted A record  does
not exists and not even asks the auth for it.

The after a cache wipe it works because when the (aggressive) cache is
empty for that zone, there is also no NSEC3 record denying anything.

So in the end this is a misconfigured domain. Completely disabling the
aggressive cache is a bit of a big hammer, you can also add an NTA for
the specific problem domain, something like:

addNTA('gslb.kpn.com', 'Invalid NSEC3 record served for 
xdsl-c-serviceweb.gslb.kpn.com')

in your Lua config file. This effectively does disable DNSSEC for the
domain. And please also report this to KPN.

-Otto



___
Pdns-users mailing list
Pdns-users@mailman.powerdns.com
https://mailman.powerdns.com/mailman/listinfo/pdns-users


Re: [Pdns-users] pdns_recursor issue

2023-01-26 Thread Arien Vijn via Pdns-users
Hi Peter,

> On 26 Jan 2023, at 17:28, Peter van Dijk via Pdns-users 
>  wrote:

[...]

> After some brief investigation we somewhat suspect this is aggressive
> NSEC caching. Can you see if aggressive-nsec-cache-size=0 makes the
> problem go away?

Thanks! I'll add this line to the configuration right away :)

-- Ariën



signature.asc
Description: Message signed with OpenPGP
___
Pdns-users mailing list
Pdns-users@mailman.powerdns.com
https://mailman.powerdns.com/mailman/listinfo/pdns-users


Re: [Pdns-users] pdns_recursor issue

2023-01-26 Thread Arien Vijn via Pdns-users
Hi Otto,

Thanks for checking. Here is the configuration:

local-address=0.0.0.0, ::
local-port=53
query-local-address=::,0.0.0.0
threads=8

allow-from=127.0.0.0/8, ::1/128, 10.0.0.0/8, 87.251.42.0/26, 2001:7b8:650::/48

dnssec=validate

lua-config-file=/etc/powerdns/recursor.lua

webserver=yes
webserver-port=8082
webserver-address=::
webserver-password=
webserver-allow-from=0.0.0.0/32,::/0
api-key=

pdns-distributes-queries=true
reuseport=yes
any-to-tcp=yes
root-nx-trust=no
version-string=powerdns
max-ns-per-resolve=5

-- Ariën


> On 26 Jan 2023, at 17:04, Otto Moerbeek  wrote:
> 
> Hi,
> 
> Please show your configuration.
> 
> I do not think your analysis is to the point.
> If I repeat a scenario, I see a correct retrieval of the A record.
> 
> So we have to find out what is different in your case.
> 
>   -Otto
> 
> 
> On Thu, Jan 26, 2023 at 01:30:54PM +0100, Arien Vijn via Pdns-users wrote:
> 
>> Greetings,
>> 
>> We recently upgraded pdns_recursor from version 4.4.5 to 4.8.0. It seems 
>> that we run in into the following issue ever since.
>> 
>> 1/ Client queries for an A-record for xdsl-serviceweb.kpn.com.
>> 2/ Recursor queries the domain tree and receives the CNAME-record that 
>> points to: xdsl-c-serviceweb.gslb.kpn.com. from the authoritative DNS server.
>> 3/ Recursor queries and receives the subsequent an A-record from the 
>> authoritative DNS server for that A-record.
>> 4/ Recursor answers the client mentioned in 1/.
>> 
>> So far so good, until the A-record of xdsl-c-serviceweb.gslb.kpn.com. 
>> expires out of the 'main record cache' but not from the 'main packet cache'. 
>> The CNAME remains in both caches. Please note this excerpt from: rec_control 
>> dump-cache below:
>> 
>>   ; main record cache dump follows
>>   ;
>>   xdsl-serviceweb.kpn.com. 300 -224 IN CNAME xdsl-c-serviceweb.gslb.kpn.com. 
>> ; (Secure) auth=1 zone=kpn.com from=194.151.228.10 nm= rtag= ss=0
>>   ; negcache dump follows
>> 
>>   [...]
>> 
>>   ; main packet cache dump from thread follows
>>   ;
>>   xdsl-c-serviceweb.gslb.kpn.com. -1803 A  ; tag 0 udp
>> 
>>   [...]
>> 
>>   ; main packet cache dump from thread follows
>>   ;
>>   xdsl-serviceweb.kpn.com. -470 A  ; tag 0 udp
>>   xdsl-serviceweb.kpn.com. 111 A  ; tag 0 udp
>>   xdsl-serviceweb.kpn.com. 111   ; tag 0 udp
>> 
>> 
>> From that point on, pdns_recursor replies on queries for the A-record with 
>> the SOA-record of the domain of the said A-record:
>> 
>>   ; <<>> DiG 9.11.5-P4-5.1+deb10u2-Debian <<>> 
>> xdsl-c-serviceweb.gslb.kpn.com. @localhost
>>   ;; global options: +cmd
>>   ;; Got answer:
>>   ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 36347
>>   ;; flags: qr rd ra ad; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1
>> 
>>   ;; OPT PSEUDOSECTION:
>>   ; EDNS: version: 0, flags:; udp: 512
>>   ;; QUESTION SECTION:
>>   ;xdsl-c-serviceweb.gslb.kpn.com.IN  A
>> 
>>   ;; AUTHORITY SECTION:
>>   gslb.kpn.com.   79407   IN  SOA ns2gslb.kpn.com. 
>> netmaster.gslb.kpn.com. 2023011702 10800 3600 604800 86400
>> 
>>   ;; Query time: 0 msec
>>   ;; SERVER: ::1#53(::1)
>>   ;; WHEN: Thu Jan 26 12:10:13 CET 2023
>>   ;; MSG SIZE  rcvd: 113
>> 
>> 
>> This situation causes actual people to complain and is being resolved by 
>> removing the domain tree for the subdomain gslb.kpn.com. out of the caches. 
>> From then on the story starts again.
>> 
>> That the A-record xdsl-c-serviceweb.gslb.kpn.com. remains in the packet 
>> cache seems not good to me, but I don't know enough about DNS and 
>> pdns_recursor be sure. What could trigger this behaviour or is it perhaps a 
>> configuration issue because we made such a large jump in versions when we 
>> upgraded? Last but not least we see the same behaviour with at least one 
>> other hostname
>> 
>> -- Ari??n
>> 
> 
> 
> 
>> ___
>> Pdns-users mailing list
>> Pdns-users@mailman.powerdns.com 
>> https://mailman.powerdns.com/mailman/listinfo/pdns-users 
>> 


signature.asc
Description: Message signed with OpenPGP
___
Pdns-users mailing list
Pdns-users@mailman.powerdns.com
https://mailman.powerdns.com/mailman/listinfo/pdns-users


Re: [Pdns-users] pdns_recursor issue

2023-01-26 Thread Peter van Dijk via Pdns-users
Hi Arien,

On Thu, 2023-01-26 at 13:30 +0100, Arien Vijn via Pdns-users wrote:
> Greetings,
> 
> We recently upgraded pdns_recursor from version 4.4.5 to 4.8.0. It seems that 
> we run in into the following issue ever since.
> 
> 1/ Client queries for an A-record for xdsl-serviceweb.kpn.com.
> 2/ Recursor queries the domain tree and receives the CNAME-record that points 
> to: xdsl-c-serviceweb.gslb.kpn.com. from the authoritative DNS server.
> 3/ Recursor queries and receives the subsequent an A-record from the 
> authoritative DNS server for that A-record.
> 4/ Recursor answers the client mentioned in 1/.
> 
> So far so good, until the A-record of xdsl-c-serviceweb.gslb.kpn.com. expires 
> out of the 'main record cache' but not from the 'main packet cache'. The 
> CNAME remains in both caches. Please note this excerpt from: rec_control 
> dump-cache below:

After some brief investigation we somewhat suspect this is aggressive
NSEC caching. Can you see if aggressive-nsec-cache-size=0 makes the
problem go away?

Kind regards,
-- 
Peter van Dijk
PowerDNS.COM BV - https://www.powerdns.com/

___
Pdns-users mailing list
Pdns-users@mailman.powerdns.com
https://mailman.powerdns.com/mailman/listinfo/pdns-users


Re: [Pdns-users] pdns_recursor issue

2023-01-26 Thread Otto Moerbeek via Pdns-users
Hi,

Please show your configuration.

I do not think your analysis is to the point.
If I repeat a scenario, I see a correct retrieval of the A record.

So we have to find out what is different in your case.

-Otto


On Thu, Jan 26, 2023 at 01:30:54PM +0100, Arien Vijn via Pdns-users wrote:

> Greetings,
> 
> We recently upgraded pdns_recursor from version 4.4.5 to 4.8.0. It seems that 
> we run in into the following issue ever since.
> 
> 1/ Client queries for an A-record for xdsl-serviceweb.kpn.com.
> 2/ Recursor queries the domain tree and receives the CNAME-record that points 
> to: xdsl-c-serviceweb.gslb.kpn.com. from the authoritative DNS server.
> 3/ Recursor queries and receives the subsequent an A-record from the 
> authoritative DNS server for that A-record.
> 4/ Recursor answers the client mentioned in 1/.
> 
> So far so good, until the A-record of xdsl-c-serviceweb.gslb.kpn.com. expires 
> out of the 'main record cache' but not from the 'main packet cache'. The 
> CNAME remains in both caches. Please note this excerpt from: rec_control 
> dump-cache below:
> 
>; main record cache dump follows
>;
>xdsl-serviceweb.kpn.com. 300 -224 IN CNAME xdsl-c-serviceweb.gslb.kpn.com. 
> ; (Secure) auth=1 zone=kpn.com from=194.151.228.10 nm= rtag= ss=0
>; negcache dump follows
> 
>[...]
> 
>; main packet cache dump from thread follows
>;
>xdsl-c-serviceweb.gslb.kpn.com. -1803 A  ; tag 0 udp
> 
>[...]
> 
>; main packet cache dump from thread follows
>;
>xdsl-serviceweb.kpn.com. -470 A  ; tag 0 udp
>xdsl-serviceweb.kpn.com. 111 A  ; tag 0 udp
>xdsl-serviceweb.kpn.com. 111   ; tag 0 udp
> 
> 
> From that point on, pdns_recursor replies on queries for the A-record with 
> the SOA-record of the domain of the said A-record:
> 
>; <<>> DiG 9.11.5-P4-5.1+deb10u2-Debian <<>> 
> xdsl-c-serviceweb.gslb.kpn.com. @localhost
>;; global options: +cmd
>;; Got answer:
>;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 36347
>;; flags: qr rd ra ad; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1
> 
>;; OPT PSEUDOSECTION:
>; EDNS: version: 0, flags:; udp: 512
>;; QUESTION SECTION:
>;xdsl-c-serviceweb.gslb.kpn.com.IN  A
> 
>;; AUTHORITY SECTION:
>gslb.kpn.com.   79407   IN  SOA ns2gslb.kpn.com. 
> netmaster.gslb.kpn.com. 2023011702 10800 3600 604800 86400
> 
>;; Query time: 0 msec
>;; SERVER: ::1#53(::1)
>;; WHEN: Thu Jan 26 12:10:13 CET 2023
>;; MSG SIZE  rcvd: 113
> 
> 
> This situation causes actual people to complain and is being resolved by 
> removing the domain tree for the subdomain gslb.kpn.com. out of the caches. 
> From then on the story starts again.
> 
> That the A-record xdsl-c-serviceweb.gslb.kpn.com. remains in the packet cache 
> seems not good to me, but I don't know enough about DNS and pdns_recursor be 
> sure. What could trigger this behaviour or is it perhaps a configuration 
> issue because we made such a large jump in versions when we upgraded? Last 
> but not least we see the same behaviour with at least one other hostname
> 
> -- Ari??n
> 



> ___
> Pdns-users mailing list
> Pdns-users@mailman.powerdns.com
> https://mailman.powerdns.com/mailman/listinfo/pdns-users

___
Pdns-users mailing list
Pdns-users@mailman.powerdns.com
https://mailman.powerdns.com/mailman/listinfo/pdns-users


[Pdns-users] pdns_recursor issue

2023-01-26 Thread Arien Vijn via Pdns-users
Greetings,

We recently upgraded pdns_recursor from version 4.4.5 to 4.8.0. It seems that 
we run in into the following issue ever since.

1/ Client queries for an A-record for xdsl-serviceweb.kpn.com.
2/ Recursor queries the domain tree and receives the CNAME-record that points 
to: xdsl-c-serviceweb.gslb.kpn.com. from the authoritative DNS server.
3/ Recursor queries and receives the subsequent an A-record from the 
authoritative DNS server for that A-record.
4/ Recursor answers the client mentioned in 1/.

So far so good, until the A-record of xdsl-c-serviceweb.gslb.kpn.com. expires 
out of the 'main record cache' but not from the 'main packet cache'. The CNAME 
remains in both caches. Please note this excerpt from: rec_control dump-cache 
below:

   ; main record cache dump follows
   ;
   xdsl-serviceweb.kpn.com. 300 -224 IN CNAME xdsl-c-serviceweb.gslb.kpn.com. ; 
(Secure) auth=1 zone=kpn.com from=194.151.228.10 nm= rtag= ss=0
   ; negcache dump follows

   [...]

   ; main packet cache dump from thread follows
   ;
   xdsl-c-serviceweb.gslb.kpn.com. -1803 A  ; tag 0 udp

   [...]

   ; main packet cache dump from thread follows
   ;
   xdsl-serviceweb.kpn.com. -470 A  ; tag 0 udp
   xdsl-serviceweb.kpn.com. 111 A  ; tag 0 udp
   xdsl-serviceweb.kpn.com. 111   ; tag 0 udp


From that point on, pdns_recursor replies on queries for the A-record with the 
SOA-record of the domain of the said A-record:

   ; <<>> DiG 9.11.5-P4-5.1+deb10u2-Debian <<>> xdsl-c-serviceweb.gslb.kpn.com. 
@localhost
   ;; global options: +cmd
   ;; Got answer:
   ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 36347
   ;; flags: qr rd ra ad; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1

   ;; OPT PSEUDOSECTION:
   ; EDNS: version: 0, flags:; udp: 512
   ;; QUESTION SECTION:
   ;xdsl-c-serviceweb.gslb.kpn.com.IN  A

   ;; AUTHORITY SECTION:
   gslb.kpn.com.   79407   IN  SOA ns2gslb.kpn.com. 
netmaster.gslb.kpn.com. 2023011702 10800 3600 604800 86400

   ;; Query time: 0 msec
   ;; SERVER: ::1#53(::1)
   ;; WHEN: Thu Jan 26 12:10:13 CET 2023
   ;; MSG SIZE  rcvd: 113


This situation causes actual people to complain and is being resolved by 
removing the domain tree for the subdomain gslb.kpn.com. out of the caches. 
From then on the story starts again.

That the A-record xdsl-c-serviceweb.gslb.kpn.com. remains in the packet cache 
seems not good to me, but I don't know enough about DNS and pdns_recursor be 
sure. What could trigger this behaviour or is it perhaps a configuration issue 
because we made such a large jump in versions when we upgraded? Last but not 
least we see the same behaviour with at least one other hostname

-- Ariën



signature.asc
Description: Message signed with OpenPGP
___
Pdns-users mailing list
Pdns-users@mailman.powerdns.com
https://mailman.powerdns.com/mailman/listinfo/pdns-users