Re: [dns-operations] Cache efficiency (was: Re: DNS .com/.net resolution problems in the Asia/Pacific region)

2023-07-20 Thread Paul Vixie via dns-operations
--- Begin Message ---



Robert Edmonds wrote on 2023-07-20 14:50:

Mark Andrews wrote:

...


Yes, there are lookups that can take a long time to perform with a cold
cache. By putting lots of users behind large, centralized caches we can
insulate users from a lot of cold cache lookups, but these centralized
resolvers then become concentrated points of failure, convenient
monitoring points, etc.


i have also found the cache hit performance to be notably worse when 
talking to an off-campus recursive than when talking on-campus. it is 
not only the cache miss performance (as driven by population size) which 
should concern us. 10ms vs 1ms doesn't matter for windowing workflows or 
for human perception but it adds up in ask-and-wait workflows not driven 
by human tentacles and optics.



Personally, I'd like to see the "full resolver"
role be re-distributed and move out as close as possible to the
endpoints, given that the original justification for the stub
resolver/full resolver split was a lack of resources at the endpoints --
in the 1980's. But if you have full resolvers running on individual
endpoints, or on network elements that serve individual households, etc.
you're much more likely to run into cold cache lookups and it would be
nice to be able to accelerate or avoid those cold lookups.


those are unpleasant extremes. can we consider a middle ground, like 
campus-level and isp-level name servers as were common before the 
opendns era, as having population sizes that led to good cache reuse? 
notably, another advantage from that rdns granularity is that the cache 
miss shares connectivity with subsequent transactions, obviating some of 
the complexities like EDNS client subnet which only came into existence 
because of the global anycast trend (quad this, quad that, quad etc.)



Here are some random ideas for improving the efficiency of cold or
lukewarm caches.

1. Cache occlusion rather than replacement of outranked data. The DNS
protocol reuses the same record type (NS) for both the non-authoritative
delegation nameserver record set served by the parent zone as well as
the authoritative nameserver record set served by the child zone. RFC
2181 § 5.4.1 says that resolvers "should replace" the data from the
parent zone when they receive authoritative data from the child zone,
but the parent zone often has a much longer TTL on the records that it
serves. (E.g., the twitter.com data from the .com zone has a 2-day TTL,
while the twitter.com NS record set from the twitter.com zone has a <4
hour TTL.)>
If resolver caches were able to retain the longer-lived NS records from
the parent and "occlude" them when a shorter-lived NS record from the
child is cached, then utilize them again when they become unoccluded
upon the expiration of the child NS record, it would avoid sending
unnecessary queries to the parent. It would also be arguably more
compatible with the lower case "should replace" text in RFC 2181 § 5.4.1
than a "parent-centric" resolver implementation.


draft-ietf-dnsop-ns-revalidation could be tuned to make that happen.


2. Persist some or all of the resolver's cached NS and nameserver
address records to disk. These are typically long-lived records and I'd
gladly trade a few tens of MB of disk space in exchange for better P99+
resolution latency after a restart. Perhaps this could also include the
RTTs, EDNS capabilities, etc. that is sometimes called the
"infrastructure cache".


while i don't think i have any "disks" left, i agree with what you mean. 
we had cache dump/restore on shutdown/startup in bind4 but pulled it out 
in bind8 as a complexity vs. utility tradeoff.



3. You mention CNAME chains, but NS delegations are another source of
indirection that may require additional upstream lookups, especially if
the nameserver names are in several different TLDs (as a reliability
hedge?). There are a couple of things that could be done here:

a) Delegations within the same organization often reflect internal
organizational boundaries. One team may want to give control over part
of the namespace to another team, without handing over write permissions
for the whole zone, so the typical solution is to carve out a child zone
for the other team, and host that zone on the same provider as the
parent zone. If the cloud-based DNS providers that many organizations
use offered a more granular, less than whole zone permissions model, it
would cut down on the number of child zones that are created solely to
reflect intra-organizational boundaries.


i'd hate to see us adopt a cloud-centric model. whatever we do to 
improve NS-chain performance -- and i think your first two suggestions 
would do this -- should also benefit the normal delegation, notify, and 
transfer system.



b) Make nameserver address indirection *optional* without requiring a
backwards-incompatible protocol change.


*cough*.

--
P Vixie

--- End Message ---
___
dns-operations mailing list

[dns-operations] Cache efficiency (was: Re: DNS .com/.net resolution problems in the Asia/Pacific region)

2023-07-20 Thread Robert Edmonds
Mark Andrews wrote:
> Lookups take enormous numbers of queries these days.  A support customer
> was asking why a lookup wasn’t completing within 3 seconds.  The resolution
> process took 48 queries with a cold cache.  Involved several CDNs and required
> fetching nameserver addresses in several different TLDs.  There where no 
> retries
> in that count.
> 
> CNAME chains are expensive but we have a whole industry that has fallen in 
> love
> with them.
> 
> Yes, we do have query limits but they need to be large to handle this sort of
> stuff.

Yes, there are lookups that can take a long time to perform with a cold
cache. By putting lots of users behind large, centralized caches we can
insulate users from a lot of cold cache lookups, but these centralized
resolvers then become concentrated points of failure, convenient
monitoring points, etc. Personally, I'd like to see the "full resolver"
role be re-distributed and move out as close as possible to the
endpoints, given that the original justification for the stub
resolver/full resolver split was a lack of resources at the endpoints --
in the 1980's. But if you have full resolvers running on individual
endpoints, or on network elements that serve individual households, etc.
you're much more likely to run into cold cache lookups and it would be
nice to be able to accelerate or avoid those cold lookups.

Here are some random ideas for improving the efficiency of cold or
lukewarm caches.

1. Cache occlusion rather than replacement of outranked data. The DNS
protocol reuses the same record type (NS) for both the non-authoritative
delegation nameserver record set served by the parent zone as well as
the authoritative nameserver record set served by the child zone. RFC
2181 § 5.4.1 says that resolvers "should replace" the data from the
parent zone when they receive authoritative data from the child zone,
but the parent zone often has a much longer TTL on the records that it
serves. (E.g., the twitter.com data from the .com zone has a 2-day TTL,
while the twitter.com NS record set from the twitter.com zone has a <4
hour TTL.)

If resolver caches were able to retain the longer-lived NS records from
the parent and "occlude" them when a shorter-lived NS record from the
child is cached, then utilize them again when they become unoccluded
upon the expiration of the child NS record, it would avoid sending
unnecessary queries to the parent. It would also be arguably more
compatible with the lower case "should replace" text in RFC 2181 § 5.4.1
than a "parent-centric" resolver implementation.

2. Persist some or all of the resolver's cached NS and nameserver
address records to disk. These are typically long-lived records and I'd
gladly trade a few tens of MB of disk space in exchange for better P99+
resolution latency after a restart. Perhaps this could also include the
RTTs, EDNS capabilities, etc. that is sometimes called the
"infrastructure cache".

Compare to modern web browsers which allow websites to store an enormous
amount of data on every user's disk. (If you use Chrome, check
chrome://settings/content/all and sort by "Data stored". According to
[0], Chrome apparently believes that individual web origins are entitled
to use "up to 60%" of your disk space.) A tiny fraction of that disk
space could store a very large amount of the most frequently used DNS
infrastructure records.

[0] https://web.dev/storage-for-the-web/#how-much

I believe some resolver implementations e.g. Knot Resolver already store
their entire cache on disk in an LMDB database.

3. You mention CNAME chains, but NS delegations are another source of
indirection that may require additional upstream lookups, especially if
the nameserver names are in several different TLDs (as a reliability
hedge?). There are a couple of things that could be done here:

a) Delegations within the same organization often reflect internal
organizational boundaries. One team may want to give control over part
of the namespace to another team, without handing over write permissions
for the whole zone, so the typical solution is to carve out a child zone
for the other team, and host that zone on the same provider as the
parent zone. If the cloud-based DNS providers that many organizations
use offered a more granular, less than whole zone permissions model, it
would cut down on the number of child zones that are created solely to
reflect intra-organizational boundaries.

b) Make nameserver address indirection *optional* without requiring a
backwards-incompatible protocol change.

One could stand up "stunt" nameservers that return A or  records for
an IP address embedded in the QNAME, e.g.:

;; QUESTION SECTION:
;198.51.100.1.ipv4-literal.example. IN A

;; ANSWER SECTION:
198.51.100.1.ipv4-literal.example. 86400 IN A 198.51.100.1

and

;; QUESTION SECTION:
;2001:db8::1.ipv6-literal.example. IN 

;; AUTHORITY SECTION:
2001:db8::1.ipv6-literal.example. 86400 IN  2001:db8::1

Then, 

Re: [dns-operations] [DNSSEC] Venezuela ccTLD broken

2023-07-20 Thread Mark Andrews
On a similar issue, why aren’t the root servers all implementing DNS COOKIES as 
it provides clients protection from spoofed referrals?

-- 
Mark Andrews

> On 21 Jul 2023, at 03:16, David Conrad  wrote:
> 
> Hi,
> 
>> On Jul 20, 2023, at 7:29 AM, Viktor Dukhovni  wrote:
>> Finally, for the RSAC (yes not the right forum to formally lodge the
>> question), should the root zone DS TTL still be 1 day?  Would a change
>> to one hour be acceptable (aligning with it with the practice of many
>> TLDs and aiding in more time recovery from mistakes)?
> 
> 
> Haven’t thought about the implications enough to comment on the idea, however 
> instead of RSSAC, this sounds to me like a question for RZERC to (eventually) 
> weigh in on. In the Byzantine world of ICANN, it would need to be brought to 
> RZERC by "any of [RZERC’s] members, PTI staff, or by the Customer Standing 
> Committee (CSC)”, many of which are on this mailing list.
> 
> Regards,
> -drc
> 
> ___
> dns-operations mailing list
> dns-operations@lists.dns-oarc.net
> https://lists.dns-oarc.net/mailman/listinfo/dns-operations


signature.asc
Description: Binary data
___
dns-operations mailing list
dns-operations@lists.dns-oarc.net
https://lists.dns-oarc.net/mailman/listinfo/dns-operations


Re: [dns-operations] [DNSSEC] Venezuela ccTLD broken

2023-07-20 Thread David Conrad
Hi,

On Jul 20, 2023, at 7:29 AM, Viktor Dukhovni  wrote:
> Finally, for the RSAC (yes not the right forum to formally lodge the
> question), should the root zone DS TTL still be 1 day?  Would a change
> to one hour be acceptable (aligning with it with the practice of many
> TLDs and aiding in more time recovery from mistakes)?


Haven’t thought about the implications enough to comment on the idea, however 
instead of RSSAC, this sounds to me like a question for RZERC to (eventually) 
weigh in on. In the Byzantine world of ICANN, it would need to be brought to 
RZERC by "any of [RZERC’s] members, PTI staff, or by the Customer Standing 
Committee (CSC)”, many of which are on this mailing list.

Regards,
-drc



signature.asc
Description: Message signed with OpenPGP
___
dns-operations mailing list
dns-operations@lists.dns-oarc.net
https://lists.dns-oarc.net/mailman/listinfo/dns-operations


Re: [dns-operations] [DNSSEC] Venezuela ccTLD broken

2023-07-20 Thread Viktor Dukhovni
On Thu, Jul 20, 2023 at 07:25:17AM -0400, Hugo Salgado wrote:

> They are aware and working on this. Thanks!

The final working state is still somewhat suboptimal:

- The KSKs are 4096 bit RSA.  This is pointless, the DS RRset from
  the root is signed with a 2048-bit RSA key.  The additional bits
  are just packet size and computational bloat.

- The ZSK need not (and so in practice should not) also sign the DNSKEY
  RRset, just the KSK signatures are sufficient.

Finally, for the RSAC (yes not the right forum to formally lodge the
question), should the root zone DS TTL still be 1 day?  Would a change
to one hour be acceptable (aligning with it with the practice of many
TLDs and aiding in more time recovery from mistakes)?

-- 
Viktor.
___
dns-operations mailing list
dns-operations@lists.dns-oarc.net
https://lists.dns-oarc.net/mailman/listinfo/dns-operations


Re: [dns-operations] [DNSSEC] Venezuela ccTLD broken

2023-07-20 Thread Stephane Bortzmeyer
On Thu, Jul 20, 2023 at 07:25:17AM -0400,
 Hugo Salgado  wrote 
 a message of 148 lines which said:

> They are aware and working on this. Thanks!

It works now.

$ dig NS ve

; <<>> DiG 9.18.14 <<>> NS ve
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 40942
;; flags: qr rd ra ad; QUERY: 1, ANSWER: 6, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;ve.IN  NS

;; ANSWER SECTION:
ve. 18000   IN  NS  ns3.nic.ve.
ve. 18000   IN  NS  ns4.nic.ve.
ve. 18000   IN  NS  a.lactld.org.
ve. 18000   IN  NS  ns5.nic.ve.
ve. 18000   IN  NS  ssdns-tld.nic.cl.
ve. 18000   IN  NS  ns6.nic.ve.

;; Query time: 780 msec
;; SERVER: ::1#53(::1) (UDP)
;; WHEN: Thu Jul 20 12:54:31 UTC 2023
;; MSG SIZE  rcvd: 163


https://dnsviz.net/d/ve/ZLknmA/dnssec/
___
dns-operations mailing list
dns-operations@lists.dns-oarc.net
https://lists.dns-oarc.net/mailman/listinfo/dns-operations


Re: [dns-operations] [DNSSEC] Venezuela ccTLD broken

2023-07-20 Thread Hugo Salgado
They are aware and working on this. Thanks!

Hugo


On July 20, 2023 3:40:06 AM GMT-04:00, Stephane Bortzmeyer  
wrote:
>On Thu, Jul 20, 2023 at 09:37:10AM +0200,
> Stephane Bortzmeyer  wrote 
> a message of 6 lines which said:
>
>> https://dnsviz.net/d/ve/ZLjinw/dnssec/
>> 
>> The DS goes to a key which does not sign (and there is no DS for the
>> key which is actually signing.)
>
>Any contact not in .ve to tell them? My email server uses a validating
>resolver :-(
>___
>dns-operations mailing list
>dns-operations@lists.dns-oarc.net
>https://lists.dns-oarc.net/mailman/listinfo/dns-operations
___
dns-operations mailing list
dns-operations@lists.dns-oarc.net
https://lists.dns-oarc.net/mailman/listinfo/dns-operations


Re: [dns-operations] [DNSSEC] Venezuela ccTLD broken

2023-07-20 Thread Yasuhiro Orange Morishita / 森下泰宏
It looks like one of the USGBKR cases...
cf. https://lists.dns-oarc.net/pipermail/dns-operations/2014-March/011399.html

Before: https://dnsviz.net/d/ve/ZLZ8ng/dnssec/
After: https://dnsviz.net/d/ve/ZLjinw/dnssec/

-- Yasuhiro Orange Morishita

From: Stephane Bortzmeyer 
Subject: [dns-operations] [DNSSEC] Venezuela ccTLD broken
Date: Thu, 20 Jul 2023 09:37:10 +0200

> https://dnsviz.net/d/ve/ZLjinw/dnssec/
> 
> The DS goes to a key which does not sign (and there is no DS for the
> key which is actually signing.)
> 
> 
> ___
> dns-operations mailing list
> dns-operations@lists.dns-oarc.net
> https://lists.dns-oarc.net/mailman/listinfo/dns-operations
> 
___
dns-operations mailing list
dns-operations@lists.dns-oarc.net
https://lists.dns-oarc.net/mailman/listinfo/dns-operations


Re: [dns-operations] [Ext] Re: [DNSSEC] Venezuela ccTLD broken

2023-07-20 Thread Benjamin Farine
Hi Stephane,

I just sent them (nic.ve) an email from non-validating resolver. 
I hope they'll able to check emails. 

-- 
Benjamin Farine 




On 20/07/2023, 09:47, "dns-operations on behalf of Stephane Bortzmeyer" 
mailto:dns-operations-boun...@dns-oarc.net> on behalf of bortzme...@nic.fr 
> wrote:


On Thu, Jul 20, 2023 at 09:37:10AM +0200,
Stephane Bortzmeyer mailto:bortzme...@nic.fr>> wrote 
a message of 6 lines which said:


> https://urldefense.com/v3/__https://dnsviz.net/d/ve/ZLjinw/dnssec/__;!!PtGJab4!_3N2EOyjPMPNfKi0LPvVrUGRlSCFWWtjjCo9TeJxw8qAwWyWPRdY_bYF4a912Tgxg0eu7Q-PVrBJ_uHE_-H8OPw8Olry9w$
>  
> 
>  [dnsviz[.]net]
> 
> The DS goes to a key which does not sign (and there is no DS for the
> key which is actually signing.)


Any contact not in .ve to tell them? My email server uses a validating
resolver :-(
___
dns-operations mailing list
dns-operations@lists.dns-oarc.net 
https://urldefense.com/v3/__https://lists.dns-oarc.net/mailman/listinfo/dns-operations__;!!PtGJab4!_3N2EOyjPMPNfKi0LPvVrUGRlSCFWWtjjCo9TeJxw8qAwWyWPRdY_bYF4a912Tgxg0eu7Q-PVrBJ_uHE_-H8OPxOzU2U4g$
 

 [lists[.]dns-oarc[.]net]




___
dns-operations mailing list
dns-operations@lists.dns-oarc.net
https://lists.dns-oarc.net/mailman/listinfo/dns-operations


Re: [dns-operations] [DNSSEC] Venezuela ccTLD broken

2023-07-20 Thread Stephane Bortzmeyer
On Thu, Jul 20, 2023 at 09:37:10AM +0200,
 Stephane Bortzmeyer  wrote 
 a message of 6 lines which said:

> https://dnsviz.net/d/ve/ZLjinw/dnssec/
> 
> The DS goes to a key which does not sign (and there is no DS for the
> key which is actually signing.)

Any contact not in .ve to tell them? My email server uses a validating
resolver :-(
___
dns-operations mailing list
dns-operations@lists.dns-oarc.net
https://lists.dns-oarc.net/mailman/listinfo/dns-operations


[dns-operations] [DNSSEC] Venezuela ccTLD broken

2023-07-20 Thread Stephane Bortzmeyer
https://dnsviz.net/d/ve/ZLjinw/dnssec/

The DS goes to a key which does not sign (and there is no DS for the
key which is actually signing.)


___
dns-operations mailing list
dns-operations@lists.dns-oarc.net
https://lists.dns-oarc.net/mailman/listinfo/dns-operations