Re: [DNSOP] scanning doesn't scale, draft-thomassen-dnsop-generalized-dns-notify and draft-ietf-dnsop-dnssec-bootstrapping

Peter Thomassen Mon, 16 Oct 2023 10:43:03 -0700

John,

On 10/16/23 18:19, John R Levine wrote:

On Mon, 16 Oct 2023, Peter Thomassen wrote:

3. the parent obtains a copy of a signaling zone and walks the signaling 
records published there (at _signal.$NS, such as _signal.jo.ns.cloudflare.com),


If you think about it for a moment,


I did :-)

#3 doesn't work very well, since every parent needs to scan all the signalling 
zones for all the nameservers they might be using.


For bystanders' context, we're discussing Section 4.3 of 
draft-ietf-dnsop-dnssec-bootstrapping.

No parent needs to scan anything. They might want to do so, if they think it's 
feasible for them, or they can find other solutions that are better for them, 
such as outlined below. (Or they might of course not do DS automation at all.)

A DNS operator and a parent may make an arrangement and agree on some method 
for how the parent can learn about the set of domains under the parent that 
have new signaling records. This method may be that the parent AXFR's the 
operator's signaling zone, which is but one of many options. Such an approach 
seems reasonable for large DNS operators which have many domains changing every 
day; a parent certainly wouldn't make such arrangements with every DNS operator 
in the world.

We're getting into a quite interesting topic space here, but it is entirely out 
of scope for this document. The document is concerned with how the parent can 
authenticate CDS/CDNSKEY records *after learning about their existence*; it 
does not deal with how to learn about their existence (beyond the very 
high-level, illustrative list given in Section 4.3).

Cloudflare hosts at least a million zones which I'm sure are in every public 
TLD, so that means we have a thousand TLDs and/or a thousand registrars all 
scanning Cloudflare's signalling zone which is not going to be small.  How's 
that going to scale?  For that matter, how's Cloudflare going to give the TSIG 
or whatever to the thousand scanners to let them do it?


There are a bunch of very specific assumptions in here, "to make it not work". 
Consider the following:

- DNS operators like Cloudflare could provide a copy of the contents under, 
say, se._signal.jo.ns.cloudflare.com to the .se registry. If done via a 
separate zone, there will be no pollution from other TLDs. [*]

- Alternatively, DNS operators can provide TLD-specific catalog zones, or 
CSV-like feeds, even blockchain. Actually, the method of obtaining the list 
doesn't matter for the argument in Section 4.3.

- Someone like Cloudflare could allowlist the registry's IP addresses, and 
nobody needs to set up TSIG.

- The parties can agree to restrict this to only keep entries younger than a 
certain time period (e.g. 1 day + buffer), and in turn process it regularly. 
This makes a very well-tailored list of bootstrapping requests.

- It is conceivable that the processing be done by the registry, not the 
registrar (as is the case for today's deployments), so there is only one 
ingesting party that needs to be allowlisted (per parent).

- Alternatively, if the registrar does the processing, some registrar endpoint 
signaling is needed (as is pondered in the NOTIFY draft). But surely this is 
not in scope of this document.

The main point here is that *none* of this has anything to do with what the 
parent does *after* the trigger (whichever one) has fired -- which is what the 
document describes.


Why talk about triggers at all in Section 4.3? -- The parent needs to consider 
that, depending on the trigger mechanism, the trigger context will or will not 
convey reliable knowledge of the delegation's NS RRset, so for some trigger 
types it will have to match against its local delegation database before 
proceeding. Pointing this out is relevant for the authentication process, and 
is the purpose of Section 4.3.


We've digressed quite a bit from your criticism, which was your cognitive dissonance that 
this document "says to scan for DS bootstrap", which it clearly does not. I 
hope the cognitive dissonance has been cleared up :-)

As for how to learn about which are appropriate triggers, this indeed seems to 
be a gap where additional standards development could help, and I share some 
(not all) of your opinions. Let's put it into the NOTIFY draft, or a more 
general one that discusses other triggers as well (e.g. the feed-like mechanism 
hinted at above, or something based on catalog zones). But please let's not do 
it here.

2. the parent chooses to do a scan,


This is no better.  For CDS scanning the scan is a single query to a single 
known server only for zones that are already signed.  But for this, it has to 
scan all the unsigned zones, and since the NS can be anywhere, each scan is a 
full recursive lookup.  That's a lot of traffic to a lot of servers all over 
the net.


This does not seem to make sense to me: For a conventional CDS lookup, you have 
to query the child's nameserver(s), which you need to resolve before you query 
it. The signaling records are subdomains of these names, so it adds little in 
terms of resolution overhead (unless you construct a case where the signaling 
zone is delegated to random other NS -- but why would that be the case). -- So, 
commonly, most of the resolution path will already be cached from the 
resolution process performed to get the apex CDS query.

I suggest adjusting the bootstrap draft saying to send NOTIFY(DS) to
the parent of a delegated name to tell it to do the bootstrap rather
than scanning.


The draft is not at all concerned about how the child triggers the parent, but 
only with what the parent does *once the parent has determined* that it is 
going to attempt DS initialization.


Right.  We need to fix that gap now, when we can do it easily by updating the 
drafts in progress.  It's totally normal to have two drafts progressing that 
depend on each other.


The dependency is artificial. People already do have CDS/CDNSKEY automation for 
some TLDs, and .ch/.li & Cloudflare (amongst others) have implemented this 
authentication protocol on parent/child side.

This protocol does not care what trigger mechanism .ch and Cloudflare are using 
between each other. That's not to say that it shouldn't be standardized; it 
just shows that the claimed dependency does not exist.

It also doesn't introduce nor endorse scanning. What it does is fixing an 
authenticity problem for parties processing CDS/CDNSKEY records. I'm not sure 
why this fix should wait for anything.

As can be seen in this message, the matter of triggers is complicated, and it 
might take quite some time to arrive at a good system that has consensus. Let's 
have this important discussion, but let's not hold up things that stand on 
their own and have no technical dependency -- even more so, as there's running 
code, and also code running.

I wouldn't forbid the other approaches but I would note that they are not 
likely to scale well to large zones like popular TLDs.


A note in that direction seems reasonable. How about adding the following 
paragraph after the bullet list in Section 4.3:

NEW
   The remainder of this section is concerned with how triggers may
   differ with respect to the context they provide, and in particular
   whether it is safe to infer a Child's set of Signaling Domains from
   the trigger context.  The triggers listed above are intended for
   illustration, and this specification does not recommend any
   particular one.  It is noted that concerns have been expressed
   regarding the scalability of parent-side scans.

Best,
Peter

[*]: This was discussed last year when the structure of the signaling names was 
figured out. It was noted that when se._signal.$NS is a separate zone, then it 
can't be a signaling name at the same time (because the semantics of the 
CDS/CDNSKEY record there would be ambiguous). That's why the _dsboot prefix was 
added.

--
https://desec.io/

_______________________________________________
DNSOP mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/dnsop

Re: [DNSOP] scanning doesn't scale, draft-thomassen-dnsop-generalized-dns-notify and draft-ietf-dnsop-dnssec-bootstrapping

Reply via email to