Hi all,
On 27/03/2023 10:24, Stephane Bortzmeyer wrote:
* Unbound implementation is not ready, but I let Yorgos elaborate on
this point.
The Unbound implementation is far from ready but the hackathon time was
well spent to identify needed changes to Unbound to cleanly support
unilateral probing and to look closely at the draft.
I will continue with the development in the future and report back here
with the results. Some initial notes for that if you are interested:
- The feature is going to be off by default;
- When turned on, the default further probing configuration will be to
actively probe new servers in an attempt to ease testing;
- Retaining data across reset as per section 4.5 will not be included,
at least in the initial implementation.
Now on with my comments for the draft, sorry for the wall of text :)
## A - ALPN
In section 4.4 there is mention of ALPN for the resolver (a MUST if I
read it correctly) but there is no mention of ALPN for the authoritative
side in the document.
## B - Resolver source IP
Section 4.5.1 describes keeping state based on the resolver's own source
IP. This is to support the guidance from section 3.1 where it says:
To avoid incurring additional minor timeouts for such a recursive
resolver, the pool operator SHOULD either:
* ensure that all members of the pool enable the same encrypted
transport(s) within the span of a few seconds, or
* ensure that the load balancer maps client requests to pool
members based on client IP addresses.
My interpretation of this text is that the first bullet point is for
offering the same transport service with a slight hiccup during update,
whereas the second bullet point is for offering different transport
services on individual servers of the pool.
The worst case for the former is that the pool is going to be labeled as
supporting encryption at most 1 day (damping variable) later, based on
which servers are reached from the pool.
This looks fine for me and no extra state keeping (i.e., resolver own
source IP) is needed.
I find trying to keep extra state per resolver source IP for the latter
case particularly challenging. Especially if the resolver is not
configured with explicit outgoing interfaces, thus default route, and
needs to observe its own source address from the reply, which may not be
available next time around thus giving bind()/send() errors and
introducing retry code paths.
All this while the measure does not guarantee to solve the
different-transport-service-behind-a-single-IP case as it depends
heavily on the network.
I understand that partial rollout is meant to test the waters for an
authoritative operator but I believe using a separate IP for enabling
DoT and/or DoQ for testing would make things simpler for both sides.
I don't have an operator's hat but is a pool with variable transport
services something that we actively want to support?
## C - Failure identification
There is mention in the draft about successful and unsuccessful DNS replies.
SERVFAIL is used as an example of an unsuccessful DNS reply.
Following the pseudo code in the draft, a SERVFAIL answer in all the
transports, which IMHO is an already usable DNS answer for the resolver,
will make the resolver to wait for all the transport replies before
considering using the SERVFAIL as the final answer.
My opinion is that any RCODE in the reply is a successful DNS answer (of
course with matching ID, qname, etc).
Otherwise we introduce something like a healthcheck per transport, see
which transport replies "better" and use that.
I believe this aligns with Stephane's observation during the hackathon
about different answers on 53 and 853 and needs addressing in section 3
to clearly state that a nameserver's reply to a given query must be the
same regardless of the transport used (maybe not the best text if TC is
also to be considered but I hope I get my message across :)
Maybe also define an unsuccessful "reply" as timeout/connection shutdown
instead of non-preferable RCODEs? There is already logic in resolvers to
handle different RCODEs.
What I am trying to say is to not base the usability of the encrypted
transport on the DNS replies themselves. IMHO as long as there are DNS
replies there, the encrypted transport is usable and preferable.
## D - Wording knit
In sections 4.6.2 and 4.6.9 the following is said:
If R is successful:
- Return R to the requesting client
It may well be the case that the R is to an internal query and there is
no requesting client waiting for an answer. Would the following work better?
If R is successful:
- R is further processed by the resolver
## E - Possible bug
In sections 4.6.2 and 4.6.9 the following is said after receiving a
successful reply:
- If Q is in N-queries[X]:
- Remove Q from N-queries[X]
I believe this is a bug and needs to be removed since future, slower
replies from the N transport will not be allowed to update the relevant
metrics as section 4.6.9 will stop further processing by the following text:
If Q is not in E-queries[X]:
- Discard R and process it no further (do not respond to a encrypted
response to a query that is not outstanding)
In general I support the idea of the draft but I believe we need to iron
out the expectations on both sides, also regarding Florian's recent
comments about per zone answers and thread-intelligence systems behavior.
Thanks for considering and best regards,
-- Yorgos
Some questions were raised about the draft, giving the experience with
PowerDNS Recursor:
* If the ADoT server replies but the reply indicates an error,
such as SERVFAIL or REFUSED, should the resolver retries without
DoT? PowerDNS recursor does it, but it seems it would make more
sense to accept the reply, and just to remind system
administrators that port 853 and 53 should deliver consistent
answers. The draft seems clear on the first point (as long as
there is a properly formatted DNS request, regard the server as
DoT-enabled) but not on the second (no clear reminder for
authoritative name servers).
* What should be the criteria to select an authoritative name
server to query? Should we prefer a fast insecure server or a slow
encrypted one? The draft does not mention it, because it is local
policy. (PowerDNS recursor has apparently no way to change its
default policy, which is to use the fastest one, DoT or
not.) The draft does not mandate such a knob in the authoritative
server, again, IETF typically does not tell endpoints how they have
to be configured.
_______________________________________________
dns-privacy mailing list
dns-privacy@ietf.org
https://www.ietf.org/mailman/listinfo/dns-privacy