Jeremie Courreges-Anglas([email protected]) on 2020.07.26 22:52:47 +0200: > On Sat, Jul 25 2020, Sebastian Benoit <[email protected]> wrote: > > Jeremie Courreges-Anglas([email protected]) on 2020.07.25 14:01:06 +0200: > >> On Fri, Jul 17 2020, Jesper Wallin <[email protected]> wrote: > >> > Hi all, > >> > > >> > I recently decided to add SSHFP records for my servers, since I never > >> > memorize or write down my key fingerprints. I learned that if I > >> > want ssh(1) to trust these records, DNSSEC needs to be enabled for my > >> > zone. To validate these records, ssh(1) is using getrrsetbyname(3), > >> > which checks if the AD (Authentic Data) flag is set in the response. > >> > > >> > To get a response with the AD flag set, the request itself also needs > >> > to have the AD flag set. It turns out that getrrsetbyname(3) doesn't > >> > set this and will therefore never get a validated response, unless the > >> > resolver is configured to always send it, no matter what the client > >> > requested. > >> > > >> > It seems like unwind(8) behaves this way but it also responds with the > >> > RRSIG records, which is extra overhead and ignored by getrrsetbyname(3). > >> > > >> > This was mentioned a few years ago [0] and the solution suggested was > >> > to add the RES_USE_DNSSEC to _res.options, which also makes the resolver > >> > respond with the extra RRSIG records. > >> > > >> > Instead, by only setting the AD flag, both the request and the response > >> > has the same size as without the flag set. The patch below will add > >> > RES_USE_AD as an option to _res.options and set it by default. > >> > This is also the default behaviour in dig(1), which I understand is a > >> > bit different, but that sure added some confusion while debugging this. > >> > > >> > This let you run unbound(8) or any other validating resolver on your > >> > local network and getrrsetbyname(3) will trust it. Do read the CAVEATS > >> > in the manual of getrrsetbyname(3) though. > >> > > >> > As a side note, I noticed that the default value of _res.options was the > >> > same value as RES_DEFAULT, so I changed it to RES_DEFAULT instead, for > >> > the sake of consistency. > >> > > >> > Thoughts? > >> > >> Thanks for addressing this longstanding issue. > >> > >> I think using the AD bit in queries is a good idea. IIRC Peter J. > >> Philipp (cc'd) suggested using it but I was not thrilled because: > >> > >> 1. the meaning of the AD bit in queries is relatively recent (2013 > >> I think[0]) > >> 2. getrrsetbyname also collects signature records, and for this you need > >> EDNS + the DO bit set. Implementing this in was not 100% trivial, > >> I think we had something working but Eric or I were not 100% happy > >> with it. > >> > >> 1. is probably not a concern, after all you're supposed to use > >> a trustworthy resolver, which should be modern and understand the > >> purpose of the AD bit in queries. > >> > >> 2. is probably not a concern either. I guess that all getrrsetbyname(3) > >> callers only care about the target records and the AD bit, not about the > >> signature records. (Why would they use it for anyway?). In the base > >> system, only ssh(1) and traceroute(8) use getrrsetbyname(3). > >> AFAIK no other system provides getrrsetbyname(3), and ISC has removed > >> getrrsetbyname and the whole lwres API in 2017[1]. So I'd say we're > >> free to improve our version of getrrsetbyname(3) as we see fit. > > > > This is a concern for the stub resolver, because edns and AD does not work > > everywhere. > > Indeed when we switched unbound to validate by default we > > learned that this is not a good idea for everyone - which lead to the > > development of unwind(8). > > Do you by chance have any data regarding fallout because of the AD bit > set in queries? I would expect it to be ignored when not supported. > EDNS and the DNSSEC DO bit is a different story indeed.
If i remember correctly, the fallout was caused by EDNS but i might be wrong. The unbound commit caused a developer some headscratching, because his upstream internet did not work with such packets, which led to immediate backout of the change, because a default config that does not work is not good. In that regard i worry about the resolv.conf option: it has the potential to break peoples DNS in non obvious ways. At least dont advertise it as a way to "enable dnssec" for the whole system, instead point people to unwind.
