Re: Outgoing DANE not working

Rich Felker Mon, 13 Apr 2020 12:36:05 -0700

On Mon, Apr 13, 2020 at 03:04:12PM -0400, Viktor Dukhovni wrote:
> On Mon, Apr 13, 2020 at 02:35:22PM -0400, Rich Felker wrote:
> 
> > > The problem can be partly resolved by setting the "AD" bit in the
> > > outgoing DNS query header sent by the musl-libc stub resolver.  Then
> > > the local iterative resolver will return the AD bit in its response.
> > > 
> > > However, lack of support for retrying truncated responses over TCP
> > > or support for disabling RES_DEFNAMES and RES_DNSRCH remain as issues.
> > 
> > musl's stub resolver intentionally speaks only rfc1035 udp,
> 
> Lack of TCP support and ignoring the TC bit means that large responses
> get truncated, possibly breaking FCrDNS and triggering false positivies
> via reject_unknown_client_hostname.
> 
> It is also not uncommon for applications that use SRV records to
> encounter large RRsets (e.g. Windows Domain controller lists for
> large Active-Directory domains in MIT Kerberos or Heimdal).


The justification here has always been that a number of clients are in
positions where they can't perform tcp queries, e.g. their nameservers
only support udp and possibly only support rfc1035. Of course such an
environment is incompatible with validating dnssec, but from the
perspective of the domain defining the records, having so many/such
long records (not counting signatures) that they can't be delivered to
such clients without truncation means the domain has accessibility
problems.

Fallback to tcp on TC would also yield very bad performance for users
who are not running a local nameserver whenever looking up names with
ridiculous numbers of A/AAAA records, where the truncated response
certainly suffices (except in your example of FCrDNS).

It's possible that some of these choices can be revisited over time,
but they were made for good reasons, not at random.

> > and the intent has always been that DNSSEC validation and policy be
> > the responsibility of the nameserver running on localhost, not the
> > stub resolver or the calling application.
> 
> But some applications need to see the AD bit returned by the local
> resolver in order to distiguish between validated and non-validated
> results.  Recursive Nameservers (BIND, Unbound, ...) will only set
> (when appropriate) the AD bit in replies if it is set in the incoming
> query.  The AD bit is part of the standard DNS header:

Is the AD bit valid as part of a query? I couldn't find where this is
documented, and it's almost certainly not supported (possibly
rejected/dropped) by servers that aren't aware of it.

>     The basic DNS header flags word is a mixture of flag bits and numbers,
>     <https://tools.ietf.org/html/rfc2535#section-6.1>:
>     
>      +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
>      |QR|   Opcode  |AA|TC|RD|RA| Z|AD|CD|   RCODE   |
>      +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
> 
> and a one-line change in the musl-libc stub resolver can set the AD bit
> when the target resolver is local (127.0.0.0/8 or ::1/128).

I'm confused whether you're saying it should be set in the outgoing
query or forged in the response. The latter sounds like a really bad
idea. If the former, I don't see why it would be done conditional on
being a local resolver (and also local need not be 127.0.0.1 or ::1;
it can be public address of localhost or a lot of other things, e.g. a
tunnel out of a container to the actual host, depending on network
setup). Is the idea just that you assume as local one would support it
whereas for a remote one it might be unknown? I don't think this kind
of policy decision belongs in the stub resolver; for instance it would
break in the other direction if you implemented nameserver on
127.0.0.1 (e.g. just to avoid needing a resolv.conf file) by an
iptables rule to redirect to the real nameserver.

> > What is relevant, as far as I can tell, is that Postfix wants a way to
> > perform an EDNS0 query that lets it distinguish between a valid signed
> > result and a valid unsigned result.
> 
> No, Postfix just wants the AD bit, but sadly the traditional resolver
> API does not have RES_USE_ADBIT, it only has RES_USE_DNSSEC which sets
> the DO bit in the EDNS(0) extended header.  If (see above) you just
> set the AD bit for all requests to local resolvers, Postfix will get
> all the DNSSEC support it needs.

I think just adding a resolv.conf option for using the AD bit might be
appropriate. One issue that makes this more complicated though is how
the API is factored. res_mkquery in theory doesn't/shouldn't depend on
the particular nameservers, but should just serialize a query that can
be used with any server (e.g. my implementation of host(1) does this
to send to the server you give it on the command line). But the choice
of configuration is specific to the configured nameservers.

> > From my perspective, what would work best with what's always been the
> > intended DNSSEC usage model of musl would be if Postfix supported use
> > of DANE with smtp_dns_support_level=enabled, i.e. outsourcing all
> > DNSSEC functionality to the nameserver.
> 
> Sorry, we actually need to know which records were validated in
> signed domains, and which are "insecure" responses from unsigned
> domains.  That's what the AD bit is for, and you're not setting
> it in requests, and so it does not come back in the response.

Can you describe why? Is it only for the sake of not using TLSA
records in unsigned domains? That kind of policy can be implemented at
the resolver level and my intent was always that, if desired, it would
be. But I can understand that you may not want that, or that there may
be other reasons it doesn't suffice. I still think it would be useful
to allow the user to configure such a setting; it's certainly better
than DANE not working.

Rich

Re: Outgoing DANE not working

Reply via email to