On Sat, Aug 10, 2019 at 9:14 AM <internet-dra...@ietf.org> wrote: > > > A New Internet-Draft is available from the on-line Internet-Drafts > directories. > This draft is a work item of the Domain Name System Operations WG of the IETF. > > Title : Extended DNS Errors > Authors : Warren Kumari > Evan Hunt > Roy Arends > Wes Hardaker > David C Lawrence > Filename : draft-ietf-dnsop-extended-error-07.txt > Pages : 13 > Date : 2019-08-09 > > Abstract: > This document defines an extensible method to return additional > information about the cause of DNS errors. Though created primarily > to extend SERVFAIL to provide additional information about the cause > of DNS and DNSSEC failures, the Extended DNS Errors option defined in > this document allows all response types to contain extended error > information. > > I went to talk to quad9. Here is the reply they sent.
Fwd: 1) I see at least one more model that needs to be supported, which is how to handle edns extended codes that are generated by a remote server, i.e. passthrough. Layering multiple forwarding resolvers behind each other is common, and some way to notify the end user that the originating message was not generated by the first resolver would be important. I don't know if there needs to be some way to indicate how "deep" the error was away from the end user; it seems just two levels (locally generated or non-locally generated) would be sufficient with only minor thought on it. Re: 1) This is a good point, but implementation will likely run afoul of existing standards or else require duplicative response codes or use of an additional flag in the INFO-CODES section. Perhaps a new flag type, similar to AA, which can be used to say that this recursor will return this result reliably/deterministically. Attempting to provide depth is perhaps unlikely, but flags for stub/forwarder/recursive/intermediate recursive or a subset of those might make sense. Perhaps a non-descript flag such as 'DR' for Deterministic Response. Obviously INFO-CODES can support many different flags, of which IR (Intermediate Resolver) or such could be included at the point of response generation, with the last server providing actual data in the chain being the one to authoritatively set the flag, which then must not be modified by further downstream resolvers in the process of returning the response. 2) SERVFAIL needs another error code to indicate the difference between a network error (unexpected network response like ICMP, or TCP error such as connection refused) versus timeout of the remote auth server, as that is often a confusing issue. Re: 2) Specifics as an item in the below list. 3) Really, I'd like to see a definition of some of the EXTRA TEXT strings here, since that will be almost immediately an issue that would need to be sorted out before this could be useful. There have been some discussions (sorry, don't know if it's a draft or just talking) about browsers consuming "extra" data in DNS responses that can do a number of things. As an example that is important to Quad9 (or any blocking-based DNS service) it might be the case that upon receiving a request for a "blocked" qname/qtype, we would hand back a forged answer that leads to a splash page as the default result. However, if the request was made from a resolver stack that had the EDNS extensions, we might include the "real" result in the EXTRA TEXT field, as well as a URL that points the user to an explanation of why that particular qname/qtype was blocked. Or we might add a risk factor, or type of risk ("risk=100, risktype=phishing") or the like. This allows a single query to be digestable by "dumb" stacks that we want to have do the most safe thing, but also allow "smart" resolver stacks to present a set of options to the end user. Re: 3) Seems reasonable. 4) I'm confused as to why a "blocked" or "censored" result would have a retry as mandatory. The resolver gave a canonical answer from the point of policy. Re: 4) See below notes. Potential inclusions/Adjustments: 184.108.40.206: A use case exists where a stale answer should attempt a retry. A declarative setting for the Retry bit should not be specified here, but instead guidance on whether or not the R bit should be set should be included. For example, when using a front-end load balancer, if the recursive backends are temporarily inaccessible but are expected to recover in time to handle a subsequent query, it would be prudent to include the R bit. No additional load would be generated towards the Authoritatives in this case, and the Intermediate Recursor may choose to set the R bit or not based on whether the failure mode appears to be temporary. 4.1.5: Another area where guidance should be provided. Some recursive resolvers process requests out of order, asynchronously, or will retry alternative authoritatives post-processing as part of infrastructure table management and thus may response to a subsequent query, where the initial will fail, likely due to timeouts. In our specific case, due to our use of multiple recursive backend technologies, a subsequent query failing DNSSEC validation has a significant chance of being answered by an alternative recursor. See also 4.2.1. 4.1.6: Synthesized Answer: This response could be considered a sub-case of forged. An example of this would be the id.server or version.bind queries, they cannot be considered forged, but also no authority truly holds them. 4.2.11: SERVFAIL - Network: The SERVFAIL response is being generated due to what is clearly identifiable to the answering server as a network issue. R bit should be set. 4.4.3: Abusive: The answering system considers the query in question to be abusive for reasons other than load, indicating that the specific requests are undesired. This could provide hints to Network Operators or simply poorly configured client implementations that the specific queries may be part of an amplification or other attack and should be inspected. 4.4.4: Excessive: The answering system considers the query volume of the client to be excessive, indicating that it is the volume and not the content of the queries being refused and that it may be willing to answer if volume is reduced. This could provide hints to Network Operators or poorly configured client systems that they need to add additional endpoints or reduce their request volume to restore service. 4.4.5: Go Away: The answering system considers further queries from the client/network to have to exceeded thresholds by large margins or excessive durations, and further queries are likely to be dropped. This message is an attempt to limit the continued use of resources terminating queries which will not be answered. This may simply be a sub-case of Abusive/Excessive, but also is not intended to be sent for each query, but instead only intermittently, and to bypass the need for lengthy troubleshooting efforts when drop rules cause a recursor to seem to have vanished. 4.5.1: The R flag being set here implies that there are potentially multiple policies in use and that a retry might receive an answer - which should not be the case with a single intermediate recursive service. A client, knowing that it has multiple recursive services with differring policies might retry against a different recursive service (ex: 220.127.116.11 instead of 18.104.22.168), but this effectively defeats the policies of the initial recursor, rendering it ineffective. The use of a specific server as a delineation is also confusing - it should instead specify that the answering entity - be it a single server or larger entity, has blocked this response. Also, blocked should be further defined to avoid collision with the definition of the Censored response code. Blocked in this case would be used as a catch-all for anything not otherwise categorized. 4.5.2: See 4.5.1. Censoring is inherently a governmental action and this should be reserved for that due to the severity and legal repercussions of attempts to bypass. R bits should not be set. Censored should be defined in the document to avoid confusion. 4.5.3: Filtered: Differentiated from Blocked/Censored in that this content has been specifically redacted at the perceived behest of the client - may include ad-blockers, dnsbl, or other specific cases - intended to be used by those systems. Would potentially include corporate IT policies. 4.5.4: Malicious: Differentiated from Blocked and Filtered in that the answering server believes the response to be actively malicious and harmful to the requesting systems or applications, and not merely undesired or offensive. R bits should not be set. 4.5.5: Malicious Upstream - The upstream entity is considered malicious by the answering server and thus a refusal to respond has been returned. Details should be included within the INFO-CODE and potentially EXTRA-TEXT. This is differentiated from Malicious in that in this case, it is the actual upstream server that is having all responses blocked, not the content itself - for instance a revoked or unexpected certificate (such as due to a CAA record) - from which no responses will be accepted. The R bit being set here depends on whether the server believes that the specific path is compromised - if all authoritatives are failed, then a retry will not help. If only one is, then it will help to get to the non-compromised server. In the absence of data, the R bit should be set. Other Notes: INFO-CODE: It would seem that would be best to include a basic recommendation for a standard DNS-specific RWhois/CRL-like endpoint which could provide local (non-IANA) information about returned codes, potentially at a well-known URI, or even within the DNS itself via TXT records or even within the EXTRA-TEXT field itself. It may make sense to create an extension of the R bit, via additional flag or other field which adds additional context to the retry declaration, such as that the request should retry the same recursor, or should instead immediately move to and try the next available. _______________________________________________ DNSOP mailing list DNSOP@ietf.org https://www.ietf.org/mailman/listinfo/dnsop