-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256
Patch applied. Thank you.
And thank you for the comprehensive documentation.
The original change was made as a part of the DNSSEC stuff, and I have
a nagging feeling that there was some, theoretical, situation that
could occur in conjunction with DNSSEC which prompted the change.
Despite much puzzling, I can't come up with waht this might have been,
so I've applied the patch. If it breaks with DNSSEC, we'll find out,
but I suspect I'm chasing shadows.
Cheers,
Simon.
On 01/02/17 22:54, Baptiste Jonglez wrote:
> From: Baptiste Jonglez
>
> This effectively reverts most of 51967f9807 ("SERVFAIL is an
> expected error return, don't try all servers.") and 4ace25c5d6
> ("Treat REFUSED (not SERVFAIL) as an unsuccessful upstream
> response").
>
> With the current behaviour, as soon as dnsmasq receives a SERVFAIL
> from an upstream server, it stops trying to resolve the query and
> simply returns SERVFAIL to the client. With this commit, dnsmasq
> will instead try to query other upstream servers upon receiving a
> SERVFAIL response.
>
> According to RFC 1034 and 1035, the semantic of SERVFAIL is that of
> a temporary error condition. Recursive resolvers are expected to
> encounter network or resources issues from time to time, and will
> respond with SERVFAIL in this case. Similarly, if a validating
> DNSSEC resolver [RFC 4033] encounters issues when checking
> signatures (unknown signing algorithm, missing signatures, expired
> signatures because of a wrong system clock, etc), it will respond
> with SERVFAIL.
>
> Note that all those behaviours are entirely different from a
> negative response, which would provide a definite indication that
> the requested name does not exist. In our case, if an upstream
> server responds with SERVFAIL, another upstream server may well
> provide a positive answer for the same query.
>
> Thus, this commit will increase robustness whenever some upstream
> servers encounter temporary issues or are misconfigured.
>
> Quoting RFC 1034, Section 4.3.1. "Queries and responses":
>
> If recursive service is requested and available, the recursive
> response to a query will be one of the following:
>
> - The answer to the query, possibly preface by one or more CNAME
> RRs that specify aliases encountered on the way to an answer.
>
> - A name error indicating that the name does not exist. This may
> include CNAME RRs that indicate that the original query name was an
> alias for a name which does not exist.
>
> - A temporary error indication.
>
> Here is Section 5.2.3. of RFC 1034, "Temporary failures":
>
> In a less than perfect world, all resolvers will occasionally be
> unable to resolve a particular request. This condition can be
> caused by a resolver which becomes separated from the rest of the
> network due to a link failure or gateway problem, or less often by
> coincident failure or unavailability of all servers for a
> particular domain.
>
> And finally, RFC 1035 specifies RRCODE 2 for this usage, which is
> now more widely known as SERVFAIL (RFC 1035, Section 4.1.1. "Header
> section format"):
>
> RCODE Response code - this 4 bit field is set as part of
> responses. The values have the following interpretation: (...)
>
> 2 Server failure - The name server was unable to
> process this query due to a problem with the name server.
>
> For the DNSSEC-related usage of SERVFAIL, here is RFC 4033 Section
> 5. "Scope of the DNSSEC Document Set and Last Hop Issues":
>
> A validating resolver can determine the following 4 states: (...)
>
> Insecure: The validating resolver has a trust anchor, a chain of
> trust, and, at some delegation point, signed proof of the
> non-existence of a DS record. This indicates that subsequent
> branches in the tree are provably insecure. A validating resolver
> may have a local policy to mark parts of the domain space as
> insecure.
>
> Bogus: The validating resolver has a trust anchor and a secure
> delegation indicating that subsidiary data is signed, but the
> response fails to validate for some reason: missing signatures,
> expired signatures, signatures with unsupported algorithms, data
> missing that the relevant NSEC RR says should be present, and so
> forth. (...)
>
> This specification only defines how security-aware name servers
> can signal non-validating stub resolvers that data was found to be
> bogus (using RCODE=2, "Server Failure"; see [RFC4035]).
>
> Notice the difference between a definite negative answer
> ("Insecure" state), and an indefinite error condition ("Bogus"
> state). The second type of error may be specific to a recursive
> resolver, for instance because its system clock has been
> incorrectly set, or because it does not implement newer
> cryptographic primitives. Another recursive resolver may succeed
> for the same query.
>
> There are other similar situations in which the specified