Hello all Thanks for preparing this, Joe.
I should say that the finding that some Cisco entered in a reboot loop due to this, in 2026, was shocking. I should note that this scenario (the CNAME coming after the records) is exactly the same as the one on the famous CVE-2002-0400 [1] affecting BIND < 9.2.1 and the libresolv library. The CVE description itself is not overly descriptive on the actual contents of the "malformed DNS packet", but it meant one where "CNAME records appear after A/AAAA records in the answer section" as noted by djb in [2] My first comment to the draft is that we should add a reference to CVE- 2002-0400 as a relevant precedent. Second, regarding the current wording of the draft, I am against the way it is phrased. When it states: > the section MUST be treated as an ordered list of RRSets it basically means that when a dns server sends: ;; ANSWER SECTION: cloudflare.com. 300 IN A 192.0.2.9 cloudflare.com. 300 IN AAAA 2001:db8::1 cloudflare.com. 300 IN A 192.0.2.3 cloudflare.com. 300 IN AAAA 2001:db8::36 cloudflare.com. 300 IN A 192.0.2.11 cloudflare.com. 300 IN AAAA 2001:db8::25 cloudflare.com. 300 IN A 192.0.2.4 the records cannot be reordered, which is against the intent of RFC 1034/1035 as well as all their predecessors, and would surely have big implications on dns caches. (And in case you wonder, there's no requisite to group the records by rrsets, either) I don't think that's really the intention of that paragraph, but it cancertainly be read that way. However, I am in favor of adding a text such as the following: > > The answer section of a DNS response carries RRs which directly > answer the query (RFC 1034 section 3.7). > These RRs may generally be presented in an arbitrary order. However, > when the answer section contains RRs with different owner names, the > records MUST appear grouped by owner name in the order in which it > was being redirected. > Currently, the only records that are able to perform a redirection > and thus change the owner name bing answered are CNAME and DNAME. > When an answer contains both a DNAME and a synthesized CNAME, the > DNAME MUST be presented before the CNAME. > DNS servers MUST send the anwer RRs in redirect chain order as > specified above. DNS clients MAY ignore RRs in the answer section > with an unknown owner name at the position they are inserted (those > that appear before an RR that introduces them), interpreting the > answer as if they had not been included. > Accordingly, the current content of sections 4 and 5 would be completely different. I believe this is the same position expressed by Warren and Philip- As for why specifying this, instead of requiring all clients to accept them in any order (John Levine), the reasons are multiple: * It is already semi-specified that way in the spec: > the recursive response > to a query will be one of the following: > > - The answer to the query, possibly preface by one or more CNAME > RRs that specify aliases encountered on the way to an answer (RFC 1034 section 4.3.1) While a bit unclear, that seems to mean that first you receive CNAMEs, and then the RR of different types. * Another place in the specification that favors the interpretation that the CNAMEs should appear in order is that: > name servers may choose to restart the query at the canonical name in certain cases (rfc1035 section 3.3.1) The process is laid out on rfc1034 section 3.6.2: > it checks to see if the resource set consists of a CNAME record with a matching class. If so, the name server includes the CNAME record in the response and restarts the query... not specifying the order in which the records nor the CNAMEs shall appear. But, if the CNAMEs came out of order, and were large enough. There might not be enough space to hold everything. Since > When a response is so long that truncation is required, the > truncation should start at the end of the response and work forward > in the datagram. (rfc1035 section 6.2) it's debatable if such truncated data should be used at all (rfc1035 seem to consider that an option), an incomplete redirect chain could be continued (since there can only be a CNAMEs per label, a complete CNAME RR is usable), but records referring to an unknown domain name because the needed CNAME got truncated would be useless. Moreover, the TC bit does not even need to be set in such case, since > The TC bit should be set in responses only when an RRSet is required > as a part of the response, but could not be included in its entirety. > The TC bit should not be set merely because some extra information > could have been included, but there was insufficient room. (rfc2181 section 9) and restarting the query is optional (may choose to) At the same time, this is precisely the scenario where placing those records _first_ would make sense if the client just wanted an A record and were going to ignore the CNAME leading to them. However, it shares the same problems as a stub picked A records not checking to which domain they belong to. * More importantly, it is de-facto that way already. We can consider it ossified. Every server is sending them that way, since dns inception. If there were resolvers not doing it like that, this would have broken much earlier, * It is also compatible with any existing client, both those that are able to pick the answers in any order and those that expect CNAMEs to appear first. * It is the 'reasonable'/'obvious' path to send a redirect chain in order. Also for the recursive resolver to show the in order. Regarding the recent Clouflare case[3], I don't see why they choose to merge the CNAME chain by appending the CNAMEs to an existing entry that already had the A/AAAA instead of expanding the existing entry for the CNAME to append the final result (maybe there's a Rust-specific reason for that?). It is consistent, it does not require going out-of-the-way to support this. * Ensuring that dns answers will come in order greatly simplifies the life for resolvers. They can simply track which is the current domain and parse the reply in order picking the records of that type. Crucially, processing them in order automatically removes the possibility of a CNAME loop (when not launching a second query, of course), in addition of being more efficient. * Finally, if the chains are not sent in order "the internet breaks" :) And fourth, on the security considerations section maybe we should mention that crashing the program if the RRs are not presented in the expected order is not an option, since that would cause a Denial of Service? It seems obvious, but seeing how some software has behaved when presented these out-of-order, perhaps it is not so evident as expected. :/ # Other record types Finally, we might want to test if there are other cases where order is significant in the wild. I was under the impression that RRSIG had to appear *after* the RRSET they sign, but that doesn't seem to be stated anywhere. Canonical ordering is relevant for signing / verifying, but there's no mention about in-the-wire ordering. Might there be clients that will (incorrectly) fail validation if the answer had the RRSIG first? (validating resolvers are much less prevalent, though) Best regards 1- https://www.cve.org/CVERecord?id=CVE-2002-0400 2- https://cr.yp.to/djbdns/res-disaster.html (oddly, I cannot find the original mail he quotes in the archive, albeit the next is at https://marc.info/?l=djbdns&m=102614840819438&w=2) 3- https://blog.cloudflare.com/cname-a-record-order-dns-standards/#the-logic-change _______________________________________________ DNSOP mailing list -- [email protected] To unsubscribe send an email to [email protected]
