> On 9 Jul 2020, at 00:50, Marek Majkowski <[email protected]> wrote: > > On Wed, Jul 8, 2020 at 10:01 AM <[email protected]> wrote: >> >> DNSOP WG, >> >> Paul Vixie and I submitted draft-ietf-dnsop-avoid-fragmentation-00. >> Please review it. > > Hi! > >> UDP requestors and responders SHOULD send DNS responses with >> IP_DONTFRAG / IPV6_DONTFRAG [RFC3542] options, which will yield >> either a silent timeout, or a network (ICMP) error, if the path >> MTU is exceeded. > > When MTU is exceeded the sender might also receive plain old EMSGSIZE > error on sendto(). I would love to see an example on what > IP_MTU_DISCOVER settings authors expect. This option is notoriously > hard to get right. > >> The maximum buffer size offered by an EDNS0 initiator SHOULD be >> no larger than the estimated maximum DNS/UDP payload size... > > This seems to indicate that EDNS0 over TCP should have a small buffer > size as well. Consider wording like "...buffer size offered by an > EDNS0 initator over UDP...". > >> Fragmented DNS/UDP messages may be dropped without IP reassembly > > Not sure what it has to do with the draft. Are we worried about > request fragmentation and allowing the DNS server to drop fragmented > requests? Are we worried about response fragmentation? > > > > I have two problems with this proposal. First, it doesn't mention IPv4 > vs IPv6 differences at all. In IPv4 landscape fragmentation, while a > security issue, is generally fine. In the IPv6 world, fragmentation is > disastrous - packets with extension headers are known to be dropped.
Not really. UNKNOWN extensions tend to get dropped but the fragmentation header is a KNOWN extension header. > Second, this proposal assumes that path MTU detection works correctly. > This is surprisingly optimistic. Let's consider IPv6 - in IPv6 the > smaller path MTU < 1500 is very common. Which is why IPV6_USE_MIN_MTU exists (RFC 3542). USE THE SOCKET OPTION. It was put there specifically to support DNS over UDP and other applications like that. I know this as I proposed the predecessor option back in 1999 which became IPV6_USE_MIN_MTU. If the OS hosting your DNS server doesn’t support this option 17 years after is was defined throw it in the bin. IPV6_USE_MIN_MTU also helps with TCP. DNS does not need to suffer from PMTUD issues. > Let's say a DNS auth server sent an IPv6 DNS response packet exceeding > path MTU. An intermediate router will drop the offending packet and > one of three scenarios will happen: > > - (A) No ICMP PTB message is sent back. > > - (B) ICMP PTB message is sent back, but fails to be delivered. > > - (C) ICMP PTB message is sent back and delivered correctly to the server. > > All three scenarios are disastrous on the practical internet. The > proposal assumes (A) and (B) will rarely happen, and puts the > responsibility on the DNS client to retry over TCP. This will cause > unnecessary timeouts and degrade the overall quality of the service. > > But perhaps most importantly even option (C) will *not* result in good > service. Consider a setup with multiple DNS servers behind an ECMP > router, or another L4 load balancer. Even if the return ICMP will hit > back the correct server - which is far from obvious - the ICMP will > update the Path MTU on *one server*. If a client attempts to retry the > query, as suggested by the proposal, it will most likely hit another > server, which is not aware of non standard Path MTU. > > These days DNS Auth installations use ECMP routing for load balancing. > A single physical box serving important DNS is a rare occurrence. > > In this proposal all three (A), (B), and (C) scenarios will result in > dropped responses. DNS client needs to wait for timeout, retry over > UDP, wait more and eventually retry over TCP. This is bad. > > We could fix (C) by making the DNS server to capture the ICMP PTB in > DNS server code. The ICMP payload often has enough context for the DNS > server to prepare another reply. This reply of course should be sent > with lowered MTU. > > In other words, I'm asking for capturing ICMP PTB in DNS servers, > unpacking the paylad and treating it as another request to be handled. > This would solve (C). Also, note this opens an interesting DDoS > vector, but this is another story. > > On Linux it is possible to capture the ICMP PTB without privileges, by > setting IP_RECVERR and inspecting MSG_ERRQUEUE. In IPv4 the PTB > messages often have 520 bytes of payload and in IPv6 1184 bytes. This > is enough context to build another response, without having to wait > for any timeout. > > Cheers, > Marek > > _______________________________________________ > DNSOP mailing list > [email protected] > https://www.ietf.org/mailman/listinfo/dnsop -- Mark Andrews, ISC 1 Seymour St., Dundas Valley, NSW 2117, Australia PHONE: +61 2 9871 4742 INTERNET: [email protected] _______________________________________________ DNSOP mailing list [email protected] https://www.ietf.org/mailman/listinfo/dnsop
