On Wed, Jul 8, 2020 at 10:01 AM <[email protected]> wrote:
>
> DNSOP WG,
>
> Paul Vixie and I submitted draft-ietf-dnsop-avoid-fragmentation-00.
> Please review it.

Hi!

> UDP requestors and responders SHOULD send DNS responses with
> IP_DONTFRAG / IPV6_DONTFRAG [RFC3542] options, which will yield
> either a silent timeout, or a network (ICMP) error, if the path
> MTU is exceeded.

When MTU is exceeded the sender might also receive plain old EMSGSIZE
error on sendto(). I would love to see an example on what
IP_MTU_DISCOVER settings authors expect. This option is notoriously
hard to get right.

> The maximum buffer size offered by an EDNS0 initiator SHOULD be
> no larger than the estimated maximum DNS/UDP payload size...

This seems to indicate that EDNS0 over TCP should have a small buffer
size as well. Consider wording like "...buffer size offered by an
EDNS0 initator over UDP...".

> Fragmented DNS/UDP messages may be dropped without IP reassembly

Not sure what it has to do with the draft. Are we worried about
request fragmentation and allowing the DNS server to drop fragmented
requests? Are we worried about response fragmentation?



I have two problems with this proposal. First, it doesn't mention IPv4
vs IPv6 differences at all. In IPv4 landscape fragmentation, while a
security issue, is generally fine. In the IPv6 world, fragmentation is
disastrous - packets with extension headers are known to be dropped.

Second, this proposal assumes that path MTU detection works correctly.
This is surprisingly optimistic. Let's consider IPv6 - in IPv6 the
smaller path MTU < 1500 is very common.

Let's say a DNS auth server sent an IPv6 DNS response packet exceeding
path MTU. An intermediate router will drop the offending packet and
one of three scenarios will happen:

- (A) No ICMP PTB message is sent back.

- (B) ICMP PTB message is sent back, but fails to be delivered.

- (C) ICMP PTB message is sent back and delivered correctly to the server.

All three scenarios are disastrous on the practical internet. The
proposal assumes (A) and (B) will rarely happen, and puts the
responsibility on the DNS client to retry over TCP. This will cause
unnecessary timeouts and degrade the overall quality of the service.

But perhaps most importantly even option (C) will *not* result in good
service. Consider a setup with multiple DNS servers behind an ECMP
router, or another L4 load balancer. Even if the return ICMP will hit
back the correct server - which is far from obvious - the ICMP will
update the Path MTU on *one server*. If a client attempts to retry the
query, as suggested by the proposal, it will most likely hit another
server, which is not aware of non standard Path MTU.

These days DNS Auth installations use ECMP routing for load balancing.
A single physical box serving important DNS is a rare occurrence.

In this proposal all three (A), (B), and (C) scenarios will result in
dropped responses. DNS client needs to wait for timeout, retry over
UDP, wait more and eventually retry over TCP. This is bad.

We could fix (C) by making the DNS server to capture the ICMP PTB in
DNS server code. The ICMP payload often has enough context for the DNS
server to prepare another reply. This reply of course should be sent
with lowered MTU.

In other words, I'm asking for capturing ICMP PTB in DNS servers,
unpacking the paylad and treating it as another request to be handled.
This would solve (C). Also, note this opens an interesting DDoS
vector, but this is another story.

On Linux it is possible to capture the ICMP PTB without privileges, by
setting IP_RECVERR and inspecting MSG_ERRQUEUE. In IPv4 the PTB
messages often have 520 bytes of payload and in IPv6 1184 bytes. This
is enough context to build another response, without having to wait
for any timeout.

Cheers,
Marek

_______________________________________________
DNSOP mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/dnsop

Reply via email to