I ran into an unexpected interaction between DLPMTUD and QUIC's idle
timeout in my implementation. I've implemented the "Probing using padding
data" described in the DLPMTUD draft: I send a single PING frame and then
use PADDING frames to achieve the desired probe packet size.

Assume that you're sending MTU probe packets on a timer, and that timer
fires a long timer after the connection became idle (but before the idle
timeout). Peer A will therefore send a probe packet to its peer, B. Since
the probe packet is ack-eliciting, this resets the idle timeout timer for
A. It now happens that the probe packet is too large for the path, and the
packet is dropped. Therefore, B's idle timeout is not reset, and A and B
will have a large disagreement about the start and end of the idle period.

This root cause of this disagreement is that MTU probe packets are treated
differently than other ack-eliciting packets from a loss recovery
standpoint: they are not retransmitted, but their loss is interpreted as a
signal that the path doesn't support that particular MTU.

Note that A not resetting the idle timer when sending a probe packet
doesn't solve the problem. There's another case where this fails: Assume
this time the probe packet is received by B, but the ACK for that packet is
lost. Now B will have reset its probe timer when it received the probe
packet, but A will not, leading again to a large disagreement about the
start and end of the idle period, this time in the other direction.

I can see multiple solutions to this:

   1. Don't send MTU packets on a timer. Only send them when application
   data is sent. This avoids sending packets during periods of quiescence
   (might be good to not wake up the network interface), but it also means
   that we're not using those periods of quiescence, where plenty of
   congestion window is available.
   2. Retransmit the PING frame from the probe packet in a normal size
   packet, until it is acknowledged. This is sad, since it will cause an
   additional packet to be sent every time a probe packet is lost.

Any thoughts on how to best deal with this?

Cheers,
Marten

Reply via email to