On Thu, Oct 16, 2025 at 10:05:13PM -0700, Christian Huitema wrote:
> By the way, there is another issue than just "the receiver cannot cope".
> Packets for a stream may be received out of order. The stack can only
> deliver them to the application in order. Suppose that the stack has
> increased "max stream data" to allow a thousand packets on the stream. If
> the first packet is lost, the stack may have to buffer 999 packets until it
> receives the correction. So there is a direct relation between "max stream
> data" times number of streams and the max memory that the stack will need.
> On a small device, one has to be cautious. And if you have a memory budget,
> then it makes sense to just enforce it using "max data".

In H2 (which doesn't have the out-of-order issue), I have been working on
a complex algorithm to try to determine how much to advertise so as to keep
the minimum amount of data between the demux and the application layer. That
was particularly complex (measuring ordering of queue/dequeue) and looked a
bit like congestion control algorithms. Then I figured that everything could
easily fall into pieces with bursty streams in parallel, and that actually
it was much easier to simply allocate a budget for the whole connection and
assign a part of it for the streams that are present. I don't remember the
exact details but basically there was a large percentage of the budget that
was equally shared between streams that were expected to receive data, and
a small part for future streams. This allows new streams to work (albeit
possibly not fast) when others are already receiving data, but as soon as
a new stream also needs to receive data, the other ones see their share
reduce and won't refill the rx window until it reaches the previously
advertised level.

While I found that pretty naive, it happened to work surprisingly well,
allowing us to multiply the single-POST performance by 30 or so, and to
get rid of HOL when merging multiple downloading client connections to a
same server over H2. Finally we managed to port it to QUIC as well (with
some adaptations that I don't remember). I only remember that it was
harder with QUIC since you cannot benefit from the TCP stack's ability
to compensate for your excessive advertisements, but overall it's ok.

The key here is to never allocate everything to a given stream so that
new ones still have something to start to work with, and let the
distribution rebalance by itself as transfers progress.

In the end I simply gave up with the initial design which would probably
only allow us to save extra memory in optimal cases, but as you said
above, with out-of-order you don't gain anything anymore if you have to
buffer the last 999/1000 of the contents waiting for the first one to
arrive before delivering it.

Willy

Reply via email to