Ian, MT, thank you for sharing your thoughts. I have implemented QMux-00, and based on that experience I agree that implementing QMux without records is not hard.
That said, I still think that using records is the better direction. The main reasons are: * it avoids requiring the decoder of every frame type to support trial or incremental decoding to handle truncated input; * the overhead is identical when sending large data; and * it naturally reduces the risk of blocking caused by frames crossing TLS record boundaries. On the first point, with direct QUIC-v1 framing over a byte stream, the decoder of every frame type needs to handle receipt of incomplete frames gracefully, either by implementing trial decoding, or by implementing incremental decoding. In the case of trial decoding, the receive core needs to retain the partial frame image and re-invoke the decoder as more bytes arrive. In the implementation strategy I presented at IETF 124, this meant making frame decoders return `PARTIAL` on truncated input, and repeatedly invoking them until `PARTIAL` is returned, while retaining the remaining bytes until more bytes are received [1]. In other words, the complexity is not just in the receive loop. It also affects the contract of the frame decoders themselves. With records, that complexity can stay at the record layer, while frame encoding and decoding remain identical to QUIC v1. On the second point, for large data, the two approaches are equally wire-efficient. Martin's preferred design prohibits STREAM frames without a length field, so in that case both approaches carry a length anyway. The difference is just where the length is carried. As already noted in the comment of issue #21 [2], for large transfers the alternative is no more space-efficient. On the third point, at first glance, one might think that using QMux records increases buffering, because a receiver would wait until a complete QMux record is received before processing it. I do not think that is the right comparison, however. Because TLS would be used as the underlying transport, and because the default maximum size of a QMux record (16382 bytes plus the length prefix) naturally matches that of a TLS record (16384 bytes), senders can naturally align QMux and TLS record boundaries in many cases. That in turn means that receivers are more likely to observe complete QMux records at natural processing boundaries, making such buffering a rarer case in the frame-decoding path and reducing the risk that a partially available STREAM frame remains blocked until the next TLS record becomes fully available, even if the receiver does not implement incremental processing. By contrast, without records, frames are more likely to cross TLS record boundaries. That means receivers either need to implement incremental decoding of STREAM frames, **and also add plumbing to pass the partially received payload of the STREAM frame to the application**, or accept additional latency while waiting for the trailing TLS record to become fully available. Put differently, I think QMux records provide a structure that is easier to implement efficiently across QMux stacks. [1] https://datatracker.ietf.org/meeting/124/materials/slides-124-quic-qmux-00 p.16 [2] https://github.com/quicwg/qmux/issues/21#issuecomment-4107952458 2026年3月23日(月) 8:52 Martin Thomson <[email protected]>: > I also looked. Like Ian, there were a few issues that came out of the > review I did. > > The question of the two-layer framing is one that I've thought about. > I've concluded that I strongly prefer the design where STREAM frames are > always length-delimited. It's a miniscule amount more efficient. Either > version includes a length, but the version with the length on one more > STREAM frame will have a smaller value. And any time that there is no > unterminated STREAM frame, the length is pure waste. > > It is true that the engineering cost of managing stream reads is > significant. I can attest to that. Variable-length integers are pretty > bad for this sort of processing, but many stacks already need the code for > their HTTP/3 implementation, which has exactly this problem already. > > It would have been nice to put the code to handle undelivered-but-promised > data in a single place, but that is only possible if you impose a > performance penalty. A stack could use another framing layer to shield > frame processing from having to block/pause when data is not yet > available. But that leads to delays in processing. Any framing layer that > bundles multiple frames would not be processed, even if the data for > leading frames is present. > > That increases the costs of a reliable QMux implementation, which > undercuts some of the promise that was made, but my view is that frame > processing is such a small part of any implementation that it doesn't > matter much. > > ~Martin > > On Sat, Mar 21, 2026, at 21:25, Ian Swett wrote: > > Thanks for the progress and I apologize for missing the session. I > > tried to provide helpful reviews where possible. > > > > The two-layer encoding Issue is the only one where I felt we were going > > in the wrong direction, but I recognize I might not fully understand > > the issues. > > > > Thanks, Ian > > > > On Fri, Mar 20, 2026 at 4:21 AM Lucas Pardue <[email protected]> > wrote: > >> __ > >> Hi folks, > >> > >> At the 125 session, Kazuho presented several QMux open issues with > associated PRs. This email serves to summarize the outcome of the > discussion and confirm the feeling in the room on the list. > >> > >> Since there has seemed to be strong emerging consensus on GitHub and in > the room, the authors would like to promptly follow up on the outcomes. If > you disagree with them, please let that be known ASAP and before > 2026-03-26, ideally on the issue or PR itself. > >> > >> As noted in the session, we'd like to schedule a virtual interim for > QMux before IETF 126, targetting an EMEA friendly timeslot. Look out for a > follow up email on that topic, in the meantime you can express your > interest directly to the chairs. > >> > >> • Two-layer encoding > >> • Issues: https://github.com/quicwg/qmux/issues/21 and > https://github.com/quicwg/qmux/issues/24 > >> • PR: https://github.com/quicwg/qmux/pull/26 > >> • Outcome: merge the PR to close the issues > >> • Deadlock and flow control > >> • Issue: https://github.com/quicwg/qmux/issues/9 > >> • PR: https://github.com/quicwg/qmux/pull/27 > >> • Outcome: merge the PR to close the issue > >> • TLS Profile - negotiating application protocol when using TLS > >> • Issues: https://github.com/quicwg/qmux/issues/12 and > https://github.com/quicwg/qmux/issues/25 > >> • PR: https://github.com/quicwg/qmux/pull/33 > >> • Outcome: merge the PR to close the issue > >> • TLS Profile - QMux transport params (TPs) in TLS handshake > >> • Issue: https://github.com/quicwg/qmux/issues/18 > >> • PR: https://github.com/quicwg/qmux/pull/28 > >> • Outcome: merge the PR and close the issue (the PR adds improved > text but we will keep TPs in QMux frames > >> • Implicit acks & ping > >> • Issue: https://github.com/quicwg/qmux/issues/22 > >> • PR: https://github.com/quicwg/qmux/pull/23 > >> • Outcome: merge the PR to close the issue > >> • Multipath TCP > >> • Issue: https://github.com/quicwg/qmux/issues/5 > >> • PR: https://github.com/quicwg/qmux/pull/29 > >> • Outcome: do not merge the PR, close with no action > >> Cheers, > >> Lucas & Matt > >> QUIC WG Chairs > -- Kazuho Oku
