On Mon, Mar 9, 2026 at 9:03 AM Simon Baatz via B4 Relay <[email protected]> wrote: > > From: Simon Baatz <[email protected]> > > By default, the Linux TCP implementation does not shrink the > advertised window (RFC 7323 calls this "window retraction") with the > following exceptions: > > - When an incoming segment cannot be added due to the receive buffer > running out of memory. Since commit 8c670bdfa58e ("tcp: correct > handling of extreme memory squeeze") a zero window will be > advertised in this case. It turns out that reaching the required > memory pressure is easy when window scaling is in use. In the > simplest case, sending a sufficient number of segments smaller than > the scale factor to a receiver that does not read data is enough. > > - Commit b650d953cd39 ("tcp: enforce receive buffer memory limits by > allowing the tcp window to shrink") addressed the "eating memory" > problem by introducing a sysctl knob that allows shrinking the > window before running out of memory. > > However, RFC 7323 does not only state that shrinking the window is > necessary in some cases, it also formulates requirements for TCP > implementations when doing so (Section 2.4). > > This commit addresses the receiver-side requirements: After retracting > the window, the peer may have a snd_nxt that lies within a previously > advertised window but is now beyond the retracted window. This means > that all incoming segments (including pure ACKs) will be rejected > until the application happens to read enough data to let the peer's > snd_nxt be in window again (which may be never). > > To comply with RFC 7323, the receiver MUST honor any segment that > would have been in window for any ACK sent by the receiver and, when > window scaling is in effect, SHOULD track the maximum window sequence > number it has advertised. This patch tracks that maximum window > sequence number rcv_mwnd_seq throughout the connection and uses it in > tcp_sequence() when deciding whether a segment is acceptable. > > rcv_mwnd_seq is updated together with rcv_wup and rcv_wnd in > tcp_select_window(). If we count tcp_sequence() as fast path, it is > read in the fast path. Therefore, rcv_mwnd_seq is put into rcv_wnd's > cacheline group. > > The logic for handling received data in tcp_data_queue() is already > sufficient and does not need to be updated. > > Signed-off-by: Simon Baatz <[email protected]>
... > diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c > index > f0ebcc7e287173be6198fd100130e7ba1a1dbf03..c86910d147f2394bf414d7691d8f90ed41c1b0e3 > 100644 > --- a/net/ipv4/tcp_output.c > +++ b/net/ipv4/tcp_output.c > @@ -293,6 +293,7 @@ static u16 tcp_select_window(struct sock *sk) > tp->pred_flags = 0; > tp->rcv_wnd = 0; > tp->rcv_wup = tp->rcv_nxt; > + tcp_update_max_rcv_wnd_seq(tp); Presumably we do not need tcp_update_max_rcv_wnd_seq() here ? Otherwise patch looks good, thanks.

