On Mon, May 4, 2026 at 7:53 AM Ankit Jain <[email protected]> wrote: > > When an application locks SO_RCVBUF, it expects strict memory bounds and > disables TCP window auto-tuning. However, recent TCP memory fragmentation > optimizations still apply dynamic truesize penalties to the `scaling_ratio` > of these locked sockets. > > For workloads processing small, fragmented packets (like Java's Tomcat), > this penalty drops the scaling_ratio to 1. This shrinks the dynamically > calculated advertised window, leading to Silly Window Syndrome (SWS) > deadlocks and 504 Gateway Timeouts. > > This patch fixes the issue by bypassing the truesize penalty for sockets > with `SOCK_RCVBUF_LOCK` set. To ensure the kernel still defends against > memory exhaustion from large aggregate payloads (e.g., GRO), the penalty > is still applied if `skb->len` exceeds the advertised MSS. > > Fixes: a2cbb1603943 ("tcp: Update window clamping condition") > Reported-by: Karen Badiryan <[email protected]> > Signed-off-by: Ankit Jain <[email protected]> > --- > net/ipv4/tcp_input.c | 8 +++++++- > 1 file changed, 7 insertions(+), 1 deletion(-) > > diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c > index d5c9e65d9760..569299dafa88 100644 > --- a/net/ipv4/tcp_input.c > +++ b/net/ipv4/tcp_input.c > @@ -240,8 +240,14 @@ static void tcp_measure_rcv_mss(struct sock *sk, const > struct sk_buff *skb) > /* Note: divides are still a bit expensive. > * For the moment, only adjust scaling_ratio > * when we update icsk_ack.rcv_mss. > + * > + * Protect locked SO_RCVBUF from Silly Window Syndrome > + * due to truesize penalties on small packets. Allow > + * penalty if aggregate payload (e.g., GRO) exceeds MSS. > */ > - if (unlikely(len != icsk->icsk_ack.rcv_mss)) { > + if (unlikely(len != icsk->icsk_ack.rcv_mss && > + (!(sk->sk_userlocks & SOCK_RCVBUF_LOCK) || > + skb->len > tcp_sk(sk)->advmss))) {
Testing tp->advmss is not doing what you want I think. A remote peer can send GRO packets with tiny segments, regardless of tp->advmss If GRO is what you are looking for, why not testing (skb->len > len) ? > u64 val = (u64)skb->len << TCP_RMEM_TO_WIN_SCALE; > u8 old_ratio = tcp_sk(sk)->scaling_ratio; > > -- > 2.53.0 >

