tcp_bpf_strp_read_sock() rolls tp->copied_seq back by the SK_PASS bytes
parked on the psock ingress_msg queue; tcp_bpf_recvmsg_parser() repays it
as those bytes are delivered. When the socket leaves the sockmap they are
purged undelivered and nothing repays the rollback, so copied_seq is left
behind sk_receive_queue and the native tcp_recvmsg() warns:

  TCP recvmsg seq # bug: copied 66913561, seq 6691356A, rcvnxt 66913572, fl 40
  WARNING: net/ipv4/tcp.c:2733 at tcp_recvmsg_locked+0x2d0/0x1270
   tcp_recvmsg+0xba/0x340
   inet_recvmsg+0x7a/0x370
   sock_recvmsg+0xef/0x110
   __sys_recvfrom+0x132/0x1e0

Settle copied_seq to the parser's consume point as the socket leaves the
sockmap so it cannot trail the receive queue.

Fixes: 36b62df5683c ("bpf: Fix wrong copied_seq calculation")
Signed-off-by: Sechang Lim <[email protected]>
---
 net/ipv4/tcp_bpf.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c
index cc0bd73f36b6..918f8da02c39 100644
--- a/net/ipv4/tcp_bpf.c
+++ b/net/ipv4/tcp_bpf.c
@@ -715,6 +715,15 @@ int tcp_bpf_update_proto(struct sock *sk, struct sk_psock 
*psock, bool restore)
        }
 
        if (restore) {
+#if IS_ENABLED(CONFIG_BPF_STREAM_PARSER)
+               /*
+                * Settle the copied_seq rollback for the now-discarded
+                * ingress_msg data so it cannot trail the receive queue
+                */
+               if (sk_psock_test_state(psock, SK_PSOCK_RX_STRP_ENABLED) &&
+                   before(tcp_sk(sk)->copied_seq, psock->copied_seq))
+                       WRITE_ONCE(tcp_sk(sk)->copied_seq, psock->copied_seq);
+#endif
                if (inet_csk_has_ulp(sk)) {
                        /* TLS does not have an unhash proto in SW cases,
                         * but we need to ensure we stop using the sock_map
-- 
2.43.0


Reply via email to