tcp_bpf_strp_read_sock() rolls tp->copied_seq back by the SK_PASS bytes
parked on the psock ingress_msg queue; tcp_bpf_recvmsg_parser() repays it
as those bytes are delivered. When the socket leaves the sockmap they are
purged undelivered and nothing repays the rollback, so copied_seq is left
behind sk_receive_queue and the native tcp_recvmsg() warns:
TCP recvmsg seq # bug: copied 66913561, seq 6691356A, rcvnxt 66913572, fl 40
WARNING: net/ipv4/tcp.c:2733 at tcp_recvmsg_locked+0x2d0/0x1270
tcp_recvmsg+0xba/0x340
inet_recvmsg+0x7a/0x370
sock_recvmsg+0xef/0x110
__sys_recvfrom+0x132/0x1e0
Settle copied_seq to the parser's consume point as the socket leaves the
sockmap so it cannot trail the receive queue.
Fixes: 36b62df5683c ("bpf: Fix wrong copied_seq calculation")
Signed-off-by: Sechang Lim <[email protected]>
---
net/ipv4/tcp_bpf.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c
index cc0bd73f36b6..918f8da02c39 100644
--- a/net/ipv4/tcp_bpf.c
+++ b/net/ipv4/tcp_bpf.c
@@ -715,6 +715,15 @@ int tcp_bpf_update_proto(struct sock *sk, struct sk_psock
*psock, bool restore)
}
if (restore) {
+#if IS_ENABLED(CONFIG_BPF_STREAM_PARSER)
+ /*
+ * Settle the copied_seq rollback for the now-discarded
+ * ingress_msg data so it cannot trail the receive queue
+ */
+ if (sk_psock_test_state(psock, SK_PSOCK_RX_STRP_ENABLED) &&
+ before(tcp_sk(sk)->copied_seq, psock->copied_seq))
+ WRITE_ONCE(tcp_sk(sk)->copied_seq, psock->copied_seq);
+#endif
if (inet_csk_has_ulp(sk)) {
/* TLS does not have an unhash proto in SW cases,
* but we need to ensure we stop using the sock_map
--
2.43.0