Add a tracepoint to monitor TCP congestion window adjustments through the tcp_cwnd_reduction() function. This tracepoint helps track: - TCP window size fluctuations - Active socket behavior - Congestion window reduction events
Meta has been using BPF programs to monitor this function for years. By adding a proper tracepoint, we provide a stable API for all users who need to monitor TCP congestion window behavior. The tracepoint captures: - Socket source and destination IPs - Number of newly acknowledged packets - Number of newly lost packets - Packets in flight Here is an example of a tracepoint when viewed in the trace buffer: tcp_cwnd_reduction: src=[2401:db00:3021:10e1:face:0:32a:0]:45904 dest=[2401:db00:3021:1fb:face:0:23:0]:5201 newly_lost=0 newly_acked_sacked=27 in_flight=34 CC: Yonghong Song <yonghong.s...@linux.dev> CC: Song Liu <s...@kernel.org> CC: Martin KaFai Lau <martin....@kernel.org> Signed-off-by: Breno Leitao <lei...@debian.org> --- include/trace/events/tcp.h | 34 ++++++++++++++++++++++++++++++++++ net/ipv4/tcp_input.c | 2 ++ 2 files changed, 36 insertions(+) diff --git a/include/trace/events/tcp.h b/include/trace/events/tcp.h index a27c4b619dffd7dcc72fffa71bf0fd5e34fe6681..b3a636658b39721cca843c0000eaa573cf4d09d5 100644 --- a/include/trace/events/tcp.h +++ b/include/trace/events/tcp.h @@ -259,6 +259,40 @@ TRACE_EVENT(tcp_retransmit_synack, __entry->saddr_v6, __entry->daddr_v6) ); +TRACE_EVENT(tcp_cwnd_reduction, + + TP_PROTO(const struct sock *sk, const int newly_acked_sacked, + const int newly_lost, const int flag), + + TP_ARGS(sk, newly_acked_sacked, newly_lost, flag), + + TP_STRUCT__entry( + __array(__u8, saddr, sizeof(struct sockaddr_in6)) + __array(__u8, daddr, sizeof(struct sockaddr_in6)) + __field(int, in_flight) + + __field(int, newly_acked_sacked) + __field(int, newly_lost) + ), + + TP_fast_assign( + const struct inet_sock *inet = inet_sk(sk); + const struct tcp_sock *tp = tcp_sk(sk); + + memset(__entry->saddr, 0, sizeof(struct sockaddr_in6)); + memset(__entry->daddr, 0, sizeof(struct sockaddr_in6)); + + TP_STORE_ADDR_PORTS(__entry, inet, sk); + __entry->in_flight = tcp_packets_in_flight(tp); + __entry->newly_lost = newly_lost; + __entry->newly_acked_sacked = newly_acked_sacked; + ), + + TP_printk("src=%pISpc dest=%pISpc newly_lost=%d newly_acked_sacked=%d in_flight=%d", + __entry->saddr, __entry->daddr, __entry->newly_lost, + __entry->newly_acked_sacked, __entry->in_flight) +); + #include <trace/events/net_probe_common.h> TRACE_EVENT(tcp_probe, diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index 4811727b8a02258ec6fa1fd129beecf7cbb0f90e..fc88c511e81bc12ec57e8dc3e9185a920d1bd079 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -2710,6 +2710,8 @@ void tcp_cwnd_reduction(struct sock *sk, int newly_acked_sacked, int newly_lost, if (newly_acked_sacked <= 0 || WARN_ON_ONCE(!tp->prior_cwnd)) return; + trace_tcp_cwnd_reduction(sk, newly_acked_sacked, newly_lost, flag); + tp->prr_delivered += newly_acked_sacked; if (delta < 0) { u64 dividend = (u64)tp->snd_ssthresh * tp->prr_delivered + --- base-commit: 96e12defe5a8fa3f3a10e3ef1d20fee503245a10 change-id: 20250120-cwnd_tracepoint-2e11c996a9cb Best regards, -- Breno Leitao <lei...@debian.org>