On 8/23/2025 5:23 AM, Andrii Nakryiko wrote:
On Sun, Aug 3, 2025 at 7:27 PM Xu Kuohai <xukuo...@huaweicloud.com> wrote:
From: Xu Kuohai <xukuo...@huawei.com>
In overwrite mode, the producer does not wait for the consumer, so the
consumer is responsible for handling conflicts. An optimistic method
is used to resolve the conflicts: the consumer first reads consumer_pos,
producer_pos and overwrite_pos, then calculates a read window and copies
data in the window from the ring buffer. After copying, it checks the
positions to decide if the data in the copy window have been overwritten
by be the producer. If so, it discards the copy and tries again. Once
success, the consumer processes the events in the copy.
I don't mind adding BPF_F_OVERWRITE mode to BPF ringbuf (it seems like
this will work fine) itself, but I don't think retrofitting it to this
callback-based libbpf-side API is a good fit.
For one, I don't like that extra memory copy and potentially a huge
allocation that you do. I think for some use cases user logic might be
totally fine with using ringbuf memory directly, even if it can be
overwritten at any point. So it would be unfair to penalize
sophisticated users for such cases. Even if not, I'd say allocating
just enough to hold the record would be a better approach.
Another downside is that the user doesn't really have much visibility
right now into whether any samples were overwritten.
I've been mulling over the idea of adding an iterator-like API for BPF
ringbuf on the libbpf side for a while now. I'm still debating some
API nuances with Eduard, but I think we'll end up adding something
pretty soon. Iterator-based API seems like a much better fit for
overwritable mode here.
But all that is not really overwrite-specific and is broader, so I
think we can proceed with finalizing kernel-side details of overwrite
and not block on libbpf side of things for now, though.
Sounds great! Looking forward to the new iterator-based API. Clearly, no
memory allocation or a samll allocation is better than a huge allocation.
I'll focus on the kernel side before the new API is introduced.
Signed-off-by: Xu Kuohai <xukuo...@huawei.com>
---
tools/lib/bpf/ringbuf.c | 103 +++++++++++++++++++++++++++++++++++++++-
1 file changed, 102 insertions(+), 1 deletion(-)
diff --git a/tools/lib/bpf/ringbuf.c b/tools/lib/bpf/ringbuf.c
index 9702b70da444..9c072af675ff 100644
--- a/tools/lib/bpf/ringbuf.c
+++ b/tools/lib/bpf/ringbuf.c
@@ -27,10 +27,13 @@ struct ring {
ring_buffer_sample_fn sample_cb;
void *ctx;
void *data;
+ void *read_buffer;
unsigned long *consumer_pos;
unsigned long *producer_pos;
+ unsigned long *overwrite_pos;
unsigned long mask;
int map_fd;
+ bool overwrite_mode;
};
[...]