On Wed, 2025-10-08 at 19:08 +0200, Kumar Kartikeya Dwivedi wrote: > On Wed, 8 Oct 2025 at 18:27, Alexei Starovoitov > <alexei.starovoi...@gmail.com> wrote: > > > > On Wed, Oct 8, 2025 at 7:41 AM Leon Hwang <hffilw...@gmail.com> wrote: > > > > > > > > > > > > On 2025/10/7 14:14, Menglong Dong wrote: > > > > On 2025/10/2 10:03, Alexei Starovoitov wrote: > > > > > On Fri, Sep 26, 2025 at 11:12 PM Menglong Dong > > > > > <menglong8.d...@gmail.com> wrote: > > > > > > > > > > > > Introduce the function bpf_prog_report_probe_violation(), which is > > > > > > used > > > > > > to report the memory probe fault to the user by the BPF stderr. > > > > > > > > > > > > Signed-off-by: Menglong Dong <menglong.d...@linux.dev> > > > > > > [...] > > > > > > > > > > > > > Interesting idea, but the above message is not helpful. > > > > > Users cannot decipher a fault_ip within a bpf prog. > > > > > It's just a random number. > > > > > > > > Yeah, I have noticed this too. What useful is the > > > > bpf_stream_dump_stack(), which will print the code > > > > line that trigger the fault. > > > > > > > > > But stepping back... just faults are common in tracing. > > > > > If we start printing them we will just fill the stream to the max, > > > > > but users won't know that the message is there, since no one > > > > > > > > You are right, we definitely can't output this message > > > > to STDERR directly. We can add an extra flag for it, as you > > > > said below. > > > > > > > > Or, maybe we can introduce a enum stream_type, and > > > > the users can subscribe what kind of messages they > > > > want to receive. > > > > > > > > > expects it. arena and lock errors are rare and arena faults > > > > > were specifically requested by folks who develop progs that use arena. > > > > > This one is different. These faults have been around for a long time > > > > > and I don't recall people asking for more verbosity. > > > > > We can add them with an extra flag specified at prog load time, > > > > > but even then. Doesn't feel that useful. > > > > > > > > Generally speaking, users can do invalid checking before > > > > they do the memory reading, such as NULL checking. And > > > > the pointer in function arguments that we hook is initialized > > > > in most case. So the fault is someting that can be prevented. > > > > > > > > I have a BPF tools which is writed for 4.X kernel and kprobe > > > > based BPF is used. Now I'm planing to migrate it to 6.X kernel > > > > and replace bpf_probe_read_kernel() with bpf_core_cast() to > > > > obtain better performance. Then I find that I can't check if the > > > > memory reading is success, which can lead to potential risk. > > > > So my tool will be happy to get such fault event :) > > > > > > > > Leon suggested to add a global errno for each BPF programs, > > > > and I haven't dig deeply on this idea yet. > > > > > > > > > > Yeah, as we discussed, a global errno would be a much more lightweight > > > approach for handling such faults. > > > > > > The idea would look like this: > > > > > > DEFINE_PER_CPU(int, bpf_errno); > > > > > > __bpf_kfunc void bpf_errno_clear(void); > > > __bpf_kfunc void bpf_errno_set(int errno); > > > __bpf_kfunc int bpf_errno_get(void); > > > > > > When a fault occurs, the kernel can simply call > > > 'bpf_errno_set(-EFAULT);'. > > > > > > If users want to detect whether a fault happened, they can do: > > > > > > bpf_errno_clear(); > > > header = READ_ONCE(skb->network_header); > > > if (header == 0 && bpf_errno_get() == -EFAULT) > > > /* handle fault */; > > > > > > This way, users can identify faults immediately and handle them > > > gracefully. > > > > > > Furthermore, these kfuncs can be inlined by the verifier, so there would > > > be no runtime function call overhead. > > > > Interesting idea, but errno as-is doesn't quite fit, > > since we only have 2 (or 3 ?) cases without explicit error return: > > probe_read_kernel above, arena read, arena write. > > I guess we can add may_goto to this set as well. > > But in all these cases we'll struggle to find an appropriate errno code, > > so it probably should be a custom enum and not called "errno". > > Yeah, agreed that this would be useful, particularly in this case. I'm > wondering how we'll end up implementing this. > Sounds like it needs to be tied to the program's invocation, so it > cannot be per-cpu per-program, since they nest. Most likely should be > backed by run_ctx, but that is unavailable in all program types. Next > best thing that comes to mind is reserving some space in the stack > frame at a known offset in each subprog that invokes this helper, and > use that to signal (by finding the program's bp and writing to the > stack), the downside being it likely becomes yet-another arch-specific > thing. Any other better ideas?
Another option is to lower probe_read to a BPF_PROBE_MEM instruction and generate a special kind of exception handler, that would set r0 to -EFAULT. (We don't do this already, right? Don't see anything like that in verifier.c or x86/../bpf_jit_comp.c). This would avoid any user-visible changes and address performance concern. Not so convenient for a chain of dereferences a->b->c->d, though.