Re: bpf_errno. Was: [PATCH RFC bpf-next 1/3] bpf: report probe fault to BPF stderr

Eduard Zingerman Wed, 08 Oct 2025 13:11:58 -0700

On Wed, 2025-10-08 at 19:08 +0200, Kumar Kartikeya Dwivedi wrote:
> On Wed, 8 Oct 2025 at 18:27, Alexei Starovoitov
> <alexei.starovoi...@gmail.com> wrote:
> > 
> > On Wed, Oct 8, 2025 at 7:41 AM Leon Hwang <hffilw...@gmail.com> wrote:
> > > 
> > > 
> > > 
> > > On 2025/10/7 14:14, Menglong Dong wrote:
> > > > On 2025/10/2 10:03, Alexei Starovoitov wrote:
> > > > > On Fri, Sep 26, 2025 at 11:12 PM Menglong Dong 
> > > > > <menglong8.d...@gmail.com> wrote:
> > > > > > 
> > > > > > Introduce the function bpf_prog_report_probe_violation(), which is 
> > > > > > used
> > > > > > to report the memory probe fault to the user by the BPF stderr.
> > > > > > 
> > > > > > Signed-off-by: Menglong Dong <menglong.d...@linux.dev>
> > > 
> > > [...]
> > > 
> > > > > 
> > > > > Interesting idea, but the above message is not helpful.
> > > > > Users cannot decipher a fault_ip within a bpf prog.
> > > > > It's just a random number.
> > > > 
> > > > Yeah, I have noticed this too. What useful is the
> > > > bpf_stream_dump_stack(), which will print the code
> > > > line that trigger the fault.
> > > > 
> > > > > But stepping back... just faults are common in tracing.
> > > > > If we start printing them we will just fill the stream to the max,
> > > > > but users won't know that the message is there, since no one
> > > > 
> > > > You are right, we definitely can't output this message
> > > > to STDERR directly. We can add an extra flag for it, as you
> > > > said below.
> > > > 
> > > > Or, maybe we can introduce a enum stream_type, and
> > > > the users can subscribe what kind of messages they
> > > > want to receive.
> > > > 
> > > > > expects it. arena and lock errors are rare and arena faults
> > > > > were specifically requested by folks who develop progs that use arena.
> > > > > This one is different. These faults have been around for a long time
> > > > > and I don't recall people asking for more verbosity.
> > > > > We can add them with an extra flag specified at prog load time,
> > > > > but even then. Doesn't feel that useful.
> > > > 
> > > > Generally speaking, users can do invalid checking before
> > > > they do the memory reading, such as NULL checking. And
> > > > the pointer in function arguments that we hook is initialized
> > > > in most case. So the fault is someting that can be prevented.
> > > > 
> > > > I have a BPF tools which is writed for 4.X kernel and kprobe
> > > > based BPF is used. Now I'm planing to migrate it to 6.X kernel
> > > > and replace bpf_probe_read_kernel() with bpf_core_cast() to
> > > > obtain better performance. Then I find that I can't check if the
> > > > memory reading is success, which can lead to potential risk.
> > > > So my tool will be happy to get such fault event :)
> > > > 
> > > > Leon suggested to add a global errno for each BPF programs,
> > > > and I haven't dig deeply on this idea yet.
> > > > 
> > > 
> > > Yeah, as we discussed, a global errno would be a much more lightweight
> > > approach for handling such faults.
> > > 
> > > The idea would look like this:
> > > 
> > > DEFINE_PER_CPU(int, bpf_errno);
> > > 
> > > __bpf_kfunc void bpf_errno_clear(void);
> > > __bpf_kfunc void bpf_errno_set(int errno);
> > > __bpf_kfunc int bpf_errno_get(void);
> > > 
> > > When a fault occurs, the kernel can simply call
> > > 'bpf_errno_set(-EFAULT);'.
> > > 
> > > If users want to detect whether a fault happened, they can do:
> > > 
> > > bpf_errno_clear();
> > > header = READ_ONCE(skb->network_header);
> > > if (header == 0 && bpf_errno_get() == -EFAULT)
> > >         /* handle fault */;
> > > 
> > > This way, users can identify faults immediately and handle them 
> > > gracefully.
> > > 
> > > Furthermore, these kfuncs can be inlined by the verifier, so there would
> > > be no runtime function call overhead.
> > 
> > Interesting idea, but errno as-is doesn't quite fit,
> > since we only have 2 (or 3 ?) cases without explicit error return:
> > probe_read_kernel above, arena read, arena write.
> > I guess we can add may_goto to this set as well.
> > But in all these cases we'll struggle to find an appropriate errno code,
> > so it probably should be a custom enum and not called "errno".
> 
> Yeah, agreed that this would be useful, particularly in this case. I'm
> wondering how we'll end up implementing this.
> Sounds like it needs to be tied to the program's invocation, so it
> cannot be per-cpu per-program, since they nest. Most likely should be
> backed by run_ctx, but that is unavailable in all program types. Next
> best thing that comes to mind is reserving some space in the stack
> frame at a known offset in each subprog that invokes this helper, and
> use that to signal (by finding the program's bp and writing to the
> stack), the downside being it likely becomes yet-another arch-specific
> thing. Any other better ideas?


Another option is to lower probe_read to a BPF_PROBE_MEM instruction
and generate a special kind of exception handler, that would set r0 to
-EFAULT. (We don't do this already, right? Don't see anything like that
in verifier.c or x86/../bpf_jit_comp.c).

This would avoid any user-visible changes and address performance
concern. Not so convenient for a chain of dereferences a->b->c->d,
though.

Re: bpf_errno. Was: [PATCH RFC bpf-next 1/3] bpf: report probe fault to BPF stderr

Reply via email to