On Fri, 19 Sep 2025 19:56:20 -0700
Alexei Starovoitov <[email protected]> wrote:

> On Fri, Sep 19, 2025 at 12:19 AM Feng Yang <[email protected]> wrote:
> >
> > When I use bpf_program__attach_kprobe_multi_opts to hook a BPF program that 
> > contains the bpf_get_stackid function on the arm64 architecture,
> > I find that the stack trace cannot be obtained. The trace->nr in 
> > __bpf_get_stackid is 0, and the function returns -EFAULT.
> >
> > For example:
> > diff --git a/tools/testing/selftests/bpf/progs/kprobe_multi.c 
> > b/tools/testing/selftests/bpf/progs/kprobe_multi.c
> > index 9e1ca8e34913..844fa88cdc4c 100644
> > --- a/tools/testing/selftests/bpf/progs/kprobe_multi.c
> > +++ b/tools/testing/selftests/bpf/progs/kprobe_multi.c
> > @@ -36,6 +36,15 @@ __u64 kretprobe_test6_result = 0;
> >  __u64 kretprobe_test7_result = 0;
> >  __u64 kretprobe_test8_result = 0;
> >
> > +typedef __u64 stack_trace_t[2];
> > +
> > +struct {
> > +       __uint(type, BPF_MAP_TYPE_STACK_TRACE);
> > +       __uint(max_entries, 1024);
> > +       __type(key, __u32);
> > +       __type(value, stack_trace_t);
> > +} stacks SEC(".maps");
> > +
> >  static void kprobe_multi_check(void *ctx, bool is_return)
> >  {
> >         if (bpf_get_current_pid_tgid() >> 32 != pid)
> > @@ -100,7 +109,9 @@ int test_kretprobe(struct pt_regs *ctx)
> >  SEC("kprobe.multi")
> >  int test_kprobe_manual(struct pt_regs *ctx)
> >  {
> > +       int id = bpf_get_stackid(ctx, &stacks, 0);
> 
> ftrace_partial_regs() supposed to work on x86 and arm64,
> but since multi-kprobe is the only user...

It should be able to unwind stack. It saves sp, pc, lr, fp.

        regs->sp = afregs->sp;
        regs->pc = afregs->pc;
        regs->regs[29] = afregs->fp;
        regs->regs[30] = afregs->lr;

> I suspect the arm64 implementation wasn't really tested.
> Or maybe there is some other issue.

It depends on how bpf_get_stackid() works. Some registers for that
function may not be saved.

If it returns -EFAULT, the get_perf_callchain() returns NULL.

struct perf_callchain_entry *
get_perf_callchain(struct pt_regs *regs, u32 init_nr, bool kernel, bool user,
                   u32 max_stack, bool crosstask, bool add_mark)
{
...
        entry = get_callchain_entry(&rctx);
        if (!entry)
                return NULL;


Thus the `get_callchain_entry(&rctx)` returns NULL. But if so,
this does not related to the ftrace_partial_regs(), because
get_callchain_entry() returns the per-cpu callchain woarking
buffer for the context, not decoding stack.

struct perf_callchain_entry *get_callchain_entry(int *rctx)
{
        int cpu;
        struct callchain_cpus_entries *entries;

        *rctx = get_recursion_context(this_cpu_ptr(callchain_recursion));
        if (*rctx == -1)
                return NULL;

        entries = rcu_dereference(callchain_cpus_entries);
        if (!entries) {
                put_recursion_context(this_cpu_ptr(callchain_recursion), *rctx);
                return NULL;
        }

        cpu = smp_processor_id();

        return (((void *)entries->cpu_entries[cpu]) +
                (*rctx * perf_callchain_entry__sizeof()));
}

What context does BPF expect, and how does it detect?

Thank you,

-- 
Masami Hiramatsu (Google) <[email protected]>

Reply via email to