On Fri, 19 Sep 2025 19:56:20 -0700 Alexei Starovoitov <[email protected]> wrote:
> On Fri, Sep 19, 2025 at 12:19 AM Feng Yang <[email protected]> wrote: > > > > When I use bpf_program__attach_kprobe_multi_opts to hook a BPF program that > > contains the bpf_get_stackid function on the arm64 architecture, > > I find that the stack trace cannot be obtained. The trace->nr in > > __bpf_get_stackid is 0, and the function returns -EFAULT. > > > > For example: > > diff --git a/tools/testing/selftests/bpf/progs/kprobe_multi.c > > b/tools/testing/selftests/bpf/progs/kprobe_multi.c > > index 9e1ca8e34913..844fa88cdc4c 100644 > > --- a/tools/testing/selftests/bpf/progs/kprobe_multi.c > > +++ b/tools/testing/selftests/bpf/progs/kprobe_multi.c > > @@ -36,6 +36,15 @@ __u64 kretprobe_test6_result = 0; > > __u64 kretprobe_test7_result = 0; > > __u64 kretprobe_test8_result = 0; > > > > +typedef __u64 stack_trace_t[2]; > > + > > +struct { > > + __uint(type, BPF_MAP_TYPE_STACK_TRACE); > > + __uint(max_entries, 1024); > > + __type(key, __u32); > > + __type(value, stack_trace_t); > > +} stacks SEC(".maps"); > > + > > static void kprobe_multi_check(void *ctx, bool is_return) > > { > > if (bpf_get_current_pid_tgid() >> 32 != pid) > > @@ -100,7 +109,9 @@ int test_kretprobe(struct pt_regs *ctx) > > SEC("kprobe.multi") > > int test_kprobe_manual(struct pt_regs *ctx) > > { > > + int id = bpf_get_stackid(ctx, &stacks, 0); > > ftrace_partial_regs() supposed to work on x86 and arm64, > but since multi-kprobe is the only user... It should be able to unwind stack. It saves sp, pc, lr, fp. regs->sp = afregs->sp; regs->pc = afregs->pc; regs->regs[29] = afregs->fp; regs->regs[30] = afregs->lr; > I suspect the arm64 implementation wasn't really tested. > Or maybe there is some other issue. It depends on how bpf_get_stackid() works. Some registers for that function may not be saved. If it returns -EFAULT, the get_perf_callchain() returns NULL. struct perf_callchain_entry * get_perf_callchain(struct pt_regs *regs, u32 init_nr, bool kernel, bool user, u32 max_stack, bool crosstask, bool add_mark) { ... entry = get_callchain_entry(&rctx); if (!entry) return NULL; Thus the `get_callchain_entry(&rctx)` returns NULL. But if so, this does not related to the ftrace_partial_regs(), because get_callchain_entry() returns the per-cpu callchain woarking buffer for the context, not decoding stack. struct perf_callchain_entry *get_callchain_entry(int *rctx) { int cpu; struct callchain_cpus_entries *entries; *rctx = get_recursion_context(this_cpu_ptr(callchain_recursion)); if (*rctx == -1) return NULL; entries = rcu_dereference(callchain_cpus_entries); if (!entries) { put_recursion_context(this_cpu_ptr(callchain_recursion), *rctx); return NULL; } cpu = smp_processor_id(); return (((void *)entries->cpu_entries[cpu]) + (*rctx * perf_callchain_entry__sizeof())); } What context does BPF expect, and how does it detect? Thank you, -- Masami Hiramatsu (Google) <[email protected]>
