On Tue, Aug 27, 2024 at 12:29:38AM +0200, Oleg Nesterov wrote:
> On 08/27, Jiri Olsa wrote:
> >
> > did you just bpftrace-ed bpftrace? ;-) on my setup I'm getting:
> >
> > [root@qemu ex]# ../bpftrace/build/src/bpftrace -e 'kprobe:uprobe_register {
> > printf("%s\n", kstack); }'
> > Attaching 1 probe...
> >
> > uprobe_register+1
>
> so I guess you are on tip/perf/core which killed uprobe_register_refctr()
> and changed bpf_uprobe_multi_link_attach() to use uprobe_register
>
> > bpf_uprobe_multi_link_attach+685
> > __sys_bpf+9395
> > __x64_sys_bpf+26
> > do_syscall_64+128
> > entry_SYSCALL_64_after_hwframe+118
> >
> >
> > I'm not sure what's bpftrace version in fedora 40, I'm using upstream build:
>
> bpftrace v0.20.1
>
> > [root@qemu ex]# ../bpftrace/build/src/bpftrace --info 2>&1 | grep
> > uprobe_multi
> > uprobe_multi: yes
>
> Aha, I get
>
> uprobe_multi: no
>
> OK. So, on your setup bpftrace uses bpf_uprobe_multi_link_attach()
> and this implies ->ret_handler = uprobe_multi_link_ret_handler()
> which calls uprobe_prog_run() which does
>
> if (link->task && current->mm != link->task->mm)
> return 0;
>
> So, can you reproduce the problem reported by Tianyi on your setup?
yes, I can repduce the issue with uretprobe on top of perf event uprobe
running 2 tasks of the test code:
int func() {
return 0;
}
int main() {
printf("pid: %d\n", getpid());
while (1) {
sleep(2);
func();
}
}
and running 2 instances of bpftrace (each with separate pid):
[root@qemu ex]# ../bpftrace/build/src/bpftrace -p 1018 -e
'uretprobe:./test:func { printf("%d\n", pid); }'
Attaching 1 probe...
1018
1017
1018
1017
[root@qemu ex]# ../bpftrace/build/src/bpftrace -p 1017 -e
'uretprobe:./test:func { printf("%d\n", pid); }'
Attaching 1 probe...
1017
1018
1017
1018
will execute bpf program twice for each bpftrace instance, like:
sched-in 1018
perf_trace_add
-> uprobe-hit
handle_swbp
handler_chain
{
for_each_uprobe_consumer {
// consumer for task 1019
uprobe_dispatcher
uprobe_perf_func
uprobe_perf_filter return false
// consumer for task 1018
uprobe_dispatcher
uprobe_perf_func
uprobe_perf_filter return true
-> could run bpf program, but none is configured
}
prepare_uretprobe
}
-> uretprobe-hit
handle_swbp
uprobe_handle_trampoline
handle_uretprobe_chain
{
for_each_uprobe_consumer {
// consumer for task 1019
uretprobe_dispatcher
uretprobe_perf_func
-> runs bpf program
// consumer for task 1018
uretprobe_dispatcher
uretprobe_perf_func
-> runs bpf program
}
}
sched-out 1019
perf_trace_del
and I think the same will happen for perf record in this case where instead of
running the program we will execute perf_tp_event
I think the uretprobe_dispatcher could call filter as suggested in the original
patch.. but I'm not sure we need to remove the uprobe from
handle_uretprobe_chain
like we do in handler_chain.. maybe just to save the next uprobe hit which would
remove the uprobe?
jirka