On Tue, Aug 27, 2024 at 03:07:01PM +0200, Jiri Olsa wrote:
> On Tue, Aug 27, 2024 at 12:29:38AM +0200, Oleg Nesterov wrote:
> > On 08/27, Jiri Olsa wrote:
> > >
> > > did you just bpftrace-ed bpftrace? ;-) on my setup I'm getting:
> > >
> > > [root@qemu ex]# ../bpftrace/build/src/bpftrace -e 'kprobe:uprobe_register
> > > { printf("%s\n", kstack); }'
> > > Attaching 1 probe...
> > >
> > > uprobe_register+1
> >
> > so I guess you are on tip/perf/core which killed uprobe_register_refctr()
> > and changed bpf_uprobe_multi_link_attach() to use uprobe_register
> >
> > > bpf_uprobe_multi_link_attach+685
> > > __sys_bpf+9395
> > > __x64_sys_bpf+26
> > > do_syscall_64+128
> > > entry_SYSCALL_64_after_hwframe+118
> > >
> > >
> > > I'm not sure what's bpftrace version in fedora 40, I'm using upstream
> > > build:
> >
> > bpftrace v0.20.1
> >
> > > [root@qemu ex]# ../bpftrace/build/src/bpftrace --info 2>&1 | grep
> > > uprobe_multi
> > > uprobe_multi: yes
> >
> > Aha, I get
> >
> > uprobe_multi: no
> >
> > OK. So, on your setup bpftrace uses bpf_uprobe_multi_link_attach()
> > and this implies ->ret_handler = uprobe_multi_link_ret_handler()
> > which calls uprobe_prog_run() which does
> >
> > if (link->task && current->mm != link->task->mm)
> > return 0;
> >
> > So, can you reproduce the problem reported by Tianyi on your setup?
>
> yes, I can repduce the issue with uretprobe on top of perf event uprobe
>
> running 2 tasks of the test code:
>
> int func() {
> return 0;
> }
>
> int main() {
> printf("pid: %d\n", getpid());
> while (1) {
> sleep(2);
> func();
> }
> }
>
> and running 2 instances of bpftrace (each with separate pid):
>
> [root@qemu ex]# ../bpftrace/build/src/bpftrace -p 1018 -e
> 'uretprobe:./test:func { printf("%d\n", pid); }'
> Attaching 1 probe...
> 1018
> 1017
> 1018
> 1017
>
> [root@qemu ex]# ../bpftrace/build/src/bpftrace -p 1017 -e
> 'uretprobe:./test:func { printf("%d\n", pid); }'
> Attaching 1 probe...
> 1017
> 1018
> 1017
> 1018
>
> will execute bpf program twice for each bpftrace instance, like:
>
> sched-in 1018
> perf_trace_add
>
> -> uprobe-hit
> handle_swbp
> handler_chain
> {
> for_each_uprobe_consumer {
>
> // consumer for task 1019
> uprobe_dispatcher
> uprobe_perf_func
> uprobe_perf_filter return false
>
> // consumer for task 1018
> uprobe_dispatcher
> uprobe_perf_func
> uprobe_perf_filter return true
> -> could run bpf program, but none is configured
> }
>
> prepare_uretprobe
> }
>
> -> uretprobe-hit
> handle_swbp
> uprobe_handle_trampoline
> handle_uretprobe_chain
> {
>
> for_each_uprobe_consumer {
>
> // consumer for task 1019
> uretprobe_dispatcher
> uretprobe_perf_func
> -> runs bpf program
>
> // consumer for task 1018
> uretprobe_dispatcher
> uretprobe_perf_func
> -> runs bpf program
>
> }
> }
>
> sched-out 1019
ugh... should be 'sched-out 1018'
jirka
> perf_trace_del
>
>
> and I think the same will happen for perf record in this case where instead of
> running the program we will execute perf_tp_event
>
> I think the uretprobe_dispatcher could call filter as suggested in the
> original
> patch.. but I'm not sure we need to remove the uprobe from
> handle_uretprobe_chain
> like we do in handler_chain.. maybe just to save the next uprobe hit which
> would
> remove the uprobe?
>
> jirka