On Wed, Feb 04, 2026 at 08:06:50AM -0800, Alexei Starovoitov wrote:
> On Wed, Feb 4, 2026 at 4:36 AM Jiri Olsa <[email protected]> wrote:
> >
> > On Tue, Feb 03, 2026 at 03:17:05PM -0800, Alexei Starovoitov wrote:
> > > On Tue, Feb 3, 2026 at 1:38 AM Jiri Olsa <[email protected]> wrote:
> > > >
> > > > hi,
> > > > as an option to Meglong's change [1] I'm sending proposal for 
> > > > tracing_multi
> > > > link that does not add static trampoline but attaches program to all 
> > > > needed
> > > > trampolines.
> > > >
> > > > This approach keeps the same performance but has some drawbacks:
> > > >
> > > >  - when attaching 20k functions we allocate and attach 20k trampolines
> > > >  - during attachment we hold each trampoline mutex, so for above
> > > >    20k functions we will hold 20k mutexes during the attachment,
> > > >    should be very prone to deadlock, but haven't hit it yet
> > >
> > > If you check that it's sorted and always take them in the same order
> > > then there will be no deadlock.
> > > Or just grab one global mutex first and then grab trampolines mutexes
> > > next in any order. The global one will serialize this attach operation.
> > >
> > > > It looks the trampoline allocations/generation might not be big a 
> > > > problem
> > > > and I'll try to find a solution for holding that many mutexes. If 
> > > > there's
> > > > no better solution I think having one read/write mutex for tracing multi
> > > > link attach/detach should work.
> > >
> > > If you mean to have one global mutex as I proposed above then I don't see
> > > a downside. It only serializes multiple libbpf calls.
> >
> > we also need to serialize it with standard single trampoline attach,
> > because the direct ftrace update is now done under trampoline->mutex:
> >
> >   bpf_trampoline_link_prog(tr)
> >   {
> >     mutex_lock(&tr->mutex);
> >     ...
> >     update_ftrace_direct_*
> >     ...
> >     mutex_unlock(&tr->mutex);
> >   }
> >
> > for tracing_multi we would link the program first (with tr->mutex)
> > and do the bulk ftrace update later (without tr->mutex)
> >
> >   {
> >     for each involved trampoline:
> >       bpf_trampoline_link_prog
> >
> >     --> and here we could race with some other thread doing single
> >         trampoline attach
> >
> >     update_ftrace_direct_*
> >   }
> >
> > note the current version locks all tr->mutex instances all the way
> > through the update_ftrace_direct_* update
> >
> > I think we could use global rwsem and take read lock on single
> > trampoline attach path and write lock on tracing_multi attach,
> >
> > I thought we could take direct_mutex early, but that would mean
> > different order with trampoline mutex than we already have in
> > single attach path
> 
> I feel we're talking past each other.
> I meant:
> 
> For multi:
> 1. take some global mutex
> 2. take N tramp mutexes in any order
> 
> For single:
> 1. take that 1 specific tramp mutex.

ah ok, I understand, it's to prevent the lockup but keep holding all
the trampolines locks.. the rwsem I mentioned was for the 'fix', where
we do not take all the trampolines locks

thanks,
jirka

Reply via email to