On Thu, Jul 2, 2026 at 4:20 AM Jiri Olsa <[email protected]> wrote:
>
> On Wed, Jul 01, 2026 at 04:13:26PM -0700, Andrii Nakryiko wrote:
> > On Wed, Jul 1, 2026 at 4:13 AM Jiri Olsa <[email protected]> wrote:
> > >
> > > hi,
> > > Andrii reported an issue with optimized uprobes [1] that can clobber
> > > redzone area with call instruction storing return address on stack
> > > where user code may keep temporary data without adjusting rsp.
> > >
> > > Fixing this by moving the optimized uprobes on top of 10-bytes nop
> > > instruction, so we can squeeze another instruction to escape the
> > > redzone area before doing the call.
> > >
> > > Note we need upstream update first for patch 3 (github.com/libbpf/usdt),
> > > if we decide to take this change.
> > >
> > > thanks,
> > > jirka
> > >
> > >
> > > v1: https://lore.kernel.org/bpf/[email protected]/
> > > v2: https://lore.kernel.org/bpf/[email protected]/
> > > v3: https://lore.kernel.org/bpf/[email protected]/
> > > v4: https://lore.kernel.org/bpf/[email protected]/
> > >
> > > v5 changes:
> > > - several selftests changes and reviewed-by tags [Jakub]
> > > - add more comments in int3_update_unoptimize [Andrii]
> > > - several other minor changes and acks [Oleg]
> > > - move insn_decode out of uprobe_init_insn to simplify the code
> > > - align uprobe_red_zone_test to 64 to make sure nop10 is not on page 
> > > boundary
> > >
> > > v4 changes:
> > > - do not use 2nd int3 (ont +5 offset) because the call instruction
> > >   is allways the same for the given nop10 address [Andrii/Peter]
> > > - unmap unused trampoline vma after unsuccesfull optimization [sashiko]
> > > - small change to patch#2 moved user_64bit_mode earlier in the path
> > >   and pass/use mm_struct pointer directly from arch_uprobe_optimize
> > >   instead of gettting current->mm
> > >   Andrii, keeping your ack, please shout otherwise
> > >
> > > v3 changes:
> > > - use nop10 update suggested by Peter in [2]
> > > - remove struct uprobe_trampoline object, use vma objects directly instead
> > > - selftests fixes [sashiko]
> > > - ack from Andrii
> > >
> > > v2 changes:
> > > - several selftest fixes [sashiko]
> > > - consolidate is_lea_insn and is_call_insn insto single check [Jakub 
> > > Sitnicki]
> > > - use proper mm_struct object in __in_uprobe_trampoline check [sashiko]
> > > - allow to copy uprobe trampolines vma objects on fork [sashiko]
> > > - change uprobe syscall detection error from -ENXIO to -EPROTO [Andrii]
> > > - added fork/clone tests
> > > - I kept the selftest changes and nop5->nop10 changes in separate
> > >   commits for easier review, we can squash them later if we want to keep
> > >   bisect working properly
> > >
> > >
> > > [1] https://lore.kernel.org/bpf/[email protected]/
> > > [2] 
> > > https://lore.kernel.org/bpf/[email protected]/#t
> > > ---
> >
> > ASAN-enabled test_progs runs are not happy in CI, can you please check?
>
> I failed to release link in test_uprobe_fork_optimized, fix is below
> I can send new version or separate fix

yeah, please fix the test, adjust comments as pointed out by AI and
send v6. Seems like Peter wants to pick it up through tip, I don't
mind.

>
>
> also there's 2 things to solve/discuss once kernel changes are acked:
> - selftest changes depend on:
>   selftests/bpf: Emit nop,nop10 instructions combo for x86_64 arch
>   that is taken from libbpf/usdt, I pushed the PR in here [1]
>

merged that one, we are good

> - as bots complained the patchset breaks bisection, because kernel
>   changes break selftests.. not sure what's prefered solution, as for
>   me I'd keep it that way rather than mixing kernel/user space changes

I think it's fine to keep them separate

>
> thanks,
> jirka
>
>
> [1] https://github.com/libbpf/usdt/pull/16
> ---
> diff --git a/tools/testing/selftests/bpf/prog_tests/uprobe_syscall.c 
> b/tools/testing/selftests/bpf/prog_tests/uprobe_syscall.c
> index eb067f029a9f..e193206fc5d2 100644
> --- a/tools/testing/selftests/bpf/prog_tests/uprobe_syscall.c
> +++ b/tools/testing/selftests/bpf/prog_tests/uprobe_syscall.c
> @@ -988,7 +988,6 @@ static noreturn int child_func(void *arg)
>  static void test_uprobe_fork_optimized(bool clone_vm)
>  {
>         struct uprobe_syscall_executed *skel = NULL;
> -       struct bpf_link *link = NULL;
>         unsigned long offset;
>         int pid, status, err;
>         char stack[65535];
> @@ -1001,9 +1000,9 @@ static void test_uprobe_fork_optimized(bool clone_vm)
>         if (!ASSERT_OK_PTR(skel, "open_and_load"))
>                 goto cleanup;
>
> -       link = bpf_program__attach_uprobe_opts(skel->progs.test_uprobe,
> -                               -1, "/proc/self/exe", offset, NULL);
> -       if (!ASSERT_OK_PTR(link, "attach_uprobe"))
> +       skel->links.test_uprobe = 
> bpf_program__attach_uprobe_opts(skel->progs.test_uprobe,
> +                                       -1, "/proc/self/exe", offset, NULL);
> +       if (!ASSERT_OK_PTR(skel->links.test_uprobe, "attach_uprobe"))
>                 goto cleanup;
>
>         skel->bss->pid = getpid();

Reply via email to