x86: Add support to optimize uprobes

Oleg Nesterov Sun, 27 Apr 2025 10:12:50 -0700

I didn't actually read this patch yet, but let me ask anyway...

On 04/21, Jiri Olsa wrote:
>
> +static int swbp_optimize(struct vm_area_struct *vma, unsigned long vaddr, 
> unsigned long tramp)
> +{
> +     struct write_opcode_ctx ctx = {
> +             .base = vaddr,
> +     };
> +     char call[5];
> +     int err;
> +
> +     relative_call(call, vaddr, tramp);
> +
> +     /*
> +      * We are in state where breakpoint (int3) is installed on top of first
> +      * byte of the nop5 instruction. We will do following steps to overwrite
> +      * this to call instruction:
> +      *
> +      * - sync cores
> +      * - write last 4 bytes of the call instruction
> +      * - sync cores
> +      * - update the call instruction opcode
> +      */
> +
> +     text_poke_sync();


Hmm. I would like to understand why exactly we need at least this first
text_poke_sync() before "write last 4 bytes of the call instruction".


And... I don't suggest to do this right now, but I am wondering if we can
use mm_cpumask(vma->vm_mm) later, I guess we don't care if we race with
switch_mm_irqs_off() which can add another CPU to this mask...

> +void arch_uprobe_optimize(struct arch_uprobe *auprobe, unsigned long vaddr)
> +{
> +     struct mm_struct *mm = current->mm;
> +     uprobe_opcode_t insn[5];
> +
> +     /*
> +      * Do not optimize if shadow stack is enabled, the return address hijack
> +      * code in arch_uretprobe_hijack_return_addr updates wrong frame when
> +      * the entry uprobe is optimized and the shadow stack crashes the app.
> +      */
> +     if (shstk_is_enabled())
> +             return;

Not sure I fully understand the comment/problem, but what if
prctl(ARCH_SHSTK_ENABLE) is called after arch_uprobe_optimize() succeeds?

Oleg.

Re: [PATCH perf/core 10/22] uprobes/x86: Add support to optimize uprobes

Reply via email to