> diff --git a/arch/powerpc/net/bpf_jit_comp.c b/arch/powerpc/net/bpf_jit_comp.c
> --- a/arch/powerpc/net/bpf_jit_comp.c
> +++ b/arch/powerpc/net/bpf_jit_comp.c

[ ... ]

> +     /* Relative offset of dummy_tramp_addr wrt start of long branch stub */
> +     tramp_idx = long_branch_stub_idx + 7;
> +     /*
> +      * Image layout need not be considered 8-byte aligned.
> +      * Lower 3 bits must be clear for 8-bytes alignment.
> +      * Adjust offset for padding NOP before dummy_tramp_addr
> +      */
> +     tramp_needs_align = (((unsigned long)&image[tramp_idx]) & 7) != 0;
> +     tramp_load_offset = tramp_needs_align ? 24 : 20;

Can the alignment check produce different results between the sizing
pass and the code generation pass?

During the sizing pass, bpf_int_jit_compile() calls
bpf_jit_build_epilogue(NULL, &cgctx), so image is NULL here.
&image[tramp_idx] evaluates to (tramp_idx * 4), and the alignment
decision depends only on the parity of tramp_idx.

During code generation, image is code_base, which may or may not be
8-byte aligned. bpf_jit_binary_pack_alloc() uses alignment=4 with a
random start offset, so roughly half the time code_base will have
bit 2 set. When it does, the alignment check flips:

    sizing pass (image=NULL, tramp_idx odd):
        ((tramp_idx * 4) & 7) = 4, tramp_needs_align = true

    code gen pass (code_base bit 2 set, tramp_idx odd):
        (((unsigned long)code_base + tramp_idx * 4) & 7) = 0,
        tramp_needs_align = false

This means the sizing pass may emit a NOP that the code generation
pass omits (or vice versa), causing the stub to be a different size
than what was computed during sizing.

> +     EMIT(PPC_RAW_LL(_R12, _R12, tramp_load_offset));
> +     EMIT(PPC_RAW_MTCTR(_R12));
> +     EMIT(PPC_RAW_MTLR(_R11));
> +     EMIT(PPC_RAW_BCTR());
> +
> +     /* align dummy_tramp_addr to 8 bytes */
> +     if (tramp_needs_align)
> +             EMIT(PPC_RAW_NOP());
> +
> +     if (image)
> +             *((unsigned long *)&image[ctx->idx]) = (unsigned 
> long)dummy_tramp;
> +
> +     ctx->idx += SZL / 4;
> +
>       if (!bpf_jit_ool_stub) {
>               bpf_jit_ool_stub = (ctx->idx - ool_stub_idx) * 4;
>               bpf_jit_long_branch_stub = (ctx->idx - long_branch_stub_idx) * 
> 4;
>       }

Because the conditional NOP makes the stub size variable,
bpf_jit_long_branch_stub now depends on whether tramp_needs_align
was true when it was first set. This global is set once during the
first program's sizing pass and reused for all subsequent programs.

Since each BPF program has a different body size (different N for
long_branch_stub_idx), and each allocation may have different
code_base alignment, the stub size can vary per-program. But
bpf_arch_text_poke() uses the fixed global to compute the branch
target:

    create_branch(&old_inst, ip,
                  bpf_func_end - bpf_jit_long_branch_stub,
                  branch_flags);

If this program's actual stub size differs from the global by 4
bytes, the branch lands one instruction before or after the intended
mflr r11 at the stub entry.

In the old code, the stub had a fixed layout with no conditional
instructions, so bpf_jit_long_branch_stub was the same for all
programs.

[ ... ]

> @@ -1161,6 +1180,8 @@ static void do_isync(void *info __maybe_unused)
>  *    mtctr   r12
>  *    mtlr    r11 // needed to retain ftrace ABI
>  *    bctr
> + *   nop         // optinonal, mem align dummy_tramp field
> + *   .long   <branch_addr>/<dummy_tramp>

Typo: "optinonal" should be "optional".

Also, the comment block above this still shows the old load offset:

 *      ld      r12, -16(r12)

This should reflect the new forward offset (20 or 24).

[ ... ]

>       /*
>        * Long branch stub:
>        *      mflr    r11
>        *      bcl     20,31,$+4
> -      *      mflr    r12
> -      *      ld      r12, -8-SZL(r12)
> +      *      mflr    r12             // lr/r12 stores current pc
> +      *      ld      r12, 20(r12)    // offset(dummy_tramp_addr) from prev 
> inst. is 20

The hardcoded 20 in this comment is only correct when no alignment
NOP is needed. When tramp_needs_align is true, the offset is 24.


---
AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md

CI run summary: https://github.com/kernel-patches/bpf/actions/runs/22369534987

AI-authorship-score: low
AI-authorship-explanation: Domain-specific code with detailed hardware 
debugging output in the cover letter suggests manual engineering work.
issues-found: 3
issue-severity-score: high
issue-severity-explanation: Alignment check uses NULL image pointer during 
sizing pass, causing variable stub sizes that break the 
bpf_jit_long_branch_stub global used by bpf_arch_text_poke to compute branch 
targets, potentially causing incorrect JIT code execution.

Reply via email to