> diff --git a/arch/powerpc/net/bpf_jit_comp64.c 
> b/arch/powerpc/net/bpf_jit_comp64.c
> --- a/arch/powerpc/net/bpf_jit_comp64.c
> +++ b/arch/powerpc/net/bpf_jit_comp64.c

[ ... ]

> @@ -287,6 +285,22 @@
>                * program(main prog) as third arg
>                */
>               EMIT(PPC_RAW_MR(_R1, _R5));
> +             /*
> +              * Exception callback reuses the stack frame of exception 
> boundary.
> +              * But BPF stack depth of exception callback and exception 
> boundary
> +              * don't have to be same. If BPF stack depth is different, 
> adjust the
> +              * stack frame size considering BPF stack depth of exception 
> callback.
> +              * The non-volatile register save area remains unchanged. These 
> non-
> +              * volatile registers are restored in exception callback's 
> epilogue.
> +              */
> +             EMIT(PPC_RAW_LD(bpf_to_ppc(TMP_REG_1), _R5, 0));
> +             EMIT(PPC_RAW_SUB(bpf_to_ppc(TMP_REG_2), _R1, 
> bpf_to_ppc(TMP_REG_1)));

Are the operands to PPC_RAW_SUB reversed here? At this point R1 holds
boundary_sp (set by the MR above) and TMP_REG_1 holds prev_sp (loaded
from [R5+0]). Since the stack grows downward, boundary_sp < prev_sp.

PPC_RAW_SUB(d, a, b) computes d = a - b, confirmed by how BPF_SUB is
emitted in bpf_jit_build_body():

    case BPF_ALU64 | BPF_SUB | BPF_X: /* dst -= src */
        EMIT(PPC_RAW_SUB(dst_reg, dst_reg, src_reg));

So this computes TMP_REG_2 = R1 - TMP_REG_1 = boundary_sp - prev_sp,
which is a negative value. After the ADDI subtracts BPF_PPC_EXC_STACKFRAME,
the result is still negative (a large unsigned value near 2^64).

The subsequent CMPLDI unsigned comparison against ctx->stack_size (at
most 512) can therefore never be equal, making the conditional branch
dead code. The fallthrough path always creates a new frame.

The intended computation appears to be
prev_sp - boundary_sp = frame_size, which requires swapping the
operands:

    EMIT(PPC_RAW_SUB(bpf_to_ppc(TMP_REG_2),
                     bpf_to_ppc(TMP_REG_1), _R1));

The code still works because always creating a new frame from prev_sp
produces correct NVR offsets, but the optimization to skip re-framing
when stack depths match never fires.

> +             EMIT(PPC_RAW_ADDI(bpf_to_ppc(TMP_REG_2), bpf_to_ppc(TMP_REG_2),
> +                     -BPF_PPC_EXC_STACKFRAME));
> +             EMIT(PPC_RAW_CMPLDI(bpf_to_ppc(TMP_REG_2), ctx->stack_size));
> +             PPC_BCC_CONST_SHORT(COND_EQ, 12);
> +             EMIT(PPC_RAW_MR(_R1, bpf_to_ppc(TMP_REG_1)));
> +             EMIT(PPC_RAW_STDU(_R1, _R1, -(BPF_PPC_EXC_STACKFRAME + 
> ctx->stack_size)));
>       }


---
AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md

CI run summary: https://github.com/kernel-patches/bpf/actions/runs/22053464899

AI-authorship-score: low
AI-authorship-explanation: The commit demonstrates deep domain knowledge of 
powerpc64 BPF JIT internals and the operand swap bug is characteristic of human 
error with the non-intuitive PPC_RAW_SUB macro.
issues-found: 1
issue-severity-score: low
issue-severity-explanation: The PPC_RAW_SUB operands are swapped making the 
stack size comparison dead code, but the fallthrough path always creates a 
correct frame so there is no crash or data corruption.

Reply via email to