On 23/01/26 2:48 am, [email protected] wrote:
From: Abhishek Dubey <[email protected]>

In the conventional stack frame, the position of tail_call_cnt
is after the NVR save area (BPF_PPC_STACK_SAVE). Whereas, the
offset of tail_call_cnt in the trampoline frame is after the
stack alignment padding. BPF JIT logic could become complex
when dealing with frame-sensitive offset calculation of
tail_call_cnt. Having the same offset in both frames is the
desired objective.

The trampoline frame does not have a BPF_PPC_STACK_SAVE area.
Introducing it leads to under-utilization of extra memory meant
only for the offset alignment of tail_call_cnt.
Another challenge is the variable alignment padding sitting at
the bottom of the trampoline frame, which requires additional
handling to compute tail_call_cnt offset.

This patch addresses the above issues by moving tail_call_cnt
to the bottom of the stack frame at offset 0 for both types
of frames. This saves additional bytes required by BPF_PPC_STACK_SAVE
in trampoline frame, and a common offset computation for
tail_call_cnt serves both frames.

The changes in this patch are required by the second patch in the
series, where the 'reference to tail_call_info' of the main frame
is copied into the trampoline frame from the previous frame.

The changelog needs to be simplified. Something like below:

To support tailcalls in subprogs, tail_call_cnt needs to be on the BPF
trampoline stack frame. In a regular BPF program or subprog stack
frame, the position of tail_call_cnt is after the NVR save area
(BPF_PPC_STACK_SAVE). To avoid complex logic in deducing offset for
tail_call_cnt, it has to be kept at the same offset on the trampoline
frame as well. But doing that wastes nearly all of BPF_PPC_STACK_SAVE
bytes on the BPF trampoline stack frame as the NVR save area is not
the same for BPF trampoline and regular BPF programs. Address this by
moving tail_call_cnt to the bottom of the frame.

This change avoids the need to account for BPF_PPC_STACK_SAVE bytes
in the BPF trampoline stack frame when support for tailcalls in BPF
subprogs is added later. Also, this change makes offset calculation
of tail_call_cnt field simpler all across.


Signed-off-by: Abhishek Dubey <[email protected]>
---
  arch/powerpc/net/bpf_jit.h        |  1 +
  arch/powerpc/net/bpf_jit_comp.c   | 15 ++++++++++++---
  arch/powerpc/net/bpf_jit_comp64.c | 31 ++++++++++++++++++++-----------
  3 files changed, 33 insertions(+), 14 deletions(-)

diff --git a/arch/powerpc/net/bpf_jit.h b/arch/powerpc/net/bpf_jit.h
index 8334cd667bba..9f6ec00bd02e 100644
--- a/arch/powerpc/net/bpf_jit.h
+++ b/arch/powerpc/net/bpf_jit.h
@@ -24,6 +24,7 @@
#define SZL sizeof(unsigned long)
  #define BPF_INSN_SAFETY               64
+#define BPF_PPC_TAILCALL       8
#define PLANT_INSTR(d, idx, instr) \
        do { if (d) { (d)[idx] = instr; } idx++; } while (0)
diff --git a/arch/powerpc/net/bpf_jit_comp.c b/arch/powerpc/net/bpf_jit_comp.c
index 5e976730b2f5..d51c696221d7 100644
--- a/arch/powerpc/net/bpf_jit_comp.c
+++ b/arch/powerpc/net/bpf_jit_comp.c
@@ -604,8 +604,8 @@ static void bpf_trampoline_setup_tail_call_cnt(u32 *image, 
struct codegen_contex
                                               int func_frame_offset, int 
r4_off)
  {
        if (IS_ENABLED(CONFIG_PPC64)) {
-               /* See bpf_jit_stack_tailcallcnt() */
-               int tailcallcnt_offset = 7 * 8;
+               /* See Generated stack layout */
+               int tailcallcnt_offset = BPF_PPC_TAILCALL;
EMIT(PPC_RAW_LL(_R3, _R1, func_frame_offset - tailcallcnt_offset));
                EMIT(PPC_RAW_STL(_R3, _R1, -tailcallcnt_offset));
@@ -620,7 +620,7 @@ static void bpf_trampoline_restore_tail_call_cnt(u32 
*image, struct codegen_cont
  {
        if (IS_ENABLED(CONFIG_PPC64)) {
                /* See bpf_jit_stack_tailcallcnt() */
-               int tailcallcnt_offset = 7 * 8;
+               int tailcallcnt_offset = BPF_PPC_TAILCALL;
EMIT(PPC_RAW_LL(_R3, _R1, -tailcallcnt_offset));
                EMIT(PPC_RAW_STL(_R3, _R1, func_frame_offset - 
tailcallcnt_offset));


@@ -714,6 +714,7 @@ static int __arch_prepare_bpf_trampoline(struct 
bpf_tramp_image *im, void *rw_im
         * LR save area                 [ r0 save (64-bit)  ]   | header
         *                              [ r0 save (32-bit)  ]   |
         * dummy frame for unwind       [ back chain 1      ] --
+        *                              [ tail_call_cnt     ] optional - 64-bit 
powerpc
         *                              [ padding           ] align stack frame
         *       r4_off                 [ r4 (tailcallcnt)  ] optional - 32-bit 
powerpc
         *       alt_lr_off             [ real lr (ool stub)] optional - actual 
lr
@@ -795,6 +796,14 @@ static int __arch_prepare_bpf_trampoline(struct 
bpf_tramp_image *im, void *rw_im
                }
        }
+ /*
+        * Save tailcall count pointer at the same offset on the
+        * stack where subprogs expect it
+        */
+       if ((flags & BPF_TRAMP_F_CALL_ORIG) &&
+               (flags & BPF_TRAMP_F_TAIL_CALL_CTX))
+               bpf_frame_size += BPF_PPC_TAILCALL;
+

The above hunk is relevant in the next patch where tailcalls support
in subprogs is added. Drop it here and move it to patch#2.

- Hari


Reply via email to