Re: [PATCH bpf-next] bpf: relax verifier restriction on BPF_MOV | BPF_ALU
On 06/12/2018 03:13, Alexei Starovoitov wrote: On Wed, Dec 05, 2018 at 03:32:50PM +, Jiong Wang wrote: On 05/12/2018 14:52, Edward Cree wrote: On 05/12/18 09:46, Jiong Wang wrote: There is NO processed instruction number regression, either with or without -mattr=+alu32. Cilium bpf === bpf_lb-DLB_L3.o 2110/21101730/1733 That looks like a regression of 3 insns in the 32-bit case. May be worth investigating why. Will look into this. + dst_reg = insn->dst_reg; + regs[dst_reg] = regs[src_reg]; + if (BPF_CLASS(insn->code) == BPF_ALU) { + /* Update type and range info. */ + regs[dst_reg].type = SCALAR_VALUE; + coerce_reg_to_size([dst_reg], 4); Won't this break when handed a pointer (as root, so allowed to leak it)? E.g. (pointer + x) gets turned into scalar x, rather than unknown scalar in range [0, 0x]. Initially I was gating this to scalar_value only, later was thinking it could be extended to ptr case if ptr leak is allowed. But, your comment remind me min/max value doesn't mean real min/max value for ptr types value, it means the offset only if I am understanding the issue correctly. So, it will break pointer case. correct. In case of is_pointer_value() && root -> mark_reg_unknown() has to be called The explanation of additional 3 steps from another email makes sense to me. Can you take a look why it helps default (llvm alu64) case too ? bpf_overlay.o 3096/2898 It is embarrassing that I am not able to reproduce this number after tried quite a few env configurations. I think the number must be wrong because llvm alu64 binary doesn't contain alu32 move so shouldn't be impacted by this patch even though I double checked the raw data I collected on llvm alu64, re-calculated the number before this patch, it is still 3096. I guess there must be something wrong with the binary I was loading. I improved my benchmarking methodology to build all alu64 and alu32 binaries first, and never change them later. Then used a script to load and collect the processed number. (borrowed the script from https://github.com/4ast/bpf_cilium_test/, only my binaries are built from latest Cilium repo and contains alu32 version as well) I ran this new benchmarking env for several times, and could get the following new results consistently: bpf_lb-DLB_L3.o:2085/2085 1685/1687 bpf_lb-DLB_L4.o:2287/2287 1986/1982 bpf_lb-DUNKNOWN.o: 690/690 622/622 bpf_lxc.o: 95033/95033 N/A bpf_netdev.o: 7245/7245 N/A bpf_overlay.o: 2898/2898 3085/2947 No change on alu64 binary. For alu32, bpf_overlay.o still get fewer processed instruction number, this is because there is the following sequence (and another similar one). Before this patch, r2 at insn 139 is unknown, so verifier always explore both path-taken and path-fall_through. After this patch, it explores path-fall_through only, so saved some insns. 129: (b4) (u32) r7 = (u32) -140 ... 136: (bc) (u32) r2 = (u32) r7 137: (74) (u32) r2 >>= (u32) 31 138: (4c) (u32) r2 |= (u32) r1 139: (15) if r2 == 0x0 goto pc+342 140: (b4) (u32) r1 = (u32) 2 And a permissive register value for r2 hasn't released more path prune for this test, so in all, after this patch, there is fewer processed insn. I have sent out a v2, gated this change under SCALAR_VALUE, and also updated the patch description. Thanks. Regards, Jiong
[PATCH v2 bpf-next] bpf: relax verifier restriction on BPF_MOV | BPF_ALU
Currently, the destination register is marked as unknown for 32-bit sub-register move (BPF_MOV | BPF_ALU) whenever the source register type is SCALAR_VALUE. This is too conservative that some valid cases will be rejected. Especially, this may turn a constant scalar value into unknown value that could break some assumptions of verifier. For example, test_l4lb_noinline.c has the following C code: struct real_definition *dst 1: if (!get_packet_dst(, , vip_info, is_ipv6)) 2:return TC_ACT_SHOT; 3: 4: if (dst->flags & F_IPV6) { get_packet_dst is responsible for initializing "dst" into valid pointer and return true (1), otherwise return false (0). The compiled instruction sequence using alu32 will be: 412: (54) (u32) r7 &= (u32) 1 413: (bc) (u32) r0 = (u32) r7 414: (95) exit insn 413, a BPF_MOV | BPF_ALU, however will turn r0 into unknown value even r7 contains SCALAR_VALUE 1. This causes trouble when verifier is walking the code path that hasn't initialized "dst" inside get_packet_dst, for which case 0 is returned and we would then expect verifier concluding line 1 in the above C code pass the "if" check, therefore would skip fall through path starting at line 4. Now, because r0 returned from callee has became unknown value, so verifier won't skip analyzing path starting at line 4 and "dst->flags" requires dereferencing the pointer "dst" which actually hasn't be initialized for this path. This patch relaxed the code marking sub-register move destination. For a SCALAR_VALUE, it is safe to just copy the value from source then truncate it into 32-bit. A unit test also included to demonstrate this issue. This test will fail before this patch. This relaxation could let verifier skipping more paths for conditional comparison against immediate. It also let verifier recording a more accurate/strict value for one register at one state, if this state end up with going through exit without rejection and it is used for state comparison later, then it is possible an inaccurate/permissive value is better. So the real impact on verifier processed insn number is complex. But in all, without this fix, valid program could be rejected. >From real benchmarking on kernel selftests and Cilium bpf tests, there is no impact on processed instruction number when tests ares compiled with default compilation options. There is slightly improvements when they are compiled with -mattr=+alu32 after this patch. Also, test_xdp_noinline/-mattr=+alu32 now passed verification. It is rejected before this fix. Insn processed before/after this patch: default -mattr=+alu32 Kernel selftest === test_xdp.o 371/371 369/369 test_l4lb.o 6345/63455623/5623 test_xdp_noinline.o 2971/2971rejected/2727 test_tcp_estates.o 429/429 430/430 Cilium bpf === bpf_lb-DLB_L3.o:2085/2085 1685/1687 bpf_lb-DLB_L4.o:2287/2287 1986/1982 bpf_lb-DUNKNOWN.o: 690/690 622/622 bpf_lxc.o: 95033/95033 N/A bpf_netdev.o: 7245/7245 N/A bpf_overlay.o: 2898/2898 3085/2947 NOTE: - bpf_lxc.o and bpf_netdev.o compiled by -mattr=+alu32 are rejected by verifier due to another issue inside verifier on supporting alu32 binary. - Each cilium bpf program could generate several processed insn number, above number is sum of them. v1->v2: - Restrict the change on SCALAR_VALUE. - Update benchmark numbers on Cilium bpf tests. Signed-off-by: Jiong Wang --- kernel/bpf/verifier.c | 16 tools/testing/selftests/bpf/test_verifier.c | 13 + 2 files changed, 25 insertions(+), 4 deletions(-) diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 9584438..20a 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -3583,12 +3583,15 @@ static int check_alu_op(struct bpf_verifier_env *env, struct bpf_insn *insn) return err; if (BPF_SRC(insn->code) == BPF_X) { + struct bpf_reg_state *src_reg = regs + insn->src_reg; + struct bpf_reg_state *dst_reg = regs + insn->dst_reg; + if (BPF_CLASS(insn->code) == BPF_ALU64) { /* case: R1 = R2 * copy register state to dest reg */ - regs[insn->dst_reg] = regs[insn->src_reg]; - regs[insn->dst_reg].live |= REG_LIVE_WRITTEN; + *dst_reg = *src_reg; + dst_reg->live |= REG_LIVE_WRITTEN; } else { /* R1 = (u32) R2 */ if (is_pointer_value(env, insn->src_reg)) { @@
[PATCH v2 bpf-next 7/7] selftests: bpf: update testcases for BPF_ALU | BPF_ARSH
"arsh32 on imm" and "arsh32 on reg" now are accepted. Also added two new testcases to make sure arsh32 won't be treated as arsh64 during interpretation or JIT code-gen for which case the high bits will be moved into low halve that the testcases could catch them. Reviewed-by: Jakub Kicinski Signed-off-by: Jiong Wang --- tools/testing/selftests/bpf/test_verifier.c | 29 + 1 file changed, 25 insertions(+), 4 deletions(-) diff --git a/tools/testing/selftests/bpf/test_verifier.c b/tools/testing/selftests/bpf/test_verifier.c index 18d0b7f..e7f1bc7 100644 --- a/tools/testing/selftests/bpf/test_verifier.c +++ b/tools/testing/selftests/bpf/test_verifier.c @@ -721,8 +721,18 @@ static struct bpf_test tests[] = { BPF_ALU32_IMM(BPF_ARSH, BPF_REG_0, 5), BPF_EXIT_INSN(), }, - .result = REJECT, - .errstr = "unknown opcode c4", + .result = ACCEPT, + .retval = 0, + }, + { + "arsh32 on imm 2", + .insns = { + BPF_LD_IMM64(BPF_REG_0, 0x1122334485667788), + BPF_ALU32_IMM(BPF_ARSH, BPF_REG_0, 7), + BPF_EXIT_INSN(), + }, + .result = ACCEPT, + .retval = -16069393, }, { "arsh32 on reg", @@ -732,8 +742,19 @@ static struct bpf_test tests[] = { BPF_ALU32_REG(BPF_ARSH, BPF_REG_0, BPF_REG_1), BPF_EXIT_INSN(), }, - .result = REJECT, - .errstr = "unknown opcode cc", + .result = ACCEPT, + .retval = 0, + }, + { + "arsh32 on reg 2", + .insns = { + BPF_LD_IMM64(BPF_REG_0, 0x55667788), + BPF_MOV64_IMM(BPF_REG_1, 15), + BPF_ALU32_REG(BPF_ARSH, BPF_REG_0, BPF_REG_1), + BPF_EXIT_INSN(), + }, + .result = ACCEPT, + .retval = 43724, }, { "arsh64 on imm", -- 2.7.4
[PATCH v2 bpf-next 0/7] bpf: support BPF_ALU | BPF_ARSH
BPF_ALU | BPF_ARSH | BPF_* were rejected by commit: 7891a87efc71 ("bpf: arsh is not supported in 32 bit alu thus reject it"). As explained in the commit message, this is due to there is no complete support for them on interpreter and various JIT compilation back-ends. This patch set is a follow-up which completes the missing bits. This also pave the way for running bpf program compiled with ALU32 instruction enabled by specifing -mattr=+alu32 to LLVM for which case there is likely to have more BPF_ALU | BPF_ARSH insns that will trigger the rejection code. test_verifier.c is updated accordingly. I have tested this patch set on x86-64 and NFP, I need help of review and test on the arch changes (mips/ppc/s390). Note, there might be merge confict on mips change which is better to be applied on top of: commit: 20b880a05f06 ("mips: bpf: fix encoding bug for mm_srlv32_op"), which is on mips-fixes branch at the moment. Thanks. v1->v2: - Fix ppc implementation bug. Should zero high bits explicitly. Cc: Paul Burton Cc: Naveen N. Rao Cc: Sandipan Das Cc: Martin Schwidefsky Cc: Heiko Carstens Jiong Wang (7): mips: bpf: implement jitting of BPF_ALU | BPF_ARSH | BPF_X ppc: bpf: implement jitting of BPF_ALU | BPF_ARSH | BPF_* s390: bpf: implement jitting of BPF_ALU | BPF_ARSH | BPF_* nfp: bpf: implement jitting of BPF_ALU | BPF_ARSH | BPF_* bpf: interpreter support BPF_ALU | BPF_ARSH bpf: verifier remove the rejection on BPF_ALU | BPF_ARSH selftests: bpf: update testcases for BPF_ALU | BPF_ARSH arch/mips/include/asm/uasm.h | 1 + arch/mips/include/uapi/asm/inst.h| 1 + arch/mips/mm/uasm-micromips.c| 1 + arch/mips/mm/uasm-mips.c | 1 + arch/mips/mm/uasm.c | 9 ++--- arch/mips/net/ebpf_jit.c | 4 +++ arch/powerpc/include/asm/ppc-opcode.h| 2 ++ arch/powerpc/net/bpf_jit.h | 4 +++ arch/powerpc/net/bpf_jit_comp64.c| 6 arch/s390/net/bpf_jit_comp.c | 12 +++ drivers/net/ethernet/netronome/nfp/bpf/jit.c | 45 kernel/bpf/core.c| 52 kernel/bpf/verifier.c| 5 --- tools/testing/selftests/bpf/test_verifier.c | 29 +--- 14 files changed, 137 insertions(+), 35 deletions(-) -- 2.7.4
[PATCH v2 bpf-next 6/7] bpf: verifier remove the rejection on BPF_ALU | BPF_ARSH
This patch remove the rejection on BPF_ALU | BPF_ARSH as we have supported them on interpreter and all JIT back-ends Reviewed-by: Jakub Kicinski Signed-off-by: Jiong Wang --- kernel/bpf/verifier.c | 5 - 1 file changed, 5 deletions(-) diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index ce8a1c3..8ed868e 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -3649,11 +3649,6 @@ static int check_alu_op(struct bpf_verifier_env *env, struct bpf_insn *insn) return -EINVAL; } - if (opcode == BPF_ARSH && BPF_CLASS(insn->code) != BPF_ALU64) { - verbose(env, "BPF_ARSH not supported for 32 bit ALU\n"); - return -EINVAL; - } - if ((opcode == BPF_LSH || opcode == BPF_RSH || opcode == BPF_ARSH) && BPF_SRC(insn->code) == BPF_K) { int size = BPF_CLASS(insn->code) == BPF_ALU64 ? 64 : 32; -- 2.7.4
[PATCH v2 bpf-next 2/7] ppc: bpf: implement jitting of BPF_ALU | BPF_ARSH | BPF_*
This patch implements code-gen for BPF_ALU | BPF_ARSH | BPF_*. Cc: Naveen N. Rao Cc: Sandipan Das Signed-off-by: Jiong Wang --- arch/powerpc/include/asm/ppc-opcode.h | 2 ++ arch/powerpc/net/bpf_jit.h| 4 arch/powerpc/net/bpf_jit_comp64.c | 6 ++ 3 files changed, 12 insertions(+) diff --git a/arch/powerpc/include/asm/ppc-opcode.h b/arch/powerpc/include/asm/ppc-opcode.h index a6e9e31..9014592 100644 --- a/arch/powerpc/include/asm/ppc-opcode.h +++ b/arch/powerpc/include/asm/ppc-opcode.h @@ -342,6 +342,8 @@ #define PPC_INST_SLW 0x7c30 #define PPC_INST_SLD 0x7c36 #define PPC_INST_SRW 0x7c000430 +#define PPC_INST_SRAW 0x7c000630 +#define PPC_INST_SRAWI 0x7c000670 #define PPC_INST_SRD 0x7c000436 #define PPC_INST_SRAD 0x7c000634 #define PPC_INST_SRADI 0x7c000674 diff --git a/arch/powerpc/net/bpf_jit.h b/arch/powerpc/net/bpf_jit.h index 47fc666..c2d5192 100644 --- a/arch/powerpc/net/bpf_jit.h +++ b/arch/powerpc/net/bpf_jit.h @@ -152,6 +152,10 @@ ___PPC_RS(a) | ___PPC_RB(s)) #define PPC_SRW(d, a, s) EMIT(PPC_INST_SRW | ___PPC_RA(d) |\ ___PPC_RS(a) | ___PPC_RB(s)) +#define PPC_SRAW(d, a, s) EMIT(PPC_INST_SRAW | ___PPC_RA(d) | \ +___PPC_RS(a) | ___PPC_RB(s)) +#define PPC_SRAWI(d, a, i) EMIT(PPC_INST_SRAWI | ___PPC_RA(d) | \ +___PPC_RS(a) | __PPC_SH(i)) #define PPC_SRD(d, a, s) EMIT(PPC_INST_SRD | ___PPC_RA(d) |\ ___PPC_RS(a) | ___PPC_RB(s)) #define PPC_SRAD(d, a, s) EMIT(PPC_INST_SRAD | ___PPC_RA(d) | \ diff --git a/arch/powerpc/net/bpf_jit_comp64.c b/arch/powerpc/net/bpf_jit_comp64.c index 17482f5..7dc8187 100644 --- a/arch/powerpc/net/bpf_jit_comp64.c +++ b/arch/powerpc/net/bpf_jit_comp64.c @@ -529,9 +529,15 @@ static int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, if (imm != 0) PPC_SRDI(dst_reg, dst_reg, imm); break; + case BPF_ALU | BPF_ARSH | BPF_X: /* (s32) dst >>= src */ + PPC_SRAW(dst_reg, dst_reg, src_reg); + goto bpf_alu32_trunc; case BPF_ALU64 | BPF_ARSH | BPF_X: /* (s64) dst >>= src */ PPC_SRAD(dst_reg, dst_reg, src_reg); break; + case BPF_ALU | BPF_ARSH | BPF_K: /* (s32) dst >>= imm */ + PPC_SRAWI(dst_reg, dst_reg, imm); + goto bpf_alu32_trunc; case BPF_ALU64 | BPF_ARSH | BPF_K: /* (s64) dst >>= imm */ if (imm != 0) PPC_SRADI(dst_reg, dst_reg, imm); -- 2.7.4
[PATCH v2 bpf-next 4/7] nfp: bpf: implement jitting of BPF_ALU | BPF_ARSH | BPF_*
BPF_X support needs indirect shift mode, please see code comments for details. Reviewed-by: Jakub Kicinski Signed-off-by: Jiong Wang --- drivers/net/ethernet/netronome/nfp/bpf/jit.c | 45 1 file changed, 45 insertions(+) diff --git a/drivers/net/ethernet/netronome/nfp/bpf/jit.c b/drivers/net/ethernet/netronome/nfp/bpf/jit.c index 97d33bb..662cbc2 100644 --- a/drivers/net/ethernet/netronome/nfp/bpf/jit.c +++ b/drivers/net/ethernet/netronome/nfp/bpf/jit.c @@ -2382,6 +2382,49 @@ static int neg_reg(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta) return 0; } +static int __ashr_imm(struct nfp_prog *nfp_prog, u8 dst, u8 shift_amt) +{ + /* Set signedness bit (MSB of result). */ + emit_alu(nfp_prog, reg_none(), reg_a(dst), ALU_OP_OR, reg_imm(0)); + emit_shf(nfp_prog, reg_both(dst), reg_none(), SHF_OP_ASHR, reg_b(dst), +SHF_SC_R_SHF, shift_amt); + wrp_immed(nfp_prog, reg_both(dst + 1), 0); + + return 0; +} + +static int ashr_reg(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta) +{ + const struct bpf_insn *insn = >insn; + u64 umin, umax; + u8 dst, src; + + dst = insn->dst_reg * 2; + umin = meta->umin_src; + umax = meta->umax_src; + if (umin == umax) + return __ashr_imm(nfp_prog, dst, umin); + + src = insn->src_reg * 2; + /* NOTE: the first insn will set both indirect shift amount (source A) +* and signedness bit (MSB of result). +*/ + emit_alu(nfp_prog, reg_none(), reg_a(src), ALU_OP_OR, reg_b(dst)); + emit_shf_indir(nfp_prog, reg_both(dst), reg_none(), SHF_OP_ASHR, + reg_b(dst), SHF_SC_R_SHF); + wrp_immed(nfp_prog, reg_both(dst + 1), 0); + + return 0; +} + +static int ashr_imm(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta) +{ + const struct bpf_insn *insn = >insn; + u8 dst = insn->dst_reg * 2; + + return __ashr_imm(nfp_prog, dst, insn->imm); +} + static int shl_imm(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta) { const struct bpf_insn *insn = >insn; @@ -3286,6 +3329,8 @@ static const instr_cb_t instr_cb[256] = { [BPF_ALU | BPF_DIV | BPF_K] = div_imm, [BPF_ALU | BPF_NEG] = neg_reg, [BPF_ALU | BPF_LSH | BPF_K] = shl_imm, + [BPF_ALU | BPF_ARSH | BPF_X] = ashr_reg, + [BPF_ALU | BPF_ARSH | BPF_K] = ashr_imm, [BPF_ALU | BPF_END | BPF_X] = end_reg32, [BPF_LD | BPF_IMM | BPF_DW] = imm_ld8, [BPF_LD | BPF_ABS | BPF_B] =data_ld1, -- 2.7.4
[PATCH v2 bpf-next 3/7] s390: bpf: implement jitting of BPF_ALU | BPF_ARSH | BPF_*
This patch implements code-gen for BPF_ALU | BPF_ARSH | BPF_*. Cc: Martin Schwidefsky Cc: Heiko Carstens Signed-off-by: Jiong Wang --- arch/s390/net/bpf_jit_comp.c | 12 1 file changed, 12 insertions(+) diff --git a/arch/s390/net/bpf_jit_comp.c b/arch/s390/net/bpf_jit_comp.c index d7052cb..3ff758e 100644 --- a/arch/s390/net/bpf_jit_comp.c +++ b/arch/s390/net/bpf_jit_comp.c @@ -821,10 +821,22 @@ static noinline int bpf_jit_insn(struct bpf_jit *jit, struct bpf_prog *fp, int i /* * BPF_ARSH */ + case BPF_ALU | BPF_ARSH | BPF_X: /* ((s32) dst) >>= src */ + /* sra %dst,%dst,0(%src) */ + EMIT4_DISP(0x8a00, dst_reg, src_reg, 0); + EMIT_ZERO(dst_reg); + break; case BPF_ALU64 | BPF_ARSH | BPF_X: /* ((s64) dst) >>= src */ /* srag %dst,%dst,0(%src) */ EMIT6_DISP_LH(0xeb00, 0x000a, dst_reg, dst_reg, src_reg, 0); break; + case BPF_ALU | BPF_ARSH | BPF_K: /* ((s32) dst >> imm */ + if (imm == 0) + break; + /* sra %dst,imm(%r0) */ + EMIT4_DISP(0x8a00, dst_reg, REG_0, imm); + EMIT_ZERO(dst_reg); + break; case BPF_ALU64 | BPF_ARSH | BPF_K: /* ((s64) dst) >>= imm */ if (imm == 0) break; -- 2.7.4
[PATCH v2 bpf-next 5/7] bpf: interpreter support BPF_ALU | BPF_ARSH
This patch implements interpreting BPF_ALU | BPF_ARSH. Do arithmetic right shift on low 32-bit sub-register, and zero the high 32 bits. Reviewed-by: Jakub Kicinski Signed-off-by: Jiong Wang --- kernel/bpf/core.c | 52 ++-- 1 file changed, 30 insertions(+), 22 deletions(-) diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c index f93ed66..36e31d8 100644 --- a/kernel/bpf/core.c +++ b/kernel/bpf/core.c @@ -923,32 +923,34 @@ EXPORT_SYMBOL_GPL(__bpf_call_base); #define BPF_INSN_MAP(INSN_2, INSN_3) \ /* 32 bit ALU operations. */\ /* Register based. */ \ - INSN_3(ALU, ADD, X),\ - INSN_3(ALU, SUB, X),\ - INSN_3(ALU, AND, X),\ - INSN_3(ALU, OR, X),\ - INSN_3(ALU, LSH, X),\ - INSN_3(ALU, RSH, X),\ - INSN_3(ALU, XOR, X),\ - INSN_3(ALU, MUL, X),\ - INSN_3(ALU, MOV, X),\ - INSN_3(ALU, DIV, X),\ - INSN_3(ALU, MOD, X),\ + INSN_3(ALU, ADD, X), \ + INSN_3(ALU, SUB, X), \ + INSN_3(ALU, AND, X), \ + INSN_3(ALU, OR, X), \ + INSN_3(ALU, LSH, X), \ + INSN_3(ALU, RSH, X), \ + INSN_3(ALU, XOR, X), \ + INSN_3(ALU, MUL, X), \ + INSN_3(ALU, MOV, X), \ + INSN_3(ALU, ARSH, X), \ + INSN_3(ALU, DIV, X), \ + INSN_3(ALU, MOD, X), \ INSN_2(ALU, NEG), \ INSN_3(ALU, END, TO_BE),\ INSN_3(ALU, END, TO_LE),\ /* Immediate based. */\ - INSN_3(ALU, ADD, K),\ - INSN_3(ALU, SUB, K),\ - INSN_3(ALU, AND, K),\ - INSN_3(ALU, OR, K),\ - INSN_3(ALU, LSH, K),\ - INSN_3(ALU, RSH, K),\ - INSN_3(ALU, XOR, K),\ - INSN_3(ALU, MUL, K),\ - INSN_3(ALU, MOV, K),\ - INSN_3(ALU, DIV, K),\ - INSN_3(ALU, MOD, K),\ + INSN_3(ALU, ADD, K), \ + INSN_3(ALU, SUB, K), \ + INSN_3(ALU, AND, K), \ + INSN_3(ALU, OR, K), \ + INSN_3(ALU, LSH, K), \ + INSN_3(ALU, RSH, K), \ + INSN_3(ALU, XOR, K), \ + INSN_3(ALU, MUL, K), \ + INSN_3(ALU, MOV, K), \ + INSN_3(ALU, ARSH, K), \ + INSN_3(ALU, DIV, K), \ + INSN_3(ALU, MOD, K), \ /* 64 bit ALU operations. */\ /* Register based. */ \ INSN_3(ALU64, ADD, X), \ @@ -1127,6 +1129,12 @@ static u64 ___bpf_prog_run(u64 *regs, const struct bpf_insn *insn, u64 *stack) DST = (u64) (u32) insn[0].imm | ((u64) (u32) insn[1].imm) << 32; insn++; CONT; + ALU_ARSH_X: + DST = (u64) (u32) ((*(s32 *) ) >> SRC); + CONT; + ALU_ARSH_K: + DST = (u64) (u32) ((*(s32 *) ) >> IMM); + CONT; ALU64_ARSH_X: (*(s64 *) ) >>= SRC; CONT; -- 2.7.4
[PATCH v2 bpf-next 1/7] mips: bpf: implement jitting of BPF_ALU | BPF_ARSH | BPF_X
Jitting of BPF_K is supported already, but not BPF_X. This patch complete the support for the latter on both MIPS and microMIPS. Cc: Paul Burton Cc: linux-m...@vger.kernel.org Acked-by: Paul Burton Signed-off-by: Jiong Wang --- arch/mips/include/asm/uasm.h | 1 + arch/mips/include/uapi/asm/inst.h | 1 + arch/mips/mm/uasm-micromips.c | 1 + arch/mips/mm/uasm-mips.c | 1 + arch/mips/mm/uasm.c | 9 + arch/mips/net/ebpf_jit.c | 4 6 files changed, 13 insertions(+), 4 deletions(-) diff --git a/arch/mips/include/asm/uasm.h b/arch/mips/include/asm/uasm.h index 59dae37..b1990dd 100644 --- a/arch/mips/include/asm/uasm.h +++ b/arch/mips/include/asm/uasm.h @@ -157,6 +157,7 @@ Ip_u2u1s3(_slti); Ip_u2u1s3(_sltiu); Ip_u3u1u2(_sltu); Ip_u2u1u3(_sra); +Ip_u3u2u1(_srav); Ip_u2u1u3(_srl); Ip_u3u2u1(_srlv); Ip_u3u1u2(_subu); diff --git a/arch/mips/include/uapi/asm/inst.h b/arch/mips/include/uapi/asm/inst.h index 273ef58..40fbb5d 100644 --- a/arch/mips/include/uapi/asm/inst.h +++ b/arch/mips/include/uapi/asm/inst.h @@ -371,6 +371,7 @@ enum mm_32a_minor_op { mm_srl32_op = 0x040, mm_srlv32_op = 0x050, mm_sra_op = 0x080, + mm_srav_op = 0x090, mm_rotr_op = 0x0c0, mm_lwxs_op = 0x118, mm_addu32_op = 0x150, diff --git a/arch/mips/mm/uasm-micromips.c b/arch/mips/mm/uasm-micromips.c index 24e5b0d..75ef904 100644 --- a/arch/mips/mm/uasm-micromips.c +++ b/arch/mips/mm/uasm-micromips.c @@ -104,6 +104,7 @@ static const struct insn insn_table_MM[insn_invalid] = { [insn_sltiu]= {M(mm_sltiu32_op, 0, 0, 0, 0, 0), RT | RS | SIMM}, [insn_sltu] = {M(mm_pool32a_op, 0, 0, 0, 0, mm_sltu_op), RT | RS | RD}, [insn_sra] = {M(mm_pool32a_op, 0, 0, 0, 0, mm_sra_op), RT | RS | RD}, + [insn_srav] = {M(mm_pool32a_op, 0, 0, 0, 0, mm_srav_op), RT | RS | RD}, [insn_srl] = {M(mm_pool32a_op, 0, 0, 0, 0, mm_srl32_op), RT | RS | RD}, [insn_srlv] = {M(mm_pool32a_op, 0, 0, 0, 0, mm_srlv32_op), RT | RS | RD}, [insn_rotr] = {M(mm_pool32a_op, 0, 0, 0, 0, mm_rotr_op), RT | RS | RD}, diff --git a/arch/mips/mm/uasm-mips.c b/arch/mips/mm/uasm-mips.c index 60ceb93..6abe40f 100644 --- a/arch/mips/mm/uasm-mips.c +++ b/arch/mips/mm/uasm-mips.c @@ -171,6 +171,7 @@ static const struct insn insn_table[insn_invalid] = { [insn_sltiu]= {M(sltiu_op, 0, 0, 0, 0, 0), RS | RT | SIMM}, [insn_sltu] = {M(spec_op, 0, 0, 0, 0, sltu_op), RS | RT | RD}, [insn_sra] = {M(spec_op, 0, 0, 0, 0, sra_op), RT | RD | RE}, + [insn_srav] = {M(spec_op, 0, 0, 0, 0, srav_op), RS | RT | RD}, [insn_srl] = {M(spec_op, 0, 0, 0, 0, srl_op), RT | RD | RE}, [insn_srlv] = {M(spec_op, 0, 0, 0, 0, srlv_op), RS | RT | RD}, [insn_subu] = {M(spec_op, 0, 0, 0, 0, subu_op), RS | RT | RD}, diff --git a/arch/mips/mm/uasm.c b/arch/mips/mm/uasm.c index 57570c0..45b6264 100644 --- a/arch/mips/mm/uasm.c +++ b/arch/mips/mm/uasm.c @@ -61,10 +61,10 @@ enum opcode { insn_mthc0, insn_mthi, insn_mtlo, insn_mul, insn_multu, insn_nor, insn_or, insn_ori, insn_pref, insn_rfe, insn_rotr, insn_sb, insn_sc, insn_scd, insn_sd, insn_sh, insn_sll, insn_sllv, - insn_slt, insn_slti, insn_sltiu, insn_sltu, insn_sra, insn_srl, - insn_srlv, insn_subu, insn_sw, insn_sync, insn_syscall, insn_tlbp, - insn_tlbr, insn_tlbwi, insn_tlbwr, insn_wait, insn_wsbh, insn_xor, - insn_xori, insn_yield, + insn_slt, insn_slti, insn_sltiu, insn_sltu, insn_sra, insn_srav, + insn_srl, insn_srlv, insn_subu, insn_sw, insn_sync, insn_syscall, + insn_tlbp, insn_tlbr, insn_tlbwi, insn_tlbwr, insn_wait, insn_wsbh, + insn_xor, insn_xori, insn_yield, insn_invalid /* insn_invalid must be last */ }; @@ -353,6 +353,7 @@ I_u2u1s3(_slti) I_u2u1s3(_sltiu) I_u3u1u2(_sltu) I_u2u1u3(_sra) +I_u3u2u1(_srav) I_u2u1u3(_srl) I_u3u2u1(_srlv) I_u2u1u3(_rotr) diff --git a/arch/mips/net/ebpf_jit.c b/arch/mips/net/ebpf_jit.c index aeb7b1b..b16710a 100644 --- a/arch/mips/net/ebpf_jit.c +++ b/arch/mips/net/ebpf_jit.c @@ -854,6 +854,7 @@ static int build_one_insn(const struct bpf_insn *insn, struct jit_ctx *ctx, case BPF_ALU | BPF_MOD | BPF_X: /* ALU_REG */ case BPF_ALU | BPF_LSH | BPF_X: /* ALU_REG */ case BPF_ALU | BPF_RSH | BPF_X: /* ALU_REG */ + case BPF_ALU | BPF_ARSH | BPF_X: /* ALU_REG */ src = ebpf_to_mips_reg(ctx, insn, src_reg_no_fp); dst = ebpf_to_mips_reg(ctx, insn, dst_reg); if (src < 0 || dst < 0) @@ -913,6 +914,9 @@ static int build_one_insn(const struct bpf_insn *insn, struct jit_ctx *ctx, case BPF_RSH: emit_instr(ctx, srlv, dst, dst, src); break; + case BPF_ARSH: + emit_instr(ctx, srav, dst, ds
Re: [PATCH bpf-next] bpf: relax verifier restriction on BPF_MOV | BPF_ALU
On 05/12/2018 15:32, Jiong Wang wrote: On 05/12/2018 14:52, Edward Cree wrote: On 05/12/18 09:46, Jiong Wang wrote: There is NO processed instruction number regression, either with or without -mattr=+alu32. Cilium bpf === bpf_lb-DLB_L3.o 2110/2110 1730/1733 That looks like a regression of 3 insns in the 32-bit case. May be worth investigating why. Will look into this. Got some analysis from trace log, making a register contains accurate integer could affect both path skip (conditional jump with immediate) and path prune, details described below. But I want to emphasis before this patch, turning a constant into unknown could potentially causing verifier rejecting valid programs. For example, for test_l4lb_noinline.c, there is the following code: struct real_definition *dst 1: if (!get_packet_dst(, , vip_info, is_ipv6)) 2:return TC_ACT_SHOT; 3: 4: if (dst->flags & F_IPV6) { get_packet_dst is responsible for initialize dst into valid pointer and return true (1), otherwise return false (0). As described in the patch description, the return sequence using alu32 will be: 412: (54) (u32) r7 &= (u32) 1 413: (bc) (u32) r0 = (u32) r7 414: (95) exit which would turn r0 into unknown value for all cases. This is causing trouble when the code path hasn't initialized dst inside get_packet_dst, for which case 0 is returned, and we would expect verifer to skip the fall through path at C code line 4, otherwise it will complain dst is not a pointer and will reject the program. For why there are 3 more processed insn under -mattr=+alu32 for bpf_lb.o, there are two places where verification logic diverges: One is the following pattern: 475: (bc) (u32) r1 = (u32) r7 ... 478: (55) if r1 != 0x7 goto pc+23 ... r1 gets its value from r7 which could be one of three different integer 0, 7, -157 when reached here. Before this patch, insn 475 will make r1 into unknown while the integer value is preserved after this patch. So, if r1 is with value other than 0x7, the fall through path is skipped after this patch. But before the patch, after insn 478, verifier could also fix the value of r1 into constant 0x7, and could do path prune for another path reaching insn 478 if state comparison is safe. From the trace log, preserve r1's integer value wins here, path skipped are more than path pruned, so after this patch, get fewer processed insn for this part of sequence. However there are another pattern: 360: (bc) (u32) r7 = (u32) r1 361: (05) goto pc+74 436: (bc) (u32) r1 = (u32) r7 437: (67) r1 <<= 32 438: (c7) r1 s>>= 32 For this case, one more path prune happened when r7 is unknown value and the pruned path is a long one which caused the total processed insn number become 3 more.
Re: [PATCH bpf-next] bpf: relax verifier restriction on BPF_MOV | BPF_ALU
On 05/12/2018 14:52, Edward Cree wrote: On 05/12/18 09:46, Jiong Wang wrote: There is NO processed instruction number regression, either with or without -mattr=+alu32. Cilium bpf === bpf_lb-DLB_L3.o 2110/21101730/1733 That looks like a regression of 3 insns in the 32-bit case. May be worth investigating why. Will look into this. + dst_reg = insn->dst_reg; + regs[dst_reg] = regs[src_reg]; + if (BPF_CLASS(insn->code) == BPF_ALU) { + /* Update type and range info. */ + regs[dst_reg].type = SCALAR_VALUE; + coerce_reg_to_size([dst_reg], 4); Won't this break when handed a pointer (as root, so allowed to leak it)? E.g. (pointer + x) gets turned into scalar x, rather than unknown scalar in range [0, 0x]. Initially I was gating this to scalar_value only, later was thinking it could be extended to ptr case if ptr leak is allowed. But, your comment remind me min/max value doesn't mean real min/max value for ptr types value, it means the offset only if I am understanding the issue correctly. So, it will break pointer case. Regards, Jiong The existing behaviour is correct for pointers: 32 unknown bits, because the value of the pointer base is unknown. It's only for scalar_values that you want to copy and truncate the var_off and min/max from the src_reg. -Ed
Re: [PATCH bpf-next 2/7] ppc: bpf: implement jitting of BPF_ALU | BPF_ARSH | BPF_*
Sandipan Das writes: > Hi Jiong, > > On 05/12/18 2:25 AM, Jiong Wang wrote: >> This patch implements code-gen for BPF_ALU | BPF_ARSH | BPF_*. >> >> Cc: Naveen N. Rao >> Cc: Sandipan Das >> Signed-off-by: Jiong Wang >> --- > [...] >> diff --git a/arch/powerpc/net/bpf_jit_comp64.c >> b/arch/powerpc/net/bpf_jit_comp64.c >> index 17482f5..c685b4f 100644 >> --- a/arch/powerpc/net/bpf_jit_comp64.c >> +++ b/arch/powerpc/net/bpf_jit_comp64.c >> @@ -529,9 +529,15 @@ static int bpf_jit_build_body(struct bpf_prog *fp, u32 >> *image, >> if (imm != 0) >> PPC_SRDI(dst_reg, dst_reg, imm); >> break; >> +case BPF_ALU | BPF_ARSH | BPF_X: /* (s32) dst >>= src */ >> +PPC_SRAW(dst_reg, dst_reg, src_reg); >> +break; > > On ppc64, the sraw and srawi instructions also use sign extension. So, you > will have > to ensure that upper 32 bits are cleared. We already have a label in our JIT > code > called bpf_alu32_trunc that takes care of this. Replacing the break statement > with > a goto bpf_alu32_trunc will fix this. (resend the reply, got a delivery failure notification from netdev@vger.kernel.org) Indeed. Doubled checked the ISA doc,"Bit 32 of RS is replicated to fill RA0:31.". Will fix both places in v2. Thanks Regards, Jiong > >> case BPF_ALU64 | BPF_ARSH | BPF_X: /* (s64) dst >>= src */ >> PPC_SRAD(dst_reg, dst_reg, src_reg); >> break; >> +case BPF_ALU | BPF_ARSH | BPF_K: /* (s32) dst >>= imm */ >> +PPC_SRAWI(dst_reg, dst_reg, imm); >> +break; > > Same here. > >> case BPF_ALU64 | BPF_ARSH | BPF_K: /* (s64) dst >>= imm */ >> if (imm != 0) >> PPC_SRADI(dst_reg, dst_reg, imm); >> > > With Regards, > Sandipan
Re: [PATCH bpf-next 1/7] mips: bpf: implement jitting of BPF_ALU | BPF_ARSH | BPF_X
On 05/12/2018 00:02, Paul Burton wrote: Hi Jiong, On Tue, Dec 04, 2018 at 03:55:16PM -0500, Jiong Wang wrote: Jitting of BPF_K is supported already, but not BPF_X. This patch complete the support for the latter on both MIPS and microMIPS. Cc: Paul Burton Cc: linux-m...@vger.kernel.org Signed-off-by: Jiong Wang --- arch/mips/include/asm/uasm.h | 1 + arch/mips/include/uapi/asm/inst.h | 1 + arch/mips/mm/uasm-micromips.c | 1 + arch/mips/mm/uasm-mips.c | 1 + arch/mips/mm/uasm.c | 9 + arch/mips/net/ebpf_jit.c | 4 6 files changed, 13 insertions(+), 4 deletions(-) I don't seem to have been copied on the rest of the series, but this patch standalone looks good from a MIPS standpoint. If the series is going through the net tree (and again, I can't see whether that seems likely because I don't have the rest of the series) then: Acked-by: Paul Burton If you want me to take this patch through the MIPS tree instead then let me know. Hi Paul, Thanks very much for the review. I'd like this set to go through net tree that this feature could be enabled as a whole for all affected arches. For the Cc, I was thinking the practice is to Cc people on patches of their subsystems, but guess a better practice might be also Cc on the cover letter so the background and affected code scope will be more clear. I will do this in v2. Regards, Jiong
[PATCH bpf-next] bpf: relax verifier restriction on BPF_MOV | BPF_ALU
Currently, the destination register is marked as unknown for 32-bit sub-register move (BPF_MOV | BPF_ALU) whenever the source register type is SCALAR_VALUE. This is too conservative that some valid cases will be rejected. Especially, this may turn a constant scalar value into unknown value that some runtime optimization could fail. For example, there is the following insn pattern for test_l4lb_noinline.c when it is compiled by -mattr=+alu32: return from callee: 411: (b4) (u32) r7 = (u32) 0 412: (54) (u32) r7 &= (u32) 1 413: (bc) (u32) r0 = (u32) r7 414: (95) exit to caller: 202: (bc) (u32) r1 = (u32) r0 203: (b4) (u32) r0 = (u32) 2 204: (bf) r7 = r9 205: (15) if r1 == 0x0 goto pc+69 206: (79) r9 = *(u64 *)(r10 -80) 207: (71) r1 = *(u8 *)(r9 +16) The ending sequence in callee is a sub-register move insn which should make r0 be SCALAR_VALUE 0, however verifier is conservatively marking r0 as unknown, so r0 is not a constant anymore, instead it is: R0=inv(id=0,umax_value=4294967295,var_off=(0x0; 0x)) Then, after returned to caller, r0 is copied back to r1 inside caller where r1 is used to compare against constant 0. In the C code logic, the stack slot (r10 - 80) will only be stored a valid pointer when r1 if not zero for which case insn 207 will be a valid load. Otherwise no initialize of the stack slot and insn 207 is invalid. Now the code pattern above belongs to the path that is initializing r1 to zero, therefore this path won't initialize the stack slot, but verifier will smartly skip analyzing the fall through path starting at insn 206 because r1 is expected to be zero that the comparison at insn 205 will be true. However due to r0 is conservatively marked as unknown in first place in callee's ending sequence, we have lost the information of r0/r1. The consequence is verifier will fail to skip the fall through path, and will de-reference stack slot at (r10 - 80) which is not with a valid spilled pointer for this path. This patch relaxed the code marking sub-register move destination. For a SCALAR_VALUE or pointer value which is allowed to be leaked as scalar value, it is safe to just copy the value from source, force the value type into SCALAR_VALUE and then truncate it into 32-bit. A unit test also included to demonstrate this issue. This test will fail before this patch. As this relaxation is potentially giving verifier more accurate information, bpf c program with reasonable size have been benchmarked, including those inside kernel bpf selftests and those inside cilium repo. There is NO processed instruction number regression, either with or without -mattr=+alu32. And there are some improvements: - test_xdp_noinline now passed verification under -mattr=+alu32. - a few cililum bpf programs got processed insn number reduced. Insn processed before/after this patch: default -mattr=+alu32 Kernel selftest === test_xdp.o 371/371 369/369 test_l4lb.o 6345/63455623/5623 test_xdp_noinline.o 2971/2971rejected/2727 test_tcp_estates.o 429/429 430/430 Cilium bpf === bpf_lb-DLB_L3.o 2110/21101730/1733 bpf_lb-DLB_L4.o 2251/22512037/2029 bpf_lb-DUNKNOWN.o 701/701 729/622 bpf_lxc.o 96023/95033 N/A bpf_netdev.o7456/7245N/A bpf_overlay.o 3096/28983085/2947 bpf_lxc.o and bpf_netdev.o compiled by -mattr=+alu32 are rejected by verifier because of another issue inside verifier on supporting alu32 binary. One Cilium bpf program could generate several processed insn info, above number is sum of them. Reviewed-by: Jakub Kicinski Signed-off-by: Jiong Wang --- kernel/bpf/verifier.c | 32 ++--- tools/testing/selftests/bpf/test_verifier.c | 13 2 files changed, 29 insertions(+), 16 deletions(-) diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 9584438..ce8a1c3 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -3583,23 +3583,23 @@ static int check_alu_op(struct bpf_verifier_env *env, struct bpf_insn *insn) return err; if (BPF_SRC(insn->code) == BPF_X) { - if (BPF_CLASS(insn->code) == BPF_ALU64) { - /* case: R1 = R2 -* copy register state to dest reg -*/ - regs[insn->dst_reg] = regs[insn->src_reg]; - regs[insn->dst_reg].live |= REG_LIVE_WRITTEN; - } else { - /* R1 = (u32) R2 */ - if (is_pointer_value(env, insn->src_reg)) { - verbose(env, - "R%d partial copy of pointer\n", -
[PATCH bpf-next 7/7] selftests: bpf: update testcases for BPF_ALU | BPF_ARSH
"arsh32 on imm" and "arsh32 on reg" now are accepted. Also added two new testcases to make sure arsh32 won't be treated as arsh64 during interpretation or JIT code-gen for which case the high bits will be moved into low halve that the testcases could catch them. Reviewed-by: Jakub Kicinski Signed-off-by: Jiong Wang --- tools/testing/selftests/bpf/test_verifier.c | 29 + 1 file changed, 25 insertions(+), 4 deletions(-) diff --git a/tools/testing/selftests/bpf/test_verifier.c b/tools/testing/selftests/bpf/test_verifier.c index 18d0b7f..e7f1bc7 100644 --- a/tools/testing/selftests/bpf/test_verifier.c +++ b/tools/testing/selftests/bpf/test_verifier.c @@ -721,8 +721,18 @@ static struct bpf_test tests[] = { BPF_ALU32_IMM(BPF_ARSH, BPF_REG_0, 5), BPF_EXIT_INSN(), }, - .result = REJECT, - .errstr = "unknown opcode c4", + .result = ACCEPT, + .retval = 0, + }, + { + "arsh32 on imm 2", + .insns = { + BPF_LD_IMM64(BPF_REG_0, 0x1122334485667788), + BPF_ALU32_IMM(BPF_ARSH, BPF_REG_0, 7), + BPF_EXIT_INSN(), + }, + .result = ACCEPT, + .retval = -16069393, }, { "arsh32 on reg", @@ -732,8 +742,19 @@ static struct bpf_test tests[] = { BPF_ALU32_REG(BPF_ARSH, BPF_REG_0, BPF_REG_1), BPF_EXIT_INSN(), }, - .result = REJECT, - .errstr = "unknown opcode cc", + .result = ACCEPT, + .retval = 0, + }, + { + "arsh32 on reg 2", + .insns = { + BPF_LD_IMM64(BPF_REG_0, 0x55667788), + BPF_MOV64_IMM(BPF_REG_1, 15), + BPF_ALU32_REG(BPF_ARSH, BPF_REG_0, BPF_REG_1), + BPF_EXIT_INSN(), + }, + .result = ACCEPT, + .retval = 43724, }, { "arsh64 on imm", -- 2.7.4
[PATCH bpf-next 2/7] ppc: bpf: implement jitting of BPF_ALU | BPF_ARSH | BPF_*
This patch implements code-gen for BPF_ALU | BPF_ARSH | BPF_*. Cc: Naveen N. Rao Cc: Sandipan Das Signed-off-by: Jiong Wang --- arch/powerpc/include/asm/ppc-opcode.h | 2 ++ arch/powerpc/net/bpf_jit.h| 4 arch/powerpc/net/bpf_jit_comp64.c | 6 ++ 3 files changed, 12 insertions(+) diff --git a/arch/powerpc/include/asm/ppc-opcode.h b/arch/powerpc/include/asm/ppc-opcode.h index a6e9e31..9014592 100644 --- a/arch/powerpc/include/asm/ppc-opcode.h +++ b/arch/powerpc/include/asm/ppc-opcode.h @@ -342,6 +342,8 @@ #define PPC_INST_SLW 0x7c30 #define PPC_INST_SLD 0x7c36 #define PPC_INST_SRW 0x7c000430 +#define PPC_INST_SRAW 0x7c000630 +#define PPC_INST_SRAWI 0x7c000670 #define PPC_INST_SRD 0x7c000436 #define PPC_INST_SRAD 0x7c000634 #define PPC_INST_SRADI 0x7c000674 diff --git a/arch/powerpc/net/bpf_jit.h b/arch/powerpc/net/bpf_jit.h index 47fc666..c2d5192 100644 --- a/arch/powerpc/net/bpf_jit.h +++ b/arch/powerpc/net/bpf_jit.h @@ -152,6 +152,10 @@ ___PPC_RS(a) | ___PPC_RB(s)) #define PPC_SRW(d, a, s) EMIT(PPC_INST_SRW | ___PPC_RA(d) |\ ___PPC_RS(a) | ___PPC_RB(s)) +#define PPC_SRAW(d, a, s) EMIT(PPC_INST_SRAW | ___PPC_RA(d) | \ +___PPC_RS(a) | ___PPC_RB(s)) +#define PPC_SRAWI(d, a, i) EMIT(PPC_INST_SRAWI | ___PPC_RA(d) | \ +___PPC_RS(a) | __PPC_SH(i)) #define PPC_SRD(d, a, s) EMIT(PPC_INST_SRD | ___PPC_RA(d) |\ ___PPC_RS(a) | ___PPC_RB(s)) #define PPC_SRAD(d, a, s) EMIT(PPC_INST_SRAD | ___PPC_RA(d) | \ diff --git a/arch/powerpc/net/bpf_jit_comp64.c b/arch/powerpc/net/bpf_jit_comp64.c index 17482f5..c685b4f 100644 --- a/arch/powerpc/net/bpf_jit_comp64.c +++ b/arch/powerpc/net/bpf_jit_comp64.c @@ -529,9 +529,15 @@ static int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, if (imm != 0) PPC_SRDI(dst_reg, dst_reg, imm); break; + case BPF_ALU | BPF_ARSH | BPF_X: /* (s32) dst >>= src */ + PPC_SRAW(dst_reg, dst_reg, src_reg); + break; case BPF_ALU64 | BPF_ARSH | BPF_X: /* (s64) dst >>= src */ PPC_SRAD(dst_reg, dst_reg, src_reg); break; + case BPF_ALU | BPF_ARSH | BPF_K: /* (s32) dst >>= imm */ + PPC_SRAWI(dst_reg, dst_reg, imm); + break; case BPF_ALU64 | BPF_ARSH | BPF_K: /* (s64) dst >>= imm */ if (imm != 0) PPC_SRADI(dst_reg, dst_reg, imm); -- 2.7.4
[PATCH bpf-next 5/7] bpf: interpreter support BPF_ALU | BPF_ARSH
This patch implements interpreting BPF_ALU | BPF_ARSH. Do arithmetic right shift on low 32-bit sub-register, and zero the high 32 bits. Reviewed-by: Jakub Kicinski Signed-off-by: Jiong Wang --- kernel/bpf/core.c | 52 ++-- 1 file changed, 30 insertions(+), 22 deletions(-) diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c index f93ed66..36e31d8 100644 --- a/kernel/bpf/core.c +++ b/kernel/bpf/core.c @@ -923,32 +923,34 @@ EXPORT_SYMBOL_GPL(__bpf_call_base); #define BPF_INSN_MAP(INSN_2, INSN_3) \ /* 32 bit ALU operations. */\ /* Register based. */ \ - INSN_3(ALU, ADD, X),\ - INSN_3(ALU, SUB, X),\ - INSN_3(ALU, AND, X),\ - INSN_3(ALU, OR, X),\ - INSN_3(ALU, LSH, X),\ - INSN_3(ALU, RSH, X),\ - INSN_3(ALU, XOR, X),\ - INSN_3(ALU, MUL, X),\ - INSN_3(ALU, MOV, X),\ - INSN_3(ALU, DIV, X),\ - INSN_3(ALU, MOD, X),\ + INSN_3(ALU, ADD, X), \ + INSN_3(ALU, SUB, X), \ + INSN_3(ALU, AND, X), \ + INSN_3(ALU, OR, X), \ + INSN_3(ALU, LSH, X), \ + INSN_3(ALU, RSH, X), \ + INSN_3(ALU, XOR, X), \ + INSN_3(ALU, MUL, X), \ + INSN_3(ALU, MOV, X), \ + INSN_3(ALU, ARSH, X), \ + INSN_3(ALU, DIV, X), \ + INSN_3(ALU, MOD, X), \ INSN_2(ALU, NEG), \ INSN_3(ALU, END, TO_BE),\ INSN_3(ALU, END, TO_LE),\ /* Immediate based. */\ - INSN_3(ALU, ADD, K),\ - INSN_3(ALU, SUB, K),\ - INSN_3(ALU, AND, K),\ - INSN_3(ALU, OR, K),\ - INSN_3(ALU, LSH, K),\ - INSN_3(ALU, RSH, K),\ - INSN_3(ALU, XOR, K),\ - INSN_3(ALU, MUL, K),\ - INSN_3(ALU, MOV, K),\ - INSN_3(ALU, DIV, K),\ - INSN_3(ALU, MOD, K),\ + INSN_3(ALU, ADD, K), \ + INSN_3(ALU, SUB, K), \ + INSN_3(ALU, AND, K), \ + INSN_3(ALU, OR, K), \ + INSN_3(ALU, LSH, K), \ + INSN_3(ALU, RSH, K), \ + INSN_3(ALU, XOR, K), \ + INSN_3(ALU, MUL, K), \ + INSN_3(ALU, MOV, K), \ + INSN_3(ALU, ARSH, K), \ + INSN_3(ALU, DIV, K), \ + INSN_3(ALU, MOD, K), \ /* 64 bit ALU operations. */\ /* Register based. */ \ INSN_3(ALU64, ADD, X), \ @@ -1127,6 +1129,12 @@ static u64 ___bpf_prog_run(u64 *regs, const struct bpf_insn *insn, u64 *stack) DST = (u64) (u32) insn[0].imm | ((u64) (u32) insn[1].imm) << 32; insn++; CONT; + ALU_ARSH_X: + DST = (u64) (u32) ((*(s32 *) ) >> SRC); + CONT; + ALU_ARSH_K: + DST = (u64) (u32) ((*(s32 *) ) >> IMM); + CONT; ALU64_ARSH_X: (*(s64 *) ) >>= SRC; CONT; -- 2.7.4
[PATCH bpf-next 1/7] mips: bpf: implement jitting of BPF_ALU | BPF_ARSH | BPF_X
Jitting of BPF_K is supported already, but not BPF_X. This patch complete the support for the latter on both MIPS and microMIPS. Cc: Paul Burton Cc: linux-m...@vger.kernel.org Signed-off-by: Jiong Wang --- arch/mips/include/asm/uasm.h | 1 + arch/mips/include/uapi/asm/inst.h | 1 + arch/mips/mm/uasm-micromips.c | 1 + arch/mips/mm/uasm-mips.c | 1 + arch/mips/mm/uasm.c | 9 + arch/mips/net/ebpf_jit.c | 4 6 files changed, 13 insertions(+), 4 deletions(-) diff --git a/arch/mips/include/asm/uasm.h b/arch/mips/include/asm/uasm.h index 59dae37..b1990dd 100644 --- a/arch/mips/include/asm/uasm.h +++ b/arch/mips/include/asm/uasm.h @@ -157,6 +157,7 @@ Ip_u2u1s3(_slti); Ip_u2u1s3(_sltiu); Ip_u3u1u2(_sltu); Ip_u2u1u3(_sra); +Ip_u3u2u1(_srav); Ip_u2u1u3(_srl); Ip_u3u2u1(_srlv); Ip_u3u1u2(_subu); diff --git a/arch/mips/include/uapi/asm/inst.h b/arch/mips/include/uapi/asm/inst.h index 273ef58..40fbb5d 100644 --- a/arch/mips/include/uapi/asm/inst.h +++ b/arch/mips/include/uapi/asm/inst.h @@ -371,6 +371,7 @@ enum mm_32a_minor_op { mm_srl32_op = 0x040, mm_srlv32_op = 0x050, mm_sra_op = 0x080, + mm_srav_op = 0x090, mm_rotr_op = 0x0c0, mm_lwxs_op = 0x118, mm_addu32_op = 0x150, diff --git a/arch/mips/mm/uasm-micromips.c b/arch/mips/mm/uasm-micromips.c index 24e5b0d..75ef904 100644 --- a/arch/mips/mm/uasm-micromips.c +++ b/arch/mips/mm/uasm-micromips.c @@ -104,6 +104,7 @@ static const struct insn insn_table_MM[insn_invalid] = { [insn_sltiu]= {M(mm_sltiu32_op, 0, 0, 0, 0, 0), RT | RS | SIMM}, [insn_sltu] = {M(mm_pool32a_op, 0, 0, 0, 0, mm_sltu_op), RT | RS | RD}, [insn_sra] = {M(mm_pool32a_op, 0, 0, 0, 0, mm_sra_op), RT | RS | RD}, + [insn_srav] = {M(mm_pool32a_op, 0, 0, 0, 0, mm_srav_op), RT | RS | RD}, [insn_srl] = {M(mm_pool32a_op, 0, 0, 0, 0, mm_srl32_op), RT | RS | RD}, [insn_srlv] = {M(mm_pool32a_op, 0, 0, 0, 0, mm_srlv32_op), RT | RS | RD}, [insn_rotr] = {M(mm_pool32a_op, 0, 0, 0, 0, mm_rotr_op), RT | RS | RD}, diff --git a/arch/mips/mm/uasm-mips.c b/arch/mips/mm/uasm-mips.c index 60ceb93..6abe40f 100644 --- a/arch/mips/mm/uasm-mips.c +++ b/arch/mips/mm/uasm-mips.c @@ -171,6 +171,7 @@ static const struct insn insn_table[insn_invalid] = { [insn_sltiu]= {M(sltiu_op, 0, 0, 0, 0, 0), RS | RT | SIMM}, [insn_sltu] = {M(spec_op, 0, 0, 0, 0, sltu_op), RS | RT | RD}, [insn_sra] = {M(spec_op, 0, 0, 0, 0, sra_op), RT | RD | RE}, + [insn_srav] = {M(spec_op, 0, 0, 0, 0, srav_op), RS | RT | RD}, [insn_srl] = {M(spec_op, 0, 0, 0, 0, srl_op), RT | RD | RE}, [insn_srlv] = {M(spec_op, 0, 0, 0, 0, srlv_op), RS | RT | RD}, [insn_subu] = {M(spec_op, 0, 0, 0, 0, subu_op), RS | RT | RD}, diff --git a/arch/mips/mm/uasm.c b/arch/mips/mm/uasm.c index 57570c0..45b6264 100644 --- a/arch/mips/mm/uasm.c +++ b/arch/mips/mm/uasm.c @@ -61,10 +61,10 @@ enum opcode { insn_mthc0, insn_mthi, insn_mtlo, insn_mul, insn_multu, insn_nor, insn_or, insn_ori, insn_pref, insn_rfe, insn_rotr, insn_sb, insn_sc, insn_scd, insn_sd, insn_sh, insn_sll, insn_sllv, - insn_slt, insn_slti, insn_sltiu, insn_sltu, insn_sra, insn_srl, - insn_srlv, insn_subu, insn_sw, insn_sync, insn_syscall, insn_tlbp, - insn_tlbr, insn_tlbwi, insn_tlbwr, insn_wait, insn_wsbh, insn_xor, - insn_xori, insn_yield, + insn_slt, insn_slti, insn_sltiu, insn_sltu, insn_sra, insn_srav, + insn_srl, insn_srlv, insn_subu, insn_sw, insn_sync, insn_syscall, + insn_tlbp, insn_tlbr, insn_tlbwi, insn_tlbwr, insn_wait, insn_wsbh, + insn_xor, insn_xori, insn_yield, insn_invalid /* insn_invalid must be last */ }; @@ -353,6 +353,7 @@ I_u2u1s3(_slti) I_u2u1s3(_sltiu) I_u3u1u2(_sltu) I_u2u1u3(_sra) +I_u3u2u1(_srav) I_u2u1u3(_srl) I_u3u2u1(_srlv) I_u2u1u3(_rotr) diff --git a/arch/mips/net/ebpf_jit.c b/arch/mips/net/ebpf_jit.c index aeb7b1b..b16710a 100644 --- a/arch/mips/net/ebpf_jit.c +++ b/arch/mips/net/ebpf_jit.c @@ -854,6 +854,7 @@ static int build_one_insn(const struct bpf_insn *insn, struct jit_ctx *ctx, case BPF_ALU | BPF_MOD | BPF_X: /* ALU_REG */ case BPF_ALU | BPF_LSH | BPF_X: /* ALU_REG */ case BPF_ALU | BPF_RSH | BPF_X: /* ALU_REG */ + case BPF_ALU | BPF_ARSH | BPF_X: /* ALU_REG */ src = ebpf_to_mips_reg(ctx, insn, src_reg_no_fp); dst = ebpf_to_mips_reg(ctx, insn, dst_reg); if (src < 0 || dst < 0) @@ -913,6 +914,9 @@ static int build_one_insn(const struct bpf_insn *insn, struct jit_ctx *ctx, case BPF_RSH: emit_instr(ctx, srlv, dst, dst, src); break; + case BPF_ARSH: + emit_instr(ctx, srav, dst, dst, src); +
[PATCH bpf-next 4/7] nfp: bpf: implement jitting of BPF_ALU | BPF_ARSH | BPF_*
BPF_X support needs indirect shift mode, please see code comments for details. Reviewed-by: Jakub Kicinski Signed-off-by: Jiong Wang --- drivers/net/ethernet/netronome/nfp/bpf/jit.c | 45 1 file changed, 45 insertions(+) diff --git a/drivers/net/ethernet/netronome/nfp/bpf/jit.c b/drivers/net/ethernet/netronome/nfp/bpf/jit.c index 97d33bb..662cbc2 100644 --- a/drivers/net/ethernet/netronome/nfp/bpf/jit.c +++ b/drivers/net/ethernet/netronome/nfp/bpf/jit.c @@ -2382,6 +2382,49 @@ static int neg_reg(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta) return 0; } +static int __ashr_imm(struct nfp_prog *nfp_prog, u8 dst, u8 shift_amt) +{ + /* Set signedness bit (MSB of result). */ + emit_alu(nfp_prog, reg_none(), reg_a(dst), ALU_OP_OR, reg_imm(0)); + emit_shf(nfp_prog, reg_both(dst), reg_none(), SHF_OP_ASHR, reg_b(dst), +SHF_SC_R_SHF, shift_amt); + wrp_immed(nfp_prog, reg_both(dst + 1), 0); + + return 0; +} + +static int ashr_reg(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta) +{ + const struct bpf_insn *insn = >insn; + u64 umin, umax; + u8 dst, src; + + dst = insn->dst_reg * 2; + umin = meta->umin_src; + umax = meta->umax_src; + if (umin == umax) + return __ashr_imm(nfp_prog, dst, umin); + + src = insn->src_reg * 2; + /* NOTE: the first insn will set both indirect shift amount (source A) +* and signedness bit (MSB of result). +*/ + emit_alu(nfp_prog, reg_none(), reg_a(src), ALU_OP_OR, reg_b(dst)); + emit_shf_indir(nfp_prog, reg_both(dst), reg_none(), SHF_OP_ASHR, + reg_b(dst), SHF_SC_R_SHF); + wrp_immed(nfp_prog, reg_both(dst + 1), 0); + + return 0; +} + +static int ashr_imm(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta) +{ + const struct bpf_insn *insn = >insn; + u8 dst = insn->dst_reg * 2; + + return __ashr_imm(nfp_prog, dst, insn->imm); +} + static int shl_imm(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta) { const struct bpf_insn *insn = >insn; @@ -3286,6 +3329,8 @@ static const instr_cb_t instr_cb[256] = { [BPF_ALU | BPF_DIV | BPF_K] = div_imm, [BPF_ALU | BPF_NEG] = neg_reg, [BPF_ALU | BPF_LSH | BPF_K] = shl_imm, + [BPF_ALU | BPF_ARSH | BPF_X] = ashr_reg, + [BPF_ALU | BPF_ARSH | BPF_K] = ashr_imm, [BPF_ALU | BPF_END | BPF_X] = end_reg32, [BPF_LD | BPF_IMM | BPF_DW] = imm_ld8, [BPF_LD | BPF_ABS | BPF_B] =data_ld1, -- 2.7.4
[PATCH bpf-next 3/7] s390: bpf: implement jitting of BPF_ALU | BPF_ARSH | BPF_*
This patch implements code-gen for BPF_ALU | BPF_ARSH | BPF_*. Cc: Martin Schwidefsky Cc: Heiko Carstens Signed-off-by: Jiong Wang --- arch/s390/net/bpf_jit_comp.c | 12 1 file changed, 12 insertions(+) diff --git a/arch/s390/net/bpf_jit_comp.c b/arch/s390/net/bpf_jit_comp.c index d7052cb..3ff758e 100644 --- a/arch/s390/net/bpf_jit_comp.c +++ b/arch/s390/net/bpf_jit_comp.c @@ -821,10 +821,22 @@ static noinline int bpf_jit_insn(struct bpf_jit *jit, struct bpf_prog *fp, int i /* * BPF_ARSH */ + case BPF_ALU | BPF_ARSH | BPF_X: /* ((s32) dst) >>= src */ + /* sra %dst,%dst,0(%src) */ + EMIT4_DISP(0x8a00, dst_reg, src_reg, 0); + EMIT_ZERO(dst_reg); + break; case BPF_ALU64 | BPF_ARSH | BPF_X: /* ((s64) dst) >>= src */ /* srag %dst,%dst,0(%src) */ EMIT6_DISP_LH(0xeb00, 0x000a, dst_reg, dst_reg, src_reg, 0); break; + case BPF_ALU | BPF_ARSH | BPF_K: /* ((s32) dst >> imm */ + if (imm == 0) + break; + /* sra %dst,imm(%r0) */ + EMIT4_DISP(0x8a00, dst_reg, REG_0, imm); + EMIT_ZERO(dst_reg); + break; case BPF_ALU64 | BPF_ARSH | BPF_K: /* ((s64) dst) >>= imm */ if (imm == 0) break; -- 2.7.4
[PATCH bpf-next 0/7] bpf: support BPF_ALU | BPF_ARSH
BPF_ALU | BPF_ARSH | BPF_* were rejected by commit: 7891a87efc71 ("bpf: arsh is not supported in 32 bit alu thus reject it"). As explained in the commit message, this is due to there is no complete support for them on interpreter and various JIT compilation back-ends. This patch set is a follow-up which completes the missing bits. This also pave the way for running bpf program compiled with ALU32 instruction enabled by specifing -mattr=+alu32 to LLVM for which case there is likely to have more BPF_ALU | BPF_ARSH insns that will trigger the rejection code. test_verifier.c is updated accordingly. I have tested this patch set on x86-64 and NFP, I need help of review and test on the arch changes (mips/ppc/s390). Note, there might be merge confict on mips change which is better to be applied on top of: commit: 20b880a05f06 ("mips: bpf: fix encoding bug for mm_srlv32_op"), which is on mips-fixes branch at the moment. Thanks. Jiong Wang (7): mips: bpf: implement jitting of BPF_ALU | BPF_ARSH | BPF_X ppc: bpf: implement jitting of BPF_ALU | BPF_ARSH | BPF_* s390: bpf: implement jitting of BPF_ALU | BPF_ARSH | BPF_* nfp: bpf: implement jitting of BPF_ALU | BPF_ARSH | BPF_* bpf: interpreter support BPF_ALU | BPF_ARSH bpf: verifier remove the rejection on BPF_ALU | BPF_ARSH selftests: bpf: update testcases for BPF_ALU | BPF_ARSH arch/mips/include/asm/uasm.h | 1 + arch/mips/include/uapi/asm/inst.h| 1 + arch/mips/mm/uasm-micromips.c| 1 + arch/mips/mm/uasm-mips.c | 1 + arch/mips/mm/uasm.c | 9 ++--- arch/mips/net/ebpf_jit.c | 4 +++ arch/powerpc/include/asm/ppc-opcode.h| 2 ++ arch/powerpc/net/bpf_jit.h | 4 +++ arch/powerpc/net/bpf_jit_comp64.c| 6 arch/s390/net/bpf_jit_comp.c | 12 +++ drivers/net/ethernet/netronome/nfp/bpf/jit.c | 45 kernel/bpf/core.c| 52 kernel/bpf/verifier.c| 5 --- tools/testing/selftests/bpf/test_verifier.c | 29 +--- 14 files changed, 137 insertions(+), 35 deletions(-) -- 2.7.4
[PATCH bpf-next 6/7] bpf: verifier remove the rejection on BPF_ALU | BPF_ARSH
This patch remove the rejection on BPF_ALU | BPF_ARSH as we have supported them on interpreter and all JIT back-ends Reviewed-by: Jakub Kicinski Signed-off-by: Jiong Wang --- kernel/bpf/verifier.c | 5 - 1 file changed, 5 deletions(-) diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index ce8a1c3..8ed868e 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -3649,11 +3649,6 @@ static int check_alu_op(struct bpf_verifier_env *env, struct bpf_insn *insn) return -EINVAL; } - if (opcode == BPF_ARSH && BPF_CLASS(insn->code) != BPF_ALU64) { - verbose(env, "BPF_ARSH not supported for 32 bit ALU\n"); - return -EINVAL; - } - if ((opcode == BPF_LSH || opcode == BPF_RSH || opcode == BPF_ARSH) && BPF_SRC(insn->code) == BPF_K) { int size = BPF_CLASS(insn->code) == BPF_ALU64 ? 64 : 32; -- 2.7.4
Re: [RFC bpf-next 1/7] bpf: interpreter support BPF_ALU | BPF_ARSH
On 04/12/2018 20:10, David Miller wrote: From: Alexei Starovoitov Date: Tue, 4 Dec 2018 11:29:55 -0800 I guess sparc doesn't really have 32 subregisters. All registers are considered 64-bit. It has 32-bit alu ops on 64-bit registers instead. Right. Anyways, sparc will require two instructions because of this, the 'sra' then a 'srl' by zero bits to clear the top 32-bits. I'll code up the sparc JIT part when this goes in. Hmm, I had been going through all JIT backends, and saw there is do_alu32_trunc after jitting sra for BPF_ALU. That's what needed? Regards, Jiong
[RFC bpf-next 6/7] bpf: verifier remove the rejection on BPF_ALU | BPF_ARSH
This patch remove the rejection on BPF_ALU | BPF_ARSH as we have supported them on interpreter and all JIT back-ends Reviewed-by: Jakub Kicinski Signed-off-by: Jiong Wang --- kernel/bpf/verifier.c | 5 - 1 file changed, 5 deletions(-) diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index ce8a1c3..8ed868e 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -3649,11 +3649,6 @@ static int check_alu_op(struct bpf_verifier_env *env, struct bpf_insn *insn) return -EINVAL; } - if (opcode == BPF_ARSH && BPF_CLASS(insn->code) != BPF_ALU64) { - verbose(env, "BPF_ARSH not supported for 32 bit ALU\n"); - return -EINVAL; - } - if ((opcode == BPF_LSH || opcode == BPF_RSH || opcode == BPF_ARSH) && BPF_SRC(insn->code) == BPF_K) { int size = BPF_CLASS(insn->code) == BPF_ALU64 ? 64 : 32; -- 2.7.4
[RFC bpf-next 3/7] ppc: bpf: implement jitting of BPF_ALU | BPF_ARSH | BPF_*
This patch implements code-gen for BPF_ALU | BPF_ARSH | BPF_*. Cc: Naveen N. Rao Cc: Sandipan Das Signed-off-by: Jiong Wang --- arch/powerpc/include/asm/ppc-opcode.h | 2 ++ arch/powerpc/net/bpf_jit.h| 4 arch/powerpc/net/bpf_jit_comp64.c | 6 ++ 3 files changed, 12 insertions(+) diff --git a/arch/powerpc/include/asm/ppc-opcode.h b/arch/powerpc/include/asm/ppc-opcode.h index a6e9e31..9014592 100644 --- a/arch/powerpc/include/asm/ppc-opcode.h +++ b/arch/powerpc/include/asm/ppc-opcode.h @@ -342,6 +342,8 @@ #define PPC_INST_SLW 0x7c30 #define PPC_INST_SLD 0x7c36 #define PPC_INST_SRW 0x7c000430 +#define PPC_INST_SRAW 0x7c000630 +#define PPC_INST_SRAWI 0x7c000670 #define PPC_INST_SRD 0x7c000436 #define PPC_INST_SRAD 0x7c000634 #define PPC_INST_SRADI 0x7c000674 diff --git a/arch/powerpc/net/bpf_jit.h b/arch/powerpc/net/bpf_jit.h index 47fc666..c2d5192 100644 --- a/arch/powerpc/net/bpf_jit.h +++ b/arch/powerpc/net/bpf_jit.h @@ -152,6 +152,10 @@ ___PPC_RS(a) | ___PPC_RB(s)) #define PPC_SRW(d, a, s) EMIT(PPC_INST_SRW | ___PPC_RA(d) |\ ___PPC_RS(a) | ___PPC_RB(s)) +#define PPC_SRAW(d, a, s) EMIT(PPC_INST_SRAW | ___PPC_RA(d) | \ +___PPC_RS(a) | ___PPC_RB(s)) +#define PPC_SRAWI(d, a, i) EMIT(PPC_INST_SRAWI | ___PPC_RA(d) | \ +___PPC_RS(a) | __PPC_SH(i)) #define PPC_SRD(d, a, s) EMIT(PPC_INST_SRD | ___PPC_RA(d) |\ ___PPC_RS(a) | ___PPC_RB(s)) #define PPC_SRAD(d, a, s) EMIT(PPC_INST_SRAD | ___PPC_RA(d) | \ diff --git a/arch/powerpc/net/bpf_jit_comp64.c b/arch/powerpc/net/bpf_jit_comp64.c index 17482f5..c685b4f 100644 --- a/arch/powerpc/net/bpf_jit_comp64.c +++ b/arch/powerpc/net/bpf_jit_comp64.c @@ -529,9 +529,15 @@ static int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, if (imm != 0) PPC_SRDI(dst_reg, dst_reg, imm); break; + case BPF_ALU | BPF_ARSH | BPF_X: /* (s32) dst >>= src */ + PPC_SRAW(dst_reg, dst_reg, src_reg); + break; case BPF_ALU64 | BPF_ARSH | BPF_X: /* (s64) dst >>= src */ PPC_SRAD(dst_reg, dst_reg, src_reg); break; + case BPF_ALU | BPF_ARSH | BPF_K: /* (s32) dst >>= imm */ + PPC_SRAWI(dst_reg, dst_reg, imm); + break; case BPF_ALU64 | BPF_ARSH | BPF_K: /* (s64) dst >>= imm */ if (imm != 0) PPC_SRADI(dst_reg, dst_reg, imm); -- 2.7.4
[RFC bpf-next 2/7] mips: bpf: implement jitting of BPF_ALU | BPF_ARSH | BPF_X
Jitting of BPF_K is supported already, but not BPF_X. This patch complete the support for the latter on both MIPS and microMIPS. Cc: Paul Burton Cc: linux-m...@vger.kernel.org Signed-off-by: Jiong Wang --- arch/mips/include/asm/uasm.h | 1 + arch/mips/include/uapi/asm/inst.h | 1 + arch/mips/mm/uasm-micromips.c | 1 + arch/mips/mm/uasm-mips.c | 1 + arch/mips/mm/uasm.c | 9 + arch/mips/net/ebpf_jit.c | 4 6 files changed, 13 insertions(+), 4 deletions(-) diff --git a/arch/mips/include/asm/uasm.h b/arch/mips/include/asm/uasm.h index 59dae37..b1990dd 100644 --- a/arch/mips/include/asm/uasm.h +++ b/arch/mips/include/asm/uasm.h @@ -157,6 +157,7 @@ Ip_u2u1s3(_slti); Ip_u2u1s3(_sltiu); Ip_u3u1u2(_sltu); Ip_u2u1u3(_sra); +Ip_u3u2u1(_srav); Ip_u2u1u3(_srl); Ip_u3u2u1(_srlv); Ip_u3u1u2(_subu); diff --git a/arch/mips/include/uapi/asm/inst.h b/arch/mips/include/uapi/asm/inst.h index 273ef58..40fbb5d 100644 --- a/arch/mips/include/uapi/asm/inst.h +++ b/arch/mips/include/uapi/asm/inst.h @@ -371,6 +371,7 @@ enum mm_32a_minor_op { mm_srl32_op = 0x040, mm_srlv32_op = 0x050, mm_sra_op = 0x080, + mm_srav_op = 0x090, mm_rotr_op = 0x0c0, mm_lwxs_op = 0x118, mm_addu32_op = 0x150, diff --git a/arch/mips/mm/uasm-micromips.c b/arch/mips/mm/uasm-micromips.c index 24e5b0d..75ef904 100644 --- a/arch/mips/mm/uasm-micromips.c +++ b/arch/mips/mm/uasm-micromips.c @@ -104,6 +104,7 @@ static const struct insn insn_table_MM[insn_invalid] = { [insn_sltiu]= {M(mm_sltiu32_op, 0, 0, 0, 0, 0), RT | RS | SIMM}, [insn_sltu] = {M(mm_pool32a_op, 0, 0, 0, 0, mm_sltu_op), RT | RS | RD}, [insn_sra] = {M(mm_pool32a_op, 0, 0, 0, 0, mm_sra_op), RT | RS | RD}, + [insn_srav] = {M(mm_pool32a_op, 0, 0, 0, 0, mm_srav_op), RT | RS | RD}, [insn_srl] = {M(mm_pool32a_op, 0, 0, 0, 0, mm_srl32_op), RT | RS | RD}, [insn_srlv] = {M(mm_pool32a_op, 0, 0, 0, 0, mm_srlv32_op), RT | RS | RD}, [insn_rotr] = {M(mm_pool32a_op, 0, 0, 0, 0, mm_rotr_op), RT | RS | RD}, diff --git a/arch/mips/mm/uasm-mips.c b/arch/mips/mm/uasm-mips.c index 60ceb93..6abe40f 100644 --- a/arch/mips/mm/uasm-mips.c +++ b/arch/mips/mm/uasm-mips.c @@ -171,6 +171,7 @@ static const struct insn insn_table[insn_invalid] = { [insn_sltiu]= {M(sltiu_op, 0, 0, 0, 0, 0), RS | RT | SIMM}, [insn_sltu] = {M(spec_op, 0, 0, 0, 0, sltu_op), RS | RT | RD}, [insn_sra] = {M(spec_op, 0, 0, 0, 0, sra_op), RT | RD | RE}, + [insn_srav] = {M(spec_op, 0, 0, 0, 0, srav_op), RS | RT | RD}, [insn_srl] = {M(spec_op, 0, 0, 0, 0, srl_op), RT | RD | RE}, [insn_srlv] = {M(spec_op, 0, 0, 0, 0, srlv_op), RS | RT | RD}, [insn_subu] = {M(spec_op, 0, 0, 0, 0, subu_op), RS | RT | RD}, diff --git a/arch/mips/mm/uasm.c b/arch/mips/mm/uasm.c index 57570c0..45b6264 100644 --- a/arch/mips/mm/uasm.c +++ b/arch/mips/mm/uasm.c @@ -61,10 +61,10 @@ enum opcode { insn_mthc0, insn_mthi, insn_mtlo, insn_mul, insn_multu, insn_nor, insn_or, insn_ori, insn_pref, insn_rfe, insn_rotr, insn_sb, insn_sc, insn_scd, insn_sd, insn_sh, insn_sll, insn_sllv, - insn_slt, insn_slti, insn_sltiu, insn_sltu, insn_sra, insn_srl, - insn_srlv, insn_subu, insn_sw, insn_sync, insn_syscall, insn_tlbp, - insn_tlbr, insn_tlbwi, insn_tlbwr, insn_wait, insn_wsbh, insn_xor, - insn_xori, insn_yield, + insn_slt, insn_slti, insn_sltiu, insn_sltu, insn_sra, insn_srav, + insn_srl, insn_srlv, insn_subu, insn_sw, insn_sync, insn_syscall, + insn_tlbp, insn_tlbr, insn_tlbwi, insn_tlbwr, insn_wait, insn_wsbh, + insn_xor, insn_xori, insn_yield, insn_invalid /* insn_invalid must be last */ }; @@ -353,6 +353,7 @@ I_u2u1s3(_slti) I_u2u1s3(_sltiu) I_u3u1u2(_sltu) I_u2u1u3(_sra) +I_u3u2u1(_srav) I_u2u1u3(_srl) I_u3u2u1(_srlv) I_u2u1u3(_rotr) diff --git a/arch/mips/net/ebpf_jit.c b/arch/mips/net/ebpf_jit.c index aeb7b1b..b16710a 100644 --- a/arch/mips/net/ebpf_jit.c +++ b/arch/mips/net/ebpf_jit.c @@ -854,6 +854,7 @@ static int build_one_insn(const struct bpf_insn *insn, struct jit_ctx *ctx, case BPF_ALU | BPF_MOD | BPF_X: /* ALU_REG */ case BPF_ALU | BPF_LSH | BPF_X: /* ALU_REG */ case BPF_ALU | BPF_RSH | BPF_X: /* ALU_REG */ + case BPF_ALU | BPF_ARSH | BPF_X: /* ALU_REG */ src = ebpf_to_mips_reg(ctx, insn, src_reg_no_fp); dst = ebpf_to_mips_reg(ctx, insn, dst_reg); if (src < 0 || dst < 0) @@ -913,6 +914,9 @@ static int build_one_insn(const struct bpf_insn *insn, struct jit_ctx *ctx, case BPF_RSH: emit_instr(ctx, srlv, dst, dst, src); break; + case BPF_ARSH: + emit_instr(ctx, srav, dst, dst, src); +
[RFC bpf-next 5/7] nfp: bpf: implement jitting of BPF_ALU | BPF_ARSH | BPF_*
BPF_X support needs indirect shift mode, please see code comments for details. Reviewed-by: Jakub Kicinski Signed-off-by: Jiong Wang --- drivers/net/ethernet/netronome/nfp/bpf/jit.c | 45 1 file changed, 45 insertions(+) diff --git a/drivers/net/ethernet/netronome/nfp/bpf/jit.c b/drivers/net/ethernet/netronome/nfp/bpf/jit.c index 97d33bb..662cbc2 100644 --- a/drivers/net/ethernet/netronome/nfp/bpf/jit.c +++ b/drivers/net/ethernet/netronome/nfp/bpf/jit.c @@ -2382,6 +2382,49 @@ static int neg_reg(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta) return 0; } +static int __ashr_imm(struct nfp_prog *nfp_prog, u8 dst, u8 shift_amt) +{ + /* Set signedness bit (MSB of result). */ + emit_alu(nfp_prog, reg_none(), reg_a(dst), ALU_OP_OR, reg_imm(0)); + emit_shf(nfp_prog, reg_both(dst), reg_none(), SHF_OP_ASHR, reg_b(dst), +SHF_SC_R_SHF, shift_amt); + wrp_immed(nfp_prog, reg_both(dst + 1), 0); + + return 0; +} + +static int ashr_reg(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta) +{ + const struct bpf_insn *insn = >insn; + u64 umin, umax; + u8 dst, src; + + dst = insn->dst_reg * 2; + umin = meta->umin_src; + umax = meta->umax_src; + if (umin == umax) + return __ashr_imm(nfp_prog, dst, umin); + + src = insn->src_reg * 2; + /* NOTE: the first insn will set both indirect shift amount (source A) +* and signedness bit (MSB of result). +*/ + emit_alu(nfp_prog, reg_none(), reg_a(src), ALU_OP_OR, reg_b(dst)); + emit_shf_indir(nfp_prog, reg_both(dst), reg_none(), SHF_OP_ASHR, + reg_b(dst), SHF_SC_R_SHF); + wrp_immed(nfp_prog, reg_both(dst + 1), 0); + + return 0; +} + +static int ashr_imm(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta) +{ + const struct bpf_insn *insn = >insn; + u8 dst = insn->dst_reg * 2; + + return __ashr_imm(nfp_prog, dst, insn->imm); +} + static int shl_imm(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta) { const struct bpf_insn *insn = >insn; @@ -3286,6 +3329,8 @@ static const instr_cb_t instr_cb[256] = { [BPF_ALU | BPF_DIV | BPF_K] = div_imm, [BPF_ALU | BPF_NEG] = neg_reg, [BPF_ALU | BPF_LSH | BPF_K] = shl_imm, + [BPF_ALU | BPF_ARSH | BPF_X] = ashr_reg, + [BPF_ALU | BPF_ARSH | BPF_K] = ashr_imm, [BPF_ALU | BPF_END | BPF_X] = end_reg32, [BPF_LD | BPF_IMM | BPF_DW] = imm_ld8, [BPF_LD | BPF_ABS | BPF_B] =data_ld1, -- 2.7.4
[RFC bpf-next 1/7] bpf: interpreter support BPF_ALU | BPF_ARSH
This patch implements interpreting BPF_ALU | BPF_ARSH. Do arithmetic right shift on low 32-bit sub-register, and zero the high 32 bits. Reviewed-by: Jakub Kicinski Signed-off-by: Jiong Wang --- kernel/bpf/core.c | 52 ++-- 1 file changed, 30 insertions(+), 22 deletions(-) diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c index f93ed66..36e31d8 100644 --- a/kernel/bpf/core.c +++ b/kernel/bpf/core.c @@ -923,32 +923,34 @@ EXPORT_SYMBOL_GPL(__bpf_call_base); #define BPF_INSN_MAP(INSN_2, INSN_3) \ /* 32 bit ALU operations. */\ /* Register based. */ \ - INSN_3(ALU, ADD, X),\ - INSN_3(ALU, SUB, X),\ - INSN_3(ALU, AND, X),\ - INSN_3(ALU, OR, X),\ - INSN_3(ALU, LSH, X),\ - INSN_3(ALU, RSH, X),\ - INSN_3(ALU, XOR, X),\ - INSN_3(ALU, MUL, X),\ - INSN_3(ALU, MOV, X),\ - INSN_3(ALU, DIV, X),\ - INSN_3(ALU, MOD, X),\ + INSN_3(ALU, ADD, X), \ + INSN_3(ALU, SUB, X), \ + INSN_3(ALU, AND, X), \ + INSN_3(ALU, OR, X), \ + INSN_3(ALU, LSH, X), \ + INSN_3(ALU, RSH, X), \ + INSN_3(ALU, XOR, X), \ + INSN_3(ALU, MUL, X), \ + INSN_3(ALU, MOV, X), \ + INSN_3(ALU, ARSH, X), \ + INSN_3(ALU, DIV, X), \ + INSN_3(ALU, MOD, X), \ INSN_2(ALU, NEG), \ INSN_3(ALU, END, TO_BE),\ INSN_3(ALU, END, TO_LE),\ /* Immediate based. */\ - INSN_3(ALU, ADD, K),\ - INSN_3(ALU, SUB, K),\ - INSN_3(ALU, AND, K),\ - INSN_3(ALU, OR, K),\ - INSN_3(ALU, LSH, K),\ - INSN_3(ALU, RSH, K),\ - INSN_3(ALU, XOR, K),\ - INSN_3(ALU, MUL, K),\ - INSN_3(ALU, MOV, K),\ - INSN_3(ALU, DIV, K),\ - INSN_3(ALU, MOD, K),\ + INSN_3(ALU, ADD, K), \ + INSN_3(ALU, SUB, K), \ + INSN_3(ALU, AND, K), \ + INSN_3(ALU, OR, K), \ + INSN_3(ALU, LSH, K), \ + INSN_3(ALU, RSH, K), \ + INSN_3(ALU, XOR, K), \ + INSN_3(ALU, MUL, K), \ + INSN_3(ALU, MOV, K), \ + INSN_3(ALU, ARSH, K), \ + INSN_3(ALU, DIV, K), \ + INSN_3(ALU, MOD, K), \ /* 64 bit ALU operations. */\ /* Register based. */ \ INSN_3(ALU64, ADD, X), \ @@ -1127,6 +1129,12 @@ static u64 ___bpf_prog_run(u64 *regs, const struct bpf_insn *insn, u64 *stack) DST = (u64) (u32) insn[0].imm | ((u64) (u32) insn[1].imm) << 32; insn++; CONT; + ALU_ARSH_X: + DST = (u64) (u32) ((*(s32 *) ) >> SRC); + CONT; + ALU_ARSH_K: + DST = (u64) (u32) ((*(s32 *) ) >> IMM); + CONT; ALU64_ARSH_X: (*(s64 *) ) >>= SRC; CONT; -- 2.7.4
[RFC bpf-next 7/7] selftests: bpf: update testcases for BPF_ALU | BPF_ARSH
"arsh32 on imm" and "arsh32 on reg" now are accepted. Also added two new testcases to make sure arsh32 won't be treated as arsh64 during interpretation or JIT code-gen for which case the high bits will be moved into low halve that the testcases could catch them. Reviewed-by: Jakub Kicinski Signed-off-by: Jiong Wang --- tools/testing/selftests/bpf/test_verifier.c | 29 + 1 file changed, 25 insertions(+), 4 deletions(-) diff --git a/tools/testing/selftests/bpf/test_verifier.c b/tools/testing/selftests/bpf/test_verifier.c index 18d0b7f..e7f1bc7 100644 --- a/tools/testing/selftests/bpf/test_verifier.c +++ b/tools/testing/selftests/bpf/test_verifier.c @@ -721,8 +721,18 @@ static struct bpf_test tests[] = { BPF_ALU32_IMM(BPF_ARSH, BPF_REG_0, 5), BPF_EXIT_INSN(), }, - .result = REJECT, - .errstr = "unknown opcode c4", + .result = ACCEPT, + .retval = 0, + }, + { + "arsh32 on imm 2", + .insns = { + BPF_LD_IMM64(BPF_REG_0, 0x1122334485667788), + BPF_ALU32_IMM(BPF_ARSH, BPF_REG_0, 7), + BPF_EXIT_INSN(), + }, + .result = ACCEPT, + .retval = -16069393, }, { "arsh32 on reg", @@ -732,8 +742,19 @@ static struct bpf_test tests[] = { BPF_ALU32_REG(BPF_ARSH, BPF_REG_0, BPF_REG_1), BPF_EXIT_INSN(), }, - .result = REJECT, - .errstr = "unknown opcode cc", + .result = ACCEPT, + .retval = 0, + }, + { + "arsh32 on reg 2", + .insns = { + BPF_LD_IMM64(BPF_REG_0, 0x55667788), + BPF_MOV64_IMM(BPF_REG_1, 15), + BPF_ALU32_REG(BPF_ARSH, BPF_REG_0, BPF_REG_1), + BPF_EXIT_INSN(), + }, + .result = ACCEPT, + .retval = 43724, }, { "arsh64 on imm", -- 2.7.4
[RFC bpf-next 4/7] s390: bpf: implement jitting of BPF_ALU | BPF_ARSH | BPF_*
This patch implements code-gen for BPF_ALU | BPF_ARSH | BPF_*. Cc: Martin Schwidefsky Cc: Heiko Carstens Signed-off-by: Jiong Wang --- arch/s390/net/bpf_jit_comp.c | 12 1 file changed, 12 insertions(+) diff --git a/arch/s390/net/bpf_jit_comp.c b/arch/s390/net/bpf_jit_comp.c index d7052cb..3ff758e 100644 --- a/arch/s390/net/bpf_jit_comp.c +++ b/arch/s390/net/bpf_jit_comp.c @@ -821,10 +821,22 @@ static noinline int bpf_jit_insn(struct bpf_jit *jit, struct bpf_prog *fp, int i /* * BPF_ARSH */ + case BPF_ALU | BPF_ARSH | BPF_X: /* ((s32) dst) >>= src */ + /* sra %dst,%dst,0(%src) */ + EMIT4_DISP(0x8a00, dst_reg, src_reg, 0); + EMIT_ZERO(dst_reg); + break; case BPF_ALU64 | BPF_ARSH | BPF_X: /* ((s64) dst) >>= src */ /* srag %dst,%dst,0(%src) */ EMIT6_DISP_LH(0xeb00, 0x000a, dst_reg, dst_reg, src_reg, 0); break; + case BPF_ALU | BPF_ARSH | BPF_K: /* ((s32) dst >> imm */ + if (imm == 0) + break; + /* sra %dst,imm(%r0) */ + EMIT4_DISP(0x8a00, dst_reg, REG_0, imm); + EMIT_ZERO(dst_reg); + break; case BPF_ALU64 | BPF_ARSH | BPF_K: /* ((s64) dst) >>= imm */ if (imm == 0) break; -- 2.7.4
[RFC bpf-next 0/7] bpf: support BPF_ALU | BPF_ARSH
BPF_ALU | BPF_ARSH | BPF_* were rejected by commit: 7891a87efc71 ("bpf: arsh is not supported in 32 bit alu thus reject it"). As explained in the commit message, this is due to there is no complete support for them on interpreter and various JIT compilation back-ends. This patch set is a follow-up which completes the missing bits. This also pave the way for running bpf program compiled with ALU32 instruction enabled by specifing -mattr=+alu32 to LLVM for which case there is likely have more BPF_ALU | BPF_ARSH insns that will trigger the rejection code. test_verifier.c is updated accordingly. I have tested this patch set on x86-64 and NFP, I need help of review and test on the arch changes (mips/ppc/s390). Note, to avoid merge confict, mips change needs to be applied on top of commit: 20b880a05f06 ("mips: bpf: fix encoding bug for mm_srlv32_op"), it is on mips-fixes branch at the moment. Thanks. Jiong Wang (7): bpf: interpreter support BPF_ALU | BPF_ARSH mips: bpf: implement jitting of BPF_ALU | BPF_ARSH | BPF_X ppc: bpf: implement jitting of BPF_ALU | BPF_ARSH | BPF_* s390: bpf: implement jitting of BPF_ALU | BPF_ARSH | BPF_* nfp: bpf: implement jitting of BPF_ALU | BPF_ARSH | BPF_* bpf: verifier remove the rejection on BPF_ALU | BPF_ARSH selftests: bpf: update testcases for BPF_ALU | BPF_ARSH arch/mips/include/asm/uasm.h | 1 + arch/mips/include/uapi/asm/inst.h| 1 + arch/mips/mm/uasm-micromips.c| 1 + arch/mips/mm/uasm-mips.c | 1 + arch/mips/mm/uasm.c | 9 ++--- arch/mips/net/ebpf_jit.c | 4 +++ arch/powerpc/include/asm/ppc-opcode.h| 2 ++ arch/powerpc/net/bpf_jit.h | 4 +++ arch/powerpc/net/bpf_jit_comp64.c| 6 arch/s390/net/bpf_jit_comp.c | 12 +++ drivers/net/ethernet/netronome/nfp/bpf/jit.c | 45 kernel/bpf/core.c| 52 kernel/bpf/verifier.c| 5 --- tools/testing/selftests/bpf/test_verifier.c | 29 +--- 14 files changed, 137 insertions(+), 35 deletions(-) -- 2.7.4
[PATCH v2 bpf] mips: bpf: fix encoding bug for mm_srlv32_op
For micro-mips, srlv inside POOL32A encoding space should use 0x50 sub-opcode, NOT 0x90. Some early version ISA doc describes the encoding as 0x90 for both srlv and srav, this looks to me was a typo. I checked Binutils libopcode implementation which is using 0x50 for srlv and 0x90 for srav. v1->v2: - Keep mm_srlv32_op sorted by value. Fixes: f31318fdf324 ("MIPS: uasm: Add srlv uasm instruction") Cc: Markos Chandras Cc: Paul Burton Cc: linux-m...@vger.kernel.org Acked-by: Jakub Kicinski Acked-by: Song Liu Signed-off-by: Jiong Wang --- arch/mips/include/uapi/asm/inst.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/mips/include/uapi/asm/inst.h b/arch/mips/include/uapi/asm/inst.h index c05dcf5..273ef58 100644 --- a/arch/mips/include/uapi/asm/inst.h +++ b/arch/mips/include/uapi/asm/inst.h @@ -369,8 +369,8 @@ enum mm_32a_minor_op { mm_ext_op = 0x02c, mm_pool32axf_op = 0x03c, mm_srl32_op = 0x040, + mm_srlv32_op = 0x050, mm_sra_op = 0x080, - mm_srlv32_op = 0x090, mm_rotr_op = 0x0c0, mm_lwxs_op = 0x118, mm_addu32_op = 0x150, -- 2.7.4
Re: [PATCH bpf] mips: bpf: fix encoding bug for mm_srlv32_op
On 03/12/2018 22:04, Paul Burton wrote: Hi Jiong, On Sat, Dec 01, 2018 at 04:10:01AM -0500, Jiong Wang wrote: For micro-mips, srlv inside POOL32A encoding space should use 0x50 sub-opcode, NOT 0x90. Some early version ISA doc describes the encoding as 0x90 for both srlv and srav, this looks to me was a typo. I checked Binutils libopcode implementation which is using 0x50 for srlv and 0x90 for srav. Are you aware of documentation that gets this right? No, I am not aware of. Looking at the latest microMIPS spec I have available (6.05) it looks like this is still documented incorrectly. I'll pass this along to the architecture team. Fixes: f31318fdf324 ("MIPS: uasm: Add srlv uasm instruction") CC: Markos Chandras CC: Paul Burton Acked-by: Jakub Kicinski Signed-off-by: Jiong Wang --- arch/mips/include/uapi/asm/inst.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/mips/include/uapi/asm/inst.h b/arch/mips/include/uapi/asm/inst.h index c05dcf5..80f35e7 100644 --- a/arch/mips/include/uapi/asm/inst.h +++ b/arch/mips/include/uapi/asm/inst.h @@ -370,7 +370,7 @@ enum mm_32a_minor_op { mm_pool32axf_op = 0x03c, mm_srl32_op = 0x040, mm_sra_op = 0x080, - mm_srlv32_op = 0x090, + mm_srlv32_op = 0x050, mm_rotr_op = 0x0c0, mm_lwxs_op = 0x118, mm_addu32_op = 0x150, Could you also move it above mm_sra_op to keep them sorted by value? When submitting v2 please also copy the linux-mips mailing list - linux-m...@vger.kernel.org. Thanks, will do. Regards, Jiong
[PATCH bpf] mips: bpf: fix encoding bug for mm_srlv32_op
For micro-mips, srlv inside POOL32A encoding space should use 0x50 sub-opcode, NOT 0x90. Some early version ISA doc describes the encoding as 0x90 for both srlv and srav, this looks to me was a typo. I checked Binutils libopcode implementation which is using 0x50 for srlv and 0x90 for srav. Fixes: f31318fdf324 ("MIPS: uasm: Add srlv uasm instruction") CC: Markos Chandras CC: Paul Burton Acked-by: Jakub Kicinski Signed-off-by: Jiong Wang --- arch/mips/include/uapi/asm/inst.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/mips/include/uapi/asm/inst.h b/arch/mips/include/uapi/asm/inst.h index c05dcf5..80f35e7 100644 --- a/arch/mips/include/uapi/asm/inst.h +++ b/arch/mips/include/uapi/asm/inst.h @@ -370,7 +370,7 @@ enum mm_32a_minor_op { mm_pool32axf_op = 0x03c, mm_srl32_op = 0x040, mm_sra_op = 0x080, - mm_srlv32_op = 0x090, + mm_srlv32_op = 0x050, mm_rotr_op = 0x0c0, mm_lwxs_op = 0x118, mm_addu32_op = 0x150, -- 2.7.4
[PATCH bpf-next 0/2] bpf: offer maximum packet offset info
The maximum packet offset accessed by one BPF program is useful information. Because sometimes there could be packet split and it is possible for some reasons (for example performance) we want to reject the BPF program if the maximum packet size would trigger such split. Normally, MTU value is treated as the maximum packet size, but one BPF program does not always access the whole packet, it could only access the head portion of the data. We could let verifier calculate the maximum packet offset ever used and record it inside prog auxiliar information structure as a new field "max_pkt_offset". Jiong Wang (2): bpf: let verifier to calculate and record max_pkt_offset nfp: bpf: relax prog rejection through max_pkt_offset drivers/net/ethernet/netronome/nfp/bpf/offload.c | 9 + include/linux/bpf.h | 1 + kernel/bpf/verifier.c| 12 3 files changed, 18 insertions(+), 4 deletions(-) -- 2.7.4
[PATCH bpf-next 2/2] nfp: bpf: relax prog rejection through max_pkt_offset
NFP is refusing to offload programs whenever the MTU is set to a value larger than the max packet bytes that fits in NFP Cluster Target Memory (CTM). However, a eBPF program doesn't always need to access the whole packet data. Verifier has always calculated maximum direct packet access (DPA) offset, and kept it in max_pkt_offset inside prog auxiliar information. This patch relax prog rejection based on max_pkt_offset. Reviewed-by: Jakub Kicinski Signed-off-by: Jiong Wang --- drivers/net/ethernet/netronome/nfp/bpf/offload.c | 9 + 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/drivers/net/ethernet/netronome/nfp/bpf/offload.c b/drivers/net/ethernet/netronome/nfp/bpf/offload.c index ba8ceed..07bdc1f 100644 --- a/drivers/net/ethernet/netronome/nfp/bpf/offload.c +++ b/drivers/net/ethernet/netronome/nfp/bpf/offload.c @@ -489,14 +489,15 @@ nfp_net_bpf_load(struct nfp_net *nn, struct bpf_prog *prog, struct netlink_ext_ack *extack) { struct nfp_prog *nfp_prog = prog->aux->offload->dev_priv; - unsigned int max_mtu, max_stack, max_prog_len; + unsigned int fw_mtu, pkt_off, max_stack, max_prog_len; dma_addr_t dma_addr; void *img; int err; - max_mtu = nn_readb(nn, NFP_NET_CFG_BPF_INL_MTU) * 64 - 32; - if (max_mtu < nn->dp.netdev->mtu) { - NL_SET_ERR_MSG_MOD(extack, "BPF offload not supported with MTU larger than HW packet split boundary"); + fw_mtu = nn_readb(nn, NFP_NET_CFG_BPF_INL_MTU) * 64 - 32; + pkt_off = min(prog->aux->max_pkt_offset, nn->dp.netdev->mtu); + if (fw_mtu < pkt_off) { + NL_SET_ERR_MSG_MOD(extack, "BPF offload not supported with potential packet access beyond HW packet split boundary"); return -EOPNOTSUPP; } -- 2.7.4
[PATCH bpf-next 1/2] bpf: let verifier to calculate and record max_pkt_offset
In check_packet_access, update max_pkt_offset after the offset has passed __check_packet_access. It should be safe to use u32 for max_pkt_offset as explained in code comment. Also, when there is tail call, the max_pkt_offset of the called program is unknown, so conservatively set max_pkt_offset to MAX_PACKET_OFF for such case. Reviewed-by: Jakub Kicinski Signed-off-by: Jiong Wang --- include/linux/bpf.h | 1 + kernel/bpf/verifier.c | 12 2 files changed, 13 insertions(+) diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 33014ae..b6a296e 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -293,6 +293,7 @@ struct bpf_prog_aux { atomic_t refcnt; u32 used_map_cnt; u32 max_ctx_offset; + u32 max_pkt_offset; u32 stack_depth; u32 id; u32 func_cnt; diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 98fa0be..6a248d8 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -1452,6 +1452,17 @@ static int check_packet_access(struct bpf_verifier_env *env, u32 regno, int off, verbose(env, "R%d offset is outside of the packet\n", regno); return err; } + + /* __check_packet_access has made sure "off + size - 1" is within u16. +* reg->umax_value can't be bigger than MAX_PACKET_OFF which is 0x, +* otherwise find_good_pkt_pointers would have refused to set range info +* that __check_packet_access would have rejected this pkt access. +* Therefore, "off + reg->umax_value + size - 1" won't overflow u32. +*/ + env->prog->aux->max_pkt_offset = + max_t(u32, env->prog->aux->max_pkt_offset, + off + reg->umax_value + size - 1); + return err; } @@ -6128,6 +6139,7 @@ static int fixup_bpf_calls(struct bpf_verifier_env *env) */ prog->cb_access = 1; env->prog->aux->stack_depth = MAX_BPF_STACK; + env->prog->aux->max_pkt_offset = MAX_PACKET_OFF; /* mark bpf_tail_call as different opcode to avoid * conditional branch in the interpeter for every normal -- 2.7.4
Re: [RFC PATCH v2 bpf-next 0/2] verifier liveness simplification
On 03/10/2018 17:53, Jiong Wang wrote: On 03/10/2018 16:59, Alexei Starovoitov wrote: On Wed, Oct 03, 2018 at 04:36:31PM +0100, Jiong Wang wrote: Now this hasn't happened. I am still debugging the root cause, but kind of feel "64-bit" attribute propagation is the issue, it seems to me it can't be nicely integrated into the existing register read/write propagation infrastructure. may be share your patches that modify the liveness propagation? OK, I will share it after some clean up. Have done more experiments on the algo, and finally got something passed my local tests. bpf selftest passed as well except several test_verifier failures due to some corner cases I guess, will have a look later. In these test, x86-64 is using the 32-bit information. In the insn loop inside sanitize_dead_code, once an insn is not marked as 64-bit and if it is ALU64, it will then be changed to ALU32. When enable tracking using printk, I could see quite a few ALU64 instructions really are rewritten into ALU32, so tests like test_l4lb runs OK looks like a positive signal of the correctness. The algo separates the problem into two steps. - First, assume there is no code path prune and all code paths will be walked through. In this case, if we could propagate 64-bit info backward along the use-def chains for all paths walked, one insn must will be marked as 64-bit correctly. Finish this step requires building use-def chain, and it is done in the following way: 1. each insn could have two explicit uses, so add two chain fields in bpf_insn_aux_data. 2. we need finer enum to describe register use, so split SRC_OP into SRC_OP_0, SRC_OP64_0, SRC_OP_1, SRC_OP64_1 to indicate the source is the first/second source and whether it is a 64-bit source. 3. create the chain at check_reg_arg which is exactly covering all register use sites. The function to create the chain is link_reg_to_def. 4. when creating the chain, if a source is a 64-bit source, also propagating the information backward along use-def chain straight away. This is done in mark_reg_read which further calls the new function "mark_insn_64bit" which is doing the real job. "mark_insn_64bit" fetches the def insn for the 64-bit source, and further marks the def insns of its sources as 64-bit. This will be repeated until the whole use-def chain consumed. 5. by use-def chain described above, if there is no code path prune, one insn must have been marked as 64-bit when it's result has 64-bit use. 6. helper call causing implicit reg use and must be conservative treated as 64-bit use, bpf-to-bpf function call has insn connected by use-def so doesn't need to make that conservative assumption. - Second, to handle code path prune, define new liveness enum REG_LIVE_READ64 and REG_LIVE_UNIQUE_WRITTEN. The latter will only be set if reg_arg_type is the new U_DST_OP or U_DST_OP_NO_MARK, and REG_LIVE_READ64 will be set if one 64-bit read is not screened off by REG_LIVE_UNIQUE_WRITTEN. The thought is 64-bit use info will only be screened off if the dst register is unique in all register operands, meaning not the same as any source. For example, insn 18 below will screen off r4 64-bit propagation. 17: r3 += r7 18: r4 = 1 19: *(u64 *)(r10 - 32) = r3 20: *(u64 *)(r10 - 40) = r4 So, U_DST_OP/U_DST_OP_NO_MARK have been introduced to differentiate with DST_OP/DST_OP_NO_MARK. Inside check_reg_arg, checks are done on dst_reg, and would pass U_DST_* as reg_arg_type when it is unique. U_DST_* then will set liveness to REG_LIVE_UNIQUE_WRITTEN. In side mark_reg_read, if one 64-bit read is not screened off by REG_LIVE_UNIQUE_WRITTEN, then REG_LIVE_READ64 will be set in the reg_state. REG_LIVE_READ64 further triggers propagating downstream 64-bit uses from the pruned paths into the current path inside propagate_liveness when path prune happened. Compared with propagating REG_LIVE_READ, propagating REG_LIVE_READ64 needs more work. Because one 64-bit read could indicating more than one registers are 64-bit. For example, insn 19 above shows r3 is 64-bit source, so its define at insn 17 are 64-bit, then all sources of insn 17 must be 64-bit, meaning both r3 and r7 are 64-bit. Therefore, REG_LIVE_READ64 needs to be propagated on both r3 and r7 upward along the register parentage chain. During walking def-use chain, we record all such affected reg_state, and propagate REG_LIVE_READ64 for all of them. This logic is done inside mark_insn_64bit as well. - For short, the implementation treating the new 64-bit info (REG_LIVE_READ64) propagation the same as REG_LIVE_READ propagation. REG_LIVE_READ triggers more path prune during state equal comparison, while REG_LIVE_READ64 triggers 64-b
Re: [RFC PATCH v2 bpf-next 0/2] verifier liveness simplification
On 03/10/2018 16:59, Alexei Starovoitov wrote: On Wed, Oct 03, 2018 at 04:36:31PM +0100, Jiong Wang wrote: Now this hasn't happened. I am still debugging the root cause, but kind of feel "64-bit" attribute propagation is the issue, it seems to me it can't be nicely integrated into the existing register read/write propagation infrastructure. may be share your patches that modify the liveness propagation? OK, I will share it after some clean up. For example, for a slightly more complex sequence which is composed of three states: State A ... 10: r6 = *(u32 *)(r10 - 4) 11: r7 = *(u32 *)(r10 - 8) 12: *(u64 *)(r10 - 16) = r6 13: *(u64 *)(r10 - 24) = r7 State B 14: r6 += 1 15: r7 += r6 16: *(u32 *)(r10 - 28) = r7 State C ... 17: r3 += r7 18: r4 = 1 19: *(u64 *)(r10 - 32) = r3 20: *(u64 *)(r10 - 40) = r4 State A is parent of state B which is parent of state C. Inside state C, at insn 20, r4 is a 64-bit read/use, so its define at 18 is marked as "64-bit". There is no register source at 18, so "64-bit" attribute propagation is stopped. Then at insn 19, r3 is a 64-bit read/use, so its define at 17 is marked as "64-bit" read/use. Insn 17 has two register sources, r3 and r7, they become "64-bit" now, and their definition should be marked as "64-bit". Now if the definition of r3 or r7 comes from parent state, then the parent ... the definition of r3 _and_ r7 ... both need to propagate up with your algo, right? Yes, all sources of insn 17, both r3 and r7. state should receive a "REG_LIVE_READ64", this is necessary if later another path reaches state C and triggers prune path, for which case that path should know there is "64-bit" use inside state C on some registers, and should use this information to mark "64-bit" insn. If the definition of r3 or r7 is still inside state C, we need to keep walking up the instruction sequences, and propagate "64-bit" attribute upward until it goes beyond the state C. The above propagation logic is quite different from existing register read/write propagation. For the latter, a write just screen up all following read, and a read would propagate directly to its parent is there is not previous write, no instruction analysis is required. correct. with such algo REG_LIVE_WRITTEN shouldn't be screening the propagation. I think the patches will discuss the algo. Also I think the initial state of 'everything is 32-bit safe' and make marks to enforce 64-bit-ness is a dangerous algorithmic choice. Indeed, and I am actually thinking the same thing ... Why not to start at a safer state where everything is 64-bit and work backward to find out which ones can be 32-bit? That will be safer algo in case there are issues with bit like you described in above. ... but I failed to find a algo works on making initial state of "everything is 64-bit". I haven't figured out a way to check that all use of one definition are 32-bit on all possible code paths, which looks to me is a must for one insn marked as 32-bit safe.
Re: [RFC PATCH v2 bpf-next 0/2] verifier liveness simplification
On 28/09/2018 14:36, Edward Cree wrote: > On 26/09/18 23:16, Jiong Wang wrote: >> On 22/08/2018 20:00, Edward Cree wrote: >>> In the future this idea may be extended to form use-def chains. >> >> 1. instruction level use->def chain >> >> - new use->def chains for each instruction. one eBPF insn could have two >> uses at maximum. > I was thinking of something a lot weaker/simpler, just making > ld rX, rY > copy rY.parent into rX.parent and not read-mark rY (whereas actual > arithmetic, pointer deref etc. would still create read marks). Thanks for the feedback Edward. > But what you've described sounds interesting; perhaps it would also > help later with loop-variable handling? Haven't considered how to use this for loop-variable handling, guess you mean applying what I have described to your previous loop detection RFC? I will look into your RFC later. At the moment the design of the use->def chain is mainly to optimize 32-bit code-gen. I was about to satisfied with a local implementation and to share it to ML for further discussion. However, when manually check the optimization result on testcase with medium size (~1000 eBPF insns) and proper complexity (make sure path prunes etc are triggered inside verifier), I found the code-gen doesn't meet my expectation. For example, for the following sequence, insn at 25 should operate on full-64 bit but I found it is marked as 32-bit safe. 25: r7 = 1 26: if r4 > r8 goto +1200 27: r1 = *(u8 *)(r1 + 0) 28: r1 &= 15 29: r7 = 1 ... L: 1227: r0 = r7 1228: exit As described at previous email, the algorithm assume all insns are 32-bit safe first, then start to insns back to "64-bit" if there is any 64-bit use found for a insn. Insn 25 is defining r7 which is used at the 1227 where its value propagated to r0 and then r0 is implicitly used at insn 1228 as it is a exit from main function to external. For above example, as we don't know the external use of r0 at 1228 (exit from main to external), so r0 is treated as 64-bit implicit use. The define is at 1227, so insn 1227 is marked as "64-bit". The "64-bit" attribute should propagate to source register operand through register move and arithmetic, so r7 at insn 1227 is a "64-bit" use and should make its definition instruction, insn 25, marked as "64-bit". This is my thinking of how insn 25 should be marked. Now this hasn't happened. I am still debugging the root cause, but kind of feel "64-bit" attribute propagation is the issue, it seems to me it can't be nicely integrated into the existing register read/write propagation infrastructure. For example, for a slightly more complex sequence which is composed of three states: State A ... 10: r6 = *(u32 *)(r10 - 4) 11: r7 = *(u32 *)(r10 - 8) 12: *(u64 *)(r10 - 16) = r6 13: *(u64 *)(r10 - 24) = r7 State B 14: r6 += 1 15: r7 += r6 16: *(u32 *)(r10 - 28) = r7 State C ... 17: r3 += r7 18: r4 = 1 19: *(u64 *)(r10 - 32) = r3 20: *(u64 *)(r10 - 40) = r4 State A is parent of state B which is parent of state C. Inside state C, at insn 20, r4 is a 64-bit read/use, so its define at 18 is marked as "64-bit". There is no register source at 18, so "64-bit" attribute propagation is stopped. Then at insn 19, r3 is a 64-bit read/use, so its define at 17 is marked as "64-bit" read/use. Insn 17 has two register sources, r3 and r7, they become "64-bit" now, and their definition should be marked as "64-bit". Now if the definition of r3 or r7 comes from parent state, then the parent state should receive a "REG_LIVE_READ64", this is necessary if later another path reaches state C and triggers prune path, for which case that path should know there is "64-bit" use inside state C on some registers, and should use this information to mark "64-bit" insn. If the definition of r3 or r7 is still inside state C, we need to keep walking up the instruction sequences, and propagate "64-bit" attribute upward until it goes beyond the state C. The above propagation logic is quite different from existing register read/write propagation. For the latter, a write just screen up all following read, and a read would propagate directly to its parent is there is not previous write, no instruction analysis is required. I am just describing what I have run into and trying to resolve, any thoughts and suggestions are appreciated. Regards, Jiong
Re: [RFC PATCH v2 bpf-next 0/2] verifier liveness simplification
On 22/08/2018 20:00, Edward Cree wrote: The first patch is a simplification of register liveness tracking by using a separate parentage chain for each register and stack slot, thus avoiding the need for logic to handle callee-saved registers when applying read marks. In the future this idea may be extended to form use-def chains. Interesting. This could potentially help efficient code-gen for 32-bit architectures and I had been thinking some implementations along this line and would like to have a discussion. Program description === It will be good if we could know one instruction is safe to operate on low 32-bit sub-register only. If this is true: - 32-bit arches could save some code-gen. - 64-bit arches could utilize 32-bit sub-register instructions. Algorithm === Based on the current verifier code-path-walker, it looks to me we could get 32-bit safety information using the following algorithm: 1. assume all instructions operate on 32-bit sub-register initially. 2. link each insn to insns which define its use. 3. for a register use, if it is a 64-bit write back to memory, or if it is a comparison-then-jump based on 64-bit value, or if it is a right shift, then consider the use is a 64-bit use, then all its define insns and their parents define insns should be marked as need full 64-bit operation. 4. currently, register read (REG_LIVE_READ) will be propagated to parent state when path prune happened, and REG_LIVE_WRITTEN would screen off such propagation. We need to propagate 64-bit mark in similar way, but REG_LIVE_WRITTEN shouldn't screen off backward 64-bit mark propagation if the written reg also used as source reg (this has raised REG_LIVE_READ propagation but it is not necessarily indicating 64-bit read) in the same instruction. Implementation === And it seems there could be an implementation based on current liveness tracking infrastructure without dramatic change. 1. instruction level use->def chain - new active_defs array for each call frame to record which insn the active define of one register comes from. callee copies active_defs from caller for argument registers only, and copies active_def of R0 to caller when exit. struct bpf_func_state { ... s16 active_defs[__MAX_BPF_REG]; - new use->def chains for each instruction. one eBPF insn could have two uses at maximum. also one new boolean "full_ref_def" added to keep the final 32-bit safety information. it will be true if this instruction needs to operate on full 64-bit. bpf_insn_aux_data { ... struct { s16 def[2]; bool full_ref_def; }; 2. Inside mark_reg_read, split SRC_OP to SRC0_OP/SRC1_OP/SRC_OP_64/SRC1_OP_64 to indicate one register read if from the first or second use slot of this instruction, and to indicate whether the read is on full 64-bit which will only be true if the read is for 64-bit write back to memory, or 64-bit comparison-then-jump, or bit right shift means 64-bit. Build use->def chain for any read, but do 64-bit backward propagation for 64-bit read only. The propagation is to set full_ref_def to true for def and parent defs of the 64-bit read. 3. Need new bpf_reg_liveness enum, REG_LIVE_READ64 to indicate there is 64-bit access to one register in the pruned path, and need REG_LIVE_WRITTEN64 to indicate a write that is REG_LIVE_WRITTEN but should not screen backward propagate 64-bit read info. #define REG_LIVE_NONE 0 #define REG_LIVE_READ BIT(0) #define REG_LIVE_READ64BIT(1) #define REG_LIVE_WRITTEN BIT(2) #define REG_LIVE_WRITTEN64 BIT(3) I might have missed something in above thoughts, would appreciate reviews and suggestions. I will also send out some concrete patches later. Regards, Jiong
Re: [PATCH bpf-next 7/7] nfp: bpf: migrate to advanced reciprocal divide in reciprocal_div.h
On 26/06/2018 21:59, Jakub Kicinski wrote: On Sun, 24 Jun 2018 20:54:21 -0700, Jakub Kicinski wrote: +* NOTE: because we are using "reciprocal_value_adv" which doesn't +* support dividend with MSB set, so we need to JIT separate NFP +* sequence to handle such case. It could be a simple sequence if there +* is conditional move, however there isn't for NFP. So, we don't bother +* generating compare-if-set-branch sequence by rejecting the program +* straight away when the u32 dividend has MSB set. Divide by such a +* large constant would be rare in practice. Also, the programmer could +* simply rewrite it as "result = divisor >= the_const". Thinking about this again, can we just use carry bit? Good catch, yes we can. The code may end up shorter than the explanation why we don't support that case :P immed[c, 0] alu[--, a, -, b] alu[c, c, +carry, 0] eBPF input will be "a = a / b", given "immed" doesn't affect carry bit, I'd reorder the sequence so we only need one tmp register for holding "b" who is constant. alu[--, a, -, b] immed[b, 0] alu[a, b, +carry, 0] Thanks. Regards, Jiong Should be equivalent to: c = a >= b (Thanks to Edwin for double-checking the carry semantics.)
Re: [PATCH bpf-next 2/7] lib: reciprocal_div: implement the improved algorithm on the paper mentioned
On Tue, Jun 26, 2018 at 7:21 AM, Song Liu wrote: > On Sun, Jun 24, 2018 at 8:54 PM, Jakub Kicinski > wrote: >> From: Jiong Wang >> + >> +struct reciprocal_value_adv reciprocal_value_adv(u32 d, u8 prec) >> +{ >> + struct reciprocal_value_adv R; >> + u32 l, post_shift; >> + u64 mhigh, mlow; >> + >> + l = fls(d - 1); >> + post_shift = l; >> + /* NOTE: mlow/mhigh could overflow u64 when l == 32 which means d has >> +* MSB set. This case needs to be handled before calling >> +* "reciprocal_value_adv", please see the comment at >> +* include/linux/reciprocal_div.h. >> +*/ > > Shall we handle l == 32 case better? I guess the concern here is extra > handling may > slow down the fast path? The implementation of "reciprocal_value_adv" hasn't considered l == 32 which will make the code more complex. As described at the pseudo code about how to call "reciprocal_value_adv" in include/linux/reciprocal_div.h, l == 32 means the MSB of dividend is set, so the result of unsigned divisor/dividend could only be 0 or 1, so the divide result could be easily get by a comparison then conditional move 0 or 1 to the result. > If that's the case, we should at least add a WARNING on the slow path. OK, I will add a pr_warn inside "reciprocal_value_adv" when l == 32 is triggered. Thanks, Jiong
Re: [RFC bpf-next 04/10] bpf: cfg: detect loop use domination information
On 10/05/2018 19:17, John Fastabend wrote: On 05/07/2018 03:22 AM, Jiong Wang wrote: If one bb is dominating its predecessor, then there is loop. Signed-off-by: Jiong Wang <jiong.w...@netronome.com> --- kernel/bpf/cfg.c | 22 ++ kernel/bpf/cfg.h | 1 + kernel/bpf/verifier.c | 8 3 files changed, 31 insertions(+) diff --git a/kernel/bpf/cfg.c b/kernel/bpf/cfg.c index b50937a..90692e4 100644 --- a/kernel/bpf/cfg.c +++ b/kernel/bpf/cfg.c @@ -568,6 +568,28 @@ int subprog_build_dom_info(struct bpf_subprog_info *subprog) return ret; } +bool subprog_has_loop(struct bpf_subprog_info *subprog) +{ + int lane_len = BITS_TO_LONGS(subprog->bb_num - 2); + struct list_head *bb_list = >bbs; + struct bb_node *bb, *entry_bb; + struct edge_node *e; + + entry_bb = entry_bb(bb_list); + bb = bb_next(entry_bb); + list_for_each_entry_from(bb, _bb(bb_list)->l, l) + list_for_each_entry(e, >e_prevs, l) { + struct bb_node *latch = e->src; + + if (latch != entry_bb && + test_bit(bb->idx, +subprog->dtree + latch->idx * lane_len)) + return true; + } + + return false; +} + Because we are using this to guard against loops we need to detect all loops not just reducible loops. And because (assuming my understanding is correct) Tarjan's algorithm will only detect all loops when the graph is reducible we need additional tests. Hi John, Yes, the current DOM based loop detection can't detect irreducible loop. And I feel both the first and second approaches you listed below are good, might worth implementing both and measure which one is with less overhead. I could give a try on approach 2. The alg given by Eric Stoltz might be a good choice (https://compilers.iecc.com/comparch/article/94-01-053). We have dom info now, so could do another DFS, and see if the head of one back-edge doesn't dom the tail that there is multiple entries to the loop. I guess the first approach is with less overhead as there is no existing DFS pass to reuse after dom build that we need a new pass for approach 2. Regards, Jiong There are a couple options to fix this with varying levels of complexity. Because I'm using this to build loop info structures to find induction variables and show termination. After the loop structures are built we could search for any back-edges not in valid loops. This would be similar to the existing back-edge detection code but with an extra check to allow edges that have been validated. I would need to check that this doesn't have any escapes before actually proposing it though. The other method would be to properly test for reducibility using one of the algorithms for this. I think the most intuitive is to remove back-edges and test the graph is acyclic. This would be run before the dom tree is built. This is IMO what we should do, it seems the most "correct" way to do this. The most complex would be to handle irreducible programs using some of the more complex methods. I really don't think this is necessary but in theory at least we could use something like Havlak-Tarjan algorithm and allow some programs with irreducible loops. This is likely overkill especially in a first iteration. Here is a sample that fails without this series, using original back-edge detection, but is allowed with this patch, SEC("classifier_tc_mark") int _tc_mark(struct __sk_buff *ctx) { void *data = (void *)(unsigned long)ctx->data; void *data_end = (void *)(unsigned long)ctx->data_end; void *data_meta = (void *)(unsigned long)ctx->data_meta; struct meta_info *meta = data_meta; volatile int mark = ctx->mark; mark += 1; if (meta + 1 > data) { B: mark += 2; if (mark < ctx->mark * 3) goto C; } else if (meta < data) { C: mark += 1; if (mark < 1000) goto B; } return TC_ACT_OK; } A more concise example could be made but I just hacked on one of the sample programs. This generates the CFG as follows (I have a patch on top of your stack to print the CFG and DOM tables) CFG: 65535[-1,-1] -> 0[0,9] 0[0,9] -> 3[20,20] 0[0,9] -> 1[10,18] 1[10,18] -> 4[21,28] 1[10,18] -> 2[19,19] 2[19,19] -> 5[29,30] 3[20,20] -> 5[29,30] 3[20,20] -> 4[21,28] 4[21,28] -> 1[10,18] 4[21,28] -> 5[29,30] 5[29,30] -> 65534[31,65534] DOM: 1 0 0 0 0 0 1 1 0 0 0 0 1 1 1 0 0 0 1 0 0 1 0 0 1 0 0 0 1 0 1 0 0 0 0 1 Here we have the loop 1[10,18]->4[21,28] and the back-edge 4[21,28]->1[10,18]. The notation is #idx[head_
Re: [RFC bpf-next 00/10] initial control flow support for eBPF verifier
On 07/05/2018 11:22, Jiong Wang wrote: execution time === test_l4lb_noinline: existing check_subprog/check_cfg: ~55000 ns new infrastructure: ~135000 ns test_xdp_noinline: existing check_subprog/check_cfg: ~52000 ns new infrastructure: ~12 ns Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz Regards, Jiong
[RFC bpf-next 04/10] bpf: cfg: detect loop use domination information
If one bb is dominating its predecessor, then there is loop. Signed-off-by: Jiong Wang <jiong.w...@netronome.com> --- kernel/bpf/cfg.c | 22 ++ kernel/bpf/cfg.h | 1 + kernel/bpf/verifier.c | 8 3 files changed, 31 insertions(+) diff --git a/kernel/bpf/cfg.c b/kernel/bpf/cfg.c index b50937a..90692e4 100644 --- a/kernel/bpf/cfg.c +++ b/kernel/bpf/cfg.c @@ -568,6 +568,28 @@ int subprog_build_dom_info(struct bpf_subprog_info *subprog) return ret; } +bool subprog_has_loop(struct bpf_subprog_info *subprog) +{ + int lane_len = BITS_TO_LONGS(subprog->bb_num - 2); + struct list_head *bb_list = >bbs; + struct bb_node *bb, *entry_bb; + struct edge_node *e; + + entry_bb = entry_bb(bb_list); + bb = bb_next(entry_bb); + list_for_each_entry_from(bb, _bb(bb_list)->l, l) + list_for_each_entry(e, >e_prevs, l) { + struct bb_node *latch = e->src; + + if (latch != entry_bb && + test_bit(bb->idx, +subprog->dtree + latch->idx * lane_len)) + return true; + } + + return false; +} + static void subprog_free_edge(struct bb_node *bb) { struct list_head *succs = >e_succs; diff --git a/kernel/bpf/cfg.h b/kernel/bpf/cfg.h index cbb44f2..c02c4cf 100644 --- a/kernel/bpf/cfg.h +++ b/kernel/bpf/cfg.h @@ -12,6 +12,7 @@ int subprog_add_bb_edges(struct bpf_insn *insns, struct list_head *bb_list); int subprog_append_bb(struct list_head *bb_list, int head); int subprog_build_dom_info(struct bpf_subprog_info *subprog); int subprog_fini_bb(struct list_head *bb_list, int subprog_end); +bool subprog_has_loop(struct bpf_subprog_info *subprog); int subprog_init_bb(struct list_head *bb_list, int subprog_start); void subprog_free(struct bpf_subprog_info *subprog, int end_idx); diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 6543470..a93aa43 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -879,6 +879,14 @@ static int check_subprogs(struct bpf_verifier_env *env) if (ret < 0) goto free_nodes; subprog[cur_subprog].bb_num = ret; + ret = subprog_build_dom_info([cur_subprog]); + if (ret < 0) + goto free_nodes; + if (subprog_has_loop([cur_subprog])) { + verbose(env, "cfg - loop detected"); + ret = -EINVAL; + goto free_nodes; + } subprog_start = subprog_end; cur_subprog++; if (cur_subprog < env->subprog_cnt) { -- 2.7.4
[RFC bpf-next 09/10] bpf: cfg: reduce k*alloc/free call by using memory pool for allocating nodes
During building control flow graph, we need to build basic block nodes and edge nodes etc. These nodes needs to allocated dynamically as we don't have pre-scan pass to know how many nodes we need accurately. It is better to manage their allocation using memory pool which could reduce calling of *alloc/kfree functions and also could simplify freeing nodes. This patch: - implements a simple memory pool based node allocator. nodes need dynamic allocation migrated to this allocator. - estimate node numbers inside subprog detection pass, asking allocator to do an initial allocation of estimated size (aligned to 2K). The pool will grow later if space are not enough. - There is no support on returning memory back to the pool. Signed-off-by: Jiong Wang <jiong.w...@netronome.com> --- kernel/bpf/cfg.c | 164 +- kernel/bpf/cfg.h | 21 +-- kernel/bpf/verifier.c | 39 +--- 3 files changed, 170 insertions(+), 54 deletions(-) diff --git a/kernel/bpf/cfg.c b/kernel/bpf/cfg.c index 174e3f0..1aeac49 100644 --- a/kernel/bpf/cfg.c +++ b/kernel/bpf/cfg.c @@ -82,7 +82,113 @@ struct dom_info { u16 dfsnum; }; -int subprog_append_bb(struct list_head *bb_list, int head) +struct node_pool { + struct list_head l; + void *data; + u32 size; + u32 used; +}; + +#define first_node_pool(pool_list) \ + list_first_entry(pool_list, struct node_pool, l) + +#define MEM_CHUNK_SIZE (1024) + +static int cfg_node_allocator_grow(struct cfg_node_allocator *allocator, + int min_grow_size) +{ + int s = min_grow_size; + struct node_pool *pool; + void *data; + + s += sizeof(struct node_pool); + s = ALIGN(s, MEM_CHUNK_SIZE); + data = kzalloc(s, GFP_KERNEL); + if (!data) + return -ENOMEM; + + pool = (struct node_pool *)data; + pool->data = pool + 1; + pool->size = s - sizeof(struct node_pool); + pool->used = 0; + allocator->cur_free_pool = pool; + list_add_tail(>l, >pools); + + return 0; +} + +static void *cfg_node_alloc(struct cfg_node_allocator *allocator, int size) +{ + struct node_pool *pool = allocator->cur_free_pool; + void *p; + + if (pool->used + size > pool->size) { + int ret = cfg_node_allocator_grow(allocator, size); + + if (ret < 0) + return NULL; + + pool = allocator->cur_free_pool; + } + + p = pool->data + pool->used; + pool->used += size; + + return p; +} + +static struct bb_node *get_single_bb_nodes(struct cfg_node_allocator *allocator) +{ + int size = sizeof(struct bb_node); + + return (struct bb_node *)cfg_node_alloc(allocator, size); +} + +static struct edge_node *get_edge_nodes(struct cfg_node_allocator *allocator, + int num) +{ + int size = num * sizeof(struct edge_node); + + return (struct edge_node *)cfg_node_alloc(allocator, size); +} + +static struct cedge_node * +get_single_cedge_node(struct cfg_node_allocator *allocator) +{ + int size = sizeof(struct cedge_node); + + return (struct cedge_node *)cfg_node_alloc(allocator, size); +} + +int cfg_node_allocator_init(struct cfg_node_allocator *allocator, + int bb_num_esti, int cedge_num_esti) +{ + int s = bb_num_esti * sizeof(struct bb_node), ret; + + s += 2 * bb_num_esti * sizeof(struct edge_node); + s += cedge_num_esti * sizeof(struct cedge_node); + INIT_LIST_HEAD(>pools); + ret = cfg_node_allocator_grow(allocator, s); + if (ret < 0) + return ret; + + return 0; +} + +void cfg_node_allocator_free(struct cfg_node_allocator *allocator) +{ + struct list_head *pools = >pools; + struct node_pool *pool, *tmp; + + pool = first_node_pool(pools); + list_for_each_entry_safe_from(pool, tmp, pools, l) { + list_del(>l); + kfree(pool); + } +} + +int subprog_append_bb(struct cfg_node_allocator *allocator, + struct list_head *bb_list, int head) { struct bb_node *new_bb, *bb; @@ -94,7 +200,7 @@ int subprog_append_bb(struct list_head *bb_list, int head) } bb = bb_prev(bb); - new_bb = kzalloc(sizeof(*new_bb), GFP_KERNEL); + new_bb = get_single_bb_nodes(allocator); if (!new_bb) return -ENOMEM; @@ -106,9 +212,10 @@ int subprog_append_bb(struct list_head *bb_list, int head) return 0; } -int subprog_fini_bb(struct list_head *bb_list, int subprog_end) +int subprog_fini_bb(struct cfg_node_allocator *allocator, + struct list_head *bb_list, int subprog_end) { - struct bb_node *bb = kzalloc(sizeof(*bb), GFP_KERNEL); + struct
[RFC bpf-next 08/10] bpf: cfg: remove push_insn and check_cfg
As we have detected loop and unreachable insns based on domination information and call graph, there is no need of check_cfg. This patch removes check_cfg and it's associated push_insn. state prune heuristic marking as moved to check_subprog. Signed-off-by: Jiong Wang <jiong.w...@netronome.com> --- kernel/bpf/verifier.c | 228 -- 1 file changed, 16 insertions(+), 212 deletions(-) diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 12f5399..54f2fe3 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -735,6 +735,8 @@ enum reg_arg_type { DST_OP_NO_MARK /* same as above, check only, don't mark */ }; +#define STATE_LIST_MARK ((struct bpf_verifier_state_list *)-1L) + static int check_subprogs(struct bpf_verifier_env *env) { int i, ret, subprog_start, subprog_end, off, cur_subprog = 0; @@ -815,8 +817,14 @@ static int check_subprogs(struct bpf_verifier_env *env) target); if (ret < 0) return ret; + + env->explored_states[target] = STATE_LIST_MARK; + env->explored_states[i] = STATE_LIST_MARK; } + if (i + 1 < insn_cnt) + env->explored_states[i + 1] = STATE_LIST_MARK; + goto next; } @@ -836,6 +844,13 @@ static int check_subprogs(struct bpf_verifier_env *env) if (ret < 0) goto free_nodes; } + + if (BPF_OP(code) != BPF_JA) { + env->explored_states[i] = STATE_LIST_MARK; + env->explored_states[off] = STATE_LIST_MARK; + } else if (i + 1 < insn_cnt) { + env->explored_states[i + 1] = STATE_LIST_MARK; + } next: if (i == subprog_end - 1) { /* to avoid fall-through from one subprog into another @@ -3951,217 +3966,6 @@ static int check_return_code(struct bpf_verifier_env *env) return 0; } -/* non-recursive DFS pseudo code - * 1 procedure DFS-iterative(G,v): - * 2 label v as discovered - * 3 let S be a stack - * 4 S.push(v) - * 5 while S is not empty - * 6t <- S.pop() - * 7if t is what we're looking for: - * 8return t - * 9for all edges e in G.adjacentEdges(t) do - * 10 if edge e is already labelled - * 11 continue with the next edge - * 12 w <- G.adjacentVertex(t,e) - * 13 if vertex w is not discovered and not explored - * 14 label e as tree-edge - * 15 label w as discovered - * 16 S.push(w) - * 17 continue at 5 - * 18 else if vertex w is discovered - * 19 label e as back-edge - * 20 else - * 21 // vertex w is explored - * 22 label e as forward- or cross-edge - * 23 label t as explored - * 24 S.pop() - * - * convention: - * 0x10 - discovered - * 0x11 - discovered and fall-through edge labelled - * 0x12 - discovered and fall-through and branch edges labelled - * 0x20 - explored - */ - -enum { - DISCOVERED = 0x10, - EXPLORED = 0x20, - FALLTHROUGH = 1, - BRANCH = 2, -}; - -#define STATE_LIST_MARK ((struct bpf_verifier_state_list *) -1L) - -static int *insn_stack;/* stack of insns to process */ -static int cur_stack; /* current stack index */ -static int *insn_state; - -/* t, w, e - match pseudo-code above: - * t - index of current instruction - * w - next instruction - * e - edge - */ -static int push_insn(int t, int w, int e, struct bpf_verifier_env *env) -{ - if (e == FALLTHROUGH && insn_state[t] >= (DISCOVERED | FALLTHROUGH)) - return 0; - - if (e == BRANCH && insn_state[t] >= (DISCOVERED | BRANCH)) - return 0; - - if (w < 0 || w >= env->prog->len) { - verbose(env, "jump out of range from insn %d to %d\n", t, w); - return -EINVAL; - } - - if (e == BRANCH) - /* mark branch target for state pruning */ - env->explored_states[w] = STATE_LIST_MARK; - - if (insn_state[w] == 0) { - /* tree-edge */ - insn_state[t] = DISCOVERED | e; - insn_state[w] = DISCOVERED; - if (cur_stack >= env->prog->len) - return -E2BIG; - insn_stack[cur_stack++] = w; - return 1; - } else if ((insn_state[w] & 0xF0) == DISCOVERED) { - verbose(env, &q
[RFC bpf-next 03/10] bpf: cfg: build domination tree using Tarjan algorithm
- When building domination information, we need to some array data structure, and need to index into them using basic block index. Calculate the basic block index during adding edges. - The Step-1 in Tarjan algorithm is to assign dfs order number for each bb in the cfg. We need to iterate on all bbs in reverse dfs order. - Step-2 and Step-3 in Tarjan algorithm to calculate semi-dominators and immediate-dominators implicitly and explicitly. - Step-4, we add a new bitmap field inside subprog_info to keep the dominator tree there. - Post-domination information is built but not tested. Signed-off-by: Jiong Wang <jiong.w...@netronome.com> --- include/linux/bpf_verifier.h | 3 + kernel/bpf/cfg.c | 386 ++- kernel/bpf/cfg.h | 3 +- kernel/bpf/verifier.c| 5 +- 4 files changed, 393 insertions(+), 4 deletions(-) diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h index 404a5d8..9c0a808 100644 --- a/include/linux/bpf_verifier.h +++ b/include/linux/bpf_verifier.h @@ -177,6 +177,9 @@ struct bpf_subprog_info { u32 start; /* insn idx of function entry point */ u16 stack_depth; /* max. stack depth used by this function */ struct list_head bbs; /* basic blocks list. */ + u32 bb_num; /* total basic block num. */ + unsigned long *dtree; + bool dtree_avail; }; /* single container for all structs diff --git a/kernel/bpf/cfg.c b/kernel/bpf/cfg.c index 208aac7..b50937a 100644 --- a/kernel/bpf/cfg.c +++ b/kernel/bpf/cfg.c @@ -17,11 +17,33 @@ struct edge_node { struct bb_node *dst; }; +struct edge_iter { + struct list_head *list_head; + struct edge_node *edge; +}; + +#define first_edge(e_list) list_first_entry(e_list, struct edge_node, l) +#define last_edge(e_list) list_last_entry(e_list, struct edge_node, l) +#define next_edge(e) list_next_entry(e, l) + +static bool ei_end_p(struct edge_iter *ei) +{ + return >edge->l == ei->list_head; +} + +static void ei_next(struct edge_iter *ei) +{ + struct edge_node *e = ei->edge; + + ei->edge = next_edge(e); +} + struct bb_node { struct list_head l; struct list_head e_prevs; struct list_head e_succs; u16 head; + u16 idx; }; #define bb_prev(bb)list_prev_entry(bb, l) @@ -31,6 +53,27 @@ struct bb_node { #define entry_bb(bb_list) bb_first(bb_list) #define exit_bb(bb_list) bb_last(bb_list) +struct dom_info { + u16 *dfs_parent; + u16 *dfs_order; + struct bb_node **dfs_to_bb; + /* immediate-dominator */ + u16 *idom; + /* semi-dominator */ + u16 *sdom; + u16 *bucket; + u16 *next_bucket; + /* best node during path compression. */ + u16 *best; + /* ancestor along tree edge. */ + u16 *ancestor; + /* size and child are used for tree balancing. */ + u16 *size; + u16 *child; + + u16 dfsnum; +}; + int subprog_append_bb(struct list_head *bb_list, int head) { struct bb_node *new_bb, *bb; @@ -107,6 +150,7 @@ int subprog_add_bb_edges(struct bpf_insn *insns, struct list_head *bb_list) { struct bb_node *bb, *exit_bb; struct edge_node *edge; + int bb_num; bb = entry_bb(bb_list); edge = kcalloc(2, sizeof(*edge), GFP_KERNEL); @@ -118,9 +162,12 @@ int subprog_add_bb_edges(struct bpf_insn *insns, struct list_head *bb_list) edge[1].src = edge->src; edge[1].dst = edge->dst; list_add_tail([1].l, [1].dst->e_prevs); + bb->idx = -1; exit_bb = exit_bb(bb_list); + exit_bb->idx = -2; bb = bb_next(bb); + bb_num = 0; list_for_each_entry_from(bb, _bb->l, l) { bool has_fallthrough, only_has_fallthrough; bool has_branch, only_has_branch; @@ -129,6 +176,8 @@ int subprog_add_bb_edges(struct bpf_insn *insns, struct list_head *bb_list) struct bpf_insn insn; u8 code; + bb->idx = bb_num++; + edge = kcalloc(2, sizeof(*edge), GFP_KERNEL); if (!edge) return -ENOMEM; @@ -184,9 +233,341 @@ int subprog_add_bb_edges(struct bpf_insn *insns, struct list_head *bb_list) } } + return bb_num + 2; +} + +static int init_dom_info(struct bpf_subprog_info *subprog, struct dom_info *di) +{ + u16 *p, bb_num, i; + + di->dfs_parent = NULL; + di->dfs_to_bb = NULL; + + bb_num = subprog->bb_num; + p = kcalloc(10 * bb_num, sizeof(*p), GFP_KERNEL); + if (!p) + return -ENOMEM; + + di->dfs_parent = p; + di->dfs_order = di->dfs_parent + bb_num; + di->idom = di->dfs_order + bb_num; + di->sdom = di->idom + bb_num; + di-&g
[RFC bpf-next 02/10] bpf: cfg: add edges between basic blocks to form CFG
This patch add edges between basic blocks. Both edges for predecessors and successors are added. Signed-off-by: Jiong Wang <jiong.w...@netronome.com> --- kernel/bpf/cfg.c | 129 +- kernel/bpf/cfg.h | 1 + kernel/bpf/verifier.c | 3 ++ 3 files changed, 131 insertions(+), 2 deletions(-) diff --git a/kernel/bpf/cfg.c b/kernel/bpf/cfg.c index b1af714..208aac7 100644 --- a/kernel/bpf/cfg.c +++ b/kernel/bpf/cfg.c @@ -11,8 +11,16 @@ #include "cfg.h" +struct edge_node { + struct list_head l; + struct bb_node *src; + struct bb_node *dst; +}; + struct bb_node { struct list_head l; + struct list_head e_prevs; + struct list_head e_succs; u16 head; }; @@ -39,6 +47,8 @@ int subprog_append_bb(struct list_head *bb_list, int head) if (!new_bb) return -ENOMEM; + INIT_LIST_HEAD(_bb->e_prevs); + INIT_LIST_HEAD(_bb->e_succs); new_bb->head = head; list_add(_bb->l, >l); @@ -53,6 +63,8 @@ int subprog_fini_bb(struct list_head *bb_list, int subprog_end) return -ENOMEM; /* entry bb. */ bb->head = -1; + INIT_LIST_HEAD(>e_prevs); + INIT_LIST_HEAD(>e_succs); list_add(>l, bb_list); bb = kzalloc(sizeof(*bb), GFP_KERNEL); @@ -60,6 +72,8 @@ int subprog_fini_bb(struct list_head *bb_list, int subprog_end) return -ENOMEM; /* exit bb. */ bb->head = subprog_end; + INIT_LIST_HEAD(>e_prevs); + INIT_LIST_HEAD(>e_succs); list_add_tail(>l, bb_list); return 0; @@ -77,15 +91,126 @@ int subprog_init_bb(struct list_head *bb_list, int subprog_start) return 0; } +static struct bb_node *search_bb_with_head(struct list_head *bb_list, int head) +{ + struct bb_node *bb; + + list_for_each_entry(bb, bb_list, l) { + if (bb->head == head) + return bb; + } + + return NULL; +} + +int subprog_add_bb_edges(struct bpf_insn *insns, struct list_head *bb_list) +{ + struct bb_node *bb, *exit_bb; + struct edge_node *edge; + + bb = entry_bb(bb_list); + edge = kcalloc(2, sizeof(*edge), GFP_KERNEL); + if (!edge) + return -ENOMEM; + edge->src = bb; + edge->dst = bb_next(bb); + list_add_tail(>l, >e_succs); + edge[1].src = edge->src; + edge[1].dst = edge->dst; + list_add_tail([1].l, [1].dst->e_prevs); + + exit_bb = exit_bb(bb_list); + bb = bb_next(bb); + list_for_each_entry_from(bb, _bb->l, l) { + bool has_fallthrough, only_has_fallthrough; + bool has_branch, only_has_branch; + struct bb_node *next_bb = bb_next(bb); + int tail = next_bb->head - 1; + struct bpf_insn insn; + u8 code; + + edge = kcalloc(2, sizeof(*edge), GFP_KERNEL); + if (!edge) + return -ENOMEM; + edge->src = bb; + edge[1].src = bb; + + insn = insns[tail]; + code = insn.code; + only_has_fallthrough = BPF_CLASS(code) != BPF_JMP || + BPF_OP(code) == BPF_EXIT; + has_fallthrough = only_has_fallthrough || + (BPF_CLASS(code) == BPF_JMP && + BPF_OP(code) != BPF_CALL && + BPF_OP(code) != BPF_JA); + only_has_branch = BPF_CLASS(code) == BPF_JMP && + BPF_OP(code) == BPF_JA; + has_branch = only_has_branch || +(BPF_CLASS(code) == BPF_JMP && + BPF_OP(code) != BPF_EXIT && + BPF_OP(code) != BPF_CALL); + + if (has_fallthrough) { + if (BPF_CLASS(code) == BPF_JMP && + BPF_OP(code) == BPF_EXIT) + next_bb = exit_bb; + edge->dst = next_bb; + edge[1].dst = next_bb; + list_add_tail(>l, >e_succs); + list_add_tail([1].l, [1].dst->e_prevs); + edge = NULL; + } + + if (has_branch) { + struct bb_node *tgt; + + if (!edge) { + edge = kcalloc(2, sizeof(*edge), GFP_KERNEL); + if (!edge) + return -ENOMEM; + edge->src = bb; + edge[1].src = bb; + } + + tgt = search_bb_with_head(bb
[RFC bpf-next 10/10] bpf: cfg: reduce memory usage by using singly list + compat pointer
Because there are going to be quite a few nodes (bb, edge etc) for a reasonable sized program, the size of cfg nodes matters. The cfg traversal used at the moment are mostly via edge which contains pointers to both src and dst basic block. The traversal is not via links between basic blocks and edge nodes themselves, so link them with doubly list_head is unnecessary, and is too heavy on 64-bit host as pointer will take 8-bytes. This patch reduce memory usage from two ways: 1. use singly linked list to link basic block and edge nodes. 2. introduce a compact pointer representation for pointers generated from memory pool. If a pointer is from pool, we could represent it using pool_base + pool_offset, struct cfg_node_link { u16 offset; s8 idx; }; the compact pointer "cfg_node_link" will only take 3-bytes. the delink of the pointer will becomes: static void *cfg_node_delink(struct cfg_node_allocator *allocator, struct cfg_node_link *link) { s8 idx = link->idx; if (idx == -1) return NULL; return allocator->base[idx] + link->offset; } For example, basic blocks nodes changed from: struct bb_node { struct list_head l; struct list_head e_prevs; struct list_head e_succs; s16 head; u16 idx; }; into: struct bb_node { struct cfg_node_link link; struct cfg_node_link e_prevs; struct cfg_node_link e_succs; s16 head; u16 idx; }; We reduced node size from 52 to 16. We could further reduce node size to 14 if we reshuffle fields of link/e_prevs/e_succs to avoid alignment padding, but that will with a cost of code readability. >From benchmarks like test_xdp_noinline, this patch reduce peek memory usage of new cfg infrastructure by more than 50%. Signed-off-by: Jiong Wang <jiong.w...@netronome.com> --- include/linux/bpf_verifier.h | 7 +- kernel/bpf/cfg.c | 503 --- kernel/bpf/cfg.h | 23 +- kernel/bpf/verifier.c| 33 ++- 4 files changed, 312 insertions(+), 254 deletions(-) diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h index 96337ba..c291461 100644 --- a/include/linux/bpf_verifier.h +++ b/include/linux/bpf_verifier.h @@ -176,8 +176,11 @@ static inline bool bpf_verifier_log_needed(const struct bpf_verifier_log *log) struct bpf_subprog_info { u32 start; /* insn idx of function entry point */ u16 stack_depth; /* max. stack depth used by this function */ - struct list_head callees; /* callees list. */ - struct list_head bbs; /* basic blocks list. */ + void *callees; /* callees list. */ + struct { + void *entry_bb; /* basic blocks list head. */ + void *exit_bb; /* basic blocks list tail. */ + } bbs; u32 bb_num; /* total basic block num. */ unsigned long *dtree; bool dtree_avail; diff --git a/kernel/bpf/cfg.c b/kernel/bpf/cfg.c index 1aeac49..bda63ab 100644 --- a/kernel/bpf/cfg.c +++ b/kernel/bpf/cfg.c @@ -13,54 +13,55 @@ #include "cfg.h" +/* The size of cfg nodes matters, therefore we try to avoid using doubly linked + * list and we try to use base + offset to address node, by this we only need + * to keep offset. + */ +struct cfg_node_link { + u16 offset; + s8 idx; +}; + +static void *cfg_node_delink(struct cfg_node_allocator *allocator, +struct cfg_node_link *link) +{ + s8 idx = link->idx; + + if (idx == -1) + return NULL; + + return allocator->base[idx] + link->offset; +} + struct cedge_node { - struct list_head l; + struct cfg_node_link link; u8 caller_idx; u8 callee_idx; }; struct edge_node { - struct list_head l; + struct cfg_node_link link; struct bb_node *src; struct bb_node *dst; }; -struct edge_iter { - struct list_head *list_head; - struct edge_node *edge; +struct bb_node { + struct cfg_node_link link; + struct cfg_node_link e_prevs; + struct cfg_node_link e_succs; + s16 head; + u16 idx; }; -#define first_edge(e_list) list_first_entry(e_list, struct edge_node, l) -#define last_edge(e_list) list_last_entry(e_list, struct edge_node, l) -#define next_edge(e) list_next_entry(e, l) +#define entry_bb(bb_list) (struct bb_node *)(*bb_list) +#define exit_bb(bb_list) (struct bb_node *)(*(bb_list + 1)) -static bool ei_end_p(struct edge_iter *ei) +static struct bb_node *bb_next(struct cfg_node_allocator *allocator, + struct bb_node *bb) { - return >edge->l == ei->list_head; + return (struct bb_node *)cfg_node_delink(allocator, >link); } -static void ei_next(stru
[RFC bpf-next 05/10] bpf: cfg: detect unreachable basic blocks
Do unreachable basic blocks detection as a side-product of the dfs walk when building domination information. Signed-off-by: Jiong Wang <jiong.w...@netronome.com> --- kernel/bpf/cfg.c | 31 ++- kernel/bpf/cfg.h | 3 ++- kernel/bpf/verifier.c | 3 ++- 3 files changed, 30 insertions(+), 7 deletions(-) diff --git a/kernel/bpf/cfg.c b/kernel/bpf/cfg.c index 90692e4..7ce1472 100644 --- a/kernel/bpf/cfg.c +++ b/kernel/bpf/cfg.c @@ -407,15 +407,17 @@ static void calc_idoms(struct bpf_subprog_info *subprog, struct dom_info *di, di->idom[idx] = di->idom[di->idom[idx]]; } -static int calc_dfs_tree(struct bpf_subprog_info *subprog, struct dom_info *di, -bool reverse) +static int +calc_dfs_tree(struct bpf_verifier_env *env, struct bpf_subprog_info *subprog, + struct dom_info *di, bool reverse) { - u16 bb_num = subprog->bb_num, sp = 0, idx, parent_idx; + u16 bb_num = subprog->bb_num, sp = 0, idx, parent_idx, i; struct list_head *bb_list = >bbs; u16 entry_bb_fake_idx = bb_num - 2; struct bb_node *entry_bb, *exit_bb; struct edge_iter ei, *stack; struct edge_node *e; + bool *visited; di->dfs_order[entry_bb_fake_idx] = di->dfsnum; @@ -423,6 +425,10 @@ static int calc_dfs_tree(struct bpf_subprog_info *subprog, struct dom_info *di, if (!stack) return -ENOMEM; + visited = kcalloc(bb_num - 2, sizeof(bool), GFP_KERNEL); + if (!visited) + return -ENOMEM; + if (reverse) { entry_bb = exit_bb(bb_list); exit_bb = entry_bb(bb_list); @@ -445,6 +451,8 @@ static int calc_dfs_tree(struct bpf_subprog_info *subprog, struct dom_info *di, if (reverse) { bb_dst = e->src; + if (bb_dst != exit_bb) + visited[bb_dst->idx] = true; if (bb_dst == exit_bb || di->dfs_order[bb_dst->idx]) { ei_next(); @@ -453,6 +461,8 @@ static int calc_dfs_tree(struct bpf_subprog_info *subprog, struct dom_info *di, bb_src = e->dst; } else { bb_dst = e->dst; + if (bb_dst != exit_bb) + visited[bb_dst->idx] = true; if (bb_dst == exit_bb || di->dfs_order[bb_dst->idx]) { ei_next(); @@ -490,6 +500,16 @@ static int calc_dfs_tree(struct bpf_subprog_info *subprog, struct dom_info *di, kfree(stack); + for (i = 0; i < bb_num - 2; i++) { + if (!visited[i]) { + bpf_verifier_log_write(env, "cfg - unreachable insn\n"); + kfree(visited); + return -EINVAL; + } + } + + kfree(visited); + return 0; } @@ -541,7 +561,8 @@ static int idoms_to_doms(struct bpf_subprog_info *subprog, struct dom_info *di) * The implementation also referenced GNU GCC 3.0. */ -int subprog_build_dom_info(struct bpf_subprog_info *subprog) +int subprog_build_dom_info(struct bpf_verifier_env *env, + struct bpf_subprog_info *subprog) { struct dom_info di; int ret; @@ -550,7 +571,7 @@ int subprog_build_dom_info(struct bpf_subprog_info *subprog) if (ret < 0) goto free_dominfo; - ret = calc_dfs_tree(subprog, , false); + ret = calc_dfs_tree(env, subprog, , false); if (ret < 0) goto free_dominfo; diff --git a/kernel/bpf/cfg.h b/kernel/bpf/cfg.h index c02c4cf..02729a9 100644 --- a/kernel/bpf/cfg.h +++ b/kernel/bpf/cfg.h @@ -10,7 +10,8 @@ int subprog_add_bb_edges(struct bpf_insn *insns, struct list_head *bb_list); int subprog_append_bb(struct list_head *bb_list, int head); -int subprog_build_dom_info(struct bpf_subprog_info *subprog); +int subprog_build_dom_info(struct bpf_verifier_env *env, + struct bpf_subprog_info *subprog); int subprog_fini_bb(struct list_head *bb_list, int subprog_end); bool subprog_has_loop(struct bpf_subprog_info *subprog); int subprog_init_bb(struct list_head *bb_list, int subprog_start); diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index a93aa43..46d5eae 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -879,7 +879,8 @@ static int check_subprogs(struct bpf_verifier_env *env) if (ret < 0) goto free_nodes; subprog[cur_subprog].bb_num = ret; - ret = subprog_buil
[RFC bpf-next 00/10] initial control flow support for eBPF verifier
This patch introduce initial control flow infrastructure to eBPF verifier. Various control flow information are built in an incremental way, so they could be easily maintained at various level. After this patch set, information collection on eBPF insns are centred at check_subprogs, check_cfg and related code then are replaced by new infrastructure. Checks/Analysis offered by check_cfg are still offered by new infrastructure. For example loop detection, recursive function call, unreachable insn, prune heuristic marking. This infrastructure comes with some memory usage and execution time cost. Memory pool based allocation is used to avoid frequently calling of kmalloc/free. Singly linked list and compact pointer are used to reduce various cfg node size. data structure size === basic blocks edgecall-graph edge 16-bytes 24-bytes6-bytes memory usage (test_l4lb_noinline, test_xdp_noinline) (counted all subprogs): test_l4lb_noinline: cfg info: 597 insns, 12 subprogs 107 basic-blocks, 278 edges, 12 call edges. mem usage: 9216(9K)-bytes for memory pool (basic-blocks/edges/call-edges) 644-bytes for dominator trees. test_xdp_noinline: cfg info: 1029 insns, 21 subprogs 191 basic-basics, 502 edges, 23 call edges. mem usage: 15360(15K)-bytes for memory pool. 1192-bytes for dominator trees. The existing check_cfg is using two auxiliary arrays (insn_state and insn_stack), insn_cnt * sizeof(int) bytes for each, so would use ~5K and ~8K accordingly. (looks to me insn_state could use 1-byte element, insn_stack could use 2-byte, so probably could reduced to ~2K and ~3K). The existing check_cfg is using 8-bytes for each insn during cfg check. The new infrastructure basically use 2x mems, 16-bytes for each insn. execution time === test_l4lb_noinline: existing check_subprog/check_cfg: ~55000 ns new infrastructure: ~135000 ns test_xdp_noinline: existing check_subprog/check_cfg: ~52000 ns new infrastructure: ~12 ns The control flow infrastructure could possibly lay the ground for further static analysis inside eBPF verifier, for example bounded loop detection, path-sensitive data-flow analysis etc. Jiong Wang (10): bpf: cfg: partition basic blocks for each subprog bpf: cfg: add edges between basic blocks to form CFG bpf: cfg: build domination tree using Tarjan algorithm bpf: cfg: detect loop use domination information bpf: cfg: detect unreachable basic blocks bpf: cfg: move find_subprog/add_subprog to cfg.c bpf: cfg: build call graph and detect unreachable/recursive call bpf: cfg: remove push_insn and check_cfg bpf: cfg: reduce k*alloc/free call by using memory pool for allocating nodes bpf: cfg: reduce memory usage by using singly list + compat pointer include/linux/bpf_verifier.h | 8 + kernel/bpf/Makefile | 2 +- kernel/bpf/cfg.c | 956 +++ kernel/bpf/cfg.h | 42 ++ kernel/bpf/verifier.c| 386 ++--- 5 files changed, 1131 insertions(+), 263 deletions(-) create mode 100644 kernel/bpf/cfg.c create mode 100644 kernel/bpf/cfg.h -- 2.7.4
[RFC bpf-next 01/10] bpf: cfg: partition basic blocks for each subprog
"check_subprogs" has partitioned subprogs and is doing insn scan for each subprog already, this patch extends it to also partition basic blocks (BB) for each subprog. - The first insn for each subprog start a BB. - Branch (both cond and absolute) target start a BB. - Insn immediately follows branch insn start a BB. Insn immediately follows exit and within subprog start a BB. BBs for each subprog are organized as a list in ascending head. Two special BBs, entry and exit are added as well. Signed-off-by: Jiong Wang <jiong.w...@netronome.com> --- include/linux/bpf_verifier.h | 1 + kernel/bpf/Makefile | 2 +- kernel/bpf/cfg.c | 93 kernel/bpf/cfg.h | 16 kernel/bpf/verifier.c| 57 --- 5 files changed, 162 insertions(+), 7 deletions(-) create mode 100644 kernel/bpf/cfg.c create mode 100644 kernel/bpf/cfg.h diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h index 8f70dc1..404a5d8 100644 --- a/include/linux/bpf_verifier.h +++ b/include/linux/bpf_verifier.h @@ -176,6 +176,7 @@ static inline bool bpf_verifier_log_needed(const struct bpf_verifier_log *log) struct bpf_subprog_info { u32 start; /* insn idx of function entry point */ u16 stack_depth; /* max. stack depth used by this function */ + struct list_head bbs; /* basic blocks list. */ }; /* single container for all structs diff --git a/kernel/bpf/Makefile b/kernel/bpf/Makefile index 35c485f..b4542d8 100644 --- a/kernel/bpf/Makefile +++ b/kernel/bpf/Makefile @@ -1,7 +1,7 @@ # SPDX-License-Identifier: GPL-2.0 obj-y := core.o -obj-$(CONFIG_BPF_SYSCALL) += syscall.o verifier.o inode.o helpers.o tnum.o +obj-$(CONFIG_BPF_SYSCALL) += syscall.o verifier.o inode.o helpers.o tnum.o cfg.o obj-$(CONFIG_BPF_SYSCALL) += hashtab.o arraymap.o percpu_freelist.o bpf_lru_list.o lpm_trie.o map_in_map.o obj-$(CONFIG_BPF_SYSCALL) += disasm.o obj-$(CONFIG_BPF_SYSCALL) += btf.o diff --git a/kernel/bpf/cfg.c b/kernel/bpf/cfg.c new file mode 100644 index 000..b1af714 --- /dev/null +++ b/kernel/bpf/cfg.c @@ -0,0 +1,93 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* Copyright (C) 2018 Netronome Systems, Inc. */ +/* This program is free software; you can redistribute it and/or + * modify it under the terms of version 2 of the GNU General Public + * License as published by the Free Software Foundation. + */ + +#include +#include +#include + +#include "cfg.h" + +struct bb_node { + struct list_head l; + u16 head; +}; + +#define bb_prev(bb)list_prev_entry(bb, l) +#define bb_next(bb)list_next_entry(bb, l) +#define bb_first(bb_list) list_first_entry(bb_list, struct bb_node, l) +#define bb_last(bb_list) list_last_entry(bb_list, struct bb_node, l) +#define entry_bb(bb_list) bb_first(bb_list) +#define exit_bb(bb_list) bb_last(bb_list) + +int subprog_append_bb(struct list_head *bb_list, int head) +{ + struct bb_node *new_bb, *bb; + + list_for_each_entry(bb, bb_list, l) { + if (bb->head == head) + return 0; + else if (bb->head > head) + break; + } + + bb = bb_prev(bb); + new_bb = kzalloc(sizeof(*new_bb), GFP_KERNEL); + if (!new_bb) + return -ENOMEM; + + new_bb->head = head; + list_add(_bb->l, >l); + + return 0; +} + +int subprog_fini_bb(struct list_head *bb_list, int subprog_end) +{ + struct bb_node *bb = kzalloc(sizeof(*bb), GFP_KERNEL); + + if (!bb) + return -ENOMEM; + /* entry bb. */ + bb->head = -1; + list_add(>l, bb_list); + + bb = kzalloc(sizeof(*bb), GFP_KERNEL); + if (!bb) + return -ENOMEM; + /* exit bb. */ + bb->head = subprog_end; + list_add_tail(>l, bb_list); + + return 0; +} + +int subprog_init_bb(struct list_head *bb_list, int subprog_start) +{ + int ret; + + INIT_LIST_HEAD(bb_list); + ret = subprog_append_bb(bb_list, subprog_start); + if (ret < 0) + return ret; + + return 0; +} + +void subprog_free_bb(struct bpf_subprog_info *subprog, int end_idx) +{ + int i = 0; + + for (; i <= end_idx; i++) { + struct list_head *bbs = [i].bbs; + struct bb_node *bb, *tmp; + + list_for_each_entry_safe(bb, tmp, bbs, l) { + list_del(>l); + kfree(bb); + } + } +} diff --git a/kernel/bpf/cfg.h b/kernel/bpf/cfg.h new file mode 100644 index 000..4092145 --- /dev/null +++ b/kernel/bpf/cfg.h @@ -0,0 +1,16 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* Copyright (C) 2018 Netronome Systems, Inc. */ +/* This program is free software; you can redistribute it and/or + * modify it under the term
[RFC bpf-next 06/10] bpf: cfg: move find_subprog/add_subprog to cfg.c
This patch centre find_subprog and add_subprog to cfg.c. Signed-off-by: Jiong Wang <jiong.w...@netronome.com> --- kernel/bpf/cfg.c | 41 + kernel/bpf/cfg.h | 2 ++ kernel/bpf/verifier.c | 42 -- 3 files changed, 43 insertions(+), 42 deletions(-) diff --git a/kernel/bpf/cfg.c b/kernel/bpf/cfg.c index 7ce1472..a34a95c 100644 --- a/kernel/bpf/cfg.c +++ b/kernel/bpf/cfg.c @@ -6,8 +6,10 @@ */ #include +#include #include #include +#include #include "cfg.h" @@ -611,6 +613,45 @@ bool subprog_has_loop(struct bpf_subprog_info *subprog) return false; } +static int cmp_subprogs(const void *a, const void *b) +{ + return ((struct bpf_subprog_info *)a)->start - + ((struct bpf_subprog_info *)b)->start; +} + +int find_subprog(struct bpf_verifier_env *env, int off) +{ + struct bpf_subprog_info *p; + + p = bsearch(, env->subprog_info, env->subprog_cnt, + sizeof(env->subprog_info[0]), cmp_subprogs); + if (!p) + return -ENOENT; + return p - env->subprog_info; +} + +int add_subprog(struct bpf_verifier_env *env, int off) +{ + int insn_cnt = env->prog->len; + int ret; + + if (off >= insn_cnt || off < 0) { + bpf_verifier_log_write(env, "call to invalid destination\n"); + return -EINVAL; + } + ret = find_subprog(env, off); + if (ret >= 0) + return 0; + if (env->subprog_cnt >= BPF_MAX_SUBPROGS) { + bpf_verifier_log_write(env, "too many subprograms\n"); + return -E2BIG; + } + env->subprog_info[env->subprog_cnt++].start = off; + sort(env->subprog_info, env->subprog_cnt, +sizeof(env->subprog_info[0]), cmp_subprogs, NULL); + return 0; +} + static void subprog_free_edge(struct bb_node *bb) { struct list_head *succs = >e_succs; diff --git a/kernel/bpf/cfg.h b/kernel/bpf/cfg.h index 02729a9..57eab9b 100644 --- a/kernel/bpf/cfg.h +++ b/kernel/bpf/cfg.h @@ -8,6 +8,8 @@ #ifndef __BPF_CFG_H__ #define __BPF_CFG_H__ +int add_subprog(struct bpf_verifier_env *env, int off); +int find_subprog(struct bpf_verifier_env *env, int off); int subprog_add_bb_edges(struct bpf_insn *insns, struct list_head *bb_list); int subprog_append_bb(struct list_head *bb_list, int head); int subprog_build_dom_info(struct bpf_verifier_env *env, diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 46d5eae..72bda84 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -20,8 +20,6 @@ #include #include #include -#include -#include #include "cfg.h" #include "disasm.h" @@ -737,46 +735,6 @@ enum reg_arg_type { DST_OP_NO_MARK /* same as above, check only, don't mark */ }; -static int cmp_subprogs(const void *a, const void *b) -{ - return ((struct bpf_subprog_info *)a)->start - - ((struct bpf_subprog_info *)b)->start; -} - -static int find_subprog(struct bpf_verifier_env *env, int off) -{ - struct bpf_subprog_info *p; - - p = bsearch(, env->subprog_info, env->subprog_cnt, - sizeof(env->subprog_info[0]), cmp_subprogs); - if (!p) - return -ENOENT; - return p - env->subprog_info; - -} - -static int add_subprog(struct bpf_verifier_env *env, int off) -{ - int insn_cnt = env->prog->len; - int ret; - - if (off >= insn_cnt || off < 0) { - verbose(env, "call to invalid destination\n"); - return -EINVAL; - } - ret = find_subprog(env, off); - if (ret >= 0) - return 0; - if (env->subprog_cnt >= BPF_MAX_SUBPROGS) { - verbose(env, "too many subprograms\n"); - return -E2BIG; - } - env->subprog_info[env->subprog_cnt++].start = off; - sort(env->subprog_info, env->subprog_cnt, -sizeof(env->subprog_info[0]), cmp_subprogs, NULL); - return 0; -} - static int check_subprogs(struct bpf_verifier_env *env) { int i, ret, subprog_start, subprog_end, off, cur_subprog = 0; -- 2.7.4
[PATCH v2 bpf-next 0/3] bpf: cleanups on managing subprog information
This patch set clean up some code logic related with managing subprog information. Part of the set are inspried by Edwin's code in his RFC: "bpf/verifier: subprog/func_call simplifications" but with clearer separation so it could be easier to review. - Path 1 unifies main prog and subprogs. All of them are registered in env->subprog_starts. - After patch 1, it is clear that subprog_starts and subprog_stack_depth could be merged as both of them now have main and subprog unified. Patch 2 therefore does the merge, all subprog information are centred at bpf_subprog_info. - Patch 3 goes further to introduce a new fake "exit" subprog which serves as an ending marker to the subprog list. We could then turn the following code snippets across verifier: if (env->subprog_cnt == cur_subprog + 1) subprog_end = insn_cnt; else subprog_end = env->subprog_info[cur_subprog + 1].start; into: subprog_end = env->subprog_info[cur_subprog + 1].start; There is no functional change by this patch set. No bpf selftest (both non-jit and jit) regression found after this set. v2: - fixed adjust_subprog_starts to also update fake "exit" subprog start. - for John's suggestion on renaming subprog to prog, I could work on a follow-up patch if it is recognized as worth the change. Jiong Wang (3): bpf: unify main prog and subprog bpf: centre subprog information fields bpf: add faked "ending" subprog include/linux/bpf_verifier.h | 9 ++-- kernel/bpf/verifier.c| 121 ++- 2 files changed, 67 insertions(+), 63 deletions(-) -- 2.7.4
[PATCH v2 bpf-next 2/3] bpf: centre subprog information fields
It is better to centre all subprog information fields into one structure. This structure could later serve as function node in call graph. Signed-off-by: Jiong Wang <jiong.w...@netronome.com> --- include/linux/bpf_verifier.h | 9 --- kernel/bpf/verifier.c| 62 +++- 2 files changed, 38 insertions(+), 33 deletions(-) diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h index f655b92..8f70dc1 100644 --- a/include/linux/bpf_verifier.h +++ b/include/linux/bpf_verifier.h @@ -173,6 +173,11 @@ static inline bool bpf_verifier_log_needed(const struct bpf_verifier_log *log) #define BPF_MAX_SUBPROGS 256 +struct bpf_subprog_info { + u32 start; /* insn idx of function entry point */ + u16 stack_depth; /* max. stack depth used by this function */ +}; + /* single container for all structs * one verifier_env per bpf_check() call */ @@ -191,9 +196,7 @@ struct bpf_verifier_env { bool seen_direct_write; struct bpf_insn_aux_data *insn_aux_data; /* array of per-insn state */ struct bpf_verifier_log log; - u32 subprog_starts[BPF_MAX_SUBPROGS + 1]; - /* computes the stack depth of each bpf function */ - u16 subprog_stack_depth[BPF_MAX_SUBPROGS + 1]; + struct bpf_subprog_info subprog_info[BPF_MAX_SUBPROGS + 1]; u32 subprog_cnt; }; diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 16ec977..9764b9b 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -738,18 +738,19 @@ enum reg_arg_type { static int cmp_subprogs(const void *a, const void *b) { - return *(int *)a - *(int *)b; + return ((struct bpf_subprog_info *)a)->start - + ((struct bpf_subprog_info *)b)->start; } static int find_subprog(struct bpf_verifier_env *env, int off) { - u32 *p; + struct bpf_subprog_info *p; - p = bsearch(, env->subprog_starts, env->subprog_cnt, - sizeof(env->subprog_starts[0]), cmp_subprogs); + p = bsearch(, env->subprog_info, env->subprog_cnt, + sizeof(env->subprog_info[0]), cmp_subprogs); if (!p) return -ENOENT; - return p - env->subprog_starts; + return p - env->subprog_info; } @@ -769,15 +770,16 @@ static int add_subprog(struct bpf_verifier_env *env, int off) verbose(env, "too many subprograms\n"); return -E2BIG; } - env->subprog_starts[env->subprog_cnt++] = off; - sort(env->subprog_starts, env->subprog_cnt, -sizeof(env->subprog_starts[0]), cmp_subprogs, NULL); + env->subprog_info[env->subprog_cnt++].start = off; + sort(env->subprog_info, env->subprog_cnt, +sizeof(env->subprog_info[0]), cmp_subprogs, NULL); return 0; } static int check_subprogs(struct bpf_verifier_env *env) { int i, ret, subprog_start, subprog_end, off, cur_subprog = 0; + struct bpf_subprog_info *subprog = env->subprog_info; struct bpf_insn *insn = env->prog->insnsi; int insn_cnt = env->prog->len; @@ -807,14 +809,14 @@ static int check_subprogs(struct bpf_verifier_env *env) if (env->log.level > 1) for (i = 0; i < env->subprog_cnt; i++) - verbose(env, "func#%d @%d\n", i, env->subprog_starts[i]); + verbose(env, "func#%d @%d\n", i, subprog[i].start); /* now check that all jumps are within the same subprog */ subprog_start = 0; if (env->subprog_cnt == cur_subprog + 1) subprog_end = insn_cnt; else - subprog_end = env->subprog_starts[cur_subprog + 1]; + subprog_end = subprog[cur_subprog + 1].start; for (i = 0; i < insn_cnt; i++) { u8 code = insn[i].code; @@ -843,8 +845,7 @@ static int check_subprogs(struct bpf_verifier_env *env) if (env->subprog_cnt == cur_subprog + 1) subprog_end = insn_cnt; else - subprog_end = - env->subprog_starts[cur_subprog + 1]; + subprog_end = subprog[cur_subprog + 1].start; } } return 0; @@ -1477,13 +1478,13 @@ static int update_stack_depth(struct bpf_verifier_env *env, const struct bpf_func_state *func, int off) { - u16 stack = env->subprog_stack_depth[func->subprogno]; + u16 stack = env->subprog_info[func->subprogno].stack_depth; if (stack >= -off) return 0; /* update known max for given subprogram */ - env->subprog_stack_depth[func->subprogno] = -off; +
[PATCH v2 bpf-next 1/3] bpf: unify main prog and subprog
Currently, verifier treat main prog and subprog differently. All subprogs detected are kept in env->subprog_starts while main prog is not kept there. Instead, main prog is implicitly defined as the prog start at 0. There is actually no difference between main prog and subprog, it is better to unify them, and register all progs detected into env->subprog_starts. This could also help simplifying some code logic. Signed-off-by: Jiong Wang <jiong.w...@netronome.com> --- include/linux/bpf_verifier.h | 2 +- kernel/bpf/verifier.c| 57 2 files changed, 32 insertions(+), 27 deletions(-) diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h index 7e61c39..f655b92 100644 --- a/include/linux/bpf_verifier.h +++ b/include/linux/bpf_verifier.h @@ -191,7 +191,7 @@ struct bpf_verifier_env { bool seen_direct_write; struct bpf_insn_aux_data *insn_aux_data; /* array of per-insn state */ struct bpf_verifier_log log; - u32 subprog_starts[BPF_MAX_SUBPROGS]; + u32 subprog_starts[BPF_MAX_SUBPROGS + 1]; /* computes the stack depth of each bpf function */ u16 subprog_stack_depth[BPF_MAX_SUBPROGS + 1]; u32 subprog_cnt; diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index eb1a596..16ec977 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -765,7 +765,7 @@ static int add_subprog(struct bpf_verifier_env *env, int off) ret = find_subprog(env, off); if (ret >= 0) return 0; - if (env->subprog_cnt >= BPF_MAX_SUBPROGS) { + if (env->subprog_cnt > BPF_MAX_SUBPROGS) { verbose(env, "too many subprograms\n"); return -E2BIG; } @@ -781,6 +781,11 @@ static int check_subprogs(struct bpf_verifier_env *env) struct bpf_insn *insn = env->prog->insnsi; int insn_cnt = env->prog->len; + /* Add entry function. */ + ret = add_subprog(env, 0); + if (ret < 0) + return ret; + /* determine subprog starts. The end is one before the next starts */ for (i = 0; i < insn_cnt; i++) { if (insn[i].code != (BPF_JMP | BPF_CALL)) @@ -806,10 +811,10 @@ static int check_subprogs(struct bpf_verifier_env *env) /* now check that all jumps are within the same subprog */ subprog_start = 0; - if (env->subprog_cnt == cur_subprog) + if (env->subprog_cnt == cur_subprog + 1) subprog_end = insn_cnt; else - subprog_end = env->subprog_starts[cur_subprog++]; + subprog_end = env->subprog_starts[cur_subprog + 1]; for (i = 0; i < insn_cnt; i++) { u8 code = insn[i].code; @@ -833,11 +838,13 @@ static int check_subprogs(struct bpf_verifier_env *env) verbose(env, "last insn is not an exit or jmp\n"); return -EINVAL; } + cur_subprog++; subprog_start = subprog_end; - if (env->subprog_cnt == cur_subprog) + if (env->subprog_cnt == cur_subprog + 1) subprog_end = insn_cnt; else - subprog_end = env->subprog_starts[cur_subprog++]; + subprog_end = + env->subprog_starts[cur_subprog + 1]; } } return 0; @@ -1505,10 +1512,10 @@ static int check_max_stack_depth(struct bpf_verifier_env *env) return -EACCES; } continue_func: - if (env->subprog_cnt == subprog) + if (env->subprog_cnt == subprog + 1) subprog_end = insn_cnt; else - subprog_end = env->subprog_starts[subprog]; + subprog_end = env->subprog_starts[subprog + 1]; for (; i < subprog_end; i++) { if (insn[i].code != (BPF_JMP | BPF_CALL)) continue; @@ -1526,7 +1533,6 @@ static int check_max_stack_depth(struct bpf_verifier_env *env) i); return -EFAULT; } - subprog++; frame++; if (frame >= MAX_CALL_FRAMES) { WARN_ONCE(1, "verifier bug. Call stack is too deep\n"); @@ -1558,7 +1564,6 @@ static int get_callee_stack_depth(struct bpf_verifier_env *env, start); return -EFAULT; } - subprog++; return env->subprog_stack_depth[subprog]; } #endif @@ -2087,7 +2092,7 @@ static int check_map_func_compatibility(struct bpf_verifier_env *env, case BPF_FUNC_tail_call: if (map->map_type !
[PATCH v2 bpf-next 3/3] bpf: add faked "ending" subprog
There are quite a few code snippet like the following in verifier: subprog_start = 0; if (env->subprog_cnt == cur_subprog + 1) subprog_end = insn_cnt; else subprog_end = env->subprog_info[cur_subprog + 1].start; The reason is there is no marker in subprog_info array to tell the end of it. We could resolve this issue by introducing a faked "ending" subprog. The special "ending" subprog is with "insn_cnt" as start offset, so it is serving as the end mark whenever we iterate over all subprogs. Signed-off-by: Jiong Wang <jiong.w...@netronome.com> --- kernel/bpf/verifier.c | 34 ++ 1 file changed, 14 insertions(+), 20 deletions(-) diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 9764b9b..65a6e2e 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -766,7 +766,7 @@ static int add_subprog(struct bpf_verifier_env *env, int off) ret = find_subprog(env, off); if (ret >= 0) return 0; - if (env->subprog_cnt > BPF_MAX_SUBPROGS) { + if (env->subprog_cnt >= BPF_MAX_SUBPROGS) { verbose(env, "too many subprograms\n"); return -E2BIG; } @@ -807,16 +807,18 @@ static int check_subprogs(struct bpf_verifier_env *env) return ret; } + /* Add a fake 'exit' subprog which could simplify subprog iteration +* logic. 'subprog_cnt' should not be increased. +*/ + subprog[env->subprog_cnt].start = insn_cnt; + if (env->log.level > 1) for (i = 0; i < env->subprog_cnt; i++) verbose(env, "func#%d @%d\n", i, subprog[i].start); /* now check that all jumps are within the same subprog */ - subprog_start = 0; - if (env->subprog_cnt == cur_subprog + 1) - subprog_end = insn_cnt; - else - subprog_end = subprog[cur_subprog + 1].start; + subprog_start = subprog[cur_subprog].start; + subprog_end = subprog[cur_subprog + 1].start; for (i = 0; i < insn_cnt; i++) { u8 code = insn[i].code; @@ -840,11 +842,9 @@ static int check_subprogs(struct bpf_verifier_env *env) verbose(env, "last insn is not an exit or jmp\n"); return -EINVAL; } - cur_subprog++; subprog_start = subprog_end; - if (env->subprog_cnt == cur_subprog + 1) - subprog_end = insn_cnt; - else + cur_subprog++; + if (cur_subprog < env->subprog_cnt) subprog_end = subprog[cur_subprog + 1].start; } } @@ -1499,7 +1499,6 @@ static int check_max_stack_depth(struct bpf_verifier_env *env) int depth = 0, frame = 0, idx = 0, i = 0, subprog_end; struct bpf_subprog_info *subprog = env->subprog_info; struct bpf_insn *insn = env->prog->insnsi; - int insn_cnt = env->prog->len; int ret_insn[MAX_CALL_FRAMES]; int ret_prog[MAX_CALL_FRAMES]; @@ -1514,10 +1513,7 @@ static int check_max_stack_depth(struct bpf_verifier_env *env) return -EACCES; } continue_func: - if (env->subprog_cnt == idx + 1) - subprog_end = insn_cnt; - else - subprog_end = subprog[idx + 1].start; + subprog_end = subprog[idx + 1].start; for (; i < subprog_end; i++) { if (insn[i].code != (BPF_JMP | BPF_CALL)) continue; @@ -5070,7 +5066,8 @@ static void adjust_subprog_starts(struct bpf_verifier_env *env, u32 off, u32 len if (len == 1) return; - for (i = 0; i < env->subprog_cnt; i++) { + /* NOTE: fake 'exit' subprog should be updated as well. */ + for (i = 0; i <= env->subprog_cnt; i++) { if (env->subprog_info[i].start < off) continue; env->subprog_info[i].start += len - 1; @@ -5268,10 +5265,7 @@ static int jit_subprogs(struct bpf_verifier_env *env) for (i = 0; i < env->subprog_cnt; i++) { subprog_start = subprog_end; - if (env->subprog_cnt == i + 1) - subprog_end = prog->len; - else - subprog_end = env->subprog_info[i + 1].start; + subprog_end = env->subprog_info[i + 1].start; len = subprog_end - subprog_start; func[i] = bpf_prog_alloc(bpf_prog_size(len), GFP_USER); -- 2.7.4
Re: [PATCH bpf-next 0/3] bpf: cleanups on managing subprog information
On 02/05/2018 18:24, John Fastabend wrote: On 05/02/2018 09:59 AM, Jiong Wang wrote: On 01/05/2018 23:22, Alexei Starovoitov wrote: ... [ 27.784931] ? bpf_int_jit_compile+0x7ac/0xab0 [ 27.785475] bpf_int_jit_compile+0x2b6/0xab0 [ 27.786001] ? do_jit+0x6020/0x6020 [ 27.786428] ? kasan_kmalloc+0xa0/0xd0 [ 27.786885] bpf_check+0x2c05/0x4c40 [ 27.787346] ? fixup_bpf_calls+0x1140/0x1140 [ 27.787865] ? kasan_unpoison_shadow+0x30/0x40 [ 27.788406] ? kasan_kmalloc+0xa0/0xd0 [ 27.788865] ? memset+0x1f/0x40 [ 27.789255] ? bpf_obj_name_cpy+0x2d/0x200 [ 27.789750] bpf_prog_load+0xb07/0xeb0 simply running test_verifier with JIT and kasan on. Ah, sorry, I should add "sysctl net/core/bpf_jit_enable=1" to my test script, error reproduced. convert_ctx_accesses and fixup_bpf_calls might insert ebpf insns that prog->len would change. The new fake "exit" subprog whose .start offset is prog->len should be updated as well. The "for" condition in adjust_subprog_starts: for (i = 0; i < env->subprog_cnt; i++) { need to be changed into: for (i = 0; i <= env->subprog_cnt; i++) { Will respin the patch set. Thanks. Regards, Jiong Also a bit of a nit, but if you are doing a respin. How about consider renaming BPF_MAX_SUBPROGS -> BPF_MAX_PROGS. It will make the naming more accurate and also avoid some diffs below where changing '>=' to '>' is required. I have been pondering renaming BPF_MAX_SUBPROGS to other name like what you suggested, but failed to convince myself, mostly due to there are quite a few other variables etc that are using the "subprog" name convention, so I am thinking use subprog is also fine as traditional main prog/func is also a sub prog/func, it is just the entry one. So I am thinking it might be not worth renaming everything related, and tend to just keep it as is. Thanks. Regards, Jiong @@ -191,7 +191,7 @@ struct bpf_verifier_env { bool seen_direct_write; struct bpf_insn_aux_data *insn_aux_data; /* array of per-insn state */ struct bpf_verifier_log log; - u32 subprog_starts[BPF_MAX_SUBPROGS]; + u32 subprog_starts[BPF_MAX_SUBPROGS + 1]; /* computes the stack depth of each bpf function */ u16 subprog_stack_depth[BPF_MAX_SUBPROGS + 1]; u32 subprog_cnt;
Re: [PATCH bpf-next 0/3] bpf: cleanups on managing subprog information
On 01/05/2018 23:22, Alexei Starovoitov wrote: ... [ 27.784931] ? bpf_int_jit_compile+0x7ac/0xab0 [ 27.785475] bpf_int_jit_compile+0x2b6/0xab0 [ 27.786001] ? do_jit+0x6020/0x6020 [ 27.786428] ? kasan_kmalloc+0xa0/0xd0 [ 27.786885] bpf_check+0x2c05/0x4c40 [ 27.787346] ? fixup_bpf_calls+0x1140/0x1140 [ 27.787865] ? kasan_unpoison_shadow+0x30/0x40 [ 27.788406] ? kasan_kmalloc+0xa0/0xd0 [ 27.788865] ? memset+0x1f/0x40 [ 27.789255] ? bpf_obj_name_cpy+0x2d/0x200 [ 27.789750] bpf_prog_load+0xb07/0xeb0 simply running test_verifier with JIT and kasan on. Ah, sorry, I should add "sysctl net/core/bpf_jit_enable=1" to my test script, error reproduced. convert_ctx_accesses and fixup_bpf_calls might insert ebpf insns that prog->len would change. The new fake "exit" subprog whose .start offset is prog->len should be updated as well. The "for" condition in adjust_subprog_starts: for (i = 0; i < env->subprog_cnt; i++) { need to be changed into: for (i = 0; i <= env->subprog_cnt; i++) { Will respin the patch set. Thanks. Regards, Jiong
[PATCH bpf-next 2/3] bpf: centre subprog information fields
It is better to centre all subprog information fields into one structure. This structure could later serve as function node in call graph. Signed-off-by: Jiong Wang <jiong.w...@netronome.com> --- include/linux/bpf_verifier.h | 9 --- kernel/bpf/verifier.c| 62 +++- 2 files changed, 38 insertions(+), 33 deletions(-) diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h index f655b92..8f70dc1 100644 --- a/include/linux/bpf_verifier.h +++ b/include/linux/bpf_verifier.h @@ -173,6 +173,11 @@ static inline bool bpf_verifier_log_needed(const struct bpf_verifier_log *log) #define BPF_MAX_SUBPROGS 256 +struct bpf_subprog_info { + u32 start; /* insn idx of function entry point */ + u16 stack_depth; /* max. stack depth used by this function */ +}; + /* single container for all structs * one verifier_env per bpf_check() call */ @@ -191,9 +196,7 @@ struct bpf_verifier_env { bool seen_direct_write; struct bpf_insn_aux_data *insn_aux_data; /* array of per-insn state */ struct bpf_verifier_log log; - u32 subprog_starts[BPF_MAX_SUBPROGS + 1]; - /* computes the stack depth of each bpf function */ - u16 subprog_stack_depth[BPF_MAX_SUBPROGS + 1]; + struct bpf_subprog_info subprog_info[BPF_MAX_SUBPROGS + 1]; u32 subprog_cnt; }; diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 16ec977..9764b9b 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -738,18 +738,19 @@ enum reg_arg_type { static int cmp_subprogs(const void *a, const void *b) { - return *(int *)a - *(int *)b; + return ((struct bpf_subprog_info *)a)->start - + ((struct bpf_subprog_info *)b)->start; } static int find_subprog(struct bpf_verifier_env *env, int off) { - u32 *p; + struct bpf_subprog_info *p; - p = bsearch(, env->subprog_starts, env->subprog_cnt, - sizeof(env->subprog_starts[0]), cmp_subprogs); + p = bsearch(, env->subprog_info, env->subprog_cnt, + sizeof(env->subprog_info[0]), cmp_subprogs); if (!p) return -ENOENT; - return p - env->subprog_starts; + return p - env->subprog_info; } @@ -769,15 +770,16 @@ static int add_subprog(struct bpf_verifier_env *env, int off) verbose(env, "too many subprograms\n"); return -E2BIG; } - env->subprog_starts[env->subprog_cnt++] = off; - sort(env->subprog_starts, env->subprog_cnt, -sizeof(env->subprog_starts[0]), cmp_subprogs, NULL); + env->subprog_info[env->subprog_cnt++].start = off; + sort(env->subprog_info, env->subprog_cnt, +sizeof(env->subprog_info[0]), cmp_subprogs, NULL); return 0; } static int check_subprogs(struct bpf_verifier_env *env) { int i, ret, subprog_start, subprog_end, off, cur_subprog = 0; + struct bpf_subprog_info *subprog = env->subprog_info; struct bpf_insn *insn = env->prog->insnsi; int insn_cnt = env->prog->len; @@ -807,14 +809,14 @@ static int check_subprogs(struct bpf_verifier_env *env) if (env->log.level > 1) for (i = 0; i < env->subprog_cnt; i++) - verbose(env, "func#%d @%d\n", i, env->subprog_starts[i]); + verbose(env, "func#%d @%d\n", i, subprog[i].start); /* now check that all jumps are within the same subprog */ subprog_start = 0; if (env->subprog_cnt == cur_subprog + 1) subprog_end = insn_cnt; else - subprog_end = env->subprog_starts[cur_subprog + 1]; + subprog_end = subprog[cur_subprog + 1].start; for (i = 0; i < insn_cnt; i++) { u8 code = insn[i].code; @@ -843,8 +845,7 @@ static int check_subprogs(struct bpf_verifier_env *env) if (env->subprog_cnt == cur_subprog + 1) subprog_end = insn_cnt; else - subprog_end = - env->subprog_starts[cur_subprog + 1]; + subprog_end = subprog[cur_subprog + 1].start; } } return 0; @@ -1477,13 +1478,13 @@ static int update_stack_depth(struct bpf_verifier_env *env, const struct bpf_func_state *func, int off) { - u16 stack = env->subprog_stack_depth[func->subprogno]; + u16 stack = env->subprog_info[func->subprogno].stack_depth; if (stack >= -off) return 0; /* update known max for given subprogram */ - env->subprog_stack_depth[func->subprogno] = -off; +
[PATCH bpf-next 0/3] bpf: cleanups on managing subprog information
This patch set clean up some code logic related with managing subprog information. Part of the set are inspried by Edwin's code in his RFC: "bpf/verifier: subprog/func_call simplifications" but with clearer separation so it could be easier to review. - Path 1 unifies main prog and subprogs. All of them are registered in env->subprog_starts. - After patch 1, it is clear that subprog_starts and subprog_stack_depth could be merged as both of them now have main and subprog unified. Patch 2 therefore does the merge, all subprog information are centred at bpf_subprog_info. - Patch 3 goes further to introduce a new fake "exit" subprog which serves as an ending marker to the subprog list. We could then turn the following code snippets across verifier: if (env->subprog_cnt == cur_subprog + 1) subprog_end = insn_cnt; else subprog_end = env->subprog_info[cur_subprog + 1].start; into: subprog_end = env->subprog_info[cur_subprog + 1].start; There is no functional change by this patch set. No bpf selftest regression found after this patch set. Jiong Wang (3): bpf: unify main prog and subprog bpf: centre subprog information fields bpf: add faked "ending" subprog include/linux/bpf_verifier.h | 9 ++-- kernel/bpf/verifier.c| 118 +-- 2 files changed, 65 insertions(+), 62 deletions(-) -- 2.7.4
[PATCH bpf-next 3/3] bpf: add faked "ending" subprog
There are quite a few code snippet like the following in verifier: subprog_start = 0; if (env->subprog_cnt == cur_subprog + 1) subprog_end = insn_cnt; else subprog_end = env->subprog_info[cur_subprog + 1].start; The reason is there is no marker in subprog_info array to tell the end of it. We could resolve this issue by introducing a faked "ending" subprog. The special "ending" subprog is with "insn_cnt" as start offset, so it is serving as the end mark whenever we iterate over all subprogs. Signed-off-by: Jiong Wang <jiong.w...@netronome.com> --- kernel/bpf/verifier.c | 31 --- 1 file changed, 12 insertions(+), 19 deletions(-) diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 9764b9b..4a081e0 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -766,7 +766,7 @@ static int add_subprog(struct bpf_verifier_env *env, int off) ret = find_subprog(env, off); if (ret >= 0) return 0; - if (env->subprog_cnt > BPF_MAX_SUBPROGS) { + if (env->subprog_cnt >= BPF_MAX_SUBPROGS) { verbose(env, "too many subprograms\n"); return -E2BIG; } @@ -807,16 +807,18 @@ static int check_subprogs(struct bpf_verifier_env *env) return ret; } + /* Add a fake 'exit' subprog which could simplify subprog iteration +* logic. 'subprog_cnt' should not be increased. +*/ + subprog[env->subprog_cnt].start = insn_cnt; + if (env->log.level > 1) for (i = 0; i < env->subprog_cnt; i++) verbose(env, "func#%d @%d\n", i, subprog[i].start); /* now check that all jumps are within the same subprog */ - subprog_start = 0; - if (env->subprog_cnt == cur_subprog + 1) - subprog_end = insn_cnt; - else - subprog_end = subprog[cur_subprog + 1].start; + subprog_start = subprog[cur_subprog].start; + subprog_end = subprog[cur_subprog + 1].start; for (i = 0; i < insn_cnt; i++) { u8 code = insn[i].code; @@ -840,11 +842,9 @@ static int check_subprogs(struct bpf_verifier_env *env) verbose(env, "last insn is not an exit or jmp\n"); return -EINVAL; } - cur_subprog++; subprog_start = subprog_end; - if (env->subprog_cnt == cur_subprog + 1) - subprog_end = insn_cnt; - else + cur_subprog++; + if (cur_subprog < env->subprog_cnt) subprog_end = subprog[cur_subprog + 1].start; } } @@ -1499,7 +1499,6 @@ static int check_max_stack_depth(struct bpf_verifier_env *env) int depth = 0, frame = 0, idx = 0, i = 0, subprog_end; struct bpf_subprog_info *subprog = env->subprog_info; struct bpf_insn *insn = env->prog->insnsi; - int insn_cnt = env->prog->len; int ret_insn[MAX_CALL_FRAMES]; int ret_prog[MAX_CALL_FRAMES]; @@ -1514,10 +1513,7 @@ static int check_max_stack_depth(struct bpf_verifier_env *env) return -EACCES; } continue_func: - if (env->subprog_cnt == idx + 1) - subprog_end = insn_cnt; - else - subprog_end = subprog[idx + 1].start; + subprog_end = subprog[idx + 1].start; for (; i < subprog_end; i++) { if (insn[i].code != (BPF_JMP | BPF_CALL)) continue; @@ -5268,10 +5264,7 @@ static int jit_subprogs(struct bpf_verifier_env *env) for (i = 0; i < env->subprog_cnt; i++) { subprog_start = subprog_end; - if (env->subprog_cnt == i + 1) - subprog_end = prog->len; - else - subprog_end = env->subprog_info[i + 1].start; + subprog_end = env->subprog_info[i + 1].start; len = subprog_end - subprog_start; func[i] = bpf_prog_alloc(bpf_prog_size(len), GFP_USER); -- 2.7.4
[PATCH bpf-next 1/3] bpf: unify main prog and subprog
Currently, verifier treat main prog and subprog differently. All subprogs detected are kept in env->subprog_starts while main prog is not kept there. Instead, main prog is implicitly defined as the prog start at 0. There is actually no difference between main prog and subprog, it is better to unify them, and register all progs detected into env->subprog_starts. This could also help simplifying some code logic. Signed-off-by: Jiong Wang <jiong.w...@netronome.com> --- include/linux/bpf_verifier.h | 2 +- kernel/bpf/verifier.c| 57 2 files changed, 32 insertions(+), 27 deletions(-) diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h index 7e61c39..f655b92 100644 --- a/include/linux/bpf_verifier.h +++ b/include/linux/bpf_verifier.h @@ -191,7 +191,7 @@ struct bpf_verifier_env { bool seen_direct_write; struct bpf_insn_aux_data *insn_aux_data; /* array of per-insn state */ struct bpf_verifier_log log; - u32 subprog_starts[BPF_MAX_SUBPROGS]; + u32 subprog_starts[BPF_MAX_SUBPROGS + 1]; /* computes the stack depth of each bpf function */ u16 subprog_stack_depth[BPF_MAX_SUBPROGS + 1]; u32 subprog_cnt; diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index eb1a596..16ec977 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -765,7 +765,7 @@ static int add_subprog(struct bpf_verifier_env *env, int off) ret = find_subprog(env, off); if (ret >= 0) return 0; - if (env->subprog_cnt >= BPF_MAX_SUBPROGS) { + if (env->subprog_cnt > BPF_MAX_SUBPROGS) { verbose(env, "too many subprograms\n"); return -E2BIG; } @@ -781,6 +781,11 @@ static int check_subprogs(struct bpf_verifier_env *env) struct bpf_insn *insn = env->prog->insnsi; int insn_cnt = env->prog->len; + /* Add entry function. */ + ret = add_subprog(env, 0); + if (ret < 0) + return ret; + /* determine subprog starts. The end is one before the next starts */ for (i = 0; i < insn_cnt; i++) { if (insn[i].code != (BPF_JMP | BPF_CALL)) @@ -806,10 +811,10 @@ static int check_subprogs(struct bpf_verifier_env *env) /* now check that all jumps are within the same subprog */ subprog_start = 0; - if (env->subprog_cnt == cur_subprog) + if (env->subprog_cnt == cur_subprog + 1) subprog_end = insn_cnt; else - subprog_end = env->subprog_starts[cur_subprog++]; + subprog_end = env->subprog_starts[cur_subprog + 1]; for (i = 0; i < insn_cnt; i++) { u8 code = insn[i].code; @@ -833,11 +838,13 @@ static int check_subprogs(struct bpf_verifier_env *env) verbose(env, "last insn is not an exit or jmp\n"); return -EINVAL; } + cur_subprog++; subprog_start = subprog_end; - if (env->subprog_cnt == cur_subprog) + if (env->subprog_cnt == cur_subprog + 1) subprog_end = insn_cnt; else - subprog_end = env->subprog_starts[cur_subprog++]; + subprog_end = + env->subprog_starts[cur_subprog + 1]; } } return 0; @@ -1505,10 +1512,10 @@ static int check_max_stack_depth(struct bpf_verifier_env *env) return -EACCES; } continue_func: - if (env->subprog_cnt == subprog) + if (env->subprog_cnt == subprog + 1) subprog_end = insn_cnt; else - subprog_end = env->subprog_starts[subprog]; + subprog_end = env->subprog_starts[subprog + 1]; for (; i < subprog_end; i++) { if (insn[i].code != (BPF_JMP | BPF_CALL)) continue; @@ -1526,7 +1533,6 @@ static int check_max_stack_depth(struct bpf_verifier_env *env) i); return -EFAULT; } - subprog++; frame++; if (frame >= MAX_CALL_FRAMES) { WARN_ONCE(1, "verifier bug. Call stack is too deep\n"); @@ -1558,7 +1564,6 @@ static int get_callee_stack_depth(struct bpf_verifier_env *env, start); return -EFAULT; } - subprog++; return env->subprog_stack_depth[subprog]; } #endif @@ -2087,7 +2092,7 @@ static int check_map_func_compatibility(struct bpf_verifier_env *env, case BPF_FUNC_tail_call: if (map->map_type !
Re: [RFC PATCH v3 bpf-next 4/5] bpf/verifier: use call graph to efficiently check max stack depth
On 06/04/2018 18:14, Edward Cree wrote: Since that algorithm selects + * nodes in the order of the sorted output, we can do our processing in the loop + * that does the tsort, rather than storing the sorted list and then having a + * second loop to iterate over it and compute the total_stack_depth values. */ IMO, seperate the sort from depth calculation is better than combining them. The tsort of call graph could potentially be used in other places.
Re: [RFC PATCH v3 bpf-next 0/5] bpf/verifier: subprog/func_call simplifications
On 06/04/2018 18:11, Edward Cree wrote: By storing subprog boundaries as a subprogno mark on each insn, rather than a start (and implicit end) for each subprog, the code handling subprogs can in various places be made simpler, more explicit, and more efficient (i.e. better asymptotic complexity). IMHO, the subprog number might be much smaller than insn number, so just keeping sorted subprog information in env->subprog_info and do bsearch looks reasonable to me. While the new added "subprogno" field is only used round find_subprog, so I feel it is a little bit heavy to allocate a delicated field that wouldn't be used broadly for each instruction. This also lays the ground for possible future work in which an extension of the check_subprogs() code to cover basic blocks could replace check_cfg(), which currently largely duplicates the insn-walking part. I was thinking to remove the insn-walk duplication in the other way by center them in the DFS insn-walk in check_cfg, as we can't remove that one before we migrate to other loop detection method. This approach is questionable though. Some other changes were also included to support this: * Per-subprog info is stored in env->subprog_info, an array of structs, rather than several arrays with a common index. * Subprog numbers and corresponding subprog_info index are now 0-based; subprog 0 is the main()-function of the program, starting at insn 0. * Call graph is now stored in the new bpf_subprog_info struct; used here for check_max_stack_depth() but may have other uses too. Ack above changes which center all subprog related info into env->subprog_info and the new addition of callee field.
Re: [PATCH v2 bpf-next 0/3] bpf/verifier: subprog/func_call simplifications
On 03/04/2018 02:08, Alexei Starovoitov wrote: Combining subprog pass with do_check is going into opposite direction of this long term work. Divide and conquer. Combining more things into do_check is the opposite of this programming principle. Agree. And for the redundant insn traversal issue in check_subprogs that Edward trying to fix, I am thinking we could do it by utilizing the existing DFS traversal in check_cfg. The current DFS probably could be improved into an generic instruction information collection pass. This won't make the existing DFS complexer as it only does information collection as a side job during traversal. These collected information then could be used to build any other information to be consumed later, for example subprog, basic blocks etc. For the redundant insn traversal issue during subprog detection, the Like how we mark STATE_LIST_MARK in DFS, we could just call add_subprog for BPF_PSEUDO_CALL insn during DFS. i.e we change the code logic of check_cfg into: check_cfg { * DFS traversal: - detect back-edge. - collect STATE_LIST_MARK. - collect subprog destination. * check all insns are reachable. * check_subprogs (insn traversal removed). } I prototyped a patch for above change, the change looks to me is easier to review. There are 8 regressions on selftests however after this change due to check_subprogs moved after some checks in DFS that some errors detected before the expected errors, need to be cleaned up. Does this direction sound good? And if we want to build basic-block later, we could just call a new add_bb (similar as add_subprog) for jump targets etc. (some of those places are actually STATE_LIST_MARK_JMPTARGET if we separate STATE_LIST_MARK as STATE_LIST_MARK_JMPTARGET and STATE_LIST_MARK_HEURISTIC). Regards, Jiong
Re: [RFC PATCH bpf-next 0/2] bpf/verifier: simplify subprog tracking
On 13/02/2018 05:09, John Fastabend wrote: To find these we do the following 1.a: Build the CFG of all the instructions so we can walk around the program easily. 1.b: Find the basic blocks and build the basic block CFG. After this with a few helper calls we can move around the CFG both at block level and insn level. 1.c: Build the dom tree and post dom tree. This is a data structure that allows us to actually find the natural loops. I can write more about this later. 1.d: Using the post dom tree verify all the CFG loops are in fact natural loops. I made some simplifications here and threw out loops that have multiple jumps to the header node and nested loops for now. Both can be handled with some additional complexity. Versions of the above can be found in gcc and llvm so it appears to be fairly regular stuff from the compiler side. For LLVM the passes are --loops, --postdomtree. I dumped the various CFG into dot format and and graphed them for fun. This part of work are very interesting. It will be very helpful if these CFG information are kept even after verification pass finished, or if the storage overhead is a concern, made into reusable functions. I plan to implement similar graph dump in bpftool, i.e. be able to dump eBPF CFG into .dot graph and color the graph according to some simple static analysis, these exactly need such CFG information. Regards, Jiong
Re: [RFC bpf-next] bpf: add new jited info fields in bpf_dev_offload and bpf_prog_info
On Fri, 12 Jan 2018 12:31:15 +0100, Daniel Borkmann wrote: What kind of string would sit in jited_arch_name for nfp? The name is purely to let libfd know which bfd backend to choose for the disassembler. Given you know the {ifindex, netns_dev, netns_ino} 3‑tuple, isn't it possible to infer the driver info from ethtool (e.g. nfp_net_get_drvinfo()) already and do the mapping for binutils? Right, the inference can certainly work. Probably by matching the PCI ID of the device? Or we can just assume it's the NFP since there is only one upstream BPF offload :) Checked ethtool source code and I am thinking (and had tried) we could just call the existing "ifindex_to_name_ns" in bpftool which returns the device name, then we could use ioctl/SIOCETHTOOL to get the "struct ethtool_drvinfo" associated with the device, then map the driver name to bpf name. Will send out new patches following this way shortly, please let me know if this is not good. Given we know when ifindex is 0, then its always host JITed and the current approach with bfd_openr() works fine, and if ifindex is non-0 it is /always/ device offloaded to the one bound in the ifindex so JIT dump will be specific to that device only and never host one. Not at all opposed to extending uapi, just a question on my side to get a better understanding first wrt arch string (maybe rough but complete patch with nfp + bpftool bits might help too). I was advocating for full separate set of fields, Jiong was trying for a reuse, and we sort of met in the middle :) Depends on how one looks at the definition of the jited_prog_insn field, old user space is not guaranteed to pay attention to ifindex. We will drop the arch and reuse host fields if that's the direction you're leaning in. I also want to make sure adding new "jited_image" and "jited_len" is the correct approach. Was thinking to reusing exsiting fields for host jit, i.e bpf_prog->jited_len and bpf_prog->aux->jited_data, but different offload targets might have different structure for "jited_data" that we need new call back to parse it, also I feel keep host fields clean for host JIT only might help preventing code logic for host and offload interleaving with each other, so had gone with adding new fields. -- Regards, Jiong