Re: [PATCH bpf-next] bpf: relax verifier restriction on BPF_MOV | BPF_ALU

2018-12-07 Thread Jiong Wang

On 06/12/2018 03:13, Alexei Starovoitov wrote:

On Wed, Dec 05, 2018 at 03:32:50PM +, Jiong Wang wrote:

On 05/12/2018 14:52, Edward Cree wrote:

On 05/12/18 09:46, Jiong Wang wrote:

There is NO processed instruction number regression, either with or without
-mattr=+alu32.



Cilium bpf
===
bpf_lb-DLB_L3.o 2110/21101730/1733

That looks like a regression of 3 insns in the 32-bit case.
May be worth investigating why.


Will look into this.




+  dst_reg = insn->dst_reg;
+  regs[dst_reg] = regs[src_reg];
+  if (BPF_CLASS(insn->code) == BPF_ALU) {
+  /* Update type and range info. */
+  regs[dst_reg].type = SCALAR_VALUE;
+  coerce_reg_to_size([dst_reg], 4);

Won't this break when handed a pointer (as root, so allowed to leak
  it)?  E.g. (pointer + x) gets turned into scalar x, rather than
  unknown scalar in range [0, 0x].


Initially I was gating this to scalar_value only, later was thinking it
could be extended to ptr case if ptr leak is allowed.

But, your comment remind me min/max value doesn't mean real min/max value
for ptr types value, it means the offset only if I am understanding the
issue correctly. So, it will break pointer case.


correct. In case of is_pointer_value() && root -> mark_reg_unknown() has to be 
called

The explanation of additional 3 steps from another email makes sense to me.

Can you take a look why it helps default (llvm alu64) case too ?
bpf_overlay.o   3096/2898


It is embarrassing that I am not able to reproduce this number after tried
quite a few env configurations. I think the number must be wrong because
llvm alu64 binary doesn't contain alu32 move so shouldn't be impacted by
this patch even though I double checked the raw data I collected on llvm
alu64, re-calculated the number before this patch, it is still 3096. I
guess there must be something wrong with the binary I was loading.

I improved my benchmarking methodology to build all alu64 and alu32
binaries first, and never change them later. Then used a script to load and
collect the processed number. (borrowed the script from
https://github.com/4ast/bpf_cilium_test/, only my binaries are built from
latest Cilium repo and contains alu32 version as well)

I ran this new benchmarking env for several times, and could get the
following new results consistently:

bpf_lb-DLB_L3.o:2085/2085 1685/1687
bpf_lb-DLB_L4.o:2287/2287 1986/1982
bpf_lb-DUNKNOWN.o:  690/690   622/622
bpf_lxc.o:  95033/95033   N/A
bpf_netdev.o:   7245/7245 N/A
bpf_overlay.o:  2898/2898 3085/2947

No change on alu64 binary.

For alu32, bpf_overlay.o still get fewer processed instruction number, this
is because there is the following sequence (and another similar one).
Before this patch, r2 at insn 139 is unknown, so verifier always explore
both path-taken and path-fall_through. After this patch, it explores
path-fall_through only, so saved some insns.

  129: (b4) (u32) r7 = (u32) -140
  ...
  136: (bc) (u32) r2 = (u32) r7
  137: (74) (u32) r2 >>= (u32) 31
  138: (4c) (u32) r2 |= (u32) r1
  139: (15) if r2 == 0x0 goto pc+342
  140: (b4) (u32) r1 = (u32) 2

And a permissive register value for r2 hasn't released more path prune for
this test, so in all, after this patch, there is fewer processed insn.

I have sent out a v2, gated this change under SCALAR_VALUE, and also
updated the patch description.

Thanks.

Regards,
Jiong



[PATCH v2 bpf-next] bpf: relax verifier restriction on BPF_MOV | BPF_ALU

2018-12-07 Thread Jiong Wang
Currently, the destination register is marked as unknown for 32-bit
sub-register move (BPF_MOV | BPF_ALU) whenever the source register type is
SCALAR_VALUE.

This is too conservative that some valid cases will be rejected.
Especially, this may turn a constant scalar value into unknown value that
could break some assumptions of verifier.

For example, test_l4lb_noinline.c has the following C code:

struct real_definition *dst

1:  if (!get_packet_dst(, , vip_info, is_ipv6))
2:return TC_ACT_SHOT;
3:
4:  if (dst->flags & F_IPV6) {

get_packet_dst is responsible for initializing "dst" into valid pointer and
return true (1), otherwise return false (0). The compiled instruction
sequence using alu32 will be:

  412: (54) (u32) r7 &= (u32) 1
  413: (bc) (u32) r0 = (u32) r7
  414: (95) exit

insn 413, a BPF_MOV | BPF_ALU, however will turn r0 into unknown value even
r7 contains SCALAR_VALUE 1.

This causes trouble when verifier is walking the code path that hasn't
initialized "dst" inside get_packet_dst, for which case 0 is returned and
we would then expect verifier concluding line 1 in the above C code pass
the "if" check, therefore would skip fall through path starting at line 4.
Now, because r0 returned from callee has became unknown value, so verifier
won't skip analyzing path starting at line 4 and "dst->flags" requires
dereferencing the pointer "dst" which actually hasn't be initialized for
this path.

This patch relaxed the code marking sub-register move destination. For a
SCALAR_VALUE, it is safe to just copy the value from source then truncate
it into 32-bit.

A unit test also included to demonstrate this issue. This test will fail
before this patch.

This relaxation could let verifier skipping more paths for conditional
comparison against immediate. It also let verifier recording a more
accurate/strict value for one register at one state, if this state end up
with going through exit without rejection and it is used for state
comparison later, then it is possible an inaccurate/permissive value is
better. So the real impact on verifier processed insn number is complex.
But in all, without this fix, valid program could be rejected.

>From real benchmarking on kernel selftests and Cilium bpf tests, there is
no impact on processed instruction number when tests ares compiled with
default compilation options. There is slightly improvements when they are
compiled with -mattr=+alu32 after this patch.

Also, test_xdp_noinline/-mattr=+alu32 now passed verification. It is
rejected before this fix.

Insn processed before/after this patch:

default -mattr=+alu32

Kernel selftest
===
test_xdp.o  371/371  369/369
test_l4lb.o 6345/63455623/5623
test_xdp_noinline.o 2971/2971rejected/2727
test_tcp_estates.o  429/429  430/430

Cilium bpf
===
bpf_lb-DLB_L3.o:2085/2085 1685/1687
bpf_lb-DLB_L4.o:2287/2287 1986/1982
bpf_lb-DUNKNOWN.o:  690/690   622/622
bpf_lxc.o:  95033/95033   N/A
bpf_netdev.o:   7245/7245 N/A
bpf_overlay.o:  2898/2898 3085/2947

NOTE:
  - bpf_lxc.o and bpf_netdev.o compiled by -mattr=+alu32 are rejected by
verifier due to another issue inside verifier on supporting alu32
binary.
  - Each cilium bpf program could generate several processed insn number,
above number is sum of them.

v1->v2:
 - Restrict the change on SCALAR_VALUE.
 - Update benchmark numbers on Cilium bpf tests.

Signed-off-by: Jiong Wang 
---
 kernel/bpf/verifier.c   | 16 
 tools/testing/selftests/bpf/test_verifier.c | 13 +
 2 files changed, 25 insertions(+), 4 deletions(-)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 9584438..20a 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -3583,12 +3583,15 @@ static int check_alu_op(struct bpf_verifier_env *env, 
struct bpf_insn *insn)
return err;
 
if (BPF_SRC(insn->code) == BPF_X) {
+   struct bpf_reg_state *src_reg = regs + insn->src_reg;
+   struct bpf_reg_state *dst_reg = regs + insn->dst_reg;
+
if (BPF_CLASS(insn->code) == BPF_ALU64) {
/* case: R1 = R2
 * copy register state to dest reg
 */
-   regs[insn->dst_reg] = regs[insn->src_reg];
-   regs[insn->dst_reg].live |= REG_LIVE_WRITTEN;
+   *dst_reg = *src_reg;
+   dst_reg->live |= REG_LIVE_WRITTEN;
} else {
/* R1 = (u32) R2 */
if (is_pointer_value(env, insn->src_reg)) {
@@

[PATCH v2 bpf-next 7/7] selftests: bpf: update testcases for BPF_ALU | BPF_ARSH

2018-12-05 Thread Jiong Wang
"arsh32 on imm" and "arsh32 on reg" now are accepted. Also added two new
testcases to make sure arsh32 won't be treated as arsh64 during
interpretation or JIT code-gen for which case the high bits will be moved
into low halve that the testcases could catch them.

Reviewed-by: Jakub Kicinski 
Signed-off-by: Jiong Wang 
---
 tools/testing/selftests/bpf/test_verifier.c | 29 +
 1 file changed, 25 insertions(+), 4 deletions(-)

diff --git a/tools/testing/selftests/bpf/test_verifier.c 
b/tools/testing/selftests/bpf/test_verifier.c
index 18d0b7f..e7f1bc7 100644
--- a/tools/testing/selftests/bpf/test_verifier.c
+++ b/tools/testing/selftests/bpf/test_verifier.c
@@ -721,8 +721,18 @@ static struct bpf_test tests[] = {
BPF_ALU32_IMM(BPF_ARSH, BPF_REG_0, 5),
BPF_EXIT_INSN(),
},
-   .result = REJECT,
-   .errstr = "unknown opcode c4",
+   .result = ACCEPT,
+   .retval = 0,
+   },
+   {
+   "arsh32 on imm 2",
+   .insns = {
+   BPF_LD_IMM64(BPF_REG_0, 0x1122334485667788),
+   BPF_ALU32_IMM(BPF_ARSH, BPF_REG_0, 7),
+   BPF_EXIT_INSN(),
+   },
+   .result = ACCEPT,
+   .retval = -16069393,
},
{
"arsh32 on reg",
@@ -732,8 +742,19 @@ static struct bpf_test tests[] = {
BPF_ALU32_REG(BPF_ARSH, BPF_REG_0, BPF_REG_1),
BPF_EXIT_INSN(),
},
-   .result = REJECT,
-   .errstr = "unknown opcode cc",
+   .result = ACCEPT,
+   .retval = 0,
+   },
+   {
+   "arsh32 on reg 2",
+   .insns = {
+   BPF_LD_IMM64(BPF_REG_0, 0x55667788),
+   BPF_MOV64_IMM(BPF_REG_1, 15),
+   BPF_ALU32_REG(BPF_ARSH, BPF_REG_0, BPF_REG_1),
+   BPF_EXIT_INSN(),
+   },
+   .result = ACCEPT,
+   .retval = 43724,
},
{
"arsh64 on imm",
-- 
2.7.4



[PATCH v2 bpf-next 0/7] bpf: support BPF_ALU | BPF_ARSH

2018-12-05 Thread Jiong Wang
BPF_ALU | BPF_ARSH | BPF_* were rejected by commit: 7891a87efc71
("bpf: arsh is not supported in 32 bit alu thus reject it"). As explained
in the commit message, this is due to there is no complete support for them
on interpreter and various JIT compilation back-ends.

This patch set is a follow-up which completes the missing bits. This also
pave the way for running bpf program compiled with ALU32 instruction
enabled by specifing -mattr=+alu32 to LLVM for which case there is likely
to have more BPF_ALU | BPF_ARSH insns that will trigger the rejection code.

test_verifier.c is updated accordingly.

I have tested this patch set on x86-64 and NFP, I need help of review and
test on the arch changes (mips/ppc/s390).

Note, there might be merge confict on mips change which is better to be
applied on top of:

  commit: 20b880a05f06 ("mips: bpf: fix encoding bug for mm_srlv32_op"),

which is on mips-fixes branch at the moment.

Thanks.

v1->v2:
 - Fix ppc implementation bug. Should zero high bits explicitly.

Cc: Paul Burton 
Cc: Naveen N. Rao 
Cc: Sandipan Das 
Cc: Martin Schwidefsky 
Cc: Heiko Carstens 

Jiong Wang (7):
  mips: bpf: implement jitting of BPF_ALU | BPF_ARSH | BPF_X
  ppc: bpf: implement jitting of BPF_ALU | BPF_ARSH | BPF_*
  s390: bpf: implement jitting of BPF_ALU | BPF_ARSH | BPF_*
  nfp: bpf: implement jitting of BPF_ALU | BPF_ARSH | BPF_*
  bpf: interpreter support BPF_ALU | BPF_ARSH
  bpf: verifier remove the rejection on BPF_ALU | BPF_ARSH
  selftests: bpf: update testcases for BPF_ALU | BPF_ARSH

 arch/mips/include/asm/uasm.h |  1 +
 arch/mips/include/uapi/asm/inst.h|  1 +
 arch/mips/mm/uasm-micromips.c|  1 +
 arch/mips/mm/uasm-mips.c |  1 +
 arch/mips/mm/uasm.c  |  9 ++---
 arch/mips/net/ebpf_jit.c |  4 +++
 arch/powerpc/include/asm/ppc-opcode.h|  2 ++
 arch/powerpc/net/bpf_jit.h   |  4 +++
 arch/powerpc/net/bpf_jit_comp64.c|  6 
 arch/s390/net/bpf_jit_comp.c | 12 +++
 drivers/net/ethernet/netronome/nfp/bpf/jit.c | 45 
 kernel/bpf/core.c| 52 
 kernel/bpf/verifier.c|  5 ---
 tools/testing/selftests/bpf/test_verifier.c  | 29 +---
 14 files changed, 137 insertions(+), 35 deletions(-)

-- 
2.7.4



[PATCH v2 bpf-next 6/7] bpf: verifier remove the rejection on BPF_ALU | BPF_ARSH

2018-12-05 Thread Jiong Wang
This patch remove the rejection on BPF_ALU | BPF_ARSH as we have supported
them on interpreter and all JIT back-ends

Reviewed-by: Jakub Kicinski 
Signed-off-by: Jiong Wang 
---
 kernel/bpf/verifier.c | 5 -
 1 file changed, 5 deletions(-)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index ce8a1c3..8ed868e 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -3649,11 +3649,6 @@ static int check_alu_op(struct bpf_verifier_env *env, 
struct bpf_insn *insn)
return -EINVAL;
}
 
-   if (opcode == BPF_ARSH && BPF_CLASS(insn->code) != BPF_ALU64) {
-   verbose(env, "BPF_ARSH not supported for 32 bit ALU\n");
-   return -EINVAL;
-   }
-
if ((opcode == BPF_LSH || opcode == BPF_RSH ||
 opcode == BPF_ARSH) && BPF_SRC(insn->code) == BPF_K) {
int size = BPF_CLASS(insn->code) == BPF_ALU64 ? 64 : 32;
-- 
2.7.4



[PATCH v2 bpf-next 2/7] ppc: bpf: implement jitting of BPF_ALU | BPF_ARSH | BPF_*

2018-12-05 Thread Jiong Wang
This patch implements code-gen for BPF_ALU | BPF_ARSH | BPF_*.

Cc: Naveen N. Rao 
Cc: Sandipan Das 
Signed-off-by: Jiong Wang 
---
 arch/powerpc/include/asm/ppc-opcode.h | 2 ++
 arch/powerpc/net/bpf_jit.h| 4 
 arch/powerpc/net/bpf_jit_comp64.c | 6 ++
 3 files changed, 12 insertions(+)

diff --git a/arch/powerpc/include/asm/ppc-opcode.h 
b/arch/powerpc/include/asm/ppc-opcode.h
index a6e9e31..9014592 100644
--- a/arch/powerpc/include/asm/ppc-opcode.h
+++ b/arch/powerpc/include/asm/ppc-opcode.h
@@ -342,6 +342,8 @@
 #define PPC_INST_SLW   0x7c30
 #define PPC_INST_SLD   0x7c36
 #define PPC_INST_SRW   0x7c000430
+#define PPC_INST_SRAW  0x7c000630
+#define PPC_INST_SRAWI 0x7c000670
 #define PPC_INST_SRD   0x7c000436
 #define PPC_INST_SRAD  0x7c000634
 #define PPC_INST_SRADI 0x7c000674
diff --git a/arch/powerpc/net/bpf_jit.h b/arch/powerpc/net/bpf_jit.h
index 47fc666..c2d5192 100644
--- a/arch/powerpc/net/bpf_jit.h
+++ b/arch/powerpc/net/bpf_jit.h
@@ -152,6 +152,10 @@
 ___PPC_RS(a) | ___PPC_RB(s))
 #define PPC_SRW(d, a, s)   EMIT(PPC_INST_SRW | ___PPC_RA(d) |\
 ___PPC_RS(a) | ___PPC_RB(s))
+#define PPC_SRAW(d, a, s)  EMIT(PPC_INST_SRAW | ___PPC_RA(d) |   \
+___PPC_RS(a) | ___PPC_RB(s))
+#define PPC_SRAWI(d, a, i) EMIT(PPC_INST_SRAWI | ___PPC_RA(d) |  \
+___PPC_RS(a) | __PPC_SH(i))
 #define PPC_SRD(d, a, s)   EMIT(PPC_INST_SRD | ___PPC_RA(d) |\
 ___PPC_RS(a) | ___PPC_RB(s))
 #define PPC_SRAD(d, a, s)  EMIT(PPC_INST_SRAD | ___PPC_RA(d) |   \
diff --git a/arch/powerpc/net/bpf_jit_comp64.c 
b/arch/powerpc/net/bpf_jit_comp64.c
index 17482f5..7dc8187 100644
--- a/arch/powerpc/net/bpf_jit_comp64.c
+++ b/arch/powerpc/net/bpf_jit_comp64.c
@@ -529,9 +529,15 @@ static int bpf_jit_build_body(struct bpf_prog *fp, u32 
*image,
if (imm != 0)
PPC_SRDI(dst_reg, dst_reg, imm);
break;
+   case BPF_ALU | BPF_ARSH | BPF_X: /* (s32) dst >>= src */
+   PPC_SRAW(dst_reg, dst_reg, src_reg);
+   goto bpf_alu32_trunc;
case BPF_ALU64 | BPF_ARSH | BPF_X: /* (s64) dst >>= src */
PPC_SRAD(dst_reg, dst_reg, src_reg);
break;
+   case BPF_ALU | BPF_ARSH | BPF_K: /* (s32) dst >>= imm */
+   PPC_SRAWI(dst_reg, dst_reg, imm);
+   goto bpf_alu32_trunc;
case BPF_ALU64 | BPF_ARSH | BPF_K: /* (s64) dst >>= imm */
if (imm != 0)
PPC_SRADI(dst_reg, dst_reg, imm);
-- 
2.7.4



[PATCH v2 bpf-next 4/7] nfp: bpf: implement jitting of BPF_ALU | BPF_ARSH | BPF_*

2018-12-05 Thread Jiong Wang
BPF_X support needs indirect shift mode, please see code comments for
details.

Reviewed-by: Jakub Kicinski 
Signed-off-by: Jiong Wang 
---
 drivers/net/ethernet/netronome/nfp/bpf/jit.c | 45 
 1 file changed, 45 insertions(+)

diff --git a/drivers/net/ethernet/netronome/nfp/bpf/jit.c 
b/drivers/net/ethernet/netronome/nfp/bpf/jit.c
index 97d33bb..662cbc2 100644
--- a/drivers/net/ethernet/netronome/nfp/bpf/jit.c
+++ b/drivers/net/ethernet/netronome/nfp/bpf/jit.c
@@ -2382,6 +2382,49 @@ static int neg_reg(struct nfp_prog *nfp_prog, struct 
nfp_insn_meta *meta)
return 0;
 }
 
+static int __ashr_imm(struct nfp_prog *nfp_prog, u8 dst, u8 shift_amt)
+{
+   /* Set signedness bit (MSB of result). */
+   emit_alu(nfp_prog, reg_none(), reg_a(dst), ALU_OP_OR, reg_imm(0));
+   emit_shf(nfp_prog, reg_both(dst), reg_none(), SHF_OP_ASHR, reg_b(dst),
+SHF_SC_R_SHF, shift_amt);
+   wrp_immed(nfp_prog, reg_both(dst + 1), 0);
+
+   return 0;
+}
+
+static int ashr_reg(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+   const struct bpf_insn *insn = >insn;
+   u64 umin, umax;
+   u8 dst, src;
+
+   dst = insn->dst_reg * 2;
+   umin = meta->umin_src;
+   umax = meta->umax_src;
+   if (umin == umax)
+   return __ashr_imm(nfp_prog, dst, umin);
+
+   src = insn->src_reg * 2;
+   /* NOTE: the first insn will set both indirect shift amount (source A)
+* and signedness bit (MSB of result).
+*/
+   emit_alu(nfp_prog, reg_none(), reg_a(src), ALU_OP_OR, reg_b(dst));
+   emit_shf_indir(nfp_prog, reg_both(dst), reg_none(), SHF_OP_ASHR,
+  reg_b(dst), SHF_SC_R_SHF);
+   wrp_immed(nfp_prog, reg_both(dst + 1), 0);
+
+   return 0;
+}
+
+static int ashr_imm(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+   const struct bpf_insn *insn = >insn;
+   u8 dst = insn->dst_reg * 2;
+
+   return __ashr_imm(nfp_prog, dst, insn->imm);
+}
+
 static int shl_imm(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
 {
const struct bpf_insn *insn = >insn;
@@ -3286,6 +3329,8 @@ static const instr_cb_t instr_cb[256] = {
[BPF_ALU | BPF_DIV | BPF_K] =   div_imm,
[BPF_ALU | BPF_NEG] =   neg_reg,
[BPF_ALU | BPF_LSH | BPF_K] =   shl_imm,
+   [BPF_ALU | BPF_ARSH | BPF_X] =  ashr_reg,
+   [BPF_ALU | BPF_ARSH | BPF_K] =  ashr_imm,
[BPF_ALU | BPF_END | BPF_X] =   end_reg32,
[BPF_LD | BPF_IMM | BPF_DW] =   imm_ld8,
[BPF_LD | BPF_ABS | BPF_B] =data_ld1,
-- 
2.7.4



[PATCH v2 bpf-next 3/7] s390: bpf: implement jitting of BPF_ALU | BPF_ARSH | BPF_*

2018-12-05 Thread Jiong Wang
This patch implements code-gen for BPF_ALU | BPF_ARSH | BPF_*.

Cc: Martin Schwidefsky 
Cc: Heiko Carstens 
Signed-off-by: Jiong Wang 
---
 arch/s390/net/bpf_jit_comp.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/arch/s390/net/bpf_jit_comp.c b/arch/s390/net/bpf_jit_comp.c
index d7052cb..3ff758e 100644
--- a/arch/s390/net/bpf_jit_comp.c
+++ b/arch/s390/net/bpf_jit_comp.c
@@ -821,10 +821,22 @@ static noinline int bpf_jit_insn(struct bpf_jit *jit, 
struct bpf_prog *fp, int i
/*
 * BPF_ARSH
 */
+   case BPF_ALU | BPF_ARSH | BPF_X: /* ((s32) dst) >>= src */
+   /* sra %dst,%dst,0(%src) */
+   EMIT4_DISP(0x8a00, dst_reg, src_reg, 0);
+   EMIT_ZERO(dst_reg);
+   break;
case BPF_ALU64 | BPF_ARSH | BPF_X: /* ((s64) dst) >>= src */
/* srag %dst,%dst,0(%src) */
EMIT6_DISP_LH(0xeb00, 0x000a, dst_reg, dst_reg, src_reg, 0);
break;
+   case BPF_ALU | BPF_ARSH | BPF_K: /* ((s32) dst >> imm */
+   if (imm == 0)
+   break;
+   /* sra %dst,imm(%r0) */
+   EMIT4_DISP(0x8a00, dst_reg, REG_0, imm);
+   EMIT_ZERO(dst_reg);
+   break;
case BPF_ALU64 | BPF_ARSH | BPF_K: /* ((s64) dst) >>= imm */
if (imm == 0)
break;
-- 
2.7.4



[PATCH v2 bpf-next 5/7] bpf: interpreter support BPF_ALU | BPF_ARSH

2018-12-05 Thread Jiong Wang
This patch implements interpreting BPF_ALU | BPF_ARSH. Do arithmetic right
shift on low 32-bit sub-register, and zero the high 32 bits.

Reviewed-by: Jakub Kicinski 
Signed-off-by: Jiong Wang 
---
 kernel/bpf/core.c | 52 ++--
 1 file changed, 30 insertions(+), 22 deletions(-)

diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index f93ed66..36e31d8 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -923,32 +923,34 @@ EXPORT_SYMBOL_GPL(__bpf_call_base);
 #define BPF_INSN_MAP(INSN_2, INSN_3)   \
/* 32 bit ALU operations. */\
/*   Register based. */ \
-   INSN_3(ALU, ADD, X),\
-   INSN_3(ALU, SUB, X),\
-   INSN_3(ALU, AND, X),\
-   INSN_3(ALU, OR,  X),\
-   INSN_3(ALU, LSH, X),\
-   INSN_3(ALU, RSH, X),\
-   INSN_3(ALU, XOR, X),\
-   INSN_3(ALU, MUL, X),\
-   INSN_3(ALU, MOV, X),\
-   INSN_3(ALU, DIV, X),\
-   INSN_3(ALU, MOD, X),\
+   INSN_3(ALU, ADD,  X),   \
+   INSN_3(ALU, SUB,  X),   \
+   INSN_3(ALU, AND,  X),   \
+   INSN_3(ALU, OR,   X),   \
+   INSN_3(ALU, LSH,  X),   \
+   INSN_3(ALU, RSH,  X),   \
+   INSN_3(ALU, XOR,  X),   \
+   INSN_3(ALU, MUL,  X),   \
+   INSN_3(ALU, MOV,  X),   \
+   INSN_3(ALU, ARSH, X),   \
+   INSN_3(ALU, DIV,  X),   \
+   INSN_3(ALU, MOD,  X),   \
INSN_2(ALU, NEG),   \
INSN_3(ALU, END, TO_BE),\
INSN_3(ALU, END, TO_LE),\
/*   Immediate based. */\
-   INSN_3(ALU, ADD, K),\
-   INSN_3(ALU, SUB, K),\
-   INSN_3(ALU, AND, K),\
-   INSN_3(ALU, OR,  K),\
-   INSN_3(ALU, LSH, K),\
-   INSN_3(ALU, RSH, K),\
-   INSN_3(ALU, XOR, K),\
-   INSN_3(ALU, MUL, K),\
-   INSN_3(ALU, MOV, K),\
-   INSN_3(ALU, DIV, K),\
-   INSN_3(ALU, MOD, K),\
+   INSN_3(ALU, ADD,  K),   \
+   INSN_3(ALU, SUB,  K),   \
+   INSN_3(ALU, AND,  K),   \
+   INSN_3(ALU, OR,   K),   \
+   INSN_3(ALU, LSH,  K),   \
+   INSN_3(ALU, RSH,  K),   \
+   INSN_3(ALU, XOR,  K),   \
+   INSN_3(ALU, MUL,  K),   \
+   INSN_3(ALU, MOV,  K),   \
+   INSN_3(ALU, ARSH, K),   \
+   INSN_3(ALU, DIV,  K),   \
+   INSN_3(ALU, MOD,  K),   \
/* 64 bit ALU operations. */\
/*   Register based. */ \
INSN_3(ALU64, ADD,  X), \
@@ -1127,6 +1129,12 @@ static u64 ___bpf_prog_run(u64 *regs, const struct 
bpf_insn *insn, u64 *stack)
DST = (u64) (u32) insn[0].imm | ((u64) (u32) insn[1].imm) << 32;
insn++;
CONT;
+   ALU_ARSH_X:
+   DST = (u64) (u32) ((*(s32 *) ) >> SRC);
+   CONT;
+   ALU_ARSH_K:
+   DST = (u64) (u32) ((*(s32 *) ) >> IMM);
+   CONT;
ALU64_ARSH_X:
(*(s64 *) ) >>= SRC;
CONT;
-- 
2.7.4



[PATCH v2 bpf-next 1/7] mips: bpf: implement jitting of BPF_ALU | BPF_ARSH | BPF_X

2018-12-05 Thread Jiong Wang
Jitting of BPF_K is supported already, but not BPF_X. This patch complete
the support for the latter on both MIPS and microMIPS.

Cc: Paul Burton 
Cc: linux-m...@vger.kernel.org
Acked-by: Paul Burton 
Signed-off-by: Jiong Wang 
---
 arch/mips/include/asm/uasm.h  | 1 +
 arch/mips/include/uapi/asm/inst.h | 1 +
 arch/mips/mm/uasm-micromips.c | 1 +
 arch/mips/mm/uasm-mips.c  | 1 +
 arch/mips/mm/uasm.c   | 9 +
 arch/mips/net/ebpf_jit.c  | 4 
 6 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/arch/mips/include/asm/uasm.h b/arch/mips/include/asm/uasm.h
index 59dae37..b1990dd 100644
--- a/arch/mips/include/asm/uasm.h
+++ b/arch/mips/include/asm/uasm.h
@@ -157,6 +157,7 @@ Ip_u2u1s3(_slti);
 Ip_u2u1s3(_sltiu);
 Ip_u3u1u2(_sltu);
 Ip_u2u1u3(_sra);
+Ip_u3u2u1(_srav);
 Ip_u2u1u3(_srl);
 Ip_u3u2u1(_srlv);
 Ip_u3u1u2(_subu);
diff --git a/arch/mips/include/uapi/asm/inst.h 
b/arch/mips/include/uapi/asm/inst.h
index 273ef58..40fbb5d 100644
--- a/arch/mips/include/uapi/asm/inst.h
+++ b/arch/mips/include/uapi/asm/inst.h
@@ -371,6 +371,7 @@ enum mm_32a_minor_op {
mm_srl32_op = 0x040,
mm_srlv32_op = 0x050,
mm_sra_op = 0x080,
+   mm_srav_op = 0x090,
mm_rotr_op = 0x0c0,
mm_lwxs_op = 0x118,
mm_addu32_op = 0x150,
diff --git a/arch/mips/mm/uasm-micromips.c b/arch/mips/mm/uasm-micromips.c
index 24e5b0d..75ef904 100644
--- a/arch/mips/mm/uasm-micromips.c
+++ b/arch/mips/mm/uasm-micromips.c
@@ -104,6 +104,7 @@ static const struct insn insn_table_MM[insn_invalid] = {
[insn_sltiu]= {M(mm_sltiu32_op, 0, 0, 0, 0, 0), RT | RS | SIMM},
[insn_sltu] = {M(mm_pool32a_op, 0, 0, 0, 0, mm_sltu_op), RT | RS | 
RD},
[insn_sra]  = {M(mm_pool32a_op, 0, 0, 0, 0, mm_sra_op), RT | RS | 
RD},
+   [insn_srav] = {M(mm_pool32a_op, 0, 0, 0, 0, mm_srav_op), RT | RS | 
RD},
[insn_srl]  = {M(mm_pool32a_op, 0, 0, 0, 0, mm_srl32_op), RT | RS | 
RD},
[insn_srlv] = {M(mm_pool32a_op, 0, 0, 0, 0, mm_srlv32_op), RT | RS 
| RD},
[insn_rotr] = {M(mm_pool32a_op, 0, 0, 0, 0, mm_rotr_op), RT | RS | 
RD},
diff --git a/arch/mips/mm/uasm-mips.c b/arch/mips/mm/uasm-mips.c
index 60ceb93..6abe40f 100644
--- a/arch/mips/mm/uasm-mips.c
+++ b/arch/mips/mm/uasm-mips.c
@@ -171,6 +171,7 @@ static const struct insn insn_table[insn_invalid] = {
[insn_sltiu]= {M(sltiu_op, 0, 0, 0, 0, 0), RS | RT | SIMM},
[insn_sltu] = {M(spec_op, 0, 0, 0, 0, sltu_op), RS | RT | RD},
[insn_sra]  = {M(spec_op, 0, 0, 0, 0, sra_op),  RT | RD | RE},
+   [insn_srav] = {M(spec_op, 0, 0, 0, 0, srav_op), RS | RT | RD},
[insn_srl]  = {M(spec_op, 0, 0, 0, 0, srl_op),  RT | RD | RE},
[insn_srlv] = {M(spec_op, 0, 0, 0, 0, srlv_op),  RS | RT | RD},
[insn_subu] = {M(spec_op, 0, 0, 0, 0, subu_op), RS | RT | RD},
diff --git a/arch/mips/mm/uasm.c b/arch/mips/mm/uasm.c
index 57570c0..45b6264 100644
--- a/arch/mips/mm/uasm.c
+++ b/arch/mips/mm/uasm.c
@@ -61,10 +61,10 @@ enum opcode {
insn_mthc0, insn_mthi, insn_mtlo, insn_mul, insn_multu, insn_nor,
insn_or, insn_ori, insn_pref, insn_rfe, insn_rotr, insn_sb,
insn_sc, insn_scd, insn_sd, insn_sh, insn_sll, insn_sllv,
-   insn_slt, insn_slti, insn_sltiu, insn_sltu, insn_sra, insn_srl,
-   insn_srlv, insn_subu, insn_sw, insn_sync, insn_syscall, insn_tlbp,
-   insn_tlbr, insn_tlbwi, insn_tlbwr, insn_wait, insn_wsbh, insn_xor,
-   insn_xori, insn_yield,
+   insn_slt, insn_slti, insn_sltiu, insn_sltu, insn_sra, insn_srav,
+   insn_srl, insn_srlv, insn_subu, insn_sw, insn_sync, insn_syscall,
+   insn_tlbp, insn_tlbr, insn_tlbwi, insn_tlbwr, insn_wait, insn_wsbh,
+   insn_xor, insn_xori, insn_yield,
insn_invalid /* insn_invalid must be last */
 };
 
@@ -353,6 +353,7 @@ I_u2u1s3(_slti)
 I_u2u1s3(_sltiu)
 I_u3u1u2(_sltu)
 I_u2u1u3(_sra)
+I_u3u2u1(_srav)
 I_u2u1u3(_srl)
 I_u3u2u1(_srlv)
 I_u2u1u3(_rotr)
diff --git a/arch/mips/net/ebpf_jit.c b/arch/mips/net/ebpf_jit.c
index aeb7b1b..b16710a 100644
--- a/arch/mips/net/ebpf_jit.c
+++ b/arch/mips/net/ebpf_jit.c
@@ -854,6 +854,7 @@ static int build_one_insn(const struct bpf_insn *insn, 
struct jit_ctx *ctx,
case BPF_ALU | BPF_MOD | BPF_X: /* ALU_REG */
case BPF_ALU | BPF_LSH | BPF_X: /* ALU_REG */
case BPF_ALU | BPF_RSH | BPF_X: /* ALU_REG */
+   case BPF_ALU | BPF_ARSH | BPF_X: /* ALU_REG */
src = ebpf_to_mips_reg(ctx, insn, src_reg_no_fp);
dst = ebpf_to_mips_reg(ctx, insn, dst_reg);
if (src < 0 || dst < 0)
@@ -913,6 +914,9 @@ static int build_one_insn(const struct bpf_insn *insn, 
struct jit_ctx *ctx,
case BPF_RSH:
emit_instr(ctx, srlv, dst, dst, src);
break;
+   case BPF_ARSH:
+   emit_instr(ctx, srav, dst, ds

Re: [PATCH bpf-next] bpf: relax verifier restriction on BPF_MOV | BPF_ALU

2018-12-05 Thread Jiong Wang

On 05/12/2018 15:32, Jiong Wang wrote:

On 05/12/2018 14:52, Edward Cree wrote:

On 05/12/18 09:46, Jiong Wang wrote:
There is NO processed instruction number regression, either with or 
without

-mattr=+alu32.



Cilium bpf
===
bpf_lb-DLB_L3.o 2110/2110    1730/1733

That looks like a regression of 3 insns in the 32-bit case.
May be worth investigating why.


Will look into this.


Got some analysis from trace log, making a register contains accurate
integer could affect both path skip (conditional jump with immediate)
and path prune, details described below.

But I want to emphasis before this patch, turning a constant into unknown
could potentially causing verifier rejecting valid programs.

For example, for test_l4lb_noinline.c, there is the following code:

struct real_definition *dst

1:  if (!get_packet_dst(, , vip_info, is_ipv6))
2:return TC_ACT_SHOT;
3:
4:  if (dst->flags & F_IPV6) {

get_packet_dst is responsible for initialize dst into valid pointer and
return true (1), otherwise return false (0).

As described in the patch description, the return sequence using alu32
will be:

  412: (54) (u32) r7 &= (u32) 1
  413: (bc) (u32) r0 = (u32) r7
  414: (95) exit

which would turn r0 into unknown value for all cases.

This is causing trouble when the code path hasn't initialized dst inside
get_packet_dst, for which case 0 is returned, and we would expect verifer
to skip the fall through path at C code line 4, otherwise it will complain
dst is not a pointer and will reject the program.

For why there are 3 more processed insn under -mattr=+alu32 for bpf_lb.o,
there are two places where verification logic diverges:

One is the following pattern:

 475: (bc) (u32) r1 = (u32) r7
 ...
 478: (55) if r1 != 0x7 goto pc+23
 ...

r1 gets its value from r7 which could be one of three different integer
0, 7, -157 when reached here.

Before this patch, insn 475 will make r1 into unknown while the integer
value is preserved after this patch.

So, if r1 is with value other than 0x7, the fall through path is skipped
after this patch.

But before the patch, after insn 478, verifier could also fix the value
of r1 into constant 0x7, and could do path prune for another path reaching
insn 478 if state comparison is safe.

From the trace log, preserve r1's integer value wins here, path skipped
are more than path pruned, so after this patch, get fewer processed insn
for this part of sequence.

However there are another pattern:

 360: (bc) (u32) r7 = (u32) r1
 361: (05) goto pc+74
 436: (bc) (u32) r1 = (u32) r7
 437: (67) r1 <<= 32
 438: (c7) r1 s>>= 32

For this case, one more path prune happened when r7 is unknown value and
the pruned path is a long one which caused the total processed insn number
become 3 more.



Re: [PATCH bpf-next] bpf: relax verifier restriction on BPF_MOV | BPF_ALU

2018-12-05 Thread Jiong Wang

On 05/12/2018 14:52, Edward Cree wrote:

On 05/12/18 09:46, Jiong Wang wrote:

There is NO processed instruction number regression, either with or without
-mattr=+alu32.



Cilium bpf
===
bpf_lb-DLB_L3.o 2110/21101730/1733

That looks like a regression of 3 insns in the 32-bit case.
May be worth investigating why.


Will look into this.




+   dst_reg = insn->dst_reg;
+   regs[dst_reg] = regs[src_reg];
+   if (BPF_CLASS(insn->code) == BPF_ALU) {
+   /* Update type and range info. */
+   regs[dst_reg].type = SCALAR_VALUE;
+   coerce_reg_to_size([dst_reg], 4);

Won't this break when handed a pointer (as root, so allowed to leak
  it)?  E.g. (pointer + x) gets turned into scalar x, rather than
  unknown scalar in range [0, 0x].


Initially I was gating this to scalar_value only, later was thinking it
could be extended to ptr case if ptr leak is allowed.

But, your comment remind me min/max value doesn't mean real min/max value
for ptr types value, it means the offset only if I am understanding the
issue correctly. So, it will break pointer case.

Regards,
Jiong


The existing behaviour is correct for pointers: 32 unknown bits,
  because the value of the pointer base is unknown.
It's only for scalar_values that you want to copy and truncate the
  var_off and min/max from the src_reg.

-Ed




Re: [PATCH bpf-next 2/7] ppc: bpf: implement jitting of BPF_ALU | BPF_ARSH | BPF_*

2018-12-05 Thread Jiong Wang


Sandipan Das writes:

> Hi Jiong,
>
> On 05/12/18 2:25 AM, Jiong Wang wrote:
>> This patch implements code-gen for BPF_ALU | BPF_ARSH | BPF_*.
>> 
>> Cc: Naveen N. Rao 
>> Cc: Sandipan Das 
>> Signed-off-by: Jiong Wang 
>> ---
> [...]
>> diff --git a/arch/powerpc/net/bpf_jit_comp64.c 
>> b/arch/powerpc/net/bpf_jit_comp64.c
>> index 17482f5..c685b4f 100644
>> --- a/arch/powerpc/net/bpf_jit_comp64.c
>> +++ b/arch/powerpc/net/bpf_jit_comp64.c
>> @@ -529,9 +529,15 @@ static int bpf_jit_build_body(struct bpf_prog *fp, u32 
>> *image,
>>  if (imm != 0)
>>  PPC_SRDI(dst_reg, dst_reg, imm);
>>  break;
>> +case BPF_ALU | BPF_ARSH | BPF_X: /* (s32) dst >>= src */
>> +PPC_SRAW(dst_reg, dst_reg, src_reg);
>> +break;
>
> On ppc64, the sraw and srawi instructions also use sign extension. So, you 
> will have
> to ensure that upper 32 bits are cleared. We already have a label in our JIT 
> code
> called bpf_alu32_trunc that takes care of this. Replacing the break statement 
> with
> a goto bpf_alu32_trunc will fix this.

(resend the reply, got a delivery failure notification from
netdev@vger.kernel.org)

Indeed. Doubled checked the ISA doc,"Bit 32 of RS is replicated to fill
RA0:31.".

Will fix both places in v2.

Thanks

Regards,
Jiong

>
>>  case BPF_ALU64 | BPF_ARSH | BPF_X: /* (s64) dst >>= src */
>>  PPC_SRAD(dst_reg, dst_reg, src_reg);
>>  break;
>> +case BPF_ALU | BPF_ARSH | BPF_K: /* (s32) dst >>= imm */
>> +PPC_SRAWI(dst_reg, dst_reg, imm);
>> +break;
>
> Same here.
>
>>  case BPF_ALU64 | BPF_ARSH | BPF_K: /* (s64) dst >>= imm */
>>  if (imm != 0)
>>  PPC_SRADI(dst_reg, dst_reg, imm);
>> 
>
> With Regards,
> Sandipan



Re: [PATCH bpf-next 1/7] mips: bpf: implement jitting of BPF_ALU | BPF_ARSH | BPF_X

2018-12-05 Thread Jiong Wang

On 05/12/2018 00:02, Paul Burton wrote:

Hi Jiong,

On Tue, Dec 04, 2018 at 03:55:16PM -0500, Jiong Wang wrote:

Jitting of BPF_K is supported already, but not BPF_X. This patch complete
the support for the latter on both MIPS and microMIPS.

Cc: Paul Burton 
Cc: linux-m...@vger.kernel.org
Signed-off-by: Jiong Wang 
---
  arch/mips/include/asm/uasm.h  | 1 +
  arch/mips/include/uapi/asm/inst.h | 1 +
  arch/mips/mm/uasm-micromips.c | 1 +
  arch/mips/mm/uasm-mips.c  | 1 +
  arch/mips/mm/uasm.c   | 9 +
  arch/mips/net/ebpf_jit.c  | 4 
  6 files changed, 13 insertions(+), 4 deletions(-)

I don't seem to have been copied on the rest of the series, but this
patch standalone looks good from a MIPS standpoint. If the series is
going through the net tree (and again, I can't see whether that seems
likely because I don't have the rest of the series) then:

 Acked-by: Paul Burton 

If you want me to take this patch through the MIPS tree instead then let
me know.


Hi Paul,

Thanks very much for the review. I'd like this set to go through net tree
that this feature could be enabled as a whole for all affected arches.

For the Cc, I was thinking the practice is to Cc people on patches of their
subsystems, but guess a better practice might be also Cc on the cover letter
so the background and affected code scope will be more clear.

I will do this in v2.

Regards,
Jiong



[PATCH bpf-next] bpf: relax verifier restriction on BPF_MOV | BPF_ALU

2018-12-05 Thread Jiong Wang
Currently, the destination register is marked as unknown for 32-bit
sub-register move (BPF_MOV | BPF_ALU) whenever the source register type is
SCALAR_VALUE.

This is too conservative that some valid cases will be rejected.
Especially, this may turn a constant scalar value into unknown value that
some runtime optimization could fail. For example, there is the following
insn pattern for test_l4lb_noinline.c when it is compiled by -mattr=+alu32:

  return from callee:
  411: (b4) (u32) r7 = (u32) 0
  412: (54) (u32) r7 &= (u32) 1
  413: (bc) (u32) r0 = (u32) r7
  414: (95) exit

  to caller:
  202: (bc) (u32) r1 = (u32) r0
  203: (b4) (u32) r0 = (u32) 2
  204: (bf) r7 = r9
  205: (15) if r1 == 0x0 goto pc+69
  206: (79) r9 = *(u64 *)(r10 -80)
  207: (71) r1 = *(u8 *)(r9 +16)

The ending sequence in callee is a sub-register move insn which should make
r0 be SCALAR_VALUE 0, however verifier is conservatively marking r0 as
unknown, so r0 is not a constant anymore, instead it is:

   R0=inv(id=0,umax_value=4294967295,var_off=(0x0; 0x))

Then, after returned to caller, r0 is copied back to r1 inside caller where
r1 is used to compare against constant 0. In the C code logic, the stack
slot (r10 - 80) will only be stored a valid pointer when r1 if not zero for
which case insn 207 will be a valid load. Otherwise no initialize of the
stack slot and insn 207 is invalid.

Now the code pattern above belongs to the path that is initializing r1 to
zero, therefore this path won't initialize the stack slot, but verifier
will smartly skip analyzing the fall through path starting at insn 206
because r1 is expected to be zero that the comparison at insn 205 will be
true. However due to r0 is conservatively marked as unknown in first place
in callee's ending sequence, we have lost the information of r0/r1. The
consequence is verifier will fail to skip the fall through path, and will
de-reference stack slot at (r10 - 80) which is not with a valid spilled
pointer for this path.

This patch relaxed the code marking sub-register move destination. For a
SCALAR_VALUE or pointer value which is allowed to be leaked as scalar
value, it is safe to just copy the value from source, force the value type
into SCALAR_VALUE and then truncate it into 32-bit.

A unit test also included to demonstrate this issue. This test will fail
before this patch.

As this relaxation is potentially giving verifier more accurate
information, bpf c program with reasonable size have been benchmarked,
including those inside kernel bpf selftests and those inside cilium repo.
There is NO processed instruction number regression, either with or without
-mattr=+alu32. And there are some improvements:

  - test_xdp_noinline now passed verification under -mattr=+alu32.
  - a few cililum bpf programs got processed insn number reduced.

Insn processed before/after this patch:

default -mattr=+alu32

Kernel selftest
===
test_xdp.o  371/371  369/369
test_l4lb.o 6345/63455623/5623
test_xdp_noinline.o 2971/2971rejected/2727
test_tcp_estates.o  429/429  430/430

Cilium bpf
===
bpf_lb-DLB_L3.o 2110/21101730/1733
bpf_lb-DLB_L4.o 2251/22512037/2029
bpf_lb-DUNKNOWN.o   701/701  729/622
bpf_lxc.o   96023/95033  N/A
bpf_netdev.o7456/7245N/A
bpf_overlay.o   3096/28983085/2947

bpf_lxc.o and bpf_netdev.o compiled by -mattr=+alu32 are rejected by
verifier because of another issue inside verifier on supporting alu32
binary.

One Cilium bpf program could generate several processed insn info, above
number is sum of them.

Reviewed-by: Jakub Kicinski 
Signed-off-by: Jiong Wang 
---
 kernel/bpf/verifier.c   | 32 ++---
 tools/testing/selftests/bpf/test_verifier.c | 13 
 2 files changed, 29 insertions(+), 16 deletions(-)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 9584438..ce8a1c3 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -3583,23 +3583,23 @@ static int check_alu_op(struct bpf_verifier_env *env, 
struct bpf_insn *insn)
return err;
 
if (BPF_SRC(insn->code) == BPF_X) {
-   if (BPF_CLASS(insn->code) == BPF_ALU64) {
-   /* case: R1 = R2
-* copy register state to dest reg
-*/
-   regs[insn->dst_reg] = regs[insn->src_reg];
-   regs[insn->dst_reg].live |= REG_LIVE_WRITTEN;
-   } else {
-   /* R1 = (u32) R2 */
-   if (is_pointer_value(env, insn->src_reg)) {
-   verbose(env,
-   "R%d partial copy of pointer\n",
-

[PATCH bpf-next 7/7] selftests: bpf: update testcases for BPF_ALU | BPF_ARSH

2018-12-04 Thread Jiong Wang
"arsh32 on imm" and "arsh32 on reg" now are accepted. Also added two new
testcases to make sure arsh32 won't be treated as arsh64 during
interpretation or JIT code-gen for which case the high bits will be moved
into low halve that the testcases could catch them.

Reviewed-by: Jakub Kicinski 
Signed-off-by: Jiong Wang 
---
 tools/testing/selftests/bpf/test_verifier.c | 29 +
 1 file changed, 25 insertions(+), 4 deletions(-)

diff --git a/tools/testing/selftests/bpf/test_verifier.c 
b/tools/testing/selftests/bpf/test_verifier.c
index 18d0b7f..e7f1bc7 100644
--- a/tools/testing/selftests/bpf/test_verifier.c
+++ b/tools/testing/selftests/bpf/test_verifier.c
@@ -721,8 +721,18 @@ static struct bpf_test tests[] = {
BPF_ALU32_IMM(BPF_ARSH, BPF_REG_0, 5),
BPF_EXIT_INSN(),
},
-   .result = REJECT,
-   .errstr = "unknown opcode c4",
+   .result = ACCEPT,
+   .retval = 0,
+   },
+   {
+   "arsh32 on imm 2",
+   .insns = {
+   BPF_LD_IMM64(BPF_REG_0, 0x1122334485667788),
+   BPF_ALU32_IMM(BPF_ARSH, BPF_REG_0, 7),
+   BPF_EXIT_INSN(),
+   },
+   .result = ACCEPT,
+   .retval = -16069393,
},
{
"arsh32 on reg",
@@ -732,8 +742,19 @@ static struct bpf_test tests[] = {
BPF_ALU32_REG(BPF_ARSH, BPF_REG_0, BPF_REG_1),
BPF_EXIT_INSN(),
},
-   .result = REJECT,
-   .errstr = "unknown opcode cc",
+   .result = ACCEPT,
+   .retval = 0,
+   },
+   {
+   "arsh32 on reg 2",
+   .insns = {
+   BPF_LD_IMM64(BPF_REG_0, 0x55667788),
+   BPF_MOV64_IMM(BPF_REG_1, 15),
+   BPF_ALU32_REG(BPF_ARSH, BPF_REG_0, BPF_REG_1),
+   BPF_EXIT_INSN(),
+   },
+   .result = ACCEPT,
+   .retval = 43724,
},
{
"arsh64 on imm",
-- 
2.7.4



[PATCH bpf-next 2/7] ppc: bpf: implement jitting of BPF_ALU | BPF_ARSH | BPF_*

2018-12-04 Thread Jiong Wang
This patch implements code-gen for BPF_ALU | BPF_ARSH | BPF_*.

Cc: Naveen N. Rao 
Cc: Sandipan Das 
Signed-off-by: Jiong Wang 
---
 arch/powerpc/include/asm/ppc-opcode.h | 2 ++
 arch/powerpc/net/bpf_jit.h| 4 
 arch/powerpc/net/bpf_jit_comp64.c | 6 ++
 3 files changed, 12 insertions(+)

diff --git a/arch/powerpc/include/asm/ppc-opcode.h 
b/arch/powerpc/include/asm/ppc-opcode.h
index a6e9e31..9014592 100644
--- a/arch/powerpc/include/asm/ppc-opcode.h
+++ b/arch/powerpc/include/asm/ppc-opcode.h
@@ -342,6 +342,8 @@
 #define PPC_INST_SLW   0x7c30
 #define PPC_INST_SLD   0x7c36
 #define PPC_INST_SRW   0x7c000430
+#define PPC_INST_SRAW  0x7c000630
+#define PPC_INST_SRAWI 0x7c000670
 #define PPC_INST_SRD   0x7c000436
 #define PPC_INST_SRAD  0x7c000634
 #define PPC_INST_SRADI 0x7c000674
diff --git a/arch/powerpc/net/bpf_jit.h b/arch/powerpc/net/bpf_jit.h
index 47fc666..c2d5192 100644
--- a/arch/powerpc/net/bpf_jit.h
+++ b/arch/powerpc/net/bpf_jit.h
@@ -152,6 +152,10 @@
 ___PPC_RS(a) | ___PPC_RB(s))
 #define PPC_SRW(d, a, s)   EMIT(PPC_INST_SRW | ___PPC_RA(d) |\
 ___PPC_RS(a) | ___PPC_RB(s))
+#define PPC_SRAW(d, a, s)  EMIT(PPC_INST_SRAW | ___PPC_RA(d) |   \
+___PPC_RS(a) | ___PPC_RB(s))
+#define PPC_SRAWI(d, a, i) EMIT(PPC_INST_SRAWI | ___PPC_RA(d) |  \
+___PPC_RS(a) | __PPC_SH(i))
 #define PPC_SRD(d, a, s)   EMIT(PPC_INST_SRD | ___PPC_RA(d) |\
 ___PPC_RS(a) | ___PPC_RB(s))
 #define PPC_SRAD(d, a, s)  EMIT(PPC_INST_SRAD | ___PPC_RA(d) |   \
diff --git a/arch/powerpc/net/bpf_jit_comp64.c 
b/arch/powerpc/net/bpf_jit_comp64.c
index 17482f5..c685b4f 100644
--- a/arch/powerpc/net/bpf_jit_comp64.c
+++ b/arch/powerpc/net/bpf_jit_comp64.c
@@ -529,9 +529,15 @@ static int bpf_jit_build_body(struct bpf_prog *fp, u32 
*image,
if (imm != 0)
PPC_SRDI(dst_reg, dst_reg, imm);
break;
+   case BPF_ALU | BPF_ARSH | BPF_X: /* (s32) dst >>= src */
+   PPC_SRAW(dst_reg, dst_reg, src_reg);
+   break;
case BPF_ALU64 | BPF_ARSH | BPF_X: /* (s64) dst >>= src */
PPC_SRAD(dst_reg, dst_reg, src_reg);
break;
+   case BPF_ALU | BPF_ARSH | BPF_K: /* (s32) dst >>= imm */
+   PPC_SRAWI(dst_reg, dst_reg, imm);
+   break;
case BPF_ALU64 | BPF_ARSH | BPF_K: /* (s64) dst >>= imm */
if (imm != 0)
PPC_SRADI(dst_reg, dst_reg, imm);
-- 
2.7.4



[PATCH bpf-next 5/7] bpf: interpreter support BPF_ALU | BPF_ARSH

2018-12-04 Thread Jiong Wang
This patch implements interpreting BPF_ALU | BPF_ARSH. Do arithmetic right
shift on low 32-bit sub-register, and zero the high 32 bits.

Reviewed-by: Jakub Kicinski 
Signed-off-by: Jiong Wang 
---
 kernel/bpf/core.c | 52 ++--
 1 file changed, 30 insertions(+), 22 deletions(-)

diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index f93ed66..36e31d8 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -923,32 +923,34 @@ EXPORT_SYMBOL_GPL(__bpf_call_base);
 #define BPF_INSN_MAP(INSN_2, INSN_3)   \
/* 32 bit ALU operations. */\
/*   Register based. */ \
-   INSN_3(ALU, ADD, X),\
-   INSN_3(ALU, SUB, X),\
-   INSN_3(ALU, AND, X),\
-   INSN_3(ALU, OR,  X),\
-   INSN_3(ALU, LSH, X),\
-   INSN_3(ALU, RSH, X),\
-   INSN_3(ALU, XOR, X),\
-   INSN_3(ALU, MUL, X),\
-   INSN_3(ALU, MOV, X),\
-   INSN_3(ALU, DIV, X),\
-   INSN_3(ALU, MOD, X),\
+   INSN_3(ALU, ADD,  X),   \
+   INSN_3(ALU, SUB,  X),   \
+   INSN_3(ALU, AND,  X),   \
+   INSN_3(ALU, OR,   X),   \
+   INSN_3(ALU, LSH,  X),   \
+   INSN_3(ALU, RSH,  X),   \
+   INSN_3(ALU, XOR,  X),   \
+   INSN_3(ALU, MUL,  X),   \
+   INSN_3(ALU, MOV,  X),   \
+   INSN_3(ALU, ARSH, X),   \
+   INSN_3(ALU, DIV,  X),   \
+   INSN_3(ALU, MOD,  X),   \
INSN_2(ALU, NEG),   \
INSN_3(ALU, END, TO_BE),\
INSN_3(ALU, END, TO_LE),\
/*   Immediate based. */\
-   INSN_3(ALU, ADD, K),\
-   INSN_3(ALU, SUB, K),\
-   INSN_3(ALU, AND, K),\
-   INSN_3(ALU, OR,  K),\
-   INSN_3(ALU, LSH, K),\
-   INSN_3(ALU, RSH, K),\
-   INSN_3(ALU, XOR, K),\
-   INSN_3(ALU, MUL, K),\
-   INSN_3(ALU, MOV, K),\
-   INSN_3(ALU, DIV, K),\
-   INSN_3(ALU, MOD, K),\
+   INSN_3(ALU, ADD,  K),   \
+   INSN_3(ALU, SUB,  K),   \
+   INSN_3(ALU, AND,  K),   \
+   INSN_3(ALU, OR,   K),   \
+   INSN_3(ALU, LSH,  K),   \
+   INSN_3(ALU, RSH,  K),   \
+   INSN_3(ALU, XOR,  K),   \
+   INSN_3(ALU, MUL,  K),   \
+   INSN_3(ALU, MOV,  K),   \
+   INSN_3(ALU, ARSH, K),   \
+   INSN_3(ALU, DIV,  K),   \
+   INSN_3(ALU, MOD,  K),   \
/* 64 bit ALU operations. */\
/*   Register based. */ \
INSN_3(ALU64, ADD,  X), \
@@ -1127,6 +1129,12 @@ static u64 ___bpf_prog_run(u64 *regs, const struct 
bpf_insn *insn, u64 *stack)
DST = (u64) (u32) insn[0].imm | ((u64) (u32) insn[1].imm) << 32;
insn++;
CONT;
+   ALU_ARSH_X:
+   DST = (u64) (u32) ((*(s32 *) ) >> SRC);
+   CONT;
+   ALU_ARSH_K:
+   DST = (u64) (u32) ((*(s32 *) ) >> IMM);
+   CONT;
ALU64_ARSH_X:
(*(s64 *) ) >>= SRC;
CONT;
-- 
2.7.4



[PATCH bpf-next 1/7] mips: bpf: implement jitting of BPF_ALU | BPF_ARSH | BPF_X

2018-12-04 Thread Jiong Wang
Jitting of BPF_K is supported already, but not BPF_X. This patch complete
the support for the latter on both MIPS and microMIPS.

Cc: Paul Burton 
Cc: linux-m...@vger.kernel.org
Signed-off-by: Jiong Wang 
---
 arch/mips/include/asm/uasm.h  | 1 +
 arch/mips/include/uapi/asm/inst.h | 1 +
 arch/mips/mm/uasm-micromips.c | 1 +
 arch/mips/mm/uasm-mips.c  | 1 +
 arch/mips/mm/uasm.c   | 9 +
 arch/mips/net/ebpf_jit.c  | 4 
 6 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/arch/mips/include/asm/uasm.h b/arch/mips/include/asm/uasm.h
index 59dae37..b1990dd 100644
--- a/arch/mips/include/asm/uasm.h
+++ b/arch/mips/include/asm/uasm.h
@@ -157,6 +157,7 @@ Ip_u2u1s3(_slti);
 Ip_u2u1s3(_sltiu);
 Ip_u3u1u2(_sltu);
 Ip_u2u1u3(_sra);
+Ip_u3u2u1(_srav);
 Ip_u2u1u3(_srl);
 Ip_u3u2u1(_srlv);
 Ip_u3u1u2(_subu);
diff --git a/arch/mips/include/uapi/asm/inst.h 
b/arch/mips/include/uapi/asm/inst.h
index 273ef58..40fbb5d 100644
--- a/arch/mips/include/uapi/asm/inst.h
+++ b/arch/mips/include/uapi/asm/inst.h
@@ -371,6 +371,7 @@ enum mm_32a_minor_op {
mm_srl32_op = 0x040,
mm_srlv32_op = 0x050,
mm_sra_op = 0x080,
+   mm_srav_op = 0x090,
mm_rotr_op = 0x0c0,
mm_lwxs_op = 0x118,
mm_addu32_op = 0x150,
diff --git a/arch/mips/mm/uasm-micromips.c b/arch/mips/mm/uasm-micromips.c
index 24e5b0d..75ef904 100644
--- a/arch/mips/mm/uasm-micromips.c
+++ b/arch/mips/mm/uasm-micromips.c
@@ -104,6 +104,7 @@ static const struct insn insn_table_MM[insn_invalid] = {
[insn_sltiu]= {M(mm_sltiu32_op, 0, 0, 0, 0, 0), RT | RS | SIMM},
[insn_sltu] = {M(mm_pool32a_op, 0, 0, 0, 0, mm_sltu_op), RT | RS | 
RD},
[insn_sra]  = {M(mm_pool32a_op, 0, 0, 0, 0, mm_sra_op), RT | RS | 
RD},
+   [insn_srav] = {M(mm_pool32a_op, 0, 0, 0, 0, mm_srav_op), RT | RS | 
RD},
[insn_srl]  = {M(mm_pool32a_op, 0, 0, 0, 0, mm_srl32_op), RT | RS | 
RD},
[insn_srlv] = {M(mm_pool32a_op, 0, 0, 0, 0, mm_srlv32_op), RT | RS 
| RD},
[insn_rotr] = {M(mm_pool32a_op, 0, 0, 0, 0, mm_rotr_op), RT | RS | 
RD},
diff --git a/arch/mips/mm/uasm-mips.c b/arch/mips/mm/uasm-mips.c
index 60ceb93..6abe40f 100644
--- a/arch/mips/mm/uasm-mips.c
+++ b/arch/mips/mm/uasm-mips.c
@@ -171,6 +171,7 @@ static const struct insn insn_table[insn_invalid] = {
[insn_sltiu]= {M(sltiu_op, 0, 0, 0, 0, 0), RS | RT | SIMM},
[insn_sltu] = {M(spec_op, 0, 0, 0, 0, sltu_op), RS | RT | RD},
[insn_sra]  = {M(spec_op, 0, 0, 0, 0, sra_op),  RT | RD | RE},
+   [insn_srav] = {M(spec_op, 0, 0, 0, 0, srav_op), RS | RT | RD},
[insn_srl]  = {M(spec_op, 0, 0, 0, 0, srl_op),  RT | RD | RE},
[insn_srlv] = {M(spec_op, 0, 0, 0, 0, srlv_op),  RS | RT | RD},
[insn_subu] = {M(spec_op, 0, 0, 0, 0, subu_op), RS | RT | RD},
diff --git a/arch/mips/mm/uasm.c b/arch/mips/mm/uasm.c
index 57570c0..45b6264 100644
--- a/arch/mips/mm/uasm.c
+++ b/arch/mips/mm/uasm.c
@@ -61,10 +61,10 @@ enum opcode {
insn_mthc0, insn_mthi, insn_mtlo, insn_mul, insn_multu, insn_nor,
insn_or, insn_ori, insn_pref, insn_rfe, insn_rotr, insn_sb,
insn_sc, insn_scd, insn_sd, insn_sh, insn_sll, insn_sllv,
-   insn_slt, insn_slti, insn_sltiu, insn_sltu, insn_sra, insn_srl,
-   insn_srlv, insn_subu, insn_sw, insn_sync, insn_syscall, insn_tlbp,
-   insn_tlbr, insn_tlbwi, insn_tlbwr, insn_wait, insn_wsbh, insn_xor,
-   insn_xori, insn_yield,
+   insn_slt, insn_slti, insn_sltiu, insn_sltu, insn_sra, insn_srav,
+   insn_srl, insn_srlv, insn_subu, insn_sw, insn_sync, insn_syscall,
+   insn_tlbp, insn_tlbr, insn_tlbwi, insn_tlbwr, insn_wait, insn_wsbh,
+   insn_xor, insn_xori, insn_yield,
insn_invalid /* insn_invalid must be last */
 };
 
@@ -353,6 +353,7 @@ I_u2u1s3(_slti)
 I_u2u1s3(_sltiu)
 I_u3u1u2(_sltu)
 I_u2u1u3(_sra)
+I_u3u2u1(_srav)
 I_u2u1u3(_srl)
 I_u3u2u1(_srlv)
 I_u2u1u3(_rotr)
diff --git a/arch/mips/net/ebpf_jit.c b/arch/mips/net/ebpf_jit.c
index aeb7b1b..b16710a 100644
--- a/arch/mips/net/ebpf_jit.c
+++ b/arch/mips/net/ebpf_jit.c
@@ -854,6 +854,7 @@ static int build_one_insn(const struct bpf_insn *insn, 
struct jit_ctx *ctx,
case BPF_ALU | BPF_MOD | BPF_X: /* ALU_REG */
case BPF_ALU | BPF_LSH | BPF_X: /* ALU_REG */
case BPF_ALU | BPF_RSH | BPF_X: /* ALU_REG */
+   case BPF_ALU | BPF_ARSH | BPF_X: /* ALU_REG */
src = ebpf_to_mips_reg(ctx, insn, src_reg_no_fp);
dst = ebpf_to_mips_reg(ctx, insn, dst_reg);
if (src < 0 || dst < 0)
@@ -913,6 +914,9 @@ static int build_one_insn(const struct bpf_insn *insn, 
struct jit_ctx *ctx,
case BPF_RSH:
emit_instr(ctx, srlv, dst, dst, src);
break;
+   case BPF_ARSH:
+   emit_instr(ctx, srav, dst, dst, src);
+  

[PATCH bpf-next 4/7] nfp: bpf: implement jitting of BPF_ALU | BPF_ARSH | BPF_*

2018-12-04 Thread Jiong Wang
BPF_X support needs indirect shift mode, please see code comments for
details.

Reviewed-by: Jakub Kicinski 
Signed-off-by: Jiong Wang 
---
 drivers/net/ethernet/netronome/nfp/bpf/jit.c | 45 
 1 file changed, 45 insertions(+)

diff --git a/drivers/net/ethernet/netronome/nfp/bpf/jit.c 
b/drivers/net/ethernet/netronome/nfp/bpf/jit.c
index 97d33bb..662cbc2 100644
--- a/drivers/net/ethernet/netronome/nfp/bpf/jit.c
+++ b/drivers/net/ethernet/netronome/nfp/bpf/jit.c
@@ -2382,6 +2382,49 @@ static int neg_reg(struct nfp_prog *nfp_prog, struct 
nfp_insn_meta *meta)
return 0;
 }
 
+static int __ashr_imm(struct nfp_prog *nfp_prog, u8 dst, u8 shift_amt)
+{
+   /* Set signedness bit (MSB of result). */
+   emit_alu(nfp_prog, reg_none(), reg_a(dst), ALU_OP_OR, reg_imm(0));
+   emit_shf(nfp_prog, reg_both(dst), reg_none(), SHF_OP_ASHR, reg_b(dst),
+SHF_SC_R_SHF, shift_amt);
+   wrp_immed(nfp_prog, reg_both(dst + 1), 0);
+
+   return 0;
+}
+
+static int ashr_reg(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+   const struct bpf_insn *insn = >insn;
+   u64 umin, umax;
+   u8 dst, src;
+
+   dst = insn->dst_reg * 2;
+   umin = meta->umin_src;
+   umax = meta->umax_src;
+   if (umin == umax)
+   return __ashr_imm(nfp_prog, dst, umin);
+
+   src = insn->src_reg * 2;
+   /* NOTE: the first insn will set both indirect shift amount (source A)
+* and signedness bit (MSB of result).
+*/
+   emit_alu(nfp_prog, reg_none(), reg_a(src), ALU_OP_OR, reg_b(dst));
+   emit_shf_indir(nfp_prog, reg_both(dst), reg_none(), SHF_OP_ASHR,
+  reg_b(dst), SHF_SC_R_SHF);
+   wrp_immed(nfp_prog, reg_both(dst + 1), 0);
+
+   return 0;
+}
+
+static int ashr_imm(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+   const struct bpf_insn *insn = >insn;
+   u8 dst = insn->dst_reg * 2;
+
+   return __ashr_imm(nfp_prog, dst, insn->imm);
+}
+
 static int shl_imm(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
 {
const struct bpf_insn *insn = >insn;
@@ -3286,6 +3329,8 @@ static const instr_cb_t instr_cb[256] = {
[BPF_ALU | BPF_DIV | BPF_K] =   div_imm,
[BPF_ALU | BPF_NEG] =   neg_reg,
[BPF_ALU | BPF_LSH | BPF_K] =   shl_imm,
+   [BPF_ALU | BPF_ARSH | BPF_X] =  ashr_reg,
+   [BPF_ALU | BPF_ARSH | BPF_K] =  ashr_imm,
[BPF_ALU | BPF_END | BPF_X] =   end_reg32,
[BPF_LD | BPF_IMM | BPF_DW] =   imm_ld8,
[BPF_LD | BPF_ABS | BPF_B] =data_ld1,
-- 
2.7.4



[PATCH bpf-next 3/7] s390: bpf: implement jitting of BPF_ALU | BPF_ARSH | BPF_*

2018-12-04 Thread Jiong Wang
This patch implements code-gen for BPF_ALU | BPF_ARSH | BPF_*.

Cc: Martin Schwidefsky 
Cc: Heiko Carstens 
Signed-off-by: Jiong Wang 
---
 arch/s390/net/bpf_jit_comp.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/arch/s390/net/bpf_jit_comp.c b/arch/s390/net/bpf_jit_comp.c
index d7052cb..3ff758e 100644
--- a/arch/s390/net/bpf_jit_comp.c
+++ b/arch/s390/net/bpf_jit_comp.c
@@ -821,10 +821,22 @@ static noinline int bpf_jit_insn(struct bpf_jit *jit, 
struct bpf_prog *fp, int i
/*
 * BPF_ARSH
 */
+   case BPF_ALU | BPF_ARSH | BPF_X: /* ((s32) dst) >>= src */
+   /* sra %dst,%dst,0(%src) */
+   EMIT4_DISP(0x8a00, dst_reg, src_reg, 0);
+   EMIT_ZERO(dst_reg);
+   break;
case BPF_ALU64 | BPF_ARSH | BPF_X: /* ((s64) dst) >>= src */
/* srag %dst,%dst,0(%src) */
EMIT6_DISP_LH(0xeb00, 0x000a, dst_reg, dst_reg, src_reg, 0);
break;
+   case BPF_ALU | BPF_ARSH | BPF_K: /* ((s32) dst >> imm */
+   if (imm == 0)
+   break;
+   /* sra %dst,imm(%r0) */
+   EMIT4_DISP(0x8a00, dst_reg, REG_0, imm);
+   EMIT_ZERO(dst_reg);
+   break;
case BPF_ALU64 | BPF_ARSH | BPF_K: /* ((s64) dst) >>= imm */
if (imm == 0)
break;
-- 
2.7.4



[PATCH bpf-next 0/7] bpf: support BPF_ALU | BPF_ARSH

2018-12-04 Thread Jiong Wang
BPF_ALU | BPF_ARSH | BPF_* were rejected by commit: 7891a87efc71
("bpf: arsh is not supported in 32 bit alu thus reject it"). As explained
in the commit message, this is due to there is no complete support for them
on interpreter and various JIT compilation back-ends.

This patch set is a follow-up which completes the missing bits. This also
pave the way for running bpf program compiled with ALU32 instruction
enabled by specifing -mattr=+alu32 to LLVM for which case there is likely
to have more BPF_ALU | BPF_ARSH insns that will trigger the rejection code.

test_verifier.c is updated accordingly.

I have tested this patch set on x86-64 and NFP, I need help of review and
test on the arch changes (mips/ppc/s390).

Note, there might be merge confict on mips change which is better to be
applied on top of:

  commit: 20b880a05f06 ("mips: bpf: fix encoding bug for mm_srlv32_op"),

which is on mips-fixes branch at the moment.

Thanks.

Jiong Wang (7):
  mips: bpf: implement jitting of BPF_ALU | BPF_ARSH | BPF_X
  ppc: bpf: implement jitting of BPF_ALU | BPF_ARSH | BPF_*
  s390: bpf: implement jitting of BPF_ALU | BPF_ARSH | BPF_*
  nfp: bpf: implement jitting of BPF_ALU | BPF_ARSH | BPF_*
  bpf: interpreter support BPF_ALU | BPF_ARSH
  bpf: verifier remove the rejection on BPF_ALU | BPF_ARSH
  selftests: bpf: update testcases for BPF_ALU | BPF_ARSH

 arch/mips/include/asm/uasm.h |  1 +
 arch/mips/include/uapi/asm/inst.h|  1 +
 arch/mips/mm/uasm-micromips.c|  1 +
 arch/mips/mm/uasm-mips.c |  1 +
 arch/mips/mm/uasm.c  |  9 ++---
 arch/mips/net/ebpf_jit.c |  4 +++
 arch/powerpc/include/asm/ppc-opcode.h|  2 ++
 arch/powerpc/net/bpf_jit.h   |  4 +++
 arch/powerpc/net/bpf_jit_comp64.c|  6 
 arch/s390/net/bpf_jit_comp.c | 12 +++
 drivers/net/ethernet/netronome/nfp/bpf/jit.c | 45 
 kernel/bpf/core.c| 52 
 kernel/bpf/verifier.c|  5 ---
 tools/testing/selftests/bpf/test_verifier.c  | 29 +---
 14 files changed, 137 insertions(+), 35 deletions(-)

-- 
2.7.4



[PATCH bpf-next 6/7] bpf: verifier remove the rejection on BPF_ALU | BPF_ARSH

2018-12-04 Thread Jiong Wang
This patch remove the rejection on BPF_ALU | BPF_ARSH as we have supported
them on interpreter and all JIT back-ends

Reviewed-by: Jakub Kicinski 
Signed-off-by: Jiong Wang 
---
 kernel/bpf/verifier.c | 5 -
 1 file changed, 5 deletions(-)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index ce8a1c3..8ed868e 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -3649,11 +3649,6 @@ static int check_alu_op(struct bpf_verifier_env *env, 
struct bpf_insn *insn)
return -EINVAL;
}
 
-   if (opcode == BPF_ARSH && BPF_CLASS(insn->code) != BPF_ALU64) {
-   verbose(env, "BPF_ARSH not supported for 32 bit ALU\n");
-   return -EINVAL;
-   }
-
if ((opcode == BPF_LSH || opcode == BPF_RSH ||
 opcode == BPF_ARSH) && BPF_SRC(insn->code) == BPF_K) {
int size = BPF_CLASS(insn->code) == BPF_ALU64 ? 64 : 32;
-- 
2.7.4



Re: [RFC bpf-next 1/7] bpf: interpreter support BPF_ALU | BPF_ARSH

2018-12-04 Thread Jiong Wang

On 04/12/2018 20:10, David Miller wrote:

From: Alexei Starovoitov 
Date: Tue, 4 Dec 2018 11:29:55 -0800


I guess sparc doesn't really have 32 subregisters.  All registers
are considered 64-bit. It has 32-bit alu ops on 64-bit registers
instead.

Right.

Anyways, sparc will require two instructions because of this, the
'sra' then a 'srl' by zero bits to clear the top 32-bits.

I'll code up the sparc JIT part when this goes in.


Hmm, I had been going through all JIT backends, and saw there is
do_alu32_trunc after jitting sra for BPF_ALU. That's what needed?

Regards,
Jiong




[RFC bpf-next 6/7] bpf: verifier remove the rejection on BPF_ALU | BPF_ARSH

2018-12-04 Thread Jiong Wang
This patch remove the rejection on BPF_ALU | BPF_ARSH as we have supported
them on interpreter and all JIT back-ends

Reviewed-by: Jakub Kicinski 
Signed-off-by: Jiong Wang 
---
 kernel/bpf/verifier.c | 5 -
 1 file changed, 5 deletions(-)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index ce8a1c3..8ed868e 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -3649,11 +3649,6 @@ static int check_alu_op(struct bpf_verifier_env *env, 
struct bpf_insn *insn)
return -EINVAL;
}
 
-   if (opcode == BPF_ARSH && BPF_CLASS(insn->code) != BPF_ALU64) {
-   verbose(env, "BPF_ARSH not supported for 32 bit ALU\n");
-   return -EINVAL;
-   }
-
if ((opcode == BPF_LSH || opcode == BPF_RSH ||
 opcode == BPF_ARSH) && BPF_SRC(insn->code) == BPF_K) {
int size = BPF_CLASS(insn->code) == BPF_ALU64 ? 64 : 32;
-- 
2.7.4



[RFC bpf-next 3/7] ppc: bpf: implement jitting of BPF_ALU | BPF_ARSH | BPF_*

2018-12-04 Thread Jiong Wang
This patch implements code-gen for BPF_ALU | BPF_ARSH | BPF_*.

Cc: Naveen N. Rao 
Cc: Sandipan Das 
Signed-off-by: Jiong Wang 
---
 arch/powerpc/include/asm/ppc-opcode.h | 2 ++
 arch/powerpc/net/bpf_jit.h| 4 
 arch/powerpc/net/bpf_jit_comp64.c | 6 ++
 3 files changed, 12 insertions(+)

diff --git a/arch/powerpc/include/asm/ppc-opcode.h 
b/arch/powerpc/include/asm/ppc-opcode.h
index a6e9e31..9014592 100644
--- a/arch/powerpc/include/asm/ppc-opcode.h
+++ b/arch/powerpc/include/asm/ppc-opcode.h
@@ -342,6 +342,8 @@
 #define PPC_INST_SLW   0x7c30
 #define PPC_INST_SLD   0x7c36
 #define PPC_INST_SRW   0x7c000430
+#define PPC_INST_SRAW  0x7c000630
+#define PPC_INST_SRAWI 0x7c000670
 #define PPC_INST_SRD   0x7c000436
 #define PPC_INST_SRAD  0x7c000634
 #define PPC_INST_SRADI 0x7c000674
diff --git a/arch/powerpc/net/bpf_jit.h b/arch/powerpc/net/bpf_jit.h
index 47fc666..c2d5192 100644
--- a/arch/powerpc/net/bpf_jit.h
+++ b/arch/powerpc/net/bpf_jit.h
@@ -152,6 +152,10 @@
 ___PPC_RS(a) | ___PPC_RB(s))
 #define PPC_SRW(d, a, s)   EMIT(PPC_INST_SRW | ___PPC_RA(d) |\
 ___PPC_RS(a) | ___PPC_RB(s))
+#define PPC_SRAW(d, a, s)  EMIT(PPC_INST_SRAW | ___PPC_RA(d) |   \
+___PPC_RS(a) | ___PPC_RB(s))
+#define PPC_SRAWI(d, a, i) EMIT(PPC_INST_SRAWI | ___PPC_RA(d) |  \
+___PPC_RS(a) | __PPC_SH(i))
 #define PPC_SRD(d, a, s)   EMIT(PPC_INST_SRD | ___PPC_RA(d) |\
 ___PPC_RS(a) | ___PPC_RB(s))
 #define PPC_SRAD(d, a, s)  EMIT(PPC_INST_SRAD | ___PPC_RA(d) |   \
diff --git a/arch/powerpc/net/bpf_jit_comp64.c 
b/arch/powerpc/net/bpf_jit_comp64.c
index 17482f5..c685b4f 100644
--- a/arch/powerpc/net/bpf_jit_comp64.c
+++ b/arch/powerpc/net/bpf_jit_comp64.c
@@ -529,9 +529,15 @@ static int bpf_jit_build_body(struct bpf_prog *fp, u32 
*image,
if (imm != 0)
PPC_SRDI(dst_reg, dst_reg, imm);
break;
+   case BPF_ALU | BPF_ARSH | BPF_X: /* (s32) dst >>= src */
+   PPC_SRAW(dst_reg, dst_reg, src_reg);
+   break;
case BPF_ALU64 | BPF_ARSH | BPF_X: /* (s64) dst >>= src */
PPC_SRAD(dst_reg, dst_reg, src_reg);
break;
+   case BPF_ALU | BPF_ARSH | BPF_K: /* (s32) dst >>= imm */
+   PPC_SRAWI(dst_reg, dst_reg, imm);
+   break;
case BPF_ALU64 | BPF_ARSH | BPF_K: /* (s64) dst >>= imm */
if (imm != 0)
PPC_SRADI(dst_reg, dst_reg, imm);
-- 
2.7.4



[RFC bpf-next 2/7] mips: bpf: implement jitting of BPF_ALU | BPF_ARSH | BPF_X

2018-12-04 Thread Jiong Wang
Jitting of BPF_K is supported already, but not BPF_X. This patch complete
the support for the latter on both MIPS and microMIPS.

Cc: Paul Burton 
Cc: linux-m...@vger.kernel.org
Signed-off-by: Jiong Wang 
---
 arch/mips/include/asm/uasm.h  | 1 +
 arch/mips/include/uapi/asm/inst.h | 1 +
 arch/mips/mm/uasm-micromips.c | 1 +
 arch/mips/mm/uasm-mips.c  | 1 +
 arch/mips/mm/uasm.c   | 9 +
 arch/mips/net/ebpf_jit.c  | 4 
 6 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/arch/mips/include/asm/uasm.h b/arch/mips/include/asm/uasm.h
index 59dae37..b1990dd 100644
--- a/arch/mips/include/asm/uasm.h
+++ b/arch/mips/include/asm/uasm.h
@@ -157,6 +157,7 @@ Ip_u2u1s3(_slti);
 Ip_u2u1s3(_sltiu);
 Ip_u3u1u2(_sltu);
 Ip_u2u1u3(_sra);
+Ip_u3u2u1(_srav);
 Ip_u2u1u3(_srl);
 Ip_u3u2u1(_srlv);
 Ip_u3u1u2(_subu);
diff --git a/arch/mips/include/uapi/asm/inst.h 
b/arch/mips/include/uapi/asm/inst.h
index 273ef58..40fbb5d 100644
--- a/arch/mips/include/uapi/asm/inst.h
+++ b/arch/mips/include/uapi/asm/inst.h
@@ -371,6 +371,7 @@ enum mm_32a_minor_op {
mm_srl32_op = 0x040,
mm_srlv32_op = 0x050,
mm_sra_op = 0x080,
+   mm_srav_op = 0x090,
mm_rotr_op = 0x0c0,
mm_lwxs_op = 0x118,
mm_addu32_op = 0x150,
diff --git a/arch/mips/mm/uasm-micromips.c b/arch/mips/mm/uasm-micromips.c
index 24e5b0d..75ef904 100644
--- a/arch/mips/mm/uasm-micromips.c
+++ b/arch/mips/mm/uasm-micromips.c
@@ -104,6 +104,7 @@ static const struct insn insn_table_MM[insn_invalid] = {
[insn_sltiu]= {M(mm_sltiu32_op, 0, 0, 0, 0, 0), RT | RS | SIMM},
[insn_sltu] = {M(mm_pool32a_op, 0, 0, 0, 0, mm_sltu_op), RT | RS | 
RD},
[insn_sra]  = {M(mm_pool32a_op, 0, 0, 0, 0, mm_sra_op), RT | RS | 
RD},
+   [insn_srav] = {M(mm_pool32a_op, 0, 0, 0, 0, mm_srav_op), RT | RS | 
RD},
[insn_srl]  = {M(mm_pool32a_op, 0, 0, 0, 0, mm_srl32_op), RT | RS | 
RD},
[insn_srlv] = {M(mm_pool32a_op, 0, 0, 0, 0, mm_srlv32_op), RT | RS 
| RD},
[insn_rotr] = {M(mm_pool32a_op, 0, 0, 0, 0, mm_rotr_op), RT | RS | 
RD},
diff --git a/arch/mips/mm/uasm-mips.c b/arch/mips/mm/uasm-mips.c
index 60ceb93..6abe40f 100644
--- a/arch/mips/mm/uasm-mips.c
+++ b/arch/mips/mm/uasm-mips.c
@@ -171,6 +171,7 @@ static const struct insn insn_table[insn_invalid] = {
[insn_sltiu]= {M(sltiu_op, 0, 0, 0, 0, 0), RS | RT | SIMM},
[insn_sltu] = {M(spec_op, 0, 0, 0, 0, sltu_op), RS | RT | RD},
[insn_sra]  = {M(spec_op, 0, 0, 0, 0, sra_op),  RT | RD | RE},
+   [insn_srav] = {M(spec_op, 0, 0, 0, 0, srav_op), RS | RT | RD},
[insn_srl]  = {M(spec_op, 0, 0, 0, 0, srl_op),  RT | RD | RE},
[insn_srlv] = {M(spec_op, 0, 0, 0, 0, srlv_op),  RS | RT | RD},
[insn_subu] = {M(spec_op, 0, 0, 0, 0, subu_op), RS | RT | RD},
diff --git a/arch/mips/mm/uasm.c b/arch/mips/mm/uasm.c
index 57570c0..45b6264 100644
--- a/arch/mips/mm/uasm.c
+++ b/arch/mips/mm/uasm.c
@@ -61,10 +61,10 @@ enum opcode {
insn_mthc0, insn_mthi, insn_mtlo, insn_mul, insn_multu, insn_nor,
insn_or, insn_ori, insn_pref, insn_rfe, insn_rotr, insn_sb,
insn_sc, insn_scd, insn_sd, insn_sh, insn_sll, insn_sllv,
-   insn_slt, insn_slti, insn_sltiu, insn_sltu, insn_sra, insn_srl,
-   insn_srlv, insn_subu, insn_sw, insn_sync, insn_syscall, insn_tlbp,
-   insn_tlbr, insn_tlbwi, insn_tlbwr, insn_wait, insn_wsbh, insn_xor,
-   insn_xori, insn_yield,
+   insn_slt, insn_slti, insn_sltiu, insn_sltu, insn_sra, insn_srav,
+   insn_srl, insn_srlv, insn_subu, insn_sw, insn_sync, insn_syscall,
+   insn_tlbp, insn_tlbr, insn_tlbwi, insn_tlbwr, insn_wait, insn_wsbh,
+   insn_xor, insn_xori, insn_yield,
insn_invalid /* insn_invalid must be last */
 };
 
@@ -353,6 +353,7 @@ I_u2u1s3(_slti)
 I_u2u1s3(_sltiu)
 I_u3u1u2(_sltu)
 I_u2u1u3(_sra)
+I_u3u2u1(_srav)
 I_u2u1u3(_srl)
 I_u3u2u1(_srlv)
 I_u2u1u3(_rotr)
diff --git a/arch/mips/net/ebpf_jit.c b/arch/mips/net/ebpf_jit.c
index aeb7b1b..b16710a 100644
--- a/arch/mips/net/ebpf_jit.c
+++ b/arch/mips/net/ebpf_jit.c
@@ -854,6 +854,7 @@ static int build_one_insn(const struct bpf_insn *insn, 
struct jit_ctx *ctx,
case BPF_ALU | BPF_MOD | BPF_X: /* ALU_REG */
case BPF_ALU | BPF_LSH | BPF_X: /* ALU_REG */
case BPF_ALU | BPF_RSH | BPF_X: /* ALU_REG */
+   case BPF_ALU | BPF_ARSH | BPF_X: /* ALU_REG */
src = ebpf_to_mips_reg(ctx, insn, src_reg_no_fp);
dst = ebpf_to_mips_reg(ctx, insn, dst_reg);
if (src < 0 || dst < 0)
@@ -913,6 +914,9 @@ static int build_one_insn(const struct bpf_insn *insn, 
struct jit_ctx *ctx,
case BPF_RSH:
emit_instr(ctx, srlv, dst, dst, src);
break;
+   case BPF_ARSH:
+   emit_instr(ctx, srav, dst, dst, src);
+  

[RFC bpf-next 5/7] nfp: bpf: implement jitting of BPF_ALU | BPF_ARSH | BPF_*

2018-12-04 Thread Jiong Wang
BPF_X support needs indirect shift mode, please see code comments for
details.

Reviewed-by: Jakub Kicinski 
Signed-off-by: Jiong Wang 
---
 drivers/net/ethernet/netronome/nfp/bpf/jit.c | 45 
 1 file changed, 45 insertions(+)

diff --git a/drivers/net/ethernet/netronome/nfp/bpf/jit.c 
b/drivers/net/ethernet/netronome/nfp/bpf/jit.c
index 97d33bb..662cbc2 100644
--- a/drivers/net/ethernet/netronome/nfp/bpf/jit.c
+++ b/drivers/net/ethernet/netronome/nfp/bpf/jit.c
@@ -2382,6 +2382,49 @@ static int neg_reg(struct nfp_prog *nfp_prog, struct 
nfp_insn_meta *meta)
return 0;
 }
 
+static int __ashr_imm(struct nfp_prog *nfp_prog, u8 dst, u8 shift_amt)
+{
+   /* Set signedness bit (MSB of result). */
+   emit_alu(nfp_prog, reg_none(), reg_a(dst), ALU_OP_OR, reg_imm(0));
+   emit_shf(nfp_prog, reg_both(dst), reg_none(), SHF_OP_ASHR, reg_b(dst),
+SHF_SC_R_SHF, shift_amt);
+   wrp_immed(nfp_prog, reg_both(dst + 1), 0);
+
+   return 0;
+}
+
+static int ashr_reg(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+   const struct bpf_insn *insn = >insn;
+   u64 umin, umax;
+   u8 dst, src;
+
+   dst = insn->dst_reg * 2;
+   umin = meta->umin_src;
+   umax = meta->umax_src;
+   if (umin == umax)
+   return __ashr_imm(nfp_prog, dst, umin);
+
+   src = insn->src_reg * 2;
+   /* NOTE: the first insn will set both indirect shift amount (source A)
+* and signedness bit (MSB of result).
+*/
+   emit_alu(nfp_prog, reg_none(), reg_a(src), ALU_OP_OR, reg_b(dst));
+   emit_shf_indir(nfp_prog, reg_both(dst), reg_none(), SHF_OP_ASHR,
+  reg_b(dst), SHF_SC_R_SHF);
+   wrp_immed(nfp_prog, reg_both(dst + 1), 0);
+
+   return 0;
+}
+
+static int ashr_imm(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+   const struct bpf_insn *insn = >insn;
+   u8 dst = insn->dst_reg * 2;
+
+   return __ashr_imm(nfp_prog, dst, insn->imm);
+}
+
 static int shl_imm(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
 {
const struct bpf_insn *insn = >insn;
@@ -3286,6 +3329,8 @@ static const instr_cb_t instr_cb[256] = {
[BPF_ALU | BPF_DIV | BPF_K] =   div_imm,
[BPF_ALU | BPF_NEG] =   neg_reg,
[BPF_ALU | BPF_LSH | BPF_K] =   shl_imm,
+   [BPF_ALU | BPF_ARSH | BPF_X] =  ashr_reg,
+   [BPF_ALU | BPF_ARSH | BPF_K] =  ashr_imm,
[BPF_ALU | BPF_END | BPF_X] =   end_reg32,
[BPF_LD | BPF_IMM | BPF_DW] =   imm_ld8,
[BPF_LD | BPF_ABS | BPF_B] =data_ld1,
-- 
2.7.4



[RFC bpf-next 1/7] bpf: interpreter support BPF_ALU | BPF_ARSH

2018-12-04 Thread Jiong Wang
This patch implements interpreting BPF_ALU | BPF_ARSH. Do arithmetic right
shift on low 32-bit sub-register, and zero the high 32 bits.

Reviewed-by: Jakub Kicinski 
Signed-off-by: Jiong Wang 
---
 kernel/bpf/core.c | 52 ++--
 1 file changed, 30 insertions(+), 22 deletions(-)

diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index f93ed66..36e31d8 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -923,32 +923,34 @@ EXPORT_SYMBOL_GPL(__bpf_call_base);
 #define BPF_INSN_MAP(INSN_2, INSN_3)   \
/* 32 bit ALU operations. */\
/*   Register based. */ \
-   INSN_3(ALU, ADD, X),\
-   INSN_3(ALU, SUB, X),\
-   INSN_3(ALU, AND, X),\
-   INSN_3(ALU, OR,  X),\
-   INSN_3(ALU, LSH, X),\
-   INSN_3(ALU, RSH, X),\
-   INSN_3(ALU, XOR, X),\
-   INSN_3(ALU, MUL, X),\
-   INSN_3(ALU, MOV, X),\
-   INSN_3(ALU, DIV, X),\
-   INSN_3(ALU, MOD, X),\
+   INSN_3(ALU, ADD,  X),   \
+   INSN_3(ALU, SUB,  X),   \
+   INSN_3(ALU, AND,  X),   \
+   INSN_3(ALU, OR,   X),   \
+   INSN_3(ALU, LSH,  X),   \
+   INSN_3(ALU, RSH,  X),   \
+   INSN_3(ALU, XOR,  X),   \
+   INSN_3(ALU, MUL,  X),   \
+   INSN_3(ALU, MOV,  X),   \
+   INSN_3(ALU, ARSH, X),   \
+   INSN_3(ALU, DIV,  X),   \
+   INSN_3(ALU, MOD,  X),   \
INSN_2(ALU, NEG),   \
INSN_3(ALU, END, TO_BE),\
INSN_3(ALU, END, TO_LE),\
/*   Immediate based. */\
-   INSN_3(ALU, ADD, K),\
-   INSN_3(ALU, SUB, K),\
-   INSN_3(ALU, AND, K),\
-   INSN_3(ALU, OR,  K),\
-   INSN_3(ALU, LSH, K),\
-   INSN_3(ALU, RSH, K),\
-   INSN_3(ALU, XOR, K),\
-   INSN_3(ALU, MUL, K),\
-   INSN_3(ALU, MOV, K),\
-   INSN_3(ALU, DIV, K),\
-   INSN_3(ALU, MOD, K),\
+   INSN_3(ALU, ADD,  K),   \
+   INSN_3(ALU, SUB,  K),   \
+   INSN_3(ALU, AND,  K),   \
+   INSN_3(ALU, OR,   K),   \
+   INSN_3(ALU, LSH,  K),   \
+   INSN_3(ALU, RSH,  K),   \
+   INSN_3(ALU, XOR,  K),   \
+   INSN_3(ALU, MUL,  K),   \
+   INSN_3(ALU, MOV,  K),   \
+   INSN_3(ALU, ARSH, K),   \
+   INSN_3(ALU, DIV,  K),   \
+   INSN_3(ALU, MOD,  K),   \
/* 64 bit ALU operations. */\
/*   Register based. */ \
INSN_3(ALU64, ADD,  X), \
@@ -1127,6 +1129,12 @@ static u64 ___bpf_prog_run(u64 *regs, const struct 
bpf_insn *insn, u64 *stack)
DST = (u64) (u32) insn[0].imm | ((u64) (u32) insn[1].imm) << 32;
insn++;
CONT;
+   ALU_ARSH_X:
+   DST = (u64) (u32) ((*(s32 *) ) >> SRC);
+   CONT;
+   ALU_ARSH_K:
+   DST = (u64) (u32) ((*(s32 *) ) >> IMM);
+   CONT;
ALU64_ARSH_X:
(*(s64 *) ) >>= SRC;
CONT;
-- 
2.7.4



[RFC bpf-next 7/7] selftests: bpf: update testcases for BPF_ALU | BPF_ARSH

2018-12-04 Thread Jiong Wang
"arsh32 on imm" and "arsh32 on reg" now are accepted. Also added two new
testcases to make sure arsh32 won't be treated as arsh64 during
interpretation or JIT code-gen for which case the high bits will be moved
into low halve that the testcases could catch them.

Reviewed-by: Jakub Kicinski 
Signed-off-by: Jiong Wang 
---
 tools/testing/selftests/bpf/test_verifier.c | 29 +
 1 file changed, 25 insertions(+), 4 deletions(-)

diff --git a/tools/testing/selftests/bpf/test_verifier.c 
b/tools/testing/selftests/bpf/test_verifier.c
index 18d0b7f..e7f1bc7 100644
--- a/tools/testing/selftests/bpf/test_verifier.c
+++ b/tools/testing/selftests/bpf/test_verifier.c
@@ -721,8 +721,18 @@ static struct bpf_test tests[] = {
BPF_ALU32_IMM(BPF_ARSH, BPF_REG_0, 5),
BPF_EXIT_INSN(),
},
-   .result = REJECT,
-   .errstr = "unknown opcode c4",
+   .result = ACCEPT,
+   .retval = 0,
+   },
+   {
+   "arsh32 on imm 2",
+   .insns = {
+   BPF_LD_IMM64(BPF_REG_0, 0x1122334485667788),
+   BPF_ALU32_IMM(BPF_ARSH, BPF_REG_0, 7),
+   BPF_EXIT_INSN(),
+   },
+   .result = ACCEPT,
+   .retval = -16069393,
},
{
"arsh32 on reg",
@@ -732,8 +742,19 @@ static struct bpf_test tests[] = {
BPF_ALU32_REG(BPF_ARSH, BPF_REG_0, BPF_REG_1),
BPF_EXIT_INSN(),
},
-   .result = REJECT,
-   .errstr = "unknown opcode cc",
+   .result = ACCEPT,
+   .retval = 0,
+   },
+   {
+   "arsh32 on reg 2",
+   .insns = {
+   BPF_LD_IMM64(BPF_REG_0, 0x55667788),
+   BPF_MOV64_IMM(BPF_REG_1, 15),
+   BPF_ALU32_REG(BPF_ARSH, BPF_REG_0, BPF_REG_1),
+   BPF_EXIT_INSN(),
+   },
+   .result = ACCEPT,
+   .retval = 43724,
},
{
"arsh64 on imm",
-- 
2.7.4



[RFC bpf-next 4/7] s390: bpf: implement jitting of BPF_ALU | BPF_ARSH | BPF_*

2018-12-04 Thread Jiong Wang
This patch implements code-gen for BPF_ALU | BPF_ARSH | BPF_*.

Cc: Martin Schwidefsky 
Cc: Heiko Carstens 
Signed-off-by: Jiong Wang 
---
 arch/s390/net/bpf_jit_comp.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/arch/s390/net/bpf_jit_comp.c b/arch/s390/net/bpf_jit_comp.c
index d7052cb..3ff758e 100644
--- a/arch/s390/net/bpf_jit_comp.c
+++ b/arch/s390/net/bpf_jit_comp.c
@@ -821,10 +821,22 @@ static noinline int bpf_jit_insn(struct bpf_jit *jit, 
struct bpf_prog *fp, int i
/*
 * BPF_ARSH
 */
+   case BPF_ALU | BPF_ARSH | BPF_X: /* ((s32) dst) >>= src */
+   /* sra %dst,%dst,0(%src) */
+   EMIT4_DISP(0x8a00, dst_reg, src_reg, 0);
+   EMIT_ZERO(dst_reg);
+   break;
case BPF_ALU64 | BPF_ARSH | BPF_X: /* ((s64) dst) >>= src */
/* srag %dst,%dst,0(%src) */
EMIT6_DISP_LH(0xeb00, 0x000a, dst_reg, dst_reg, src_reg, 0);
break;
+   case BPF_ALU | BPF_ARSH | BPF_K: /* ((s32) dst >> imm */
+   if (imm == 0)
+   break;
+   /* sra %dst,imm(%r0) */
+   EMIT4_DISP(0x8a00, dst_reg, REG_0, imm);
+   EMIT_ZERO(dst_reg);
+   break;
case BPF_ALU64 | BPF_ARSH | BPF_K: /* ((s64) dst) >>= imm */
if (imm == 0)
break;
-- 
2.7.4



[RFC bpf-next 0/7] bpf: support BPF_ALU | BPF_ARSH

2018-12-04 Thread Jiong Wang
BPF_ALU | BPF_ARSH | BPF_* were rejected by commit: 7891a87efc71
("bpf: arsh is not supported in 32 bit alu thus reject it"). As explained
in the commit message, this is due to there is no complete support for them
on interpreter and various JIT compilation back-ends.

This patch set is a follow-up which completes the missing bits. This also
pave the way for running bpf program compiled with ALU32 instruction
enabled by specifing -mattr=+alu32 to LLVM for which case there is likely
have more BPF_ALU | BPF_ARSH insns that will trigger the rejection code.

test_verifier.c is updated accordingly.

I have tested this patch set on x86-64 and NFP, I need help of review and
test on the arch changes (mips/ppc/s390).

Note, to avoid merge confict, mips change needs to be applied on top of
commit: 20b880a05f06 ("mips: bpf: fix encoding bug for mm_srlv32_op"), it
is on mips-fixes branch at the moment.

Thanks.

Jiong Wang (7):
  bpf: interpreter support BPF_ALU | BPF_ARSH
  mips: bpf: implement jitting of BPF_ALU | BPF_ARSH | BPF_X
  ppc: bpf: implement jitting of BPF_ALU | BPF_ARSH | BPF_*
  s390: bpf: implement jitting of BPF_ALU | BPF_ARSH | BPF_*
  nfp: bpf: implement jitting of BPF_ALU | BPF_ARSH | BPF_*
  bpf: verifier remove the rejection on BPF_ALU | BPF_ARSH
  selftests: bpf: update testcases for BPF_ALU | BPF_ARSH

 arch/mips/include/asm/uasm.h |  1 +
 arch/mips/include/uapi/asm/inst.h|  1 +
 arch/mips/mm/uasm-micromips.c|  1 +
 arch/mips/mm/uasm-mips.c |  1 +
 arch/mips/mm/uasm.c  |  9 ++---
 arch/mips/net/ebpf_jit.c |  4 +++
 arch/powerpc/include/asm/ppc-opcode.h|  2 ++
 arch/powerpc/net/bpf_jit.h   |  4 +++
 arch/powerpc/net/bpf_jit_comp64.c|  6 
 arch/s390/net/bpf_jit_comp.c | 12 +++
 drivers/net/ethernet/netronome/nfp/bpf/jit.c | 45 
 kernel/bpf/core.c| 52 
 kernel/bpf/verifier.c|  5 ---
 tools/testing/selftests/bpf/test_verifier.c  | 29 +---
 14 files changed, 137 insertions(+), 35 deletions(-)

-- 
2.7.4



[PATCH v2 bpf] mips: bpf: fix encoding bug for mm_srlv32_op

2018-12-03 Thread Jiong Wang
For micro-mips, srlv inside POOL32A encoding space should use 0x50
sub-opcode, NOT 0x90.

Some early version ISA doc describes the encoding as 0x90 for both srlv and
srav, this looks to me was a typo. I checked Binutils libopcode
implementation which is using 0x50 for srlv and 0x90 for srav.

v1->v2:
  - Keep mm_srlv32_op sorted by value.

Fixes: f31318fdf324 ("MIPS: uasm: Add srlv uasm instruction")
Cc: Markos Chandras 
Cc: Paul Burton 
Cc: linux-m...@vger.kernel.org
Acked-by: Jakub Kicinski 
Acked-by: Song Liu 
Signed-off-by: Jiong Wang 
---
 arch/mips/include/uapi/asm/inst.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/mips/include/uapi/asm/inst.h 
b/arch/mips/include/uapi/asm/inst.h
index c05dcf5..273ef58 100644
--- a/arch/mips/include/uapi/asm/inst.h
+++ b/arch/mips/include/uapi/asm/inst.h
@@ -369,8 +369,8 @@ enum mm_32a_minor_op {
mm_ext_op = 0x02c,
mm_pool32axf_op = 0x03c,
mm_srl32_op = 0x040,
+   mm_srlv32_op = 0x050,
mm_sra_op = 0x080,
-   mm_srlv32_op = 0x090,
mm_rotr_op = 0x0c0,
mm_lwxs_op = 0x118,
mm_addu32_op = 0x150,
-- 
2.7.4



Re: [PATCH bpf] mips: bpf: fix encoding bug for mm_srlv32_op

2018-12-03 Thread Jiong Wang

On 03/12/2018 22:04, Paul Burton wrote:

Hi Jiong,

On Sat, Dec 01, 2018 at 04:10:01AM -0500, Jiong Wang wrote:

For micro-mips, srlv inside POOL32A encoding space should use 0x50
sub-opcode, NOT 0x90.

Some early version ISA doc describes the encoding as 0x90 for both srlv and
srav, this looks to me was a typo. I checked Binutils libopcode
implementation which is using 0x50 for srlv and 0x90 for srav.

Are you aware of documentation that gets this right?


No, I am not aware of.


Looking at the
latest microMIPS spec I have available (6.05) it looks like this is
still documented incorrectly. I'll pass this along to the architecture
team.


Fixes: f31318fdf324 ("MIPS: uasm: Add srlv uasm instruction")
CC: Markos Chandras 
CC: Paul Burton 
Acked-by: Jakub Kicinski 
Signed-off-by: Jiong Wang 
---
  arch/mips/include/uapi/asm/inst.h | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/mips/include/uapi/asm/inst.h 
b/arch/mips/include/uapi/asm/inst.h
index c05dcf5..80f35e7 100644
--- a/arch/mips/include/uapi/asm/inst.h
+++ b/arch/mips/include/uapi/asm/inst.h
@@ -370,7 +370,7 @@ enum mm_32a_minor_op {
mm_pool32axf_op = 0x03c,
mm_srl32_op = 0x040,
mm_sra_op = 0x080,
-   mm_srlv32_op = 0x090,
+   mm_srlv32_op = 0x050,
mm_rotr_op = 0x0c0,
mm_lwxs_op = 0x118,
mm_addu32_op = 0x150,

Could you also move it above mm_sra_op to keep them sorted by value?

When submitting v2 please also copy the linux-mips mailing list -
linux-m...@vger.kernel.org.


Thanks, will do.

Regards,
Jiong



[PATCH bpf] mips: bpf: fix encoding bug for mm_srlv32_op

2018-12-01 Thread Jiong Wang
For micro-mips, srlv inside POOL32A encoding space should use 0x50
sub-opcode, NOT 0x90.

Some early version ISA doc describes the encoding as 0x90 for both srlv and
srav, this looks to me was a typo. I checked Binutils libopcode
implementation which is using 0x50 for srlv and 0x90 for srav.

Fixes: f31318fdf324 ("MIPS: uasm: Add srlv uasm instruction")
CC: Markos Chandras 
CC: Paul Burton 
Acked-by: Jakub Kicinski 
Signed-off-by: Jiong Wang 
---
 arch/mips/include/uapi/asm/inst.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/mips/include/uapi/asm/inst.h 
b/arch/mips/include/uapi/asm/inst.h
index c05dcf5..80f35e7 100644
--- a/arch/mips/include/uapi/asm/inst.h
+++ b/arch/mips/include/uapi/asm/inst.h
@@ -370,7 +370,7 @@ enum mm_32a_minor_op {
mm_pool32axf_op = 0x03c,
mm_srl32_op = 0x040,
mm_sra_op = 0x080,
-   mm_srlv32_op = 0x090,
+   mm_srlv32_op = 0x050,
mm_rotr_op = 0x0c0,
mm_lwxs_op = 0x118,
mm_addu32_op = 0x150,
-- 
2.7.4



[PATCH bpf-next 0/2] bpf: offer maximum packet offset info

2018-11-08 Thread Jiong Wang
The maximum packet offset accessed by one BPF program is useful
information.

Because sometimes there could be packet split and it is possible for some
reasons (for example performance) we want to reject the BPF program if the
maximum packet size would trigger such split. Normally, MTU value is
treated as the maximum packet size, but one BPF program does not always
access the whole packet, it could only access the head portion of the data.

We could let verifier calculate the maximum packet offset ever used and
record it inside prog auxiliar information structure as a new field
"max_pkt_offset".

Jiong Wang (2):
  bpf: let verifier to calculate and record max_pkt_offset
  nfp: bpf: relax prog rejection through max_pkt_offset

 drivers/net/ethernet/netronome/nfp/bpf/offload.c |  9 +
 include/linux/bpf.h  |  1 +
 kernel/bpf/verifier.c| 12 
 3 files changed, 18 insertions(+), 4 deletions(-)

-- 
2.7.4



[PATCH bpf-next 2/2] nfp: bpf: relax prog rejection through max_pkt_offset

2018-11-08 Thread Jiong Wang
NFP is refusing to offload programs whenever the MTU is set to a value
larger than the max packet bytes that fits in NFP Cluster Target Memory
(CTM). However, a eBPF program doesn't always need to access the whole
packet data.

Verifier has always calculated maximum direct packet access (DPA) offset,
and kept it in max_pkt_offset inside prog auxiliar information. This patch
relax prog rejection based on max_pkt_offset.

Reviewed-by: Jakub Kicinski 
Signed-off-by: Jiong Wang 
---
 drivers/net/ethernet/netronome/nfp/bpf/offload.c | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/netronome/nfp/bpf/offload.c 
b/drivers/net/ethernet/netronome/nfp/bpf/offload.c
index ba8ceed..07bdc1f 100644
--- a/drivers/net/ethernet/netronome/nfp/bpf/offload.c
+++ b/drivers/net/ethernet/netronome/nfp/bpf/offload.c
@@ -489,14 +489,15 @@ nfp_net_bpf_load(struct nfp_net *nn, struct bpf_prog 
*prog,
 struct netlink_ext_ack *extack)
 {
struct nfp_prog *nfp_prog = prog->aux->offload->dev_priv;
-   unsigned int max_mtu, max_stack, max_prog_len;
+   unsigned int fw_mtu, pkt_off, max_stack, max_prog_len;
dma_addr_t dma_addr;
void *img;
int err;
 
-   max_mtu = nn_readb(nn, NFP_NET_CFG_BPF_INL_MTU) * 64 - 32;
-   if (max_mtu < nn->dp.netdev->mtu) {
-   NL_SET_ERR_MSG_MOD(extack, "BPF offload not supported with MTU 
larger than HW packet split boundary");
+   fw_mtu = nn_readb(nn, NFP_NET_CFG_BPF_INL_MTU) * 64 - 32;
+   pkt_off = min(prog->aux->max_pkt_offset, nn->dp.netdev->mtu);
+   if (fw_mtu < pkt_off) {
+   NL_SET_ERR_MSG_MOD(extack, "BPF offload not supported with 
potential packet access beyond HW packet split boundary");
return -EOPNOTSUPP;
}
 
-- 
2.7.4



[PATCH bpf-next 1/2] bpf: let verifier to calculate and record max_pkt_offset

2018-11-08 Thread Jiong Wang
In check_packet_access, update max_pkt_offset after the offset has passed
__check_packet_access.

It should be safe to use u32 for max_pkt_offset as explained in code
comment.

Also, when there is tail call, the max_pkt_offset of the called program is
unknown, so conservatively set max_pkt_offset to MAX_PACKET_OFF for such
case.

Reviewed-by: Jakub Kicinski 
Signed-off-by: Jiong Wang 
---
 include/linux/bpf.h   |  1 +
 kernel/bpf/verifier.c | 12 
 2 files changed, 13 insertions(+)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 33014ae..b6a296e 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -293,6 +293,7 @@ struct bpf_prog_aux {
atomic_t refcnt;
u32 used_map_cnt;
u32 max_ctx_offset;
+   u32 max_pkt_offset;
u32 stack_depth;
u32 id;
u32 func_cnt;
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 98fa0be..6a248d8 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -1452,6 +1452,17 @@ static int check_packet_access(struct bpf_verifier_env 
*env, u32 regno, int off,
verbose(env, "R%d offset is outside of the packet\n", regno);
return err;
}
+
+   /* __check_packet_access has made sure "off + size - 1" is within u16.
+* reg->umax_value can't be bigger than MAX_PACKET_OFF which is 0x,
+* otherwise find_good_pkt_pointers would have refused to set range info
+* that __check_packet_access would have rejected this pkt access.
+* Therefore, "off + reg->umax_value + size - 1" won't overflow u32.
+*/
+   env->prog->aux->max_pkt_offset =
+   max_t(u32, env->prog->aux->max_pkt_offset,
+ off + reg->umax_value + size - 1);
+
return err;
 }
 
@@ -6128,6 +6139,7 @@ static int fixup_bpf_calls(struct bpf_verifier_env *env)
 */
prog->cb_access = 1;
env->prog->aux->stack_depth = MAX_BPF_STACK;
+   env->prog->aux->max_pkt_offset = MAX_PACKET_OFF;
 
/* mark bpf_tail_call as different opcode to avoid
 * conditional branch in the interpeter for every normal
-- 
2.7.4



Re: [RFC PATCH v2 bpf-next 0/2] verifier liveness simplification

2018-10-08 Thread Jiong Wang

On 03/10/2018 17:53, Jiong Wang wrote:

On 03/10/2018 16:59, Alexei Starovoitov wrote:

On Wed, Oct 03, 2018 at 04:36:31PM +0100, Jiong Wang wrote:



Now this hasn't happened. I am still debugging the root cause, but kind of
feel
"64-bit" attribute propagation is the issue, it seems to me it can't be
nicely
integrated into the existing register read/write propagation infrastructure.


may be share your patches that modify the liveness propagation?


OK, I will share it after some clean up.


Have done more experiments on the algo, and finally got something passed my
local tests. bpf selftest passed as well except several test_verifier failures
due to some corner cases I guess, will have a look later.

In these test, x86-64 is using the 32-bit information. In the insn loop inside
sanitize_dead_code, once an insn is not marked as 64-bit and if it is ALU64, it
will then be changed to ALU32. When enable tracking using printk, I could see
quite a few ALU64 instructions really are rewritten into ALU32, so tests like
test_l4lb runs OK looks like a positive signal of the correctness.

The algo separates the problem into two steps.

  - First, assume there is no code path prune and all code paths will be walked
through. In this case, if we could propagate 64-bit info backward along the
use-def chains for all paths walked, one insn must will be marked as 64-bit
correctly. Finish this step requires building use-def chain, and it is done
in the following way:

 1. each insn could have two explicit uses, so add two chain fields in
bpf_insn_aux_data.
 2. we need finer enum to describe register use, so split SRC_OP into
SRC_OP_0, SRC_OP64_0, SRC_OP_1, SRC_OP64_1 to indicate the source
is the first/second source and whether it is a 64-bit source.
 3. create the chain at check_reg_arg which is exactly covering all
register use sites. The function to create the chain is link_reg_to_def.
 4. when creating the chain, if a source is a 64-bit source, also
propagating the information backward along use-def chain straight away.
This is done in mark_reg_read which further calls the new function
"mark_insn_64bit" which is doing the real job. "mark_insn_64bit" fetches
the def insn for the 64-bit source, and further marks the def insns of
its sources as 64-bit. This will be repeated until the whole use-def
chain consumed.
 5. by use-def chain described above, if there is no code path prune, one
insn must have been marked as 64-bit when it's result has 64-bit use.
 6. helper call causing implicit reg use and must be conservative treated
as 64-bit use, bpf-to-bpf function call has insn connected by use-def
so doesn't need to make that conservative assumption.

  - Second, to handle code path prune, define new liveness enum
REG_LIVE_READ64 and REG_LIVE_UNIQUE_WRITTEN. The latter will only be
set if reg_arg_type is the new U_DST_OP or U_DST_OP_NO_MARK, and
REG_LIVE_READ64 will be set if one 64-bit read is not screened off by
REG_LIVE_UNIQUE_WRITTEN.

The thought is 64-bit use info will only be screened off if the dst register
is unique in all register operands, meaning not the same as any source. For
example, insn 18 below will screen off r4 64-bit propagation.

  17: r3 += r7
  18: r4 = 1
  19: *(u64 *)(r10 - 32) = r3
  20: *(u64 *)(r10 - 40) = r4

So, U_DST_OP/U_DST_OP_NO_MARK have been introduced to differentiate with
DST_OP/DST_OP_NO_MARK. Inside check_reg_arg, checks are done on dst_reg,
and would pass U_DST_* as reg_arg_type when it is unique. U_DST_* then
will set liveness to REG_LIVE_UNIQUE_WRITTEN. In side mark_reg_read, if
one 64-bit read is not screened off by REG_LIVE_UNIQUE_WRITTEN, then
REG_LIVE_READ64 will be set in the reg_state. REG_LIVE_READ64 further
triggers propagating downstream 64-bit uses from the pruned paths into the
current path inside propagate_liveness when path prune happened.

Compared with propagating REG_LIVE_READ, propagating REG_LIVE_READ64 needs
more work. Because one 64-bit read could indicating more than one registers
are 64-bit. For example, insn 19 above shows r3 is 64-bit source, so its
define at insn 17 are 64-bit, then all sources of insn 17 must be 64-bit,
meaning both r3 and r7 are 64-bit. Therefore, REG_LIVE_READ64 needs to be
propagated on both r3 and r7 upward along the register parentage chain.
During walking def-use chain, we record all such affected reg_state, and
propagate REG_LIVE_READ64 for all of them. This logic is done inside
mark_insn_64bit as well.

  - For short, the implementation treating the new 64-bit info (REG_LIVE_READ64)
propagation the same as REG_LIVE_READ propagation.

REG_LIVE_READ triggers more path prune during state equal comparison, while
REG_LIVE_READ64 triggers 64-b

Re: [RFC PATCH v2 bpf-next 0/2] verifier liveness simplification

2018-10-03 Thread Jiong Wang

On 03/10/2018 16:59, Alexei Starovoitov wrote:

On Wed, Oct 03, 2018 at 04:36:31PM +0100, Jiong Wang wrote:



Now this hasn't happened. I am still debugging the root cause, but kind of
feel
"64-bit" attribute propagation is the issue, it seems to me it can't be
nicely
integrated into the existing register read/write propagation infrastructure.


may be share your patches that modify the liveness propagation?


OK, I will share it after some clean up.


For
example, for a slightly more complex sequence which is composed of three
states:

State A
  ...
  10: r6 = *(u32 *)(r10 - 4)
  11: r7 = *(u32 *)(r10 - 8)
  12: *(u64 *)(r10 - 16) = r6
  13: *(u64 *)(r10 - 24) = r7

State B
  14: r6 += 1
  15: r7 += r6
  16: *(u32 *)(r10 - 28) = r7

State C
  ...
  17: r3 += r7
  18: r4 = 1
  19: *(u64 *)(r10 - 32) = r3
  20: *(u64 *)(r10 - 40) = r4

State A is parent of state B which is parent of state C.

Inside state C, at insn 20, r4 is a 64-bit read/use, so its define at 18 is
marked as "64-bit". There is no register source at 18, so "64-bit" attribute
propagation is stopped.

Then at insn 19, r3 is a 64-bit read/use, so its define at 17 is marked as
"64-bit" read/use. Insn 17 has two register sources, r3 and r7, they become
"64-bit" now, and their definition should be marked as "64-bit".

Now if the definition of r3 or r7 comes from parent state, then the parent


... the definition of r3 _and_ r7 ...
both need to propagate up with your algo, right?


Yes, all sources of insn 17, both r3 and r7.


state
should receive a "REG_LIVE_READ64", this is necessary if later another path
reaches state C and triggers prune path, for which case that path should
know
there is "64-bit" use inside state C on some registers, and should use this
information to mark "64-bit" insn.

If the definition of r3 or r7 is still inside state C, we need to keep
walking
up the instruction sequences, and propagate "64-bit" attribute upward until
it
goes beyond the state C.

The above propagation logic is quite different from existing register
read/write propagation.
For the latter, a write just screen up all following read, and a
read would propagate directly to its parent is there is not previous write,
no instruction analysis is required.


correct.
with such algo REG_LIVE_WRITTEN shouldn't be screening the propagation.

I think the patches will discuss the algo.
Also I think the initial state of 'everything is 32-bit safe'
and make marks to enforce 64-bit-ness is a dangerous algorithmic choice.


Indeed, and I am actually thinking the same thing ...


Why not to start at a safer state where everything is 64-bit
and work backward to find out which ones can be 32-bit?
That will be safer algo in case there are issues with bit like
you described in above.


... but I failed to find a algo works on making initial state of
"everything is 64-bit". I haven't figured out a way to check that all use of one
definition are 32-bit on all possible code paths, which looks to me is a must 
for
one insn marked as 32-bit safe.



Re: [RFC PATCH v2 bpf-next 0/2] verifier liveness simplification

2018-10-03 Thread Jiong Wang

On 28/09/2018 14:36, Edward Cree wrote:
> On 26/09/18 23:16, Jiong Wang wrote:
>> On 22/08/2018 20:00, Edward Cree wrote:
>>> In the future this idea may be extended to form use-def chains.
>>
>>   1. instruction level use->def chain
>>
>>  - new use->def chains for each instruction. one eBPF insn could 
have two

>>    uses at maximum.
> I was thinking of something a lot weaker/simpler, just making
> ld rX, rY
>  copy rY.parent into rX.parent and not read-mark rY (whereas actual
>  arithmetic, pointer deref etc. would still create read marks).

Thanks for the feedback Edward.

> But what you've described sounds interesting; perhaps it would also
>  help later with loop-variable handling?

Haven't considered how to use this for loop-variable handling, guess you 
mean
applying what I have described to your previous loop detection RFC? I 
will look

into your RFC later.

At the moment the design of the use->def chain is mainly to optimize 32-bit
code-gen. I was about to satisfied with a local implementation and to 
share it

to ML for further discussion. However, when manually check the optimization
result on testcase with medium size (~1000 eBPF insns) and proper complexity
(make sure path prunes etc are triggered inside verifier), I found the 
code-gen

doesn't meet my expectation.

For example, for the following sequence, insn at 25 should operate on 
full-64

bit but I found it is marked as 32-bit safe.

  25:    r7 = 1
  26:    if r4 > r8 goto +1200 
  27:    r1 = *(u8 *)(r1 + 0)
  28:    r1 &= 15
  29:    r7 = 1
  ...

L:
  1227:  r0 = r7
  1228:  exit

As described at previous email, the algorithm assume all insns are 
32-bit safe

first, then start to insns back to "64-bit" if there is any 64-bit use found
for a insn.

Insn 25 is defining r7 which is used at the 1227 where its value 
propagated to

r0 and then r0 is implicitly used at insn 1228 as it is a exit from main
function to external.

For above example, as we don't know the external use of r0 at 1228 (exit 
from

main to external), so r0 is treated as 64-bit implicit use. The define is at
1227, so insn 1227 is marked as "64-bit". The "64-bit" attribute should
propagate to source register operand through register move and 
arithmetic, so
r7 at insn 1227 is a "64-bit" use and should make its definition 
instruction,

insn 25, marked as "64-bit". This is my thinking of how insn 25 should be
marked.

Now this hasn't happened. I am still debugging the root cause, but kind 
of feel
"64-bit" attribute propagation is the issue, it seems to me it can't be 
nicely
integrated into the existing register read/write propagation 
infrastructure. For
example, for a slightly more complex sequence which is composed of three 
states:


State A
  ...
  10: r6 = *(u32 *)(r10 - 4)
  11: r7 = *(u32 *)(r10 - 8)
  12: *(u64 *)(r10 - 16) = r6
  13: *(u64 *)(r10 - 24) = r7

State B
  14: r6 += 1
  15: r7 += r6
  16: *(u32 *)(r10 - 28) = r7

State C
  ...
  17: r3 += r7
  18: r4 = 1
  19: *(u64 *)(r10 - 32) = r3
  20: *(u64 *)(r10 - 40) = r4

State A is parent of state B which is parent of state C.

Inside state C, at insn 20, r4 is a 64-bit read/use, so its define at 18 is
marked as "64-bit". There is no register source at 18, so "64-bit" attribute
propagation is stopped.

Then at insn 19, r3 is a 64-bit read/use, so its define at 17 is marked as
"64-bit" read/use. Insn 17 has two register sources, r3 and r7, they become
"64-bit" now, and their definition should be marked as "64-bit".

Now if the definition of r3 or r7 comes from parent state, then the 
parent state

should receive a "REG_LIVE_READ64", this is necessary if later another path
reaches state C and triggers prune path, for which case that path should 
know

there is "64-bit" use inside state C on some registers, and should use this
information to mark "64-bit" insn.

If the definition of r3 or r7 is still inside state C, we need to keep 
walking
up the instruction sequences, and propagate "64-bit" attribute upward 
until it

goes beyond the state C.

The above propagation logic is quite different from existing register 
read/write
propagation. For the latter, a write just screen up all following read, 
and a
read would propagate directly to its parent is there is not previous 
write, no

instruction analysis is required.

I am just describing what I have run into and trying to resolve, any 
thoughts

and suggestions are appreciated.

Regards,
Jiong


Re: [RFC PATCH v2 bpf-next 0/2] verifier liveness simplification

2018-09-26 Thread Jiong Wang

On 22/08/2018 20:00, Edward Cree wrote:

The first patch is a simplification of register liveness tracking by using
 a separate parentage chain for each register and stack slot, thus avoiding
 the need for logic to handle callee-saved registers when applying read
 marks.  In the future this idea may be extended to form use-def chains.


Interesting.

This could potentially help efficient code-gen for 32-bit architectures and I
had been thinking some implementations along this line and would like to have
a discussion.

Program description
===
It will be good if we could know one instruction is safe to operate on low
32-bit sub-register only. If this is true:
  - 32-bit arches could save some code-gen.
  - 64-bit arches could utilize 32-bit sub-register instructions.

Algorithm
===
Based on the current verifier code-path-walker, it looks to me we could get
32-bit safety information using the following algorithm:

  1. assume all instructions operate on 32-bit sub-register initially.
  2. link each insn to insns which define its use.
  3. for a register use, if it is a 64-bit write back to memory, or if it is
 a comparison-then-jump based on 64-bit value, or if it is a right shift,
 then consider the use is a 64-bit use, then all its define insns and their
 parents define insns should be marked as need full 64-bit operation.
  4. currently, register read (REG_LIVE_READ) will be propagated to parent
 state when path prune happened, and REG_LIVE_WRITTEN would screen off such
 propagation. We need to propagate 64-bit mark in similar way, but
 REG_LIVE_WRITTEN shouldn't screen off backward 64-bit mark propagation if
 the written reg also used as source reg (this has raised REG_LIVE_READ
 propagation but it is not necessarily indicating 64-bit read) in the same
 instruction.

Implementation
===
And it seems there could be an implementation based on current liveness tracking
infrastructure without dramatic change.

  1. instruction level use->def chain

 - new active_defs array for each call frame to record which insn the active
   define of one register comes from. callee copies active_defs from caller
   for argument registers only, and copies active_def of R0 to caller when
   exit.

   struct bpf_func_state {
 ...
 s16 active_defs[__MAX_BPF_REG];

 - new use->def chains for each instruction. one eBPF insn could have two
   uses at maximum. also one new boolean "full_ref_def" added to keep the
   final 32-bit safety information. it will be true if this instruction
   needs to operate on full 64-bit.

   bpf_insn_aux_data {
   ...
 struct {
   s16 def[2];
   bool full_ref_def;
 };

  2. Inside mark_reg_read, split SRC_OP to SRC0_OP/SRC1_OP/SRC_OP_64/SRC1_OP_64
 to indicate one register read if from the first or second use slot of this
 instruction, and to indicate whether the read is on full 64-bit which will
 only be true if the read is for 64-bit write back to memory, or 64-bit
 comparison-then-jump, or bit right shift means 64-bit.

 Build use->def chain for any read, but do 64-bit backward propagation
 for 64-bit read only. The propagation is to set full_ref_def to true for
 def and parent defs of the 64-bit read.

   3. Need new bpf_reg_liveness enum, REG_LIVE_READ64 to indicate there is
  64-bit access to one register in the pruned path, and need
  REG_LIVE_WRITTEN64 to indicate a write that is REG_LIVE_WRITTEN but should
  not screen backward propagate 64-bit read info.

   #define REG_LIVE_NONE  0
   #define REG_LIVE_READ  BIT(0)
   #define REG_LIVE_READ64BIT(1)
   #define REG_LIVE_WRITTEN   BIT(2)
   #define REG_LIVE_WRITTEN64 BIT(3)

I might have missed something in above thoughts, would appreciate reviews and
suggestions. I will also send out some concrete patches later.

Regards,
Jiong



Re: [PATCH bpf-next 7/7] nfp: bpf: migrate to advanced reciprocal divide in reciprocal_div.h

2018-07-05 Thread Jiong Wang

On 26/06/2018 21:59, Jakub Kicinski wrote:

On Sun, 24 Jun 2018 20:54:21 -0700, Jakub Kicinski wrote:

+* NOTE: because we are using "reciprocal_value_adv" which doesn't
+* support dividend with MSB set, so we need to JIT separate NFP
+* sequence to handle such case. It could be a simple sequence if there
+* is conditional move, however there isn't for NFP. So, we don't bother
+* generating compare-if-set-branch sequence by rejecting the program
+* straight away when the u32 dividend has MSB set. Divide by such a
+* large constant would be rare in practice. Also, the programmer could
+* simply rewrite it as "result = divisor >= the_const".

Thinking about this again, can we just use carry bit?


Good catch, yes we can.


The code may end
up shorter than the explanation why we don't support that case :P

immed[c, 0]
alu[--, a, -, b]
alu[c, c, +carry, 0]


eBPF input will be "a = a / b", given "immed" doesn't affect carry bit,
I'd reorder the sequence so we only need one tmp register for holding
"b" who is constant.

  alu[--, a, -, b]
  immed[b, 0]
  alu[a, b, +carry, 0]
 
Thanks.

Regards,
Jiong



Should be equivalent to:

c = a >= b

(Thanks to Edwin for double-checking the carry semantics.)




Re: [PATCH bpf-next 2/7] lib: reciprocal_div: implement the improved algorithm on the paper mentioned

2018-06-28 Thread Jiong Wang
On Tue, Jun 26, 2018 at 7:21 AM, Song Liu  wrote:
> On Sun, Jun 24, 2018 at 8:54 PM, Jakub Kicinski
>  wrote:
>> From: Jiong Wang 



>> +
>> +struct reciprocal_value_adv reciprocal_value_adv(u32 d, u8 prec)
>> +{
>> +   struct reciprocal_value_adv R;
>> +   u32 l, post_shift;
>> +   u64 mhigh, mlow;
>> +
>> +   l = fls(d - 1);
>> +   post_shift = l;
>> +   /* NOTE: mlow/mhigh could overflow u64 when l == 32 which means d has
>> +* MSB set. This case needs to be handled before calling
>> +* "reciprocal_value_adv", please see the comment at
>> +* include/linux/reciprocal_div.h.
>> +*/
>
> Shall we handle l == 32 case better? I guess the concern here is extra
> handling may
> slow down the fast path?

The implementation of "reciprocal_value_adv" hasn't considered l  ==
32 which will make the code more complex.

As described at the pseudo code about how to call
"reciprocal_value_adv" in include/linux/reciprocal_div.h, l == 32
means the MSB of dividend is set, so the result of unsigned
divisor/dividend could only be 0 or 1, so the divide result could be
easily get by a comparison then conditional move 0 or 1 to the result.

> If that's the case, we should at least add a WARNING on the slow path.

OK, I will add a pr_warn inside "reciprocal_value_adv" when l == 32 is
triggered.

Thanks,
Jiong


Re: [RFC bpf-next 04/10] bpf: cfg: detect loop use domination information

2018-05-11 Thread Jiong Wang

On 10/05/2018 19:17, John Fastabend wrote:

On 05/07/2018 03:22 AM, Jiong Wang wrote:

If one bb is dominating its predecessor, then there is loop.

Signed-off-by: Jiong Wang <jiong.w...@netronome.com>
---
  kernel/bpf/cfg.c  | 22 ++
  kernel/bpf/cfg.h  |  1 +
  kernel/bpf/verifier.c |  8 
  3 files changed, 31 insertions(+)

diff --git a/kernel/bpf/cfg.c b/kernel/bpf/cfg.c
index b50937a..90692e4 100644
--- a/kernel/bpf/cfg.c
+++ b/kernel/bpf/cfg.c
@@ -568,6 +568,28 @@ int subprog_build_dom_info(struct bpf_subprog_info 
*subprog)
return ret;
  }
  
+bool subprog_has_loop(struct bpf_subprog_info *subprog)

+{
+   int lane_len = BITS_TO_LONGS(subprog->bb_num - 2);
+   struct list_head *bb_list = >bbs;
+   struct bb_node *bb, *entry_bb;
+   struct edge_node *e;
+
+   entry_bb = entry_bb(bb_list);
+   bb = bb_next(entry_bb);
+   list_for_each_entry_from(bb, _bb(bb_list)->l, l)
+   list_for_each_entry(e, >e_prevs, l) {
+   struct bb_node *latch = e->src;
+
+   if (latch != entry_bb &&
+   test_bit(bb->idx,
+subprog->dtree + latch->idx * lane_len))
+   return true;
+   }
+
+   return false;
+}
+

Because we are using this to guard against loops we need to detect
all loops not just reducible loops. And because (assuming my understanding
is correct) Tarjan's algorithm will only detect all loops when the
graph is reducible we need additional tests.


Hi John,

Yes, the current DOM based loop detection can't detect irreducible loop.

And I feel both the first and second approaches you listed below are good,
might worth implementing both and measure which one is with less overhead.

I could give a try on approach 2.  The alg given by Eric Stoltz might be a
good choice (https://compilers.iecc.com/comparch/article/94-01-053). We
have dom info now, so could do another DFS, and see if the head of one
back-edge doesn't dom the tail that there is multiple entries to the loop.

I guess the first approach is with less overhead as there is no existing
DFS pass to reuse after dom build that we need a new pass for approach 2.

Regards,
Jiong



There are a couple options to fix this with varying levels of complexity.
Because I'm using this to build loop info structures to find induction
variables and show termination. After the loop structures are built we
could search for any back-edges not in valid loops. This would be similar
to the existing back-edge detection code but with an extra check to
allow edges that have been validated. I would need to check that this
doesn't have any escapes before actually proposing it though.

The other method would be to properly test for reducibility using one of
the algorithms for this. I think the most intuitive is to remove back-edges
and test the graph is acyclic. This would be run before the dom tree is
built. This is IMO what we should do, it seems the most "correct" way to
do this.

The most complex would be to handle irreducible programs using some of the
more complex methods. I really don't think this is necessary but in theory
at least we could use something like Havlak-Tarjan algorithm and allow
some programs with irreducible loops. This is likely overkill especially
in a first iteration.

Here is a sample that fails without this series, using original back-edge
detection, but is allowed with this patch,

SEC("classifier_tc_mark")
int _tc_mark(struct __sk_buff *ctx)
{
 void *data  = (void *)(unsigned long)ctx->data;
 void *data_end  = (void *)(unsigned long)ctx->data_end;
 void *data_meta = (void *)(unsigned long)ctx->data_meta;
 struct meta_info *meta = data_meta;
 volatile int mark = ctx->mark;

 mark += 1;

 if (meta + 1 > data) {
B:
 mark += 2;

 if (mark < ctx->mark * 3)
 goto C;
 } else if (meta < data) {
C:
 mark += 1;
 if (mark < 1000)
 goto  B;
 }

 return TC_ACT_OK;
}

A more concise example could be made but I just hacked on one of the
sample programs. This generates the CFG as follows (I have a patch
on top of your stack to print the CFG and DOM tables)

CFG:  65535[-1,-1] -> 0[0,9]  0[0,9] -> 3[20,20]  0[0,9] -> 1[10,18]  1[10,18] -> 4[21,28]  1[10,18] 
-> 2[19,19]  2[19,19] -> 5[29,30]  3[20,20] -> 5[29,30]  3[20,20] -> 4[21,28]  4[21,28] -> 
1[10,18]  4[21,28] -> 5[29,30]  5[29,30] -> 65534[31,65534]
DOM:
  1  0  0  0  0  0
  1  1  0  0  0  0
  1  1  1  0  0  0
  1  0  0  1  0  0
  1  0  0  0  1  0
  1  0  0  0  0  1


Here we have the loop 1[10,18]->4[21,28] and the back-edge 4[21,28]->1[10,18].
The notation is #idx[head_

Re: [RFC bpf-next 00/10] initial control flow support for eBPF verifier

2018-05-07 Thread Jiong Wang

On 07/05/2018 11:22, Jiong Wang wrote:

execution time
===
test_l4lb_noinline:
   existing check_subprog/check_cfg: ~55000 ns
   new infrastructure: ~135000 ns

test_xdp_noinline:
   existing check_subprog/check_cfg: ~52000 ns
   new infrastructure: ~12 ns


Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz

Regards,
Jiong




[RFC bpf-next 04/10] bpf: cfg: detect loop use domination information

2018-05-07 Thread Jiong Wang
If one bb is dominating its predecessor, then there is loop.

Signed-off-by: Jiong Wang <jiong.w...@netronome.com>
---
 kernel/bpf/cfg.c  | 22 ++
 kernel/bpf/cfg.h  |  1 +
 kernel/bpf/verifier.c |  8 
 3 files changed, 31 insertions(+)

diff --git a/kernel/bpf/cfg.c b/kernel/bpf/cfg.c
index b50937a..90692e4 100644
--- a/kernel/bpf/cfg.c
+++ b/kernel/bpf/cfg.c
@@ -568,6 +568,28 @@ int subprog_build_dom_info(struct bpf_subprog_info 
*subprog)
return ret;
 }
 
+bool subprog_has_loop(struct bpf_subprog_info *subprog)
+{
+   int lane_len = BITS_TO_LONGS(subprog->bb_num - 2);
+   struct list_head *bb_list = >bbs;
+   struct bb_node *bb, *entry_bb;
+   struct edge_node *e;
+
+   entry_bb = entry_bb(bb_list);
+   bb = bb_next(entry_bb);
+   list_for_each_entry_from(bb, _bb(bb_list)->l, l)
+   list_for_each_entry(e, >e_prevs, l) {
+   struct bb_node *latch = e->src;
+
+   if (latch != entry_bb &&
+   test_bit(bb->idx,
+subprog->dtree + latch->idx * lane_len))
+   return true;
+   }
+
+   return false;
+}
+
 static void subprog_free_edge(struct bb_node *bb)
 {
struct list_head *succs = >e_succs;
diff --git a/kernel/bpf/cfg.h b/kernel/bpf/cfg.h
index cbb44f2..c02c4cf 100644
--- a/kernel/bpf/cfg.h
+++ b/kernel/bpf/cfg.h
@@ -12,6 +12,7 @@ int subprog_add_bb_edges(struct bpf_insn *insns, struct 
list_head *bb_list);
 int subprog_append_bb(struct list_head *bb_list, int head);
 int subprog_build_dom_info(struct bpf_subprog_info *subprog);
 int subprog_fini_bb(struct list_head *bb_list, int subprog_end);
+bool subprog_has_loop(struct bpf_subprog_info *subprog);
 int subprog_init_bb(struct list_head *bb_list, int subprog_start);
 void subprog_free(struct bpf_subprog_info *subprog, int end_idx);
 
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 6543470..a93aa43 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -879,6 +879,14 @@ static int check_subprogs(struct bpf_verifier_env *env)
if (ret < 0)
goto free_nodes;
subprog[cur_subprog].bb_num = ret;
+   ret = subprog_build_dom_info([cur_subprog]);
+   if (ret < 0)
+   goto free_nodes;
+   if (subprog_has_loop([cur_subprog])) {
+   verbose(env, "cfg - loop detected");
+   ret = -EINVAL;
+   goto free_nodes;
+   }
subprog_start = subprog_end;
cur_subprog++;
if (cur_subprog < env->subprog_cnt) {
-- 
2.7.4



[RFC bpf-next 09/10] bpf: cfg: reduce k*alloc/free call by using memory pool for allocating nodes

2018-05-07 Thread Jiong Wang
During building control flow graph, we need to build basic block nodes
and edge nodes etc. These nodes needs to allocated dynamically as we
don't have pre-scan pass to know how many nodes we need accurately.
It is better to manage their allocation using memory pool which could
reduce calling of *alloc/kfree functions and also could simplify freeing
nodes.

This patch:
  - implements a simple memory pool based node allocator.
nodes need dynamic allocation migrated to this allocator.
  - estimate node numbers inside subprog detection pass, asking allocator
to do an initial allocation of estimated size (aligned to 2K). The pool
will grow later if space are not enough.
  - There is no support on returning memory back to the pool.

Signed-off-by: Jiong Wang <jiong.w...@netronome.com>
---
 kernel/bpf/cfg.c  | 164 +-
 kernel/bpf/cfg.h  |  21 +--
 kernel/bpf/verifier.c |  39 +---
 3 files changed, 170 insertions(+), 54 deletions(-)

diff --git a/kernel/bpf/cfg.c b/kernel/bpf/cfg.c
index 174e3f0..1aeac49 100644
--- a/kernel/bpf/cfg.c
+++ b/kernel/bpf/cfg.c
@@ -82,7 +82,113 @@ struct dom_info {
u16 dfsnum;
 };
 
-int subprog_append_bb(struct list_head *bb_list, int head)
+struct node_pool {
+   struct list_head l;
+   void *data;
+   u32 size;
+   u32 used;
+};
+
+#define first_node_pool(pool_list) \
+   list_first_entry(pool_list, struct node_pool, l)
+
+#define MEM_CHUNK_SIZE (1024)
+
+static int cfg_node_allocator_grow(struct cfg_node_allocator *allocator,
+  int min_grow_size)
+{
+   int s = min_grow_size;
+   struct node_pool *pool;
+   void *data;
+
+   s += sizeof(struct node_pool);
+   s = ALIGN(s, MEM_CHUNK_SIZE);
+   data = kzalloc(s, GFP_KERNEL);
+   if (!data)
+   return -ENOMEM;
+
+   pool = (struct node_pool *)data;
+   pool->data = pool + 1;
+   pool->size = s - sizeof(struct node_pool);
+   pool->used = 0;
+   allocator->cur_free_pool = pool;
+   list_add_tail(>l, >pools);
+
+   return 0;
+}
+
+static void *cfg_node_alloc(struct cfg_node_allocator *allocator, int size)
+{
+   struct node_pool *pool = allocator->cur_free_pool;
+   void *p;
+
+   if (pool->used + size > pool->size) {
+   int ret = cfg_node_allocator_grow(allocator, size);
+
+   if (ret < 0)
+   return NULL;
+
+   pool = allocator->cur_free_pool;
+   }
+
+   p = pool->data + pool->used;
+   pool->used += size;
+
+   return p;
+}
+
+static struct bb_node *get_single_bb_nodes(struct cfg_node_allocator 
*allocator)
+{
+   int size = sizeof(struct bb_node);
+
+   return (struct bb_node *)cfg_node_alloc(allocator, size);
+}
+
+static struct edge_node *get_edge_nodes(struct cfg_node_allocator *allocator,
+   int num)
+{
+   int size = num * sizeof(struct edge_node);
+
+   return (struct edge_node *)cfg_node_alloc(allocator, size);
+}
+
+static struct cedge_node *
+get_single_cedge_node(struct cfg_node_allocator *allocator)
+{
+   int size = sizeof(struct cedge_node);
+
+   return (struct cedge_node *)cfg_node_alloc(allocator, size);
+}
+
+int cfg_node_allocator_init(struct cfg_node_allocator *allocator,
+   int bb_num_esti, int cedge_num_esti)
+{
+   int s = bb_num_esti * sizeof(struct bb_node), ret;
+
+   s += 2 * bb_num_esti * sizeof(struct edge_node);
+   s += cedge_num_esti * sizeof(struct cedge_node);
+   INIT_LIST_HEAD(>pools);
+   ret = cfg_node_allocator_grow(allocator, s);
+   if (ret < 0)
+   return ret;
+
+   return 0;
+}
+
+void cfg_node_allocator_free(struct cfg_node_allocator *allocator)
+{
+   struct list_head *pools = >pools;
+   struct node_pool *pool, *tmp;
+
+   pool = first_node_pool(pools);
+   list_for_each_entry_safe_from(pool, tmp, pools, l) {
+   list_del(>l);
+   kfree(pool);
+   }
+}
+
+int subprog_append_bb(struct cfg_node_allocator *allocator,
+ struct list_head *bb_list, int head)
 {
struct bb_node *new_bb, *bb;
 
@@ -94,7 +200,7 @@ int subprog_append_bb(struct list_head *bb_list, int head)
}
 
bb = bb_prev(bb);
-   new_bb = kzalloc(sizeof(*new_bb), GFP_KERNEL);
+   new_bb = get_single_bb_nodes(allocator);
if (!new_bb)
return -ENOMEM;
 
@@ -106,9 +212,10 @@ int subprog_append_bb(struct list_head *bb_list, int head)
return 0;
 }
 
-int subprog_fini_bb(struct list_head *bb_list, int subprog_end)
+int subprog_fini_bb(struct cfg_node_allocator *allocator,
+   struct list_head *bb_list, int subprog_end)
 {
-   struct bb_node *bb = kzalloc(sizeof(*bb), GFP_KERNEL);
+   struct 

[RFC bpf-next 08/10] bpf: cfg: remove push_insn and check_cfg

2018-05-07 Thread Jiong Wang
As we have detected loop and unreachable insns based on domination
information and call graph, there is no need of check_cfg.

This patch removes check_cfg and it's associated push_insn.

state prune heuristic marking as moved to check_subprog.

Signed-off-by: Jiong Wang <jiong.w...@netronome.com>
---
 kernel/bpf/verifier.c | 228 --
 1 file changed, 16 insertions(+), 212 deletions(-)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 12f5399..54f2fe3 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -735,6 +735,8 @@ enum reg_arg_type {
DST_OP_NO_MARK  /* same as above, check only, don't mark */
 };
 
+#define STATE_LIST_MARK ((struct bpf_verifier_state_list *)-1L)
+
 static int check_subprogs(struct bpf_verifier_env *env)
 {
int i, ret, subprog_start, subprog_end, off, cur_subprog = 0;
@@ -815,8 +817,14 @@ static int check_subprogs(struct bpf_verifier_env *env)
target);
if (ret < 0)
return ret;
+
+   env->explored_states[target] = STATE_LIST_MARK;
+   env->explored_states[i] = STATE_LIST_MARK;
}
 
+   if (i + 1 < insn_cnt)
+   env->explored_states[i + 1] = STATE_LIST_MARK;
+
goto next;
}
 
@@ -836,6 +844,13 @@ static int check_subprogs(struct bpf_verifier_env *env)
if (ret < 0)
goto free_nodes;
}
+
+   if (BPF_OP(code) != BPF_JA) {
+   env->explored_states[i] = STATE_LIST_MARK;
+   env->explored_states[off] = STATE_LIST_MARK;
+   } else if (i + 1 < insn_cnt) {
+   env->explored_states[i + 1] = STATE_LIST_MARK;
+   }
 next:
if (i == subprog_end - 1) {
/* to avoid fall-through from one subprog into another
@@ -3951,217 +3966,6 @@ static int check_return_code(struct bpf_verifier_env 
*env)
return 0;
 }
 
-/* non-recursive DFS pseudo code
- * 1  procedure DFS-iterative(G,v):
- * 2  label v as discovered
- * 3  let S be a stack
- * 4  S.push(v)
- * 5  while S is not empty
- * 6t <- S.pop()
- * 7if t is what we're looking for:
- * 8return t
- * 9for all edges e in G.adjacentEdges(t) do
- * 10   if edge e is already labelled
- * 11   continue with the next edge
- * 12   w <- G.adjacentVertex(t,e)
- * 13   if vertex w is not discovered and not explored
- * 14   label e as tree-edge
- * 15   label w as discovered
- * 16   S.push(w)
- * 17   continue at 5
- * 18   else if vertex w is discovered
- * 19   label e as back-edge
- * 20   else
- * 21   // vertex w is explored
- * 22   label e as forward- or cross-edge
- * 23   label t as explored
- * 24   S.pop()
- *
- * convention:
- * 0x10 - discovered
- * 0x11 - discovered and fall-through edge labelled
- * 0x12 - discovered and fall-through and branch edges labelled
- * 0x20 - explored
- */
-
-enum {
-   DISCOVERED = 0x10,
-   EXPLORED = 0x20,
-   FALLTHROUGH = 1,
-   BRANCH = 2,
-};
-
-#define STATE_LIST_MARK ((struct bpf_verifier_state_list *) -1L)
-
-static int *insn_stack;/* stack of insns to process */
-static int cur_stack;  /* current stack index */
-static int *insn_state;
-
-/* t, w, e - match pseudo-code above:
- * t - index of current instruction
- * w - next instruction
- * e - edge
- */
-static int push_insn(int t, int w, int e, struct bpf_verifier_env *env)
-{
-   if (e == FALLTHROUGH && insn_state[t] >= (DISCOVERED | FALLTHROUGH))
-   return 0;
-
-   if (e == BRANCH && insn_state[t] >= (DISCOVERED | BRANCH))
-   return 0;
-
-   if (w < 0 || w >= env->prog->len) {
-   verbose(env, "jump out of range from insn %d to %d\n", t, w);
-   return -EINVAL;
-   }
-
-   if (e == BRANCH)
-   /* mark branch target for state pruning */
-   env->explored_states[w] = STATE_LIST_MARK;
-
-   if (insn_state[w] == 0) {
-   /* tree-edge */
-   insn_state[t] = DISCOVERED | e;
-   insn_state[w] = DISCOVERED;
-   if (cur_stack >= env->prog->len)
-   return -E2BIG;
-   insn_stack[cur_stack++] = w;
-   return 1;
-   } else if ((insn_state[w] & 0xF0) == DISCOVERED) {
-   verbose(env, &q

[RFC bpf-next 03/10] bpf: cfg: build domination tree using Tarjan algorithm

2018-05-07 Thread Jiong Wang
  - When building domination information, we need to some array data
structure, and need to index into them using basic block index.
Calculate the basic block index during adding edges.

  - The Step-1 in Tarjan algorithm is to assign dfs order number for
each bb in the cfg. We need to iterate on all bbs in reverse dfs order.

  - Step-2 and Step-3 in Tarjan algorithm to calculate semi-dominators and
immediate-dominators implicitly and explicitly.

  - Step-4, we add a new bitmap field inside subprog_info to keep the
dominator tree there.

  - Post-domination information is built but not tested.

Signed-off-by: Jiong Wang <jiong.w...@netronome.com>
---
 include/linux/bpf_verifier.h |   3 +
 kernel/bpf/cfg.c | 386 ++-
 kernel/bpf/cfg.h |   3 +-
 kernel/bpf/verifier.c|   5 +-
 4 files changed, 393 insertions(+), 4 deletions(-)

diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index 404a5d8..9c0a808 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -177,6 +177,9 @@ struct bpf_subprog_info {
u32 start; /* insn idx of function entry point */
u16 stack_depth; /* max. stack depth used by this function */
struct list_head bbs; /* basic blocks list. */
+   u32 bb_num; /* total basic block num. */
+   unsigned long *dtree;
+   bool dtree_avail;
 };
 
 /* single container for all structs
diff --git a/kernel/bpf/cfg.c b/kernel/bpf/cfg.c
index 208aac7..b50937a 100644
--- a/kernel/bpf/cfg.c
+++ b/kernel/bpf/cfg.c
@@ -17,11 +17,33 @@ struct edge_node {
struct bb_node *dst;
 };
 
+struct edge_iter {
+   struct list_head *list_head;
+   struct edge_node *edge;
+};
+
+#define first_edge(e_list) list_first_entry(e_list, struct edge_node, l)
+#define last_edge(e_list)  list_last_entry(e_list, struct edge_node, l)
+#define next_edge(e)   list_next_entry(e, l)
+
+static bool ei_end_p(struct edge_iter *ei)
+{
+   return >edge->l == ei->list_head;
+}
+
+static void ei_next(struct edge_iter *ei)
+{
+   struct edge_node *e = ei->edge;
+
+   ei->edge = next_edge(e);
+}
+
 struct bb_node {
struct list_head l;
struct list_head e_prevs;
struct list_head e_succs;
u16 head;
+   u16 idx;
 };
 
 #define bb_prev(bb)list_prev_entry(bb, l)
@@ -31,6 +53,27 @@ struct bb_node {
 #define entry_bb(bb_list)  bb_first(bb_list)
 #define exit_bb(bb_list)   bb_last(bb_list)
 
+struct dom_info {
+   u16 *dfs_parent;
+   u16 *dfs_order;
+   struct bb_node **dfs_to_bb;
+   /* immediate-dominator */
+   u16 *idom;
+   /* semi-dominator */
+   u16 *sdom;
+   u16 *bucket;
+   u16 *next_bucket;
+   /* best node during path compression. */
+   u16 *best;
+   /* ancestor along tree edge. */
+   u16 *ancestor;
+   /* size and child are used for tree balancing. */
+   u16 *size;
+   u16 *child;
+
+   u16 dfsnum;
+};
+
 int subprog_append_bb(struct list_head *bb_list, int head)
 {
struct bb_node *new_bb, *bb;
@@ -107,6 +150,7 @@ int subprog_add_bb_edges(struct bpf_insn *insns, struct 
list_head *bb_list)
 {
struct bb_node *bb, *exit_bb;
struct edge_node *edge;
+   int bb_num;
 
bb = entry_bb(bb_list);
edge = kcalloc(2, sizeof(*edge), GFP_KERNEL);
@@ -118,9 +162,12 @@ int subprog_add_bb_edges(struct bpf_insn *insns, struct 
list_head *bb_list)
edge[1].src = edge->src;
edge[1].dst = edge->dst;
list_add_tail([1].l, [1].dst->e_prevs);
+   bb->idx = -1;
 
exit_bb = exit_bb(bb_list);
+   exit_bb->idx = -2;
bb = bb_next(bb);
+   bb_num = 0;
list_for_each_entry_from(bb, _bb->l, l) {
bool has_fallthrough, only_has_fallthrough;
bool has_branch, only_has_branch;
@@ -129,6 +176,8 @@ int subprog_add_bb_edges(struct bpf_insn *insns, struct 
list_head *bb_list)
struct bpf_insn insn;
u8 code;
 
+   bb->idx = bb_num++;
+
edge = kcalloc(2, sizeof(*edge), GFP_KERNEL);
if (!edge)
return -ENOMEM;
@@ -184,9 +233,341 @@ int subprog_add_bb_edges(struct bpf_insn *insns, struct 
list_head *bb_list)
}
}
 
+   return bb_num + 2;
+}
+
+static int init_dom_info(struct bpf_subprog_info *subprog, struct dom_info *di)
+{
+   u16 *p, bb_num, i;
+
+   di->dfs_parent = NULL;
+   di->dfs_to_bb = NULL;
+
+   bb_num = subprog->bb_num;
+   p = kcalloc(10 * bb_num, sizeof(*p), GFP_KERNEL);
+   if (!p)
+   return -ENOMEM;
+
+   di->dfs_parent = p;
+   di->dfs_order = di->dfs_parent + bb_num;
+   di->idom = di->dfs_order + bb_num;
+   di->sdom = di->idom + bb_num;
+   di-&g

[RFC bpf-next 02/10] bpf: cfg: add edges between basic blocks to form CFG

2018-05-07 Thread Jiong Wang
This patch add edges between basic blocks. Both edges for predecessors and
successors are added.

Signed-off-by: Jiong Wang <jiong.w...@netronome.com>
---
 kernel/bpf/cfg.c  | 129 +-
 kernel/bpf/cfg.h  |   1 +
 kernel/bpf/verifier.c |   3 ++
 3 files changed, 131 insertions(+), 2 deletions(-)

diff --git a/kernel/bpf/cfg.c b/kernel/bpf/cfg.c
index b1af714..208aac7 100644
--- a/kernel/bpf/cfg.c
+++ b/kernel/bpf/cfg.c
@@ -11,8 +11,16 @@
 
 #include "cfg.h"
 
+struct edge_node {
+   struct list_head l;
+   struct bb_node *src;
+   struct bb_node *dst;
+};
+
 struct bb_node {
struct list_head l;
+   struct list_head e_prevs;
+   struct list_head e_succs;
u16 head;
 };
 
@@ -39,6 +47,8 @@ int subprog_append_bb(struct list_head *bb_list, int head)
if (!new_bb)
return -ENOMEM;
 
+   INIT_LIST_HEAD(_bb->e_prevs);
+   INIT_LIST_HEAD(_bb->e_succs);
new_bb->head = head;
list_add(_bb->l, >l);
 
@@ -53,6 +63,8 @@ int subprog_fini_bb(struct list_head *bb_list, int 
subprog_end)
return -ENOMEM;
/* entry bb. */
bb->head = -1;
+   INIT_LIST_HEAD(>e_prevs);
+   INIT_LIST_HEAD(>e_succs);
list_add(>l, bb_list);
 
bb = kzalloc(sizeof(*bb), GFP_KERNEL);
@@ -60,6 +72,8 @@ int subprog_fini_bb(struct list_head *bb_list, int 
subprog_end)
return -ENOMEM;
/* exit bb. */
bb->head = subprog_end;
+   INIT_LIST_HEAD(>e_prevs);
+   INIT_LIST_HEAD(>e_succs);
list_add_tail(>l, bb_list);
 
return 0;
@@ -77,15 +91,126 @@ int subprog_init_bb(struct list_head *bb_list, int 
subprog_start)
return 0;
 }
 
+static struct bb_node *search_bb_with_head(struct list_head *bb_list, int head)
+{
+   struct bb_node *bb;
+
+   list_for_each_entry(bb, bb_list, l) {
+   if (bb->head == head)
+   return bb;
+   }
+
+   return NULL;
+}
+
+int subprog_add_bb_edges(struct bpf_insn *insns, struct list_head *bb_list)
+{
+   struct bb_node *bb, *exit_bb;
+   struct edge_node *edge;
+
+   bb = entry_bb(bb_list);
+   edge = kcalloc(2, sizeof(*edge), GFP_KERNEL);
+   if (!edge)
+   return -ENOMEM;
+   edge->src = bb;
+   edge->dst = bb_next(bb);
+   list_add_tail(>l, >e_succs);
+   edge[1].src = edge->src;
+   edge[1].dst = edge->dst;
+   list_add_tail([1].l, [1].dst->e_prevs);
+
+   exit_bb = exit_bb(bb_list);
+   bb = bb_next(bb);
+   list_for_each_entry_from(bb, _bb->l, l) {
+   bool has_fallthrough, only_has_fallthrough;
+   bool has_branch, only_has_branch;
+   struct bb_node *next_bb = bb_next(bb);
+   int tail = next_bb->head - 1;
+   struct bpf_insn insn;
+   u8 code;
+
+   edge = kcalloc(2, sizeof(*edge), GFP_KERNEL);
+   if (!edge)
+   return -ENOMEM;
+   edge->src = bb;
+   edge[1].src = bb;
+
+   insn = insns[tail];
+   code = insn.code;
+   only_has_fallthrough = BPF_CLASS(code) != BPF_JMP ||
+  BPF_OP(code) == BPF_EXIT;
+   has_fallthrough = only_has_fallthrough ||
+ (BPF_CLASS(code) == BPF_JMP &&
+  BPF_OP(code) != BPF_CALL &&
+  BPF_OP(code) != BPF_JA);
+   only_has_branch = BPF_CLASS(code) == BPF_JMP &&
+ BPF_OP(code) == BPF_JA;
+   has_branch = only_has_branch ||
+(BPF_CLASS(code) == BPF_JMP &&
+ BPF_OP(code) != BPF_EXIT &&
+ BPF_OP(code) != BPF_CALL);
+
+   if (has_fallthrough) {
+   if (BPF_CLASS(code) == BPF_JMP &&
+   BPF_OP(code) == BPF_EXIT)
+   next_bb = exit_bb;
+   edge->dst = next_bb;
+   edge[1].dst = next_bb;
+   list_add_tail(>l, >e_succs);
+   list_add_tail([1].l, [1].dst->e_prevs);
+   edge = NULL;
+   }
+
+   if (has_branch) {
+   struct bb_node *tgt;
+
+   if (!edge) {
+   edge = kcalloc(2, sizeof(*edge), GFP_KERNEL);
+   if (!edge)
+   return -ENOMEM;
+   edge->src = bb;
+   edge[1].src = bb;
+   }
+
+   tgt = search_bb_with_head(bb

[RFC bpf-next 10/10] bpf: cfg: reduce memory usage by using singly list + compat pointer

2018-05-07 Thread Jiong Wang
Because there are going to be quite a few nodes (bb, edge etc) for a
reasonable sized program, the size of cfg nodes matters.

The cfg traversal used at the moment are mostly via edge which contains
pointers to both src and dst basic block. The traversal is not via links
between basic blocks and edge nodes themselves, so link them with doubly
list_head is unnecessary, and is too heavy on 64-bit host as pointer will
take 8-bytes.

This patch reduce memory usage from two ways:
  1. use singly linked list to link basic block and edge nodes.
  2. introduce a compact pointer representation for pointers generated from
 memory pool. If a pointer is from pool, we could represent it using
 pool_base + pool_offset,

   struct cfg_node_link {
 u16 offset;
 s8 idx;
   };

 the compact pointer "cfg_node_link" will only take 3-bytes.
 the delink of the pointer will becomes:

 static void *cfg_node_delink(struct cfg_node_allocator *allocator,
struct cfg_node_link *link)
 {
   s8 idx = link->idx;

   if (idx == -1)
   return NULL;

   return allocator->base[idx] + link->offset;
 }

For example, basic blocks nodes changed from:

struct bb_node {
   struct list_head l;
   struct list_head e_prevs;
   struct list_head e_succs;
   s16 head;
   u16 idx;
};

into:

struct bb_node {
struct cfg_node_link link;
struct cfg_node_link e_prevs;
struct cfg_node_link e_succs;
s16 head;
u16 idx;
};

We reduced node size from 52 to 16. We could further reduce node size to 14
if we reshuffle fields of link/e_prevs/e_succs to avoid alignment padding,
but that will with a cost of code readability.

>From benchmarks like test_xdp_noinline, this patch reduce peek memory usage
of new cfg infrastructure by more than 50%.

Signed-off-by: Jiong Wang <jiong.w...@netronome.com>
---
 include/linux/bpf_verifier.h |   7 +-
 kernel/bpf/cfg.c | 503 ---
 kernel/bpf/cfg.h |  23 +-
 kernel/bpf/verifier.c|  33 ++-
 4 files changed, 312 insertions(+), 254 deletions(-)

diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index 96337ba..c291461 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -176,8 +176,11 @@ static inline bool bpf_verifier_log_needed(const struct 
bpf_verifier_log *log)
 struct bpf_subprog_info {
u32 start; /* insn idx of function entry point */
u16 stack_depth; /* max. stack depth used by this function */
-   struct list_head callees; /* callees list. */
-   struct list_head bbs; /* basic blocks list. */
+   void *callees; /* callees list. */
+   struct {
+   void *entry_bb; /* basic blocks list head. */
+   void *exit_bb; /* basic blocks list tail. */
+   } bbs;
u32 bb_num; /* total basic block num. */
unsigned long *dtree;
bool dtree_avail;
diff --git a/kernel/bpf/cfg.c b/kernel/bpf/cfg.c
index 1aeac49..bda63ab 100644
--- a/kernel/bpf/cfg.c
+++ b/kernel/bpf/cfg.c
@@ -13,54 +13,55 @@
 
 #include "cfg.h"
 
+/* The size of cfg nodes matters, therefore we try to avoid using doubly linked
+ * list and we try to use base + offset to address node, by this we only need
+ * to keep offset.
+ */
+struct cfg_node_link {
+   u16 offset;
+   s8 idx;
+};
+
+static void *cfg_node_delink(struct cfg_node_allocator *allocator,
+struct cfg_node_link *link)
+{
+   s8 idx = link->idx;
+
+   if (idx == -1)
+   return NULL;
+
+   return allocator->base[idx] + link->offset;
+}
+
 struct cedge_node {
-   struct list_head l;
+   struct cfg_node_link link;
u8 caller_idx;
u8 callee_idx;
 };
 
 struct edge_node {
-   struct list_head l;
+   struct cfg_node_link link;
struct bb_node *src;
struct bb_node *dst;
 };
 
-struct edge_iter {
-   struct list_head *list_head;
-   struct edge_node *edge;
+struct bb_node {
+   struct cfg_node_link link;
+   struct cfg_node_link e_prevs;
+   struct cfg_node_link e_succs;
+   s16 head;
+   u16 idx;
 };
 
-#define first_edge(e_list) list_first_entry(e_list, struct edge_node, l)
-#define last_edge(e_list)  list_last_entry(e_list, struct edge_node, l)
-#define next_edge(e)   list_next_entry(e, l)
+#define entry_bb(bb_list)  (struct bb_node *)(*bb_list)
+#define exit_bb(bb_list)   (struct bb_node *)(*(bb_list + 1))
 
-static bool ei_end_p(struct edge_iter *ei)
+static struct bb_node *bb_next(struct cfg_node_allocator *allocator,
+  struct bb_node *bb)
 {
-   return >edge->l == ei->list_head;
+   return (struct bb_node *)cfg_node_delink(allocator, >link);
 }
 
-static void ei_next(stru

[RFC bpf-next 05/10] bpf: cfg: detect unreachable basic blocks

2018-05-07 Thread Jiong Wang
Do unreachable basic blocks detection as a side-product of the dfs walk
when building domination information.

Signed-off-by: Jiong Wang <jiong.w...@netronome.com>
---
 kernel/bpf/cfg.c  | 31 ++-
 kernel/bpf/cfg.h  |  3 ++-
 kernel/bpf/verifier.c |  3 ++-
 3 files changed, 30 insertions(+), 7 deletions(-)

diff --git a/kernel/bpf/cfg.c b/kernel/bpf/cfg.c
index 90692e4..7ce1472 100644
--- a/kernel/bpf/cfg.c
+++ b/kernel/bpf/cfg.c
@@ -407,15 +407,17 @@ static void calc_idoms(struct bpf_subprog_info *subprog, 
struct dom_info *di,
di->idom[idx] = di->idom[di->idom[idx]];
 }
 
-static int calc_dfs_tree(struct bpf_subprog_info *subprog, struct dom_info *di,
-bool reverse)
+static int
+calc_dfs_tree(struct bpf_verifier_env *env, struct bpf_subprog_info *subprog,
+ struct dom_info *di, bool reverse)
 {
-   u16 bb_num = subprog->bb_num, sp = 0, idx, parent_idx;
+   u16 bb_num = subprog->bb_num, sp = 0, idx, parent_idx, i;
struct list_head *bb_list = >bbs;
u16 entry_bb_fake_idx = bb_num - 2;
struct bb_node *entry_bb, *exit_bb;
struct edge_iter ei, *stack;
struct edge_node *e;
+   bool *visited;
 
di->dfs_order[entry_bb_fake_idx] = di->dfsnum;
 
@@ -423,6 +425,10 @@ static int calc_dfs_tree(struct bpf_subprog_info *subprog, 
struct dom_info *di,
if (!stack)
return -ENOMEM;
 
+   visited = kcalloc(bb_num - 2, sizeof(bool), GFP_KERNEL);
+   if (!visited)
+   return -ENOMEM;
+
if (reverse) {
entry_bb = exit_bb(bb_list);
exit_bb = entry_bb(bb_list);
@@ -445,6 +451,8 @@ static int calc_dfs_tree(struct bpf_subprog_info *subprog, 
struct dom_info *di,
 
if (reverse) {
bb_dst = e->src;
+   if (bb_dst != exit_bb)
+   visited[bb_dst->idx] = true;
if (bb_dst == exit_bb ||
di->dfs_order[bb_dst->idx]) {
ei_next();
@@ -453,6 +461,8 @@ static int calc_dfs_tree(struct bpf_subprog_info *subprog, 
struct dom_info *di,
bb_src = e->dst;
} else {
bb_dst = e->dst;
+   if (bb_dst != exit_bb)
+   visited[bb_dst->idx] = true;
if (bb_dst == exit_bb ||
di->dfs_order[bb_dst->idx]) {
ei_next();
@@ -490,6 +500,16 @@ static int calc_dfs_tree(struct bpf_subprog_info *subprog, 
struct dom_info *di,
 
kfree(stack);
 
+   for (i = 0; i < bb_num - 2; i++) {
+   if (!visited[i]) {
+   bpf_verifier_log_write(env, "cfg - unreachable insn\n");
+   kfree(visited);
+   return -EINVAL;
+   }
+   }
+
+   kfree(visited);
+
return 0;
 }
 
@@ -541,7 +561,8 @@ static int idoms_to_doms(struct bpf_subprog_info *subprog, 
struct dom_info *di)
  * The implementation also referenced GNU GCC 3.0.
  */
 
-int subprog_build_dom_info(struct bpf_subprog_info *subprog)
+int subprog_build_dom_info(struct bpf_verifier_env *env,
+  struct bpf_subprog_info *subprog)
 {
struct dom_info di;
int ret;
@@ -550,7 +571,7 @@ int subprog_build_dom_info(struct bpf_subprog_info *subprog)
if (ret < 0)
goto free_dominfo;
 
-   ret = calc_dfs_tree(subprog, , false);
+   ret = calc_dfs_tree(env, subprog, , false);
if (ret < 0)
goto free_dominfo;
 
diff --git a/kernel/bpf/cfg.h b/kernel/bpf/cfg.h
index c02c4cf..02729a9 100644
--- a/kernel/bpf/cfg.h
+++ b/kernel/bpf/cfg.h
@@ -10,7 +10,8 @@
 
 int subprog_add_bb_edges(struct bpf_insn *insns, struct list_head *bb_list);
 int subprog_append_bb(struct list_head *bb_list, int head);
-int subprog_build_dom_info(struct bpf_subprog_info *subprog);
+int subprog_build_dom_info(struct bpf_verifier_env *env,
+  struct bpf_subprog_info *subprog);
 int subprog_fini_bb(struct list_head *bb_list, int subprog_end);
 bool subprog_has_loop(struct bpf_subprog_info *subprog);
 int subprog_init_bb(struct list_head *bb_list, int subprog_start);
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index a93aa43..46d5eae 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -879,7 +879,8 @@ static int check_subprogs(struct bpf_verifier_env *env)
if (ret < 0)
goto free_nodes;
subprog[cur_subprog].bb_num = ret;
-   ret = subprog_buil

[RFC bpf-next 00/10] initial control flow support for eBPF verifier

2018-05-07 Thread Jiong Wang
This patch introduce initial control flow infrastructure to eBPF verifier.
Various control flow information are built in an incremental way, so they
could be easily maintained at various level.

After this patch set, information collection on eBPF insns are centred at
check_subprogs, check_cfg and related code then are replaced by new
infrastructure.

Checks/Analysis offered by check_cfg are still offered by new
infrastructure. For example loop detection, recursive function call,
unreachable insn, prune heuristic marking.

This infrastructure comes with some memory usage and execution time cost.
Memory pool based allocation is used to avoid frequently calling of
kmalloc/free. Singly linked list and compact pointer are used to reduce
various cfg node size.

data structure size
===
  basic blocks edgecall-graph edge
  16-bytes 24-bytes6-bytes

memory usage (test_l4lb_noinline, test_xdp_noinline)

(counted all subprogs):

test_l4lb_noinline:

  cfg info:  597 insns, 12 subprogs
 107 basic-blocks, 278 edges, 12 call edges.

  mem usage: 9216(9K)-bytes for memory pool (basic-blocks/edges/call-edges)
 644-bytes for dominator trees.

test_xdp_noinline:

  cfg info:  1029 insns, 21 subprogs
 191 basic-basics, 502 edges, 23 call edges.

  mem usage: 15360(15K)-bytes for memory pool.
 1192-bytes for dominator trees.

The existing check_cfg is using two auxiliary arrays (insn_state and
insn_stack), insn_cnt * sizeof(int) bytes for each, so would use ~5K and
~8K accordingly.  (looks to me insn_state could use 1-byte element,
insn_stack could use 2-byte, so probably could reduced to ~2K and ~3K).

The existing check_cfg is using 8-bytes for each insn during cfg check.
The new infrastructure basically use 2x mems, 16-bytes for each insn.

execution time
===
test_l4lb_noinline:
  existing check_subprog/check_cfg: ~55000 ns
  new infrastructure: ~135000 ns

test_xdp_noinline:
  existing check_subprog/check_cfg: ~52000 ns
  new infrastructure: ~12 ns

The control flow infrastructure could possibly lay the ground for further
static analysis inside eBPF verifier, for example bounded loop detection,
path-sensitive data-flow analysis etc.

Jiong Wang (10):
  bpf: cfg: partition basic blocks for each subprog
  bpf: cfg: add edges between basic blocks to form CFG
  bpf: cfg: build domination tree using Tarjan algorithm
  bpf: cfg: detect loop use domination information
  bpf: cfg: detect unreachable basic blocks
  bpf: cfg: move find_subprog/add_subprog to cfg.c
  bpf: cfg: build call graph and detect unreachable/recursive call
  bpf: cfg: remove push_insn and check_cfg
  bpf: cfg: reduce k*alloc/free call by using memory pool for allocating
nodes
  bpf: cfg: reduce memory usage by using singly list + compat pointer

 include/linux/bpf_verifier.h |   8 +
 kernel/bpf/Makefile  |   2 +-
 kernel/bpf/cfg.c | 956 +++
 kernel/bpf/cfg.h |  42 ++
 kernel/bpf/verifier.c| 386 ++---
 5 files changed, 1131 insertions(+), 263 deletions(-)
 create mode 100644 kernel/bpf/cfg.c
 create mode 100644 kernel/bpf/cfg.h

-- 
2.7.4



[RFC bpf-next 01/10] bpf: cfg: partition basic blocks for each subprog

2018-05-07 Thread Jiong Wang
"check_subprogs" has partitioned subprogs and is doing insn scan for each
subprog already, this patch extends it to also partition basic blocks (BB)
for each subprog.

  - The first insn for each subprog start a BB.
  - Branch (both cond and absolute) target start a BB.
  - Insn immediately follows branch insn start a BB.
Insn immediately follows exit and within subprog start a BB.

BBs for each subprog are organized as a list in ascending head.

Two special BBs, entry and exit are added as well.

Signed-off-by: Jiong Wang <jiong.w...@netronome.com>
---
 include/linux/bpf_verifier.h |  1 +
 kernel/bpf/Makefile  |  2 +-
 kernel/bpf/cfg.c | 93 
 kernel/bpf/cfg.h | 16 
 kernel/bpf/verifier.c| 57 ---
 5 files changed, 162 insertions(+), 7 deletions(-)
 create mode 100644 kernel/bpf/cfg.c
 create mode 100644 kernel/bpf/cfg.h

diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index 8f70dc1..404a5d8 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -176,6 +176,7 @@ static inline bool bpf_verifier_log_needed(const struct 
bpf_verifier_log *log)
 struct bpf_subprog_info {
u32 start; /* insn idx of function entry point */
u16 stack_depth; /* max. stack depth used by this function */
+   struct list_head bbs; /* basic blocks list. */
 };
 
 /* single container for all structs
diff --git a/kernel/bpf/Makefile b/kernel/bpf/Makefile
index 35c485f..b4542d8 100644
--- a/kernel/bpf/Makefile
+++ b/kernel/bpf/Makefile
@@ -1,7 +1,7 @@
 # SPDX-License-Identifier: GPL-2.0
 obj-y := core.o
 
-obj-$(CONFIG_BPF_SYSCALL) += syscall.o verifier.o inode.o helpers.o tnum.o
+obj-$(CONFIG_BPF_SYSCALL) += syscall.o verifier.o inode.o helpers.o tnum.o 
cfg.o
 obj-$(CONFIG_BPF_SYSCALL) += hashtab.o arraymap.o percpu_freelist.o 
bpf_lru_list.o lpm_trie.o map_in_map.o
 obj-$(CONFIG_BPF_SYSCALL) += disasm.o
 obj-$(CONFIG_BPF_SYSCALL) += btf.o
diff --git a/kernel/bpf/cfg.c b/kernel/bpf/cfg.c
new file mode 100644
index 000..b1af714
--- /dev/null
+++ b/kernel/bpf/cfg.c
@@ -0,0 +1,93 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/* Copyright (C) 2018 Netronome Systems, Inc. */
+/* This program is free software; you can redistribute it and/or
+ * modify it under the terms of version 2 of the GNU General Public
+ * License as published by the Free Software Foundation.
+ */
+
+#include 
+#include 
+#include 
+
+#include "cfg.h"
+
+struct bb_node {
+   struct list_head l;
+   u16 head;
+};
+
+#define bb_prev(bb)list_prev_entry(bb, l)
+#define bb_next(bb)list_next_entry(bb, l)
+#define bb_first(bb_list)  list_first_entry(bb_list, struct bb_node, l)
+#define bb_last(bb_list)   list_last_entry(bb_list, struct bb_node, l)
+#define entry_bb(bb_list)  bb_first(bb_list)
+#define exit_bb(bb_list)   bb_last(bb_list)
+
+int subprog_append_bb(struct list_head *bb_list, int head)
+{
+   struct bb_node *new_bb, *bb;
+
+   list_for_each_entry(bb, bb_list, l) {
+   if (bb->head == head)
+   return 0;
+   else if (bb->head > head)
+   break;
+   }
+
+   bb = bb_prev(bb);
+   new_bb = kzalloc(sizeof(*new_bb), GFP_KERNEL);
+   if (!new_bb)
+   return -ENOMEM;
+
+   new_bb->head = head;
+   list_add(_bb->l, >l);
+
+   return 0;
+}
+
+int subprog_fini_bb(struct list_head *bb_list, int subprog_end)
+{
+   struct bb_node *bb = kzalloc(sizeof(*bb), GFP_KERNEL);
+
+   if (!bb)
+   return -ENOMEM;
+   /* entry bb. */
+   bb->head = -1;
+   list_add(>l, bb_list);
+
+   bb = kzalloc(sizeof(*bb), GFP_KERNEL);
+   if (!bb)
+   return -ENOMEM;
+   /* exit bb. */
+   bb->head = subprog_end;
+   list_add_tail(>l, bb_list);
+
+   return 0;
+}
+
+int subprog_init_bb(struct list_head *bb_list, int subprog_start)
+{
+   int ret;
+
+   INIT_LIST_HEAD(bb_list);
+   ret = subprog_append_bb(bb_list, subprog_start);
+   if (ret < 0)
+   return ret;
+
+   return 0;
+}
+
+void subprog_free_bb(struct bpf_subprog_info *subprog, int end_idx)
+{
+   int i = 0;
+
+   for (; i <= end_idx; i++) {
+   struct list_head *bbs = [i].bbs;
+   struct bb_node *bb, *tmp;
+
+   list_for_each_entry_safe(bb, tmp, bbs, l) {
+   list_del(>l);
+   kfree(bb);
+   }
+   }
+}
diff --git a/kernel/bpf/cfg.h b/kernel/bpf/cfg.h
new file mode 100644
index 000..4092145
--- /dev/null
+++ b/kernel/bpf/cfg.h
@@ -0,0 +1,16 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/* Copyright (C) 2018 Netronome Systems, Inc. */
+/* This program is free software; you can redistribute it and/or
+ * modify it under the term

[RFC bpf-next 06/10] bpf: cfg: move find_subprog/add_subprog to cfg.c

2018-05-07 Thread Jiong Wang
This patch centre find_subprog and add_subprog to cfg.c.

Signed-off-by: Jiong Wang <jiong.w...@netronome.com>
---
 kernel/bpf/cfg.c  | 41 +
 kernel/bpf/cfg.h  |  2 ++
 kernel/bpf/verifier.c | 42 --
 3 files changed, 43 insertions(+), 42 deletions(-)

diff --git a/kernel/bpf/cfg.c b/kernel/bpf/cfg.c
index 7ce1472..a34a95c 100644
--- a/kernel/bpf/cfg.c
+++ b/kernel/bpf/cfg.c
@@ -6,8 +6,10 @@
  */
 
 #include 
+#include 
 #include 
 #include 
+#include 
 
 #include "cfg.h"
 
@@ -611,6 +613,45 @@ bool subprog_has_loop(struct bpf_subprog_info *subprog)
return false;
 }
 
+static int cmp_subprogs(const void *a, const void *b)
+{
+   return ((struct bpf_subprog_info *)a)->start -
+  ((struct bpf_subprog_info *)b)->start;
+}
+
+int find_subprog(struct bpf_verifier_env *env, int off)
+{
+   struct bpf_subprog_info *p;
+
+   p = bsearch(, env->subprog_info, env->subprog_cnt,
+   sizeof(env->subprog_info[0]), cmp_subprogs);
+   if (!p)
+   return -ENOENT;
+   return p - env->subprog_info;
+}
+
+int add_subprog(struct bpf_verifier_env *env, int off)
+{
+   int insn_cnt = env->prog->len;
+   int ret;
+
+   if (off >= insn_cnt || off < 0) {
+   bpf_verifier_log_write(env, "call to invalid destination\n");
+   return -EINVAL;
+   }
+   ret = find_subprog(env, off);
+   if (ret >= 0)
+   return 0;
+   if (env->subprog_cnt >= BPF_MAX_SUBPROGS) {
+   bpf_verifier_log_write(env, "too many subprograms\n");
+   return -E2BIG;
+   }
+   env->subprog_info[env->subprog_cnt++].start = off;
+   sort(env->subprog_info, env->subprog_cnt,
+sizeof(env->subprog_info[0]), cmp_subprogs, NULL);
+   return 0;
+}
+
 static void subprog_free_edge(struct bb_node *bb)
 {
struct list_head *succs = >e_succs;
diff --git a/kernel/bpf/cfg.h b/kernel/bpf/cfg.h
index 02729a9..57eab9b 100644
--- a/kernel/bpf/cfg.h
+++ b/kernel/bpf/cfg.h
@@ -8,6 +8,8 @@
 #ifndef __BPF_CFG_H__
 #define __BPF_CFG_H__
 
+int add_subprog(struct bpf_verifier_env *env, int off);
+int find_subprog(struct bpf_verifier_env *env, int off);
 int subprog_add_bb_edges(struct bpf_insn *insns, struct list_head *bb_list);
 int subprog_append_bb(struct list_head *bb_list, int head);
 int subprog_build_dom_info(struct bpf_verifier_env *env,
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 46d5eae..72bda84 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -20,8 +20,6 @@
 #include 
 #include 
 #include 
-#include 
-#include 
 
 #include "cfg.h"
 #include "disasm.h"
@@ -737,46 +735,6 @@ enum reg_arg_type {
DST_OP_NO_MARK  /* same as above, check only, don't mark */
 };
 
-static int cmp_subprogs(const void *a, const void *b)
-{
-   return ((struct bpf_subprog_info *)a)->start -
-  ((struct bpf_subprog_info *)b)->start;
-}
-
-static int find_subprog(struct bpf_verifier_env *env, int off)
-{
-   struct bpf_subprog_info *p;
-
-   p = bsearch(, env->subprog_info, env->subprog_cnt,
-   sizeof(env->subprog_info[0]), cmp_subprogs);
-   if (!p)
-   return -ENOENT;
-   return p - env->subprog_info;
-
-}
-
-static int add_subprog(struct bpf_verifier_env *env, int off)
-{
-   int insn_cnt = env->prog->len;
-   int ret;
-
-   if (off >= insn_cnt || off < 0) {
-   verbose(env, "call to invalid destination\n");
-   return -EINVAL;
-   }
-   ret = find_subprog(env, off);
-   if (ret >= 0)
-   return 0;
-   if (env->subprog_cnt >= BPF_MAX_SUBPROGS) {
-   verbose(env, "too many subprograms\n");
-   return -E2BIG;
-   }
-   env->subprog_info[env->subprog_cnt++].start = off;
-   sort(env->subprog_info, env->subprog_cnt,
-sizeof(env->subprog_info[0]), cmp_subprogs, NULL);
-   return 0;
-}
-
 static int check_subprogs(struct bpf_verifier_env *env)
 {
int i, ret, subprog_start, subprog_end, off, cur_subprog = 0;
-- 
2.7.4



[PATCH v2 bpf-next 0/3] bpf: cleanups on managing subprog information

2018-05-02 Thread Jiong Wang
This patch set clean up some code logic related with managing subprog
information.

Part of the set are inspried by Edwin's code in his RFC:

  "bpf/verifier: subprog/func_call simplifications"

but with clearer separation so it could be easier to review.

  - Path 1 unifies main prog and subprogs. All of them are registered in
env->subprog_starts.

  - After patch 1, it is clear that subprog_starts and subprog_stack_depth
could be merged as both of them now have main and subprog unified.
Patch 2 therefore does the merge, all subprog information are centred
at bpf_subprog_info.

  - Patch 3 goes further to introduce a new fake "exit" subprog which
serves as an ending marker to the subprog list. We could then turn the
following code snippets across verifier:

   if (env->subprog_cnt == cur_subprog + 1)
   subprog_end = insn_cnt;
   else
   subprog_end = env->subprog_info[cur_subprog + 1].start;

into:
   subprog_end = env->subprog_info[cur_subprog + 1].start;

There is no functional change by this patch set.
No bpf selftest (both non-jit and jit) regression found after this set.

v2:
  - fixed adjust_subprog_starts to also update fake "exit" subprog start.
  - for John's suggestion on renaming subprog to prog, I could work on
a follow-up patch if it is recognized as worth the change.

Jiong Wang (3):
  bpf: unify main prog and subprog
  bpf: centre subprog information fields
  bpf: add faked "ending" subprog

 include/linux/bpf_verifier.h |   9 ++--
 kernel/bpf/verifier.c| 121 ++-
 2 files changed, 67 insertions(+), 63 deletions(-)

-- 
2.7.4



[PATCH v2 bpf-next 2/3] bpf: centre subprog information fields

2018-05-02 Thread Jiong Wang
It is better to centre all subprog information fields into one structure.
This structure could later serve as function node in call graph.

Signed-off-by: Jiong Wang <jiong.w...@netronome.com>
---
 include/linux/bpf_verifier.h |  9 ---
 kernel/bpf/verifier.c| 62 +++-
 2 files changed, 38 insertions(+), 33 deletions(-)

diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index f655b92..8f70dc1 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -173,6 +173,11 @@ static inline bool bpf_verifier_log_needed(const struct 
bpf_verifier_log *log)
 
 #define BPF_MAX_SUBPROGS 256
 
+struct bpf_subprog_info {
+   u32 start; /* insn idx of function entry point */
+   u16 stack_depth; /* max. stack depth used by this function */
+};
+
 /* single container for all structs
  * one verifier_env per bpf_check() call
  */
@@ -191,9 +196,7 @@ struct bpf_verifier_env {
bool seen_direct_write;
struct bpf_insn_aux_data *insn_aux_data; /* array of per-insn state */
struct bpf_verifier_log log;
-   u32 subprog_starts[BPF_MAX_SUBPROGS + 1];
-   /* computes the stack depth of each bpf function */
-   u16 subprog_stack_depth[BPF_MAX_SUBPROGS + 1];
+   struct bpf_subprog_info subprog_info[BPF_MAX_SUBPROGS + 1];
u32 subprog_cnt;
 };
 
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 16ec977..9764b9b 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -738,18 +738,19 @@ enum reg_arg_type {
 
 static int cmp_subprogs(const void *a, const void *b)
 {
-   return *(int *)a - *(int *)b;
+   return ((struct bpf_subprog_info *)a)->start -
+  ((struct bpf_subprog_info *)b)->start;
 }
 
 static int find_subprog(struct bpf_verifier_env *env, int off)
 {
-   u32 *p;
+   struct bpf_subprog_info *p;
 
-   p = bsearch(, env->subprog_starts, env->subprog_cnt,
-   sizeof(env->subprog_starts[0]), cmp_subprogs);
+   p = bsearch(, env->subprog_info, env->subprog_cnt,
+   sizeof(env->subprog_info[0]), cmp_subprogs);
if (!p)
return -ENOENT;
-   return p - env->subprog_starts;
+   return p - env->subprog_info;
 
 }
 
@@ -769,15 +770,16 @@ static int add_subprog(struct bpf_verifier_env *env, int 
off)
verbose(env, "too many subprograms\n");
return -E2BIG;
}
-   env->subprog_starts[env->subprog_cnt++] = off;
-   sort(env->subprog_starts, env->subprog_cnt,
-sizeof(env->subprog_starts[0]), cmp_subprogs, NULL);
+   env->subprog_info[env->subprog_cnt++].start = off;
+   sort(env->subprog_info, env->subprog_cnt,
+sizeof(env->subprog_info[0]), cmp_subprogs, NULL);
return 0;
 }
 
 static int check_subprogs(struct bpf_verifier_env *env)
 {
int i, ret, subprog_start, subprog_end, off, cur_subprog = 0;
+   struct bpf_subprog_info *subprog = env->subprog_info;
struct bpf_insn *insn = env->prog->insnsi;
int insn_cnt = env->prog->len;
 
@@ -807,14 +809,14 @@ static int check_subprogs(struct bpf_verifier_env *env)
 
if (env->log.level > 1)
for (i = 0; i < env->subprog_cnt; i++)
-   verbose(env, "func#%d @%d\n", i, 
env->subprog_starts[i]);
+   verbose(env, "func#%d @%d\n", i, subprog[i].start);
 
/* now check that all jumps are within the same subprog */
subprog_start = 0;
if (env->subprog_cnt == cur_subprog + 1)
subprog_end = insn_cnt;
else
-   subprog_end = env->subprog_starts[cur_subprog + 1];
+   subprog_end = subprog[cur_subprog + 1].start;
for (i = 0; i < insn_cnt; i++) {
u8 code = insn[i].code;
 
@@ -843,8 +845,7 @@ static int check_subprogs(struct bpf_verifier_env *env)
if (env->subprog_cnt == cur_subprog + 1)
subprog_end = insn_cnt;
else
-   subprog_end =
-   env->subprog_starts[cur_subprog + 1];
+   subprog_end = subprog[cur_subprog + 1].start;
}
}
return 0;
@@ -1477,13 +1478,13 @@ static int update_stack_depth(struct bpf_verifier_env 
*env,
  const struct bpf_func_state *func,
  int off)
 {
-   u16 stack = env->subprog_stack_depth[func->subprogno];
+   u16 stack = env->subprog_info[func->subprogno].stack_depth;
 
if (stack >= -off)
return 0;
 
/* update known max for given subprogram */
-   env->subprog_stack_depth[func->subprogno] = -off;
+

[PATCH v2 bpf-next 1/3] bpf: unify main prog and subprog

2018-05-02 Thread Jiong Wang
Currently, verifier treat main prog and subprog differently. All subprogs
detected are kept in env->subprog_starts while main prog is not kept there.
Instead, main prog is implicitly defined as the prog start at 0.

There is actually no difference between main prog and subprog, it is better
to unify them, and register all progs detected into env->subprog_starts.

This could also help simplifying some code logic.

Signed-off-by: Jiong Wang <jiong.w...@netronome.com>
---
 include/linux/bpf_verifier.h |  2 +-
 kernel/bpf/verifier.c| 57 
 2 files changed, 32 insertions(+), 27 deletions(-)

diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index 7e61c39..f655b92 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -191,7 +191,7 @@ struct bpf_verifier_env {
bool seen_direct_write;
struct bpf_insn_aux_data *insn_aux_data; /* array of per-insn state */
struct bpf_verifier_log log;
-   u32 subprog_starts[BPF_MAX_SUBPROGS];
+   u32 subprog_starts[BPF_MAX_SUBPROGS + 1];
/* computes the stack depth of each bpf function */
u16 subprog_stack_depth[BPF_MAX_SUBPROGS + 1];
u32 subprog_cnt;
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index eb1a596..16ec977 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -765,7 +765,7 @@ static int add_subprog(struct bpf_verifier_env *env, int 
off)
ret = find_subprog(env, off);
if (ret >= 0)
return 0;
-   if (env->subprog_cnt >= BPF_MAX_SUBPROGS) {
+   if (env->subprog_cnt > BPF_MAX_SUBPROGS) {
verbose(env, "too many subprograms\n");
return -E2BIG;
}
@@ -781,6 +781,11 @@ static int check_subprogs(struct bpf_verifier_env *env)
struct bpf_insn *insn = env->prog->insnsi;
int insn_cnt = env->prog->len;
 
+   /* Add entry function. */
+   ret = add_subprog(env, 0);
+   if (ret < 0)
+   return ret;
+
/* determine subprog starts. The end is one before the next starts */
for (i = 0; i < insn_cnt; i++) {
if (insn[i].code != (BPF_JMP | BPF_CALL))
@@ -806,10 +811,10 @@ static int check_subprogs(struct bpf_verifier_env *env)
 
/* now check that all jumps are within the same subprog */
subprog_start = 0;
-   if (env->subprog_cnt == cur_subprog)
+   if (env->subprog_cnt == cur_subprog + 1)
subprog_end = insn_cnt;
else
-   subprog_end = env->subprog_starts[cur_subprog++];
+   subprog_end = env->subprog_starts[cur_subprog + 1];
for (i = 0; i < insn_cnt; i++) {
u8 code = insn[i].code;
 
@@ -833,11 +838,13 @@ static int check_subprogs(struct bpf_verifier_env *env)
verbose(env, "last insn is not an exit or 
jmp\n");
return -EINVAL;
}
+   cur_subprog++;
subprog_start = subprog_end;
-   if (env->subprog_cnt == cur_subprog)
+   if (env->subprog_cnt == cur_subprog + 1)
subprog_end = insn_cnt;
else
-   subprog_end = 
env->subprog_starts[cur_subprog++];
+   subprog_end =
+   env->subprog_starts[cur_subprog + 1];
}
}
return 0;
@@ -1505,10 +1512,10 @@ static int check_max_stack_depth(struct 
bpf_verifier_env *env)
return -EACCES;
}
 continue_func:
-   if (env->subprog_cnt == subprog)
+   if (env->subprog_cnt == subprog + 1)
subprog_end = insn_cnt;
else
-   subprog_end = env->subprog_starts[subprog];
+   subprog_end = env->subprog_starts[subprog + 1];
for (; i < subprog_end; i++) {
if (insn[i].code != (BPF_JMP | BPF_CALL))
continue;
@@ -1526,7 +1533,6 @@ static int check_max_stack_depth(struct bpf_verifier_env 
*env)
  i);
return -EFAULT;
}
-   subprog++;
frame++;
if (frame >= MAX_CALL_FRAMES) {
WARN_ONCE(1, "verifier bug. Call stack is too deep\n");
@@ -1558,7 +1564,6 @@ static int get_callee_stack_depth(struct bpf_verifier_env 
*env,
  start);
return -EFAULT;
}
-   subprog++;
return env->subprog_stack_depth[subprog];
 }
 #endif
@@ -2087,7 +2092,7 @@ static int check_map_func_compatibility(struct 
bpf_verifier_env *env,
case BPF_FUNC_tail_call:
if (map->map_type !

[PATCH v2 bpf-next 3/3] bpf: add faked "ending" subprog

2018-05-02 Thread Jiong Wang
There are quite a few code snippet like the following in verifier:

   subprog_start = 0;
   if (env->subprog_cnt == cur_subprog + 1)
   subprog_end = insn_cnt;
   else
   subprog_end = env->subprog_info[cur_subprog + 1].start;

The reason is there is no marker in subprog_info array to tell the end of
it.

We could resolve this issue by introducing a faked "ending" subprog.
The special "ending" subprog is with "insn_cnt" as start offset, so it is
serving as the end mark whenever we iterate over all subprogs.

Signed-off-by: Jiong Wang <jiong.w...@netronome.com>
---
 kernel/bpf/verifier.c | 34 ++
 1 file changed, 14 insertions(+), 20 deletions(-)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 9764b9b..65a6e2e 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -766,7 +766,7 @@ static int add_subprog(struct bpf_verifier_env *env, int 
off)
ret = find_subprog(env, off);
if (ret >= 0)
return 0;
-   if (env->subprog_cnt > BPF_MAX_SUBPROGS) {
+   if (env->subprog_cnt >= BPF_MAX_SUBPROGS) {
verbose(env, "too many subprograms\n");
return -E2BIG;
}
@@ -807,16 +807,18 @@ static int check_subprogs(struct bpf_verifier_env *env)
return ret;
}
 
+   /* Add a fake 'exit' subprog which could simplify subprog iteration
+* logic. 'subprog_cnt' should not be increased.
+*/
+   subprog[env->subprog_cnt].start = insn_cnt;
+
if (env->log.level > 1)
for (i = 0; i < env->subprog_cnt; i++)
verbose(env, "func#%d @%d\n", i, subprog[i].start);
 
/* now check that all jumps are within the same subprog */
-   subprog_start = 0;
-   if (env->subprog_cnt == cur_subprog + 1)
-   subprog_end = insn_cnt;
-   else
-   subprog_end = subprog[cur_subprog + 1].start;
+   subprog_start = subprog[cur_subprog].start;
+   subprog_end = subprog[cur_subprog + 1].start;
for (i = 0; i < insn_cnt; i++) {
u8 code = insn[i].code;
 
@@ -840,11 +842,9 @@ static int check_subprogs(struct bpf_verifier_env *env)
verbose(env, "last insn is not an exit or 
jmp\n");
return -EINVAL;
}
-   cur_subprog++;
subprog_start = subprog_end;
-   if (env->subprog_cnt == cur_subprog + 1)
-   subprog_end = insn_cnt;
-   else
+   cur_subprog++;
+   if (cur_subprog < env->subprog_cnt)
subprog_end = subprog[cur_subprog + 1].start;
}
}
@@ -1499,7 +1499,6 @@ static int check_max_stack_depth(struct bpf_verifier_env 
*env)
int depth = 0, frame = 0, idx = 0, i = 0, subprog_end;
struct bpf_subprog_info *subprog = env->subprog_info;
struct bpf_insn *insn = env->prog->insnsi;
-   int insn_cnt = env->prog->len;
int ret_insn[MAX_CALL_FRAMES];
int ret_prog[MAX_CALL_FRAMES];
 
@@ -1514,10 +1513,7 @@ static int check_max_stack_depth(struct bpf_verifier_env 
*env)
return -EACCES;
}
 continue_func:
-   if (env->subprog_cnt == idx + 1)
-   subprog_end = insn_cnt;
-   else
-   subprog_end = subprog[idx + 1].start;
+   subprog_end = subprog[idx + 1].start;
for (; i < subprog_end; i++) {
if (insn[i].code != (BPF_JMP | BPF_CALL))
continue;
@@ -5070,7 +5066,8 @@ static void adjust_subprog_starts(struct bpf_verifier_env 
*env, u32 off, u32 len
 
if (len == 1)
return;
-   for (i = 0; i < env->subprog_cnt; i++) {
+   /* NOTE: fake 'exit' subprog should be updated as well. */
+   for (i = 0; i <= env->subprog_cnt; i++) {
if (env->subprog_info[i].start < off)
continue;
env->subprog_info[i].start += len - 1;
@@ -5268,10 +5265,7 @@ static int jit_subprogs(struct bpf_verifier_env *env)
 
for (i = 0; i < env->subprog_cnt; i++) {
subprog_start = subprog_end;
-   if (env->subprog_cnt == i + 1)
-   subprog_end = prog->len;
-   else
-   subprog_end = env->subprog_info[i + 1].start;
+   subprog_end = env->subprog_info[i + 1].start;
 
len = subprog_end - subprog_start;
func[i] = bpf_prog_alloc(bpf_prog_size(len), GFP_USER);
-- 
2.7.4



Re: [PATCH bpf-next 0/3] bpf: cleanups on managing subprog information

2018-05-02 Thread Jiong Wang

On 02/05/2018 18:24, John Fastabend wrote:

On 05/02/2018 09:59 AM, Jiong Wang wrote:

On 01/05/2018 23:22, Alexei Starovoitov wrote:
...

[   27.784931]  ? bpf_int_jit_compile+0x7ac/0xab0
[   27.785475]  bpf_int_jit_compile+0x2b6/0xab0
[   27.786001]  ? do_jit+0x6020/0x6020
[   27.786428]  ? kasan_kmalloc+0xa0/0xd0
[   27.786885]  bpf_check+0x2c05/0x4c40
[   27.787346]  ? fixup_bpf_calls+0x1140/0x1140
[   27.787865]  ? kasan_unpoison_shadow+0x30/0x40
[   27.788406]  ? kasan_kmalloc+0xa0/0xd0
[   27.788865]  ? memset+0x1f/0x40
[   27.789255]  ? bpf_obj_name_cpy+0x2d/0x200
[   27.789750]  bpf_prog_load+0xb07/0xeb0

simply running test_verifier with JIT and kasan on.

Ah, sorry, I should add "sysctl net/core/bpf_jit_enable=1" to my test
script, error reproduced.

convert_ctx_accesses and fixup_bpf_calls might insert ebpf insns that
prog->len would change.

The new fake "exit" subprog whose .start offset is prog->len should be
updated as well.

The "for" condition in adjust_subprog_starts:

   for (i = 0; i < env->subprog_cnt; i++) {

need to be changed into:

   for (i = 0; i <= env->subprog_cnt; i++) {

Will respin the patch set.

Thanks.

Regards,
Jiong


Also a bit of a nit, but if you are doing a respin. How about
consider renaming BPF_MAX_SUBPROGS -> BPF_MAX_PROGS. It will
make the naming more accurate and also avoid some diffs below
where changing '>=' to '>' is required.


I have been pondering renaming BPF_MAX_SUBPROGS to other name like
what you suggested, but failed to convince myself, mostly due to there
are quite a few other variables etc that are using the "subprog" name
convention, so I am thinking use subprog is also fine as traditional
main prog/func is also a sub prog/func, it is just the entry one.

So I am thinking it might be not worth renaming everything related, and
tend to just keep it as is.

Thanks.

Regards,
Jiong



@@ -191,7 +191,7 @@ struct bpf_verifier_env {
bool seen_direct_write;
struct bpf_insn_aux_data *insn_aux_data; /* array of per-insn state */
struct bpf_verifier_log log;
-   u32 subprog_starts[BPF_MAX_SUBPROGS];
+   u32 subprog_starts[BPF_MAX_SUBPROGS + 1];
/* computes the stack depth of each bpf function */
u16 subprog_stack_depth[BPF_MAX_SUBPROGS + 1];
u32 subprog_cnt;




Re: [PATCH bpf-next 0/3] bpf: cleanups on managing subprog information

2018-05-02 Thread Jiong Wang

On 01/05/2018 23:22, Alexei Starovoitov wrote:
...

[   27.784931]  ? bpf_int_jit_compile+0x7ac/0xab0
[   27.785475]  bpf_int_jit_compile+0x2b6/0xab0
[   27.786001]  ? do_jit+0x6020/0x6020
[   27.786428]  ? kasan_kmalloc+0xa0/0xd0
[   27.786885]  bpf_check+0x2c05/0x4c40
[   27.787346]  ? fixup_bpf_calls+0x1140/0x1140
[   27.787865]  ? kasan_unpoison_shadow+0x30/0x40
[   27.788406]  ? kasan_kmalloc+0xa0/0xd0
[   27.788865]  ? memset+0x1f/0x40
[   27.789255]  ? bpf_obj_name_cpy+0x2d/0x200
[   27.789750]  bpf_prog_load+0xb07/0xeb0

simply running test_verifier with JIT and kasan on.


Ah, sorry, I should add "sysctl net/core/bpf_jit_enable=1" to my test
script, error reproduced.

convert_ctx_accesses and fixup_bpf_calls might insert ebpf insns that
prog->len would change.

The new fake "exit" subprog whose .start offset is prog->len should be
updated as well.

The "for" condition in adjust_subprog_starts:

  for (i = 0; i < env->subprog_cnt; i++) {

need to be changed into:

  for (i = 0; i <= env->subprog_cnt; i++) {

Will respin the patch set.

Thanks.

Regards,
Jiong



[PATCH bpf-next 2/3] bpf: centre subprog information fields

2018-04-30 Thread Jiong Wang
It is better to centre all subprog information fields into one structure.
This structure could later serve as function node in call graph.

Signed-off-by: Jiong Wang <jiong.w...@netronome.com>
---
 include/linux/bpf_verifier.h |  9 ---
 kernel/bpf/verifier.c| 62 +++-
 2 files changed, 38 insertions(+), 33 deletions(-)

diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index f655b92..8f70dc1 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -173,6 +173,11 @@ static inline bool bpf_verifier_log_needed(const struct 
bpf_verifier_log *log)
 
 #define BPF_MAX_SUBPROGS 256
 
+struct bpf_subprog_info {
+   u32 start; /* insn idx of function entry point */
+   u16 stack_depth; /* max. stack depth used by this function */
+};
+
 /* single container for all structs
  * one verifier_env per bpf_check() call
  */
@@ -191,9 +196,7 @@ struct bpf_verifier_env {
bool seen_direct_write;
struct bpf_insn_aux_data *insn_aux_data; /* array of per-insn state */
struct bpf_verifier_log log;
-   u32 subprog_starts[BPF_MAX_SUBPROGS + 1];
-   /* computes the stack depth of each bpf function */
-   u16 subprog_stack_depth[BPF_MAX_SUBPROGS + 1];
+   struct bpf_subprog_info subprog_info[BPF_MAX_SUBPROGS + 1];
u32 subprog_cnt;
 };
 
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 16ec977..9764b9b 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -738,18 +738,19 @@ enum reg_arg_type {
 
 static int cmp_subprogs(const void *a, const void *b)
 {
-   return *(int *)a - *(int *)b;
+   return ((struct bpf_subprog_info *)a)->start -
+  ((struct bpf_subprog_info *)b)->start;
 }
 
 static int find_subprog(struct bpf_verifier_env *env, int off)
 {
-   u32 *p;
+   struct bpf_subprog_info *p;
 
-   p = bsearch(, env->subprog_starts, env->subprog_cnt,
-   sizeof(env->subprog_starts[0]), cmp_subprogs);
+   p = bsearch(, env->subprog_info, env->subprog_cnt,
+   sizeof(env->subprog_info[0]), cmp_subprogs);
if (!p)
return -ENOENT;
-   return p - env->subprog_starts;
+   return p - env->subprog_info;
 
 }
 
@@ -769,15 +770,16 @@ static int add_subprog(struct bpf_verifier_env *env, int 
off)
verbose(env, "too many subprograms\n");
return -E2BIG;
}
-   env->subprog_starts[env->subprog_cnt++] = off;
-   sort(env->subprog_starts, env->subprog_cnt,
-sizeof(env->subprog_starts[0]), cmp_subprogs, NULL);
+   env->subprog_info[env->subprog_cnt++].start = off;
+   sort(env->subprog_info, env->subprog_cnt,
+sizeof(env->subprog_info[0]), cmp_subprogs, NULL);
return 0;
 }
 
 static int check_subprogs(struct bpf_verifier_env *env)
 {
int i, ret, subprog_start, subprog_end, off, cur_subprog = 0;
+   struct bpf_subprog_info *subprog = env->subprog_info;
struct bpf_insn *insn = env->prog->insnsi;
int insn_cnt = env->prog->len;
 
@@ -807,14 +809,14 @@ static int check_subprogs(struct bpf_verifier_env *env)
 
if (env->log.level > 1)
for (i = 0; i < env->subprog_cnt; i++)
-   verbose(env, "func#%d @%d\n", i, 
env->subprog_starts[i]);
+   verbose(env, "func#%d @%d\n", i, subprog[i].start);
 
/* now check that all jumps are within the same subprog */
subprog_start = 0;
if (env->subprog_cnt == cur_subprog + 1)
subprog_end = insn_cnt;
else
-   subprog_end = env->subprog_starts[cur_subprog + 1];
+   subprog_end = subprog[cur_subprog + 1].start;
for (i = 0; i < insn_cnt; i++) {
u8 code = insn[i].code;
 
@@ -843,8 +845,7 @@ static int check_subprogs(struct bpf_verifier_env *env)
if (env->subprog_cnt == cur_subprog + 1)
subprog_end = insn_cnt;
else
-   subprog_end =
-   env->subprog_starts[cur_subprog + 1];
+   subprog_end = subprog[cur_subprog + 1].start;
}
}
return 0;
@@ -1477,13 +1478,13 @@ static int update_stack_depth(struct bpf_verifier_env 
*env,
  const struct bpf_func_state *func,
  int off)
 {
-   u16 stack = env->subprog_stack_depth[func->subprogno];
+   u16 stack = env->subprog_info[func->subprogno].stack_depth;
 
if (stack >= -off)
return 0;
 
/* update known max for given subprogram */
-   env->subprog_stack_depth[func->subprogno] = -off;
+

[PATCH bpf-next 0/3] bpf: cleanups on managing subprog information

2018-04-30 Thread Jiong Wang
This patch set clean up some code logic related with managing subprog
information.

Part of the set are inspried by Edwin's code in his RFC:

  "bpf/verifier: subprog/func_call simplifications"

but with clearer separation so it could be easier to review.

  - Path 1 unifies main prog and subprogs. All of them are registered in
env->subprog_starts.

  - After patch 1, it is clear that subprog_starts and subprog_stack_depth
could be merged as both of them now have main and subprog unified.
Patch 2 therefore does the merge, all subprog information are centred
at bpf_subprog_info.

  - Patch 3 goes further to introduce a new fake "exit" subprog which
serves as an ending marker to the subprog list. We could then turn the
following code snippets across verifier:

   if (env->subprog_cnt == cur_subprog + 1)
   subprog_end = insn_cnt;
   else
   subprog_end = env->subprog_info[cur_subprog + 1].start;

into:
   subprog_end = env->subprog_info[cur_subprog + 1].start;

There is no functional change by this patch set.
No bpf selftest regression found after this patch set.

Jiong Wang (3):
  bpf: unify main prog and subprog
  bpf: centre subprog information fields
  bpf: add faked "ending" subprog

 include/linux/bpf_verifier.h |   9 ++--
 kernel/bpf/verifier.c| 118 +--
 2 files changed, 65 insertions(+), 62 deletions(-)

-- 
2.7.4



[PATCH bpf-next 3/3] bpf: add faked "ending" subprog

2018-04-30 Thread Jiong Wang
There are quite a few code snippet like the following in verifier:

   subprog_start = 0;
   if (env->subprog_cnt == cur_subprog + 1)
   subprog_end = insn_cnt;
   else
   subprog_end = env->subprog_info[cur_subprog + 1].start;

The reason is there is no marker in subprog_info array to tell the end of
it.

We could resolve this issue by introducing a faked "ending" subprog.
The special "ending" subprog is with "insn_cnt" as start offset, so it is
serving as the end mark whenever we iterate over all subprogs.

Signed-off-by: Jiong Wang <jiong.w...@netronome.com>
---
 kernel/bpf/verifier.c | 31 ---
 1 file changed, 12 insertions(+), 19 deletions(-)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 9764b9b..4a081e0 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -766,7 +766,7 @@ static int add_subprog(struct bpf_verifier_env *env, int 
off)
ret = find_subprog(env, off);
if (ret >= 0)
return 0;
-   if (env->subprog_cnt > BPF_MAX_SUBPROGS) {
+   if (env->subprog_cnt >= BPF_MAX_SUBPROGS) {
verbose(env, "too many subprograms\n");
return -E2BIG;
}
@@ -807,16 +807,18 @@ static int check_subprogs(struct bpf_verifier_env *env)
return ret;
}
 
+   /* Add a fake 'exit' subprog which could simplify subprog iteration
+* logic. 'subprog_cnt' should not be increased.
+*/
+   subprog[env->subprog_cnt].start = insn_cnt;
+
if (env->log.level > 1)
for (i = 0; i < env->subprog_cnt; i++)
verbose(env, "func#%d @%d\n", i, subprog[i].start);
 
/* now check that all jumps are within the same subprog */
-   subprog_start = 0;
-   if (env->subprog_cnt == cur_subprog + 1)
-   subprog_end = insn_cnt;
-   else
-   subprog_end = subprog[cur_subprog + 1].start;
+   subprog_start = subprog[cur_subprog].start;
+   subprog_end = subprog[cur_subprog + 1].start;
for (i = 0; i < insn_cnt; i++) {
u8 code = insn[i].code;
 
@@ -840,11 +842,9 @@ static int check_subprogs(struct bpf_verifier_env *env)
verbose(env, "last insn is not an exit or 
jmp\n");
return -EINVAL;
}
-   cur_subprog++;
subprog_start = subprog_end;
-   if (env->subprog_cnt == cur_subprog + 1)
-   subprog_end = insn_cnt;
-   else
+   cur_subprog++;
+   if (cur_subprog < env->subprog_cnt)
subprog_end = subprog[cur_subprog + 1].start;
}
}
@@ -1499,7 +1499,6 @@ static int check_max_stack_depth(struct bpf_verifier_env 
*env)
int depth = 0, frame = 0, idx = 0, i = 0, subprog_end;
struct bpf_subprog_info *subprog = env->subprog_info;
struct bpf_insn *insn = env->prog->insnsi;
-   int insn_cnt = env->prog->len;
int ret_insn[MAX_CALL_FRAMES];
int ret_prog[MAX_CALL_FRAMES];
 
@@ -1514,10 +1513,7 @@ static int check_max_stack_depth(struct bpf_verifier_env 
*env)
return -EACCES;
}
 continue_func:
-   if (env->subprog_cnt == idx + 1)
-   subprog_end = insn_cnt;
-   else
-   subprog_end = subprog[idx + 1].start;
+   subprog_end = subprog[idx + 1].start;
for (; i < subprog_end; i++) {
if (insn[i].code != (BPF_JMP | BPF_CALL))
continue;
@@ -5268,10 +5264,7 @@ static int jit_subprogs(struct bpf_verifier_env *env)
 
for (i = 0; i < env->subprog_cnt; i++) {
subprog_start = subprog_end;
-   if (env->subprog_cnt == i + 1)
-   subprog_end = prog->len;
-   else
-   subprog_end = env->subprog_info[i + 1].start;
+   subprog_end = env->subprog_info[i + 1].start;
 
len = subprog_end - subprog_start;
func[i] = bpf_prog_alloc(bpf_prog_size(len), GFP_USER);
-- 
2.7.4



[PATCH bpf-next 1/3] bpf: unify main prog and subprog

2018-04-30 Thread Jiong Wang
Currently, verifier treat main prog and subprog differently. All subprogs
detected are kept in env->subprog_starts while main prog is not kept there.
Instead, main prog is implicitly defined as the prog start at 0.

There is actually no difference between main prog and subprog, it is better
to unify them, and register all progs detected into env->subprog_starts.

This could also help simplifying some code logic.

Signed-off-by: Jiong Wang <jiong.w...@netronome.com>
---
 include/linux/bpf_verifier.h |  2 +-
 kernel/bpf/verifier.c| 57 
 2 files changed, 32 insertions(+), 27 deletions(-)

diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index 7e61c39..f655b92 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -191,7 +191,7 @@ struct bpf_verifier_env {
bool seen_direct_write;
struct bpf_insn_aux_data *insn_aux_data; /* array of per-insn state */
struct bpf_verifier_log log;
-   u32 subprog_starts[BPF_MAX_SUBPROGS];
+   u32 subprog_starts[BPF_MAX_SUBPROGS + 1];
/* computes the stack depth of each bpf function */
u16 subprog_stack_depth[BPF_MAX_SUBPROGS + 1];
u32 subprog_cnt;
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index eb1a596..16ec977 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -765,7 +765,7 @@ static int add_subprog(struct bpf_verifier_env *env, int 
off)
ret = find_subprog(env, off);
if (ret >= 0)
return 0;
-   if (env->subprog_cnt >= BPF_MAX_SUBPROGS) {
+   if (env->subprog_cnt > BPF_MAX_SUBPROGS) {
verbose(env, "too many subprograms\n");
return -E2BIG;
}
@@ -781,6 +781,11 @@ static int check_subprogs(struct bpf_verifier_env *env)
struct bpf_insn *insn = env->prog->insnsi;
int insn_cnt = env->prog->len;
 
+   /* Add entry function. */
+   ret = add_subprog(env, 0);
+   if (ret < 0)
+   return ret;
+
/* determine subprog starts. The end is one before the next starts */
for (i = 0; i < insn_cnt; i++) {
if (insn[i].code != (BPF_JMP | BPF_CALL))
@@ -806,10 +811,10 @@ static int check_subprogs(struct bpf_verifier_env *env)
 
/* now check that all jumps are within the same subprog */
subprog_start = 0;
-   if (env->subprog_cnt == cur_subprog)
+   if (env->subprog_cnt == cur_subprog + 1)
subprog_end = insn_cnt;
else
-   subprog_end = env->subprog_starts[cur_subprog++];
+   subprog_end = env->subprog_starts[cur_subprog + 1];
for (i = 0; i < insn_cnt; i++) {
u8 code = insn[i].code;
 
@@ -833,11 +838,13 @@ static int check_subprogs(struct bpf_verifier_env *env)
verbose(env, "last insn is not an exit or 
jmp\n");
return -EINVAL;
}
+   cur_subprog++;
subprog_start = subprog_end;
-   if (env->subprog_cnt == cur_subprog)
+   if (env->subprog_cnt == cur_subprog + 1)
subprog_end = insn_cnt;
else
-   subprog_end = 
env->subprog_starts[cur_subprog++];
+   subprog_end =
+   env->subprog_starts[cur_subprog + 1];
}
}
return 0;
@@ -1505,10 +1512,10 @@ static int check_max_stack_depth(struct 
bpf_verifier_env *env)
return -EACCES;
}
 continue_func:
-   if (env->subprog_cnt == subprog)
+   if (env->subprog_cnt == subprog + 1)
subprog_end = insn_cnt;
else
-   subprog_end = env->subprog_starts[subprog];
+   subprog_end = env->subprog_starts[subprog + 1];
for (; i < subprog_end; i++) {
if (insn[i].code != (BPF_JMP | BPF_CALL))
continue;
@@ -1526,7 +1533,6 @@ static int check_max_stack_depth(struct bpf_verifier_env 
*env)
  i);
return -EFAULT;
}
-   subprog++;
frame++;
if (frame >= MAX_CALL_FRAMES) {
WARN_ONCE(1, "verifier bug. Call stack is too deep\n");
@@ -1558,7 +1564,6 @@ static int get_callee_stack_depth(struct bpf_verifier_env 
*env,
  start);
return -EFAULT;
}
-   subprog++;
return env->subprog_stack_depth[subprog];
 }
 #endif
@@ -2087,7 +2092,7 @@ static int check_map_func_compatibility(struct 
bpf_verifier_env *env,
case BPF_FUNC_tail_call:
if (map->map_type !

Re: [RFC PATCH v3 bpf-next 4/5] bpf/verifier: use call graph to efficiently check max stack depth

2018-04-10 Thread Jiong Wang

On 06/04/2018 18:14, Edward Cree wrote:


Since that algorithm selects
+ * nodes in the order of the sorted output, we can do our processing in the 
loop
+ * that does the tsort, rather than storing the sorted list and then having a
+ * second loop to iterate over it and compute the total_stack_depth values.
  */


IMO, seperate the sort from depth calculation is better than combining
them. The tsort of call graph could potentially be used in other places.



Re: [RFC PATCH v3 bpf-next 0/5] bpf/verifier: subprog/func_call simplifications

2018-04-10 Thread Jiong Wang

On 06/04/2018 18:11, Edward Cree wrote:

By storing subprog boundaries as a subprogno mark on each insn, rather than
 a start (and implicit end) for each subprog, the code handling subprogs can
 in various places be made simpler, more explicit, and more efficient (i.e.
 better asymptotic complexity).


IMHO, the subprog number might be much smaller than insn number, so just
keeping sorted subprog information in env->subprog_info and do bsearch
looks reasonable to me.

While the new added "subprogno" field is only used round find_subprog, so
I feel it is a little bit heavy to allocate a delicated field that wouldn't
be used broadly for each instruction.


This also lays the ground for possible
 future work in which an extension of the check_subprogs() code to cover
 basic blocks could replace check_cfg(), which currently largely duplicates
 the insn-walking part.


I was thinking to remove the insn-walk duplication in the other way by
center them in the DFS insn-walk in check_cfg, as we can't remove that one
before we migrate to other loop detection method. This approach is
questionable though.


Some other changes were also included to support this:
* Per-subprog info is stored in env->subprog_info, an array of structs,
  rather than several arrays with a common index.
* Subprog numbers and corresponding subprog_info index are now 0-based;
  subprog 0 is the main()-function of the program, starting at insn 0.
* Call graph is now stored in the new bpf_subprog_info struct; used here
  for check_max_stack_depth() but may have other uses too.


Ack above changes which center all subprog related info into
env->subprog_info and the new addition of callee field.



Re: [PATCH v2 bpf-next 0/3] bpf/verifier: subprog/func_call simplifications

2018-04-05 Thread Jiong Wang

On 03/04/2018 02:08, Alexei Starovoitov wrote:

Combining subprog pass with do_check is going into opposite direction
of this long term work. Divide and conquer. Combining more things into
do_check is the opposite of this programming principle.


Agree. And for the redundant insn traversal issue in check_subprogs that
Edward trying to fix, I am thinking we could do it by utilizing the
existing DFS traversal in check_cfg.

The current DFS probably could be improved into an generic instruction
information collection pass.

This won't make the existing DFS complexer as it only does information
collection as a side job during traversal. These collected information
then could be used to build any other information to be consumed later,
for example subprog, basic blocks etc.

For the redundant insn traversal issue during subprog detection, the
Like how we mark STATE_LIST_MARK in DFS, we could just call add_subprog
for BPF_PSEUDO_CALL insn during DFS.

i.e we change the code logic of check_cfg into:

check_cfg
{
  * DFS traversal:
- detect back-edge.
- collect STATE_LIST_MARK.
- collect subprog destination.

  * check all insns are reachable.
  * check_subprogs (insn traversal removed).
}

I prototyped a patch for above change, the change looks to me is easier to
review. There are 8 regressions on selftests however after this change due
to check_subprogs moved after some checks in DFS that some errors detected
before the expected errors, need to be cleaned up.

Does this direction sound good?

And if we want to build basic-block later, we could just call a new add_bb
(similar as add_subprog) for jump targets etc. (some of those places are
actually STATE_LIST_MARK_JMPTARGET if we separate STATE_LIST_MARK as
STATE_LIST_MARK_JMPTARGET and STATE_LIST_MARK_HEURISTIC).

Regards,
Jiong



Re: [RFC PATCH bpf-next 0/2] bpf/verifier: simplify subprog tracking

2018-02-13 Thread Jiong Wang

On 13/02/2018 05:09, John Fastabend wrote:

To find these we do the following
1.a: Build the CFG of all the instructions so we can walk around the
 program easily.
1.b: Find the basic blocks and build the basic block CFG. After this
 with a few helper calls we can move around the CFG both at block
 level and insn level.
1.c: Build the dom tree and post dom tree. This is a data structure that
 allows us to actually find the natural loops. I can write more about
 this later.
1.d: Using the post dom tree verify all the CFG loops are in fact
 natural loops. I made some simplifications here and threw
 out loops that have multiple jumps to the header node and nested
 loops for now. Both can be handled with some additional complexity.

   Versions of the above can be found in gcc and llvm so it appears to be
   fairly regular stuff from the compiler side. For LLVM the passes are
   --loops, --postdomtree. I dumped the various CFG into dot format and
   and graphed them for fun.


This part of work are very interesting. It will be very helpful if these CFG
information are kept even after verification pass finished, or if the storage
overhead is a concern, made into reusable functions.

I plan to implement similar graph dump in bpftool, i.e. be able to dump eBPF
CFG into .dot graph and color the graph according to some simple static 
analysis,
these exactly need such CFG information.

Regards,
Jiong



Re: [RFC bpf-next] bpf: add new jited info fields in bpf_dev_offload and bpf_prog_info

2018-01-15 Thread Jiong Wang

On Fri, 12 Jan 2018 12:31:15 +0100, Daniel Borkmann wrote:

What kind of string would sit in jited_arch_name for nfp?


The name is purely to let libfd know which bfd backend to choose for the
disassembler.


Given you know
the {ifindex, netns_dev, netns_ino} 3‑tuple, isn't it possible to infer
the driver info from ethtool (e.g. nfp_net_get_drvinfo()) already and do
the mapping for binutils?

Right, the inference can certainly work.  Probably by matching the PCI
ID of the device?  Or we can just assume it's the NFP since there is
only one upstream BPF offload :)


Checked ethtool source code and I am thinking (and had tried) we could just
call the existing "ifindex_to_name_ns" in bpftool which returns the device
name, then we could use ioctl/SIOCETHTOOL to get the "struct ethtool_drvinfo"
associated with the device, then map the driver name to bpf name.

Will send out new patches following this way shortly, please let me know if
this is not good.




Given we know when ifindex is 0, then its always host JITed and the
current approach with bfd_openr() works fine, and if ifindex is non-0
it is /always/ device offloaded to the one bound in the ifindex so
JIT dump will be specific to that device only and never host one. Not
at all opposed to extending uapi, just a question on my side to get a
better understanding first wrt arch string (maybe rough but complete
patch with nfp + bpftool bits might help too).

I was advocating for full separate set of fields, Jiong was trying for
a reuse, and we sort of met in the middle :)  Depends on how one looks
at the definition of the jited_prog_insn field, old user space is not
guaranteed to pay attention to ifindex.  We will drop the arch and reuse
host fields if that's the direction you're leaning in.


I also want to make sure adding new "jited_image" and "jited_len" is the
correct approach.

Was thinking to reusing exsiting fields for host jit, i.e
bpf_prog->jited_len and bpf_prog->aux->jited_data, but different offload
targets might have different structure for "jited_data" that we need new
call back to parse it, also I feel keep host fields clean for host JIT
only might help preventing code logic for host and offload interleaving
with each other, so had gone with adding new fields.

--
Regards,
Jiong