Re: [PATCH v2 bpf-next 08/13] bpf: Add instructions for atomic_[cmp]xchg

2020-12-01 Thread Brendan Jackman
On Sat, Nov 28, 2020 at 05:27:48PM -0800, Alexei Starovoitov wrote:
> On Fri, Nov 27, 2020 at 05:57:33PM +, Brendan Jackman wrote:
> >  
> >  /* atomic op type fields (stored in immediate) */
> > -#define BPF_FETCH  0x01/* fetch previous value into src reg */
> > +#define BPF_XCHG   (0xe0 | BPF_FETCH)  /* atomic exchange */
> > +#define BPF_CMPXCHG(0xf0 | BPF_FETCH)  /* atomic 
> > compare-and-write */
> > +#define BPF_FETCH  0x01/* fetch previous value into src reg or r0*/
> 
> I think such comment is more confusing than helpful.
> I'd just say that the fetch bit is not valid on its own.
> It's used to build other instructions like cmpxchg and atomic_fetch_add.

OK sounds good.

> > +   } else if (BPF_MODE(insn->code) == BPF_ATOMIC &&
> > +  insn->imm == (BPF_CMPXCHG)) {
> 
> redundant ().

Ack, thanks

> > +   verbose(cbs->private_data, "(%02x) r0 = 
> > atomic%s_cmpxchg(*(%s *)(r%d %+d), r0, r%d)\n",
> > +   insn->code,
> > +   BPF_SIZE(insn->code) == BPF_DW ? "64" : "",
> > +   bpf_ldst_string[BPF_SIZE(insn->code) >> 3],
> > +   insn->dst_reg, insn->off,
> > +   insn->src_reg);
> > +   } else if (BPF_MODE(insn->code) == BPF_ATOMIC &&
> > +  insn->imm == (BPF_XCHG)) {
> 
> redundant ().

Ack, thanks


Re: [PATCH v2 bpf-next 08/13] bpf: Add instructions for atomic_[cmp]xchg

2020-12-01 Thread Brendan Jackman
On Fri, Nov 27, 2020 at 09:25:53PM -0800, Yonghong Song wrote:
> 
> 
> On 11/27/20 9:57 AM, Brendan Jackman wrote:
> > This adds two atomic opcodes, both of which include the BPF_FETCH
> > flag. XCHG without the BPF_FETCh flag would naturally encode
> 
> BPF_FETCH

Ack, thanks

> > atomic_set. This is not supported because it would be of limited
> > value to userspace (it doesn't imply any barriers). CMPXCHG without
> > BPF_FETCH woulud be an atomic compare-and-write. We don't have such
> > an operation in the kernel so it isn't provided to BPF either.
> > 
> > There are two significant design decisions made for the CMPXCHG
> > instruction:
> > 
> >   - To solve the issue that this operation fundamentally has 3
> > operands, but we only have two register fields. Therefore the
> > operand we compare against (the kernel's API calls it 'old') is
> > hard-coded to be R0. x86 has similar design (and A64 doesn't
> > have this problem).
> > 
> > A potential alternative might be to encode the other operand's
> > register number in the immediate field.
> > 
> >   - The kernel's atomic_cmpxchg returns the old value, while the C11
> > userspace APIs return a boolean indicating the comparison
> > result. Which should BPF do? A64 returns the old value. x86 returns
> > the old value in the hard-coded register (and also sets a
> > flag). That means return-old-value is easier to JIT.
> > 
> > Signed-off-by: Brendan Jackman 
> > ---
> >   arch/x86/net/bpf_jit_comp.c|  8 
> >   include/linux/filter.h | 20 
> >   include/uapi/linux/bpf.h   |  4 +++-
> >   kernel/bpf/core.c  | 20 
> >   kernel/bpf/disasm.c| 15 +++
> >   kernel/bpf/verifier.c  | 19 +--
> >   tools/include/linux/filter.h   | 20 
> >   tools/include/uapi/linux/bpf.h |  4 +++-
> >   8 files changed, 106 insertions(+), 4 deletions(-)
> > 
> [...]
> > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > index cd4c03b25573..c8311cc114ec 100644
> > --- a/kernel/bpf/verifier.c
> > +++ b/kernel/bpf/verifier.c
> > @@ -3601,10 +3601,13 @@ static int check_mem_access(struct bpf_verifier_env 
> > *env, int insn_idx, u32 regn
> >   static int check_atomic(struct bpf_verifier_env *env, int insn_idx, 
> > struct bpf_insn *insn)
> >   {
> > int err;
> > +   int load_reg;
> > switch (insn->imm) {
> > case BPF_ADD:
> > case BPF_ADD | BPF_FETCH:
> > +   case BPF_XCHG:
> > +   case BPF_CMPXCHG:
> > break;
> > default:
> > verbose(env, "BPF_ATOMIC uses invalid atomic opcode %02x\n", 
> > insn->imm);
> > @@ -3626,6 +3629,13 @@ static int check_atomic(struct bpf_verifier_env 
> > *env, int insn_idx, struct bpf_i
> > if (err)
> > return err;
> > +   if (insn->imm == BPF_CMPXCHG) {
> > +   /* check src3 operand */
> 
> better comment about what src3 means here?

Ack,  adding "Check comparison of R0 with memory location"


Re: [PATCH v2 bpf-next 08/13] bpf: Add instructions for atomic_[cmp]xchg

2020-11-28 Thread Alexei Starovoitov
On Fri, Nov 27, 2020 at 05:57:33PM +, Brendan Jackman wrote:
>  
>  /* atomic op type fields (stored in immediate) */
> -#define BPF_FETCH0x01/* fetch previous value into src reg */
> +#define BPF_XCHG (0xe0 | BPF_FETCH)  /* atomic exchange */
> +#define BPF_CMPXCHG  (0xf0 | BPF_FETCH)  /* atomic compare-and-write */
> +#define BPF_FETCH0x01/* fetch previous value into src reg or r0*/

I think such comment is more confusing than helpful.
I'd just say that the fetch bit is not valid on its own.
It's used to build other instructions like cmpxchg and atomic_fetch_add.

> + } else if (BPF_MODE(insn->code) == BPF_ATOMIC &&
> +insn->imm == (BPF_CMPXCHG)) {

redundant ().

> + verbose(cbs->private_data, "(%02x) r0 = 
> atomic%s_cmpxchg(*(%s *)(r%d %+d), r0, r%d)\n",
> + insn->code,
> + BPF_SIZE(insn->code) == BPF_DW ? "64" : "",
> + bpf_ldst_string[BPF_SIZE(insn->code) >> 3],
> + insn->dst_reg, insn->off,
> + insn->src_reg);
> + } else if (BPF_MODE(insn->code) == BPF_ATOMIC &&
> +insn->imm == (BPF_XCHG)) {

redundant ().


Re: [PATCH v2 bpf-next 08/13] bpf: Add instructions for atomic_[cmp]xchg

2020-11-28 Thread Yonghong Song




On 11/27/20 9:57 AM, Brendan Jackman wrote:

This adds two atomic opcodes, both of which include the BPF_FETCH
flag. XCHG without the BPF_FETCh flag would naturally encode


BPF_FETCH


atomic_set. This is not supported because it would be of limited
value to userspace (it doesn't imply any barriers). CMPXCHG without
BPF_FETCH woulud be an atomic compare-and-write. We don't have such
an operation in the kernel so it isn't provided to BPF either.

There are two significant design decisions made for the CMPXCHG
instruction:

  - To solve the issue that this operation fundamentally has 3
operands, but we only have two register fields. Therefore the
operand we compare against (the kernel's API calls it 'old') is
hard-coded to be R0. x86 has similar design (and A64 doesn't
have this problem).

A potential alternative might be to encode the other operand's
register number in the immediate field.

  - The kernel's atomic_cmpxchg returns the old value, while the C11
userspace APIs return a boolean indicating the comparison
result. Which should BPF do? A64 returns the old value. x86 returns
the old value in the hard-coded register (and also sets a
flag). That means return-old-value is easier to JIT.

Signed-off-by: Brendan Jackman 
---
  arch/x86/net/bpf_jit_comp.c|  8 
  include/linux/filter.h | 20 
  include/uapi/linux/bpf.h   |  4 +++-
  kernel/bpf/core.c  | 20 
  kernel/bpf/disasm.c| 15 +++
  kernel/bpf/verifier.c  | 19 +--
  tools/include/linux/filter.h   | 20 
  tools/include/uapi/linux/bpf.h |  4 +++-
  8 files changed, 106 insertions(+), 4 deletions(-)


[...]

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index cd4c03b25573..c8311cc114ec 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -3601,10 +3601,13 @@ static int check_mem_access(struct bpf_verifier_env 
*env, int insn_idx, u32 regn
  static int check_atomic(struct bpf_verifier_env *env, int insn_idx, struct 
bpf_insn *insn)
  {
int err;
+   int load_reg;
  
  	switch (insn->imm) {

case BPF_ADD:
case BPF_ADD | BPF_FETCH:
+   case BPF_XCHG:
+   case BPF_CMPXCHG:
break;
default:
verbose(env, "BPF_ATOMIC uses invalid atomic opcode %02x\n", 
insn->imm);
@@ -3626,6 +3629,13 @@ static int check_atomic(struct bpf_verifier_env *env, 
int insn_idx, struct bpf_i
if (err)
return err;
  
+	if (insn->imm == BPF_CMPXCHG) {

+   /* check src3 operand */


better comment about what src3 means here?


+   err = check_reg_arg(env, BPF_REG_0, SRC_OP);
+   if (err)
+   return err;
+   }
+
if (is_pointer_value(env, insn->src_reg)) {
verbose(env, "R%d leaks addr into mem\n", insn->src_reg);
return -EACCES;
@@ -3656,8 +3666,13 @@ static int check_atomic(struct bpf_verifier_env *env, 
int insn_idx, struct bpf_i
if (!(insn->imm & BPF_FETCH))
return 0;
  
-	/* check and record load of old value into src reg  */

-   err = check_reg_arg(env, insn->src_reg, DST_OP);
+   if (insn->imm == BPF_CMPXCHG)
+   load_reg = BPF_REG_0;
+   else
+   load_reg = insn->src_reg;
+
+   /* check and record load of old value */
+   err = check_reg_arg(env, load_reg, DST_OP);
if (err)
return err;
  

[...]


[PATCH v2 bpf-next 08/13] bpf: Add instructions for atomic_[cmp]xchg

2020-11-27 Thread Brendan Jackman
This adds two atomic opcodes, both of which include the BPF_FETCH
flag. XCHG without the BPF_FETCh flag would naturally encode
atomic_set. This is not supported because it would be of limited
value to userspace (it doesn't imply any barriers). CMPXCHG without
BPF_FETCH woulud be an atomic compare-and-write. We don't have such
an operation in the kernel so it isn't provided to BPF either.

There are two significant design decisions made for the CMPXCHG
instruction:

 - To solve the issue that this operation fundamentally has 3
   operands, but we only have two register fields. Therefore the
   operand we compare against (the kernel's API calls it 'old') is
   hard-coded to be R0. x86 has similar design (and A64 doesn't
   have this problem).

   A potential alternative might be to encode the other operand's
   register number in the immediate field.

 - The kernel's atomic_cmpxchg returns the old value, while the C11
   userspace APIs return a boolean indicating the comparison
   result. Which should BPF do? A64 returns the old value. x86 returns
   the old value in the hard-coded register (and also sets a
   flag). That means return-old-value is easier to JIT.

Signed-off-by: Brendan Jackman 
---
 arch/x86/net/bpf_jit_comp.c|  8 
 include/linux/filter.h | 20 
 include/uapi/linux/bpf.h   |  4 +++-
 kernel/bpf/core.c  | 20 
 kernel/bpf/disasm.c| 15 +++
 kernel/bpf/verifier.c  | 19 +--
 tools/include/linux/filter.h   | 20 
 tools/include/uapi/linux/bpf.h |  4 +++-
 8 files changed, 106 insertions(+), 4 deletions(-)

diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
index d3cd45bcd0c1..7431b2937157 100644
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -831,6 +831,14 @@ static int emit_atomic(u8 **pprog, u8 atomic_op,
/* src_reg = atomic_fetch_add(*(dst_reg + off), src_reg); */
EMIT2(0x0F, 0xC1);
break;
+   case BPF_XCHG:
+   /* src_reg = atomic_xchg(*(u32/u64*)(dst_reg + off), src_reg); 
*/
+   EMIT1(0x87);
+   break;
+   case BPF_CMPXCHG:
+   /* r0 = atomic_cmpxchg(*(u32/u64*)(dst_reg + off), r0, 
src_reg); */
+   EMIT2(0x0F, 0xB1);
+   break;
default:
pr_err("bpf_jit: unknown atomic opcode %02x\n", atomic_op);
return -EFAULT;
diff --git a/include/linux/filter.h b/include/linux/filter.h
index 4e04d0fc454f..6186280715ed 100644
--- a/include/linux/filter.h
+++ b/include/linux/filter.h
@@ -280,6 +280,26 @@ static inline bool insn_is_zext(const struct bpf_insn 
*insn)
.off   = OFF,   \
.imm   = BPF_ADD | BPF_FETCH })
 
+/* Atomic exchange, src_reg = atomic_xchg((dst_reg + off), src_reg) */
+
+#define BPF_ATOMIC_XCHG(SIZE, DST, SRC, OFF)   \
+   ((struct bpf_insn) {\
+   .code  = BPF_STX | BPF_SIZE(SIZE) | BPF_ATOMIC, \
+   .dst_reg = DST, \
+   .src_reg = SRC, \
+   .off   = OFF,   \
+   .imm   = BPF_XCHG  })
+
+/* Atomic compare-exchange, r0 = atomic_cmpxchg((dst_reg + off), r0, src_reg) 
*/
+
+#define BPF_ATOMIC_CMPXCHG(SIZE, DST, SRC, OFF)\
+   ((struct bpf_insn) {\
+   .code  = BPF_STX | BPF_SIZE(SIZE) | BPF_ATOMIC, \
+   .dst_reg = DST, \
+   .src_reg = SRC, \
+   .off   = OFF,   \
+   .imm   = BPF_CMPXCHG })
+
 /* Memory store, *(uint *) (dst_reg + off16) = imm32 */
 
 #define BPF_ST_MEM(SIZE, DST, OFF, IMM)\
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 025e377e7229..82039a1176ac 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -45,7 +45,9 @@
 #define BPF_EXIT   0x90/* function return */
 
 /* atomic op type fields (stored in immediate) */
-#define BPF_FETCH  0x01/* fetch previous value into src reg */
+#define BPF_XCHG   (0xe0 | BPF_FETCH)  /* atomic exchange */
+#define BPF_CMPXCHG(0xf0 | BPF_FETCH)  /* atomic compare-and-write */
+#define BPF_FETCH  0x01/* fetch previous value into src reg or r0*/
 
 /* Register numbers */
 enum {
diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index 49a2a533db60..05350a8f87c0 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -1638,6 +1638,16 @@ static u64 ___bpf_prog_run(u64 *regs, const struct 
bpf_insn *insn, u64 *stack)
(u32) SRC,