Re: [PATCH net-next 2/3] arm64: bpf: optimize JMP_CALL

2016-06-06 Thread Z Lim
Hi Will,

On Mon, Jun 6, 2016 at 10:05 AM, Will Deacon  wrote:
> On Sat, Jun 04, 2016 at 03:00:29PM -0700, Zi Shen Lim wrote:
>> Remove superfluous stack frame, saving us 3 instructions for
>> every JMP_CALL.
>>
>> Signed-off-by: Zi Shen Lim 
>> ---
>>  arch/arm64/net/bpf_jit_comp.c | 3 ---
>>  1 file changed, 3 deletions(-)
>>
>> diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c
>> index 51abc97..7ae304e 100644
>> --- a/arch/arm64/net/bpf_jit_comp.c
>> +++ b/arch/arm64/net/bpf_jit_comp.c
>> @@ -578,11 +578,8 @@ emit_cond_jmp:
>>   const u64 func = (u64)__bpf_call_base + imm;
>>
>>   emit_a64_mov_i64(tmp, func, ctx);
>> - emit(A64_PUSH(A64_FP, A64_LR, A64_SP), ctx);
>> - emit(A64_MOV(1, A64_FP, A64_SP), ctx);
>>   emit(A64_BLR(tmp), ctx);
>>   emit(A64_MOV(1, r0, A64_R(0)), ctx);
>> - emit(A64_POP(A64_FP, A64_LR, A64_SP), ctx);
>>   break;
>>   }
>
> Is the jitted code intended to be unwindable by standard tools?

Before this patch:
bpf_prologue => push stack frame
...
jmp_call => push stack frame, call bpf_helper*, pop stack frame
...
bpf_epilogue => pop stack frame, ret

Now:
bpf_prologue => push stack frame
...
jmp_call => call bpf_helper*
...
bpf_epilogue => pop stack frame, ret

*Note: bpf_helpers in kernel/bpf/helper.c

So yes, it's still unwindable.

>
> Will


Re: [PATCH net-next 1/3] arm64: bpf: implement bpf_tail_call() helper

2016-06-05 Thread Z Lim
Hi Daniel,

On Sun, Jun 5, 2016 at 12:53 AM, Daniel Borkmann  wrote:
> On 06/05/2016 01:46 AM, kbuild test robot wrote:
>>
>> Hi,
>>
>> [auto build test ERROR on net-next/master]
>>
>> url:
>> https://github.com/0day-ci/linux/commits/Zi-Shen-Lim/arm64-bpf-implement-bpf_tail_call-helper/20160605-060435
>> config: arm64-defconfig (attached as .config)
>> compiler: aarch64-linux-gnu-gcc (Debian 5.3.1-8) 5.3.1 20160205
>> reproduce:
>>  wget
>> https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross
>> -O ~/bin/make.cross
>>  chmod +x ~/bin/make.cross
>>  # save the attached .config to linux build tree
>>  make.cross ARCH=arm64
>>
>> All errors (new ones prefixed by >>):
>>
>> In file included from arch/arm64/net/bpf_jit_comp.c:21:0:
>> include/linux/bpf.h: In function 'bpf_prog_get':

 include/linux/bpf.h:235:9: error: implicit declaration of function
 'ERR_PTR' [-Werror=implicit-function-declaration]
>>
>>   return ERR_PTR(-EOPNOTSUPP);
>>  ^
>> include/linux/bpf.h:235:9: warning: return makes pointer from integer
>> without a cast [-Wint-conversion]
>> In file included from include/linux/rwsem.h:17:0,
>>  from include/linux/mm_types.h:10,
>>  from include/linux/sched.h:27,
>>  from arch/arm64/include/asm/compat.h:25,
>>  from arch/arm64/include/asm/stat.h:23,
>>  from include/linux/stat.h:5,
>>  from include/linux/compat.h:12,
>>  from include/linux/filter.h:10,
>>  from arch/arm64/net/bpf_jit_comp.c:22:
>> include/linux/err.h: At top level:

 include/linux/err.h:23:35: error: conflicting types for 'ERR_PTR'
>>
>>  static inline void * __must_check ERR_PTR(long error)
>>^
>> In file included from arch/arm64/net/bpf_jit_comp.c:21:0:
>> include/linux/bpf.h:235:9: note: previous implicit declaration of
>> 'ERR_PTR' was here
>>   return ERR_PTR(-EOPNOTSUPP);
>>  ^
>> cc1: some warnings being treated as errors
>
>
> Looks like including linux/bpf.h at the very beginning causes issues when
> bpf
> syscall is disabled. We should probably just include linux/err.h from bpf.h.

How about the attached patch? Fixes compilation error on build
!CONFIG_BPF_SYSCALL.

Also, should this patch be sent to net or net-next (along with this series)?

Thanks,
z
From 0633e3e528e11b09691fbf533ba7fdaf4c52f772 Mon Sep 17 00:00:00 2001
From: Zi Shen Lim 
Date: Sun, 5 Jun 2016 21:43:14 -0700
Subject: [PATCH] bpf: fix missing header inclusion

Commit 0fc174dea545 ("ebpf: make internal bpf API independent of
CONFIG_BPF_SYSCALL ifdefs") introduced usage of ERR_PTR() in
bpf_prog_get(), however did not include linux/err.h.

Without this patch, when compiling arm64 BPF without CONFIG_BPF_SYSCALL:
...
In file included from arch/arm64/net/bpf_jit_comp.c:21:0:
include/linux/bpf.h: In function 'bpf_prog_get':
include/linux/bpf.h:235:9: error: implicit declaration of function 'ERR_PTR' [-Werror=implicit-function-declaration]
  return ERR_PTR(-EOPNOTSUPP);
 ^
include/linux/bpf.h:235:9: warning: return makes pointer from integer without a cast [-Wint-conversion]
In file included from include/linux/rwsem.h:17:0,
 from include/linux/mm_types.h:10,
 from include/linux/sched.h:27,
 from arch/arm64/include/asm/compat.h:25,
 from arch/arm64/include/asm/stat.h:23,
 from include/linux/stat.h:5,
 from include/linux/compat.h:12,
 from include/linux/filter.h:10,
 from arch/arm64/net/bpf_jit_comp.c:22:
include/linux/err.h: At top level:
include/linux/err.h:23:35: error: conflicting types for 'ERR_PTR'
 static inline void * __must_check ERR_PTR(long error)
   ^
In file included from arch/arm64/net/bpf_jit_comp.c:21:0:
include/linux/bpf.h:235:9: note: previous implicit declaration of 'ERR_PTR' was here
  return ERR_PTR(-EOPNOTSUPP);
 ^
...

Fixes: 0fc174dea545 ("ebpf: make internal bpf API independent of CONFIG_BPF_SYSCALL ifdefs")
Suggested-by: Daniel Borkmann 
Signed-off-by: Zi Shen Lim 
---
 include/linux/bpf.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 8ee27b8..1bcae82 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -11,6 +11,7 @@
 #include 
 #include 
 #include 
+#include 
 
 struct bpf_map;
 
-- 
1.9.1



Re: [PATCH net-next] bpf: arm64: remove callee-save registers use for tmp registers

2016-05-16 Thread Z Lim
Hi Yang,

On Mon, May 16, 2016 at 4:09 PM, Yang Shi  wrote:
> In the current implementation of ARM64 eBPF JIT, R23 and R24 are used for
> tmp registers, which are callee-saved registers. This leads to variable size
> of JIT prologue and epilogue. The latest blinding constant change prefers to
> constant size of prologue and epilogue. AAPCS reserves R9 ~ R15 for temp
> registers which not need to be saved/restored during function call. So, 
> replace
> R23 and R24 to R10 and R11, and remove tmp_used flag.
>
> CC: Zi Shen Lim 
> CC: Daniel Borkmann 
> Signed-off-by: Yang Shi 
> ---

Couple suggestions, but otherwise:
Acked-by: Zi Shen Lim 

1. Update the diagram. I think it should now be:

-* BPF fp register => -80:+-+ <= (BPF_FP)
+* BPF fp register => -64:+-+ <= (BPF_FP)

2. Add a comment in commit log along the lines of: this is an
optimization saving 2 instructions per jited BPF program.

Thanks :)

z

> Apply on top of Daniel's blinding constant patchset.
>
>  arch/arm64/net/bpf_jit_comp.c | 32 
>  1 file changed, 4 insertions(+), 28 deletions(-)
>


Re: [PATCH net-next 2/5] bpf: move clearing of A/X into classic to eBPF migration prologue

2015-12-17 Thread Z Lim
On Thu, Dec 17, 2015 at 2:51 PM, Daniel Borkmann  wrote:
> Back in the days where eBPF (or back then "internal BPF" ;->) was not
> exposed to user space, and only the classic BPF programs internally
> translated into eBPF programs, we missed the fact that for classic BPF
> A and X needed to be cleared. It was fixed back then via 83d5b7ef99c9
> ("net: filter: initialize A and X registers"), and thus classic BPF
> specifics were added to the eBPF interpreter core to work around it.
>
> This added some confusion for JIT developers later on that take the
> eBPF interpreter code as an example for deriving their JIT. F.e. in
> f75298f5c3fe ("s390/bpf: clear correct BPF accumulator register"), at
> least X could leak stack memory. Furthermore, since this is only needed
> for classic BPF translations and not for eBPF (verifier takes care
> that read access to regs cannot be done uninitialized), more complexity
> is added to JITs as they need to determine whether they deal with
> migrations or native eBPF where they can just omit clearing A/X in
> their prologue and thus reduce image size a bit, see f.e. cde66c2d88da
> ("s390/bpf: Only clear A and X for converted BPF programs"). In other
> cases (x86, arm64), A and X is being cleared in the prologue also for
> eBPF case, which is unnecessary.
>
> Lets move this into the BPF migration in bpf_convert_filter() where it
> actually belongs as long as the number of eBPF JITs are still few. It
> can thus be done generically; allowing us to remove the quirk from
> __bpf_prog_run() and to slightly reduce JIT image size in case of eBPF,
> while reducing code duplication on this matter in current(/future) eBPF
> JITs.
>
> Signed-off-by: Daniel Borkmann 
> Acked-by: Alexei Starovoitov 
> Reviewed-by: Michael Holzheu 
> Tested-by: Michael Holzheu 
> Cc: Zi Shen Lim 
> Cc: Yang Shi 
> ---
>  arch/arm64/net/bpf_jit_comp.c |  6 --

For the arm64 bits:
Acked-by: Zi Shen Lim 
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] arm64: bpf: fix buffer pointer

2015-11-18 Thread Z Lim
On Wed, Nov 18, 2015 at 1:07 PM, Shi, Yang  wrote:
> On 11/18/2015 12:56 AM, Zi Shen Lim wrote:
>> emit_a64_mov_i64(r3, size, ctx);
>> -   emit(A64_ADD_I(1, r4, fp, MAX_BPF_STACK), ctx);
>> +   emit(A64_SUB_I(1, r4, fp, STACK_SIZE), ctx);
>
>
> Should not it sub MAX_BPF_STACK?

No, if it's at (BPF_FP - MAX_BPF_STACK), we'll be writing into the BPF
stack area, which should only be used by the BPF program.

> If you sub STACK_SIZE here, the buffer pointer will point to bottom of the
> reserved area.

Yes, that's the idea. The buffer is allocated in here. Right now we're
using this "reserved" space for this buffer only.

>
> You stack layout change also shows this:
>
> +*+-+ <= (BPF_FP - MAX_BPF_STACK)
> +*|RSVD | JIT scratchpad
> +* current A64_SP =>  +-+ <= (BPF_FP - STACK_SIZE)

Yes, this diagram reflects the code and intention.


Thanks for reviewing, we definitely need more of these :)
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V4 2/2] arm64: bpf: make BPF prologue and epilogue align with ARM64 AAPCS

2015-11-16 Thread Z Lim
On Mon, Nov 16, 2015 at 2:35 PM, Yang Shi  wrote:
> Save and restore FP/LR in BPF prog prologue and epilogue, save SP to FP
> in prologue in order to get the correct stack backtrace.
>
> However, ARM64 JIT used FP (x29) as eBPF fp register, FP is subjected to
> change during function call so it may cause the BPF prog stack base address
> change too.
>
> Use x25 to replace FP as BPF stack base register (fp). Since x25 is callee
> saved register, so it will keep intact during function call.
> It is initialized in BPF prog prologue when BPF prog is started to run
> everytime. Save and restore x25/x26 in BPF prologue and epilogue to keep
> them intact for the outside of BPF. Actually, x26 is unnecessary, but SP
> requires 16 bytes alignment.
>
> So, the BPF stack layout looks like:
>
>  high
>  original A64_SP =>   0:+-+ BPF prologue
> |FP/LR|
>  current A64_FP =>  -16:+-+
> | ... | callee saved registers
> +-+
> | | x25/x26
>  BPF fp register => -80:+-+
> | |
> | ... | BPF prog stack
> | |
> | |
>  current A64_SP =>  +-+
> | |
> | ... | Function call stack
> | |
> +-+
>   low
>
> CC: Zi Shen Lim 
> CC: Xi Wang 
> Signed-off-by: Yang Shi 

Acked-by: Zi Shen Lim 
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V3 2/2] arm64: bpf: make BPF prologue and epilogue align with ARM64 AAPCS

2015-11-13 Thread Z Lim
Yang, I noticed another thing...

On Fri, Nov 13, 2015 at 10:09 AM, Yang Shi  wrote:
> Save and restore FP/LR in BPF prog prologue and epilogue, save SP to FP
> in prologue in order to get the correct stack backtrace.
>
> However, ARM64 JIT used FP (x29) as eBPF fp register, FP is subjected to
> change during function call so it may cause the BPF prog stack base address
> change too.
>
> Use x25 to replace FP as BPF stack base register (fp). Since x25 is callee
> saved register, so it will keep intact during function call.

Can you please add save/restore for x25 also? :)

> It is initialized in BPF prog prologue when BPF prog is started to run
> everytime. When BPF prog exits, it could be just tossed.
>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] arm64: bpf: fix JIT frame pointer setup

2015-11-12 Thread Z Lim
On Thu, Nov 12, 2015 at 1:57 PM, Yang Shi  wrote:
> BPF fp should point to the top of the BPF prog stack. The original
> implementation made it point to the bottom incorrectly.
> Move A64_SP to fp before reserve BPF prog stack space.
>
> CC: Zi Shen Lim 
> CC: Xi Wang 
> Signed-off-by: Yang Shi 
> ---

Reviewed-by: Zi Shen Lim 

Also,

Fixes: e54bcde3d69d ("arm64: eBPF JIT compiler")
Cc:  # 3.18+
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] arm64: bpf: make BPF prologue and epilogue align with ARM64 AAPCS

2015-11-12 Thread Z Lim
On Thu, Nov 12, 2015 at 1:57 PM, Yang Shi  wrote:
>
> Save and restore FP/LR in BPF prog prologue and epilogue, save SP to FP
> in prologue in order to get the correct stack backtrace.
>
> However, ARM64 JIT used FP (x29) as eBPF fp register, FP is subjected to
> change during function call so it may cause the BPF prog stack base address
> change too.
>
> Use x25 to replace FP as BPF stack base register (fp). Since x25 is callee
> saved register, so it will keep intact during function call.
> It is initialized in BPF prog prologue when BPF prog is started to run
> everytime. When BPF prog exits, it could be just tossed.
>
> So, the BPF stack layout looks like:
>
>  high
>  original A64_SP =>   0:+-+ BPF prologue
> | | FP/LR and callee saved registers
>  BPF fp register => -64:+-+
> | |
> | ... | BPF prog stack
> | |
> | |
>  current A64_SP/FP =>   +-+
> | |
> | ... | Function call stack
> | |
> +-+
>   low
>

Yang, for stack unwinding to work, shouldn't it be something like the following?

  | LR |
A64_FP => | FP |
  | .. |
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] arm64: bpf: add 'store immediate' instruction

2015-11-12 Thread Z Lim
On Thu, Nov 12, 2015 at 11:33 AM, Shi, Yang  wrote:
> On 11/11/2015 4:39 AM, Will Deacon wrote:
>>
>> Wait a second, we're both talking rubbish here :) The STR (immediate)
>> form is referring to the addressing mode, whereas this patch wants to
>> store an immediate value to memory, which does need moving to a register
>> first.
>
>
> Yes, the immediate means immediate offset for addressing index. Doesn't mean
> to store immediate to memory.
>
> I don't think any load-store architecture has store immediate instruction.
>

Indeed. Sorry for the noise.

Somehow Will caught a whiff of whatever I was smoking then :)
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] arm64: bpf: fix JIT stack setup

2015-11-10 Thread Z Lim
On Tue, Nov 10, 2015 at 11:46 AM, Shi, Yang <yang@linaro.org> wrote:
> On 11/9/2015 12:00 PM, Z Lim wrote:
>>
>> How about splitting this into two patches? One for the BPF-related
>> bug, and another for A64 FP-handling.
>
> I'm not sure if this is a good approach or not. IMHO, they are kind of
> atomic. Without A64 FP-handling, that fix looks incomplete and introduces
> another problem (stack backtrace).
>

The first, even on its own, doesn't make things worse, only better.
The second, which we agree needs to be fixed also, addresses a different issue.

Either way, please also note that these patches fix the original
implementation. We do want -stable to pick these up.

Suggestions for the diagram:
- As an enhancement, would you mind showing the A64_FP also?
- Please revisit "+64:"
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] arm64: bpf: fix JIT stack setup

2015-11-09 Thread Z Lim
On Mon, Nov 9, 2015 at 10:08 AM, Shi, Yang  wrote:
> I added it to stay align with ARMv8 AAPCS to maintain the correct FP during
> function call. It makes us get correct stack backtrace.
>
> I think we'd better to keep compliant with ARMv8 AAPCS in BPF JIT prologue
> too.
>
> If nobody thinks it is necessary, we definitely could remove that change.

Oh no, I don't think anyone will say it's unnecessary!
I agree the A64_FP-related change is a good idea, so stack unwinding works.

How about splitting this into two patches? One for the BPF-related
bug, and another for A64 FP-handling.

Thanks again for tracking this down and improving things overall for arm64 :)

>
> Thanks,
> Yang
>
>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] arm64: bpf: fix JIT stack setup

2015-11-08 Thread Z Lim
On Sat, Nov 7, 2015 at 6:27 PM, Alexei Starovoitov
 wrote:
> On Fri, Nov 06, 2015 at 09:36:17PM -0800, Yang Shi wrote:
>> ARM64 JIT used FP (x29) as eBPF fp register, but FP is subjected to
>> change during function call so it may cause the BPF prog stack base address
>> change too. Whenever, it pointed to the bottom of BPF prog stack instead of
>> the top.
>>
>> So, when copying data via bpf_probe_read, it will be copied to (SP - offset),
>> then it may overwrite the saved FP/LR.
>>
>> Use x25 to replace FP as BPF stack base register (fp). Since x25 is callee
>> saved register, so it will keep intact during function call.
>> It is initialized in BPF prog prologue when BPF prog is started to run
>> everytime. When BPF prog exits, it could be just tossed.
>>
>> Other than this the BPf prog stack base need to be setup before function
>> call stack.
>>
>> So, the BPF stack layout looks like:
>>
>>  high
>>  original A64_SP =>   0:+-+ BPF prologue
>> | | FP/LR and callee saved registers
>>  BPF fp register => +64:+-+
>> | |
>> | ... | BPF prog stack
>> | |
>> | |
>>  current A64_SP =>  +-+
>> | |
>> | ... | Function call stack
>> | |
>> +-+
>>   low
>>
>> Signed-off-by: Yang Shi 
>> CC: Zi Shen Lim 
>> CC: Xi Wang 
>
> Thanks for tracking it down.
> That looks like fundamental bug in arm64 jit. I'm surprised function calls 
> worked at all.
> Zi please review.
>

For function calls (BPF_JMP | BPF_CALL), we are compliant with AAPCS64
[1]. That part is okay.


bpf_probe_read accesses the BPF program stack, which is based on BPF_REG_FP.

This exposes an issue with how BPF_REG_FP was setup, as Yang pointed out.
Instead of having BPF_REG_FP point to top of stack, we erroneously
point it to the bottom of stack. When there are function calls, we run
the risk of clobbering of BPF stack. Bad idea.

Otherwise, since BPF_REG_FP is read-only, and is setup exactly once in
prologue, it remains consistent throughout lifetime of the BPF
program.


Yang, can you please try the following?

8<-
--- a/arch/arm64/net/bpf_jit_comp.c
+++ b/arch/arm64/net/bpf_jit_comp.c
@@ -161,12 +161,12 @@ static void build_prologue(struct jit_ctx *ctx)
if (ctx->tmp_used)
emit(A64_PUSH(tmp1, tmp2, A64_SP), ctx);

-   /* Set up BPF stack */
-   emit(A64_SUB_I(1, A64_SP, A64_SP, stack_size), ctx);
-
/* Set up frame pointer */
emit(A64_MOV(1, fp, A64_SP), ctx);

+   /* Set up BPF stack */
+   emit(A64_SUB_I(1, A64_SP, A64_SP, stack_size), ctx);
+
/* Clear registers A and X */
emit_a64_mov_i64(ra, 0, ctx);
emit_a64_mov_i64(rx, 0, ctx);
->8

[1] 
http://infocenter.arm.com/help/topic/com.arm.doc.ihi0055b/IHI0055B_aapcs64.pdf
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] bpf: add mod default A and X test cases

2015-11-04 Thread Z Lim
On Wed, Nov 4, 2015 at 11:36 AM, Yang Shi  wrote:
> When running "mod X" operation, if X is 0 the filter has to be halt.
> Add new test cases to cover A = A mod X if X is 0, and A = A mod 1.
>
> CC: Xi Wang 
> CC: Zi Shen Lim 
> Signed-off-by: Yang Shi 
> ---

Acked-by: Zi Shen Lim 
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html