Re: [PATCH v2] arm: eBPF JIT compiler
Okay Kees. I will take a look at it. Best, Shubham Bansal On Fri, Jul 7, 2017 at 10:12 AM, Kees Cookwrote: > On Wed, Jul 5, 2017 at 8:49 PM, Shubham Bansal > wrote: >> Hi Kees, >> >> Problem is my ARM machine don't have clang and iproute2 which is >> keeping me from testing the bpf tail calls. >> >> You should do the following to test it,. >> >> 1. tools/testing/selftests/bpf/ >> 2. make >> 3. sudo ./test_progs >> >> And, before testing, you have to do "make headers_install". >> These tests are for tail calls with the attached patch. If its too >> much work, Can you please upload your arm image so that I can test it? >> I just need a good machine. > > I've got all this set up now, and it faults during the test: > > Unable to handle kernel NULL pointer dereference at virtual address 0008 > ... > CPU: 0 PID: 1922 Comm: test_progs Not tainted 4.12.0+ #60 > ... > PC is at __htab_map_lookup_elem+0x54/0x1f4 > > I'll see if I can send you this disk image... > > -Kees > > > -- > Kees Cook > Pixel Security
Re: [PATCH v2] arm: eBPF JIT compiler
On Wed, Jul 5, 2017 at 8:49 PM, Shubham Bansalwrote: > Hi Kees, > > Problem is my ARM machine don't have clang and iproute2 which is > keeping me from testing the bpf tail calls. > > You should do the following to test it,. > > 1. tools/testing/selftests/bpf/ > 2. make > 3. sudo ./test_progs > > And, before testing, you have to do "make headers_install". > These tests are for tail calls with the attached patch. If its too > much work, Can you please upload your arm image so that I can test it? > I just need a good machine. I've got all this set up now, and it faults during the test: Unable to handle kernel NULL pointer dereference at virtual address 0008 ... CPU: 0 PID: 1922 Comm: test_progs Not tainted 4.12.0+ #60 ... PC is at __htab_map_lookup_elem+0x54/0x1f4 I'll see if I can send you this disk image... -Kees -- Kees Cook Pixel Security
Re: [PATCH v2] arm: eBPF JIT compiler
Hi Kees, Problem is my ARM machine don't have clang and iproute2 which is keeping me from testing the bpf tail calls. You should do the following to test it,. 1. tools/testing/selftests/bpf/ 2. make 3. sudo ./test_progs And, before testing, you have to do "make headers_install". These tests are for tail calls with the attached patch. If its too much work, Can you please upload your arm image so that I can test it? I just need a good machine. -Shubham 0001-Added-Support-for-BPF_CALL-BPF_JMP.patch Description: Binary data
Re: [PATCH v2] arm: eBPF JIT compiler
On Wed, Jul 5, 2017 at 3:11 PM, Kees Cookwrote: > On Fri, Jun 23, 2017 at 3:39 PM, Shubham Bansal > wrote: >> Hi Russell,Daniel and Kees, >> >> I am attaching the latest patch with this mail. It included support >> for BPF_CALL | BPF_JMP tested with and without constant blinding on >> ARMv7 machine. >> Due to the limitation on my machine I can't test the tail call. It >> would be a great help if any of you could help me with this. > > Is this just a matter of running test_bpf? If so: Tested-by: Kees Cook test_bpf: Summary: 316 PASSED, 0 FAILED, [287/308 JIT'ed] -Kees -- Kees Cook Pixel Security
Re: [PATCH v2] arm: eBPF JIT compiler
On Fri, Jun 23, 2017 at 3:39 PM, Shubham Bansalwrote: > Hi Russell,Daniel and Kees, > > I am attaching the latest patch with this mail. It included support > for BPF_CALL | BPF_JMP tested with and without constant blinding on > ARMv7 machine. > Due to the limitation on my machine I can't test the tail call. It > would be a great help if any of you could help me with this. Is this just a matter of running test_bpf? Have you been able to debootstrap a debian chroot for ARMv7? > Its been a long time since this patch is in works, Russell, can you > please help with sending this patch to ARM patch tracker? If some other folks can Ack this, I can throw it at the patch tracker for you. I'll report back on my findings. -Kees -- Kees Cook Pixel Security
Re: [PATCH v2] arm: eBPF JIT compiler
Hi Russell,Daniel and Kees, I am attaching the latest patch with this mail. It included support for BPF_CALL | BPF_JMP tested with and without constant blinding on ARMv7 machine. Due to the limitation on my machine I can't test the tail call. It would be a great help if any of you could help me with this. Its been a long time since this patch is in works, Russell, can you please help with sending this patch to ARM patch tracker? Thanks. Shubham 0001-Added-Support-for-BPF_CALL-BPF_JMP.patch Description: Binary data
Re: [PATCH v2] arm: eBPF JIT compiler
On 06/21/2017 09:37 PM, Shubham Bansal wrote: Hi Daniel, Good news. Got the CALL to work. [ 145.670882] test_bpf: Summary: 316 PASSED, 0 FAILED, [287/308 JIT'ed] Awesome. Do you think with this implementation, the patch could get accepted? If you think so, then I will send the patch in couple of days after some refactoring, if not, then do let me know what more is required? Nice, it's ultimately up to the arm folks to review the set in-depth, but feel free to send out the patch once you're done refactoring. With BPF_CALL support that looks quite good from pov of supported insns.
Re: [PATCH v2] arm: eBPF JIT compiler
Hi Daniel, Good news. Got the CALL to work. [ 145.670882] test_bpf: Summary: 316 PASSED, 0 FAILED, [287/308 JIT'ed] Awesome. Do you think with this implementation, the patch could get accepted? If you think so, then I will send the patch in couple of days after some refactoring, if not, then do let me know what more is required? Best, Shubham Bansal On Wed, Jun 21, 2017 at 10:02 PM, Daniel Borkmannwrote: > On 06/21/2017 04:26 PM, Shubham Bansal wrote: > [...] >> >> So ultimately, we call helper_function which takes u64 as arguments >> only. I know its asking a lot, but can you please confirm this asap? I >> would like to start implementing it. > > > Yes, that is correct. I think it would be the better, more generic > approach going forward to always assume that.
Re: [PATCH v2] arm: eBPF JIT compiler
On 06/21/2017 04:26 PM, Shubham Bansal wrote: [...] So ultimately, we call helper_function which takes u64 as arguments only. I know its asking a lot, but can you please confirm this asap? I would like to start implementing it. Yes, that is correct. I think it would be the better, more generic approach going forward to always assume that.
Re: [PATCH v2] arm: eBPF JIT compiler
Hi Daniel, > > So my question would be, why can't the JIT imitate something > similar to what we do in the interpreter as well? So looking > at the disasm of what gcc compiles for the interpreter when it's > doing the above call could help as well in going forward. Not > sure if that answers your question, but perhaps not sure if I > understand your question yet? I just looked at the code again and I think I completely misunderstood the logic of BPF_JMP | BPF_CALL. I think each helper function is working like this. helper_function(u32 a1, u32 a2){ } helper_function(u64 a1, u64 a2){ helper_function((u32 *)a1, (u32 *)a2); } So ultimately, we call helper_function which takes u64 as arguments only. I know its asking a lot, but can you please confirm this asap? I would like to start implementing it. > > Cheers, > Daniel -Shubham
Re: [PATCH v2] arm: eBPF JIT compiler
On 06/20/2017 03:34 AM, Shubham Bansal wrote: Hi Daniel, Sorry, had a travel over the weekend, so didn't read it in time. What is the issue with imitating in JIT what the interpreter is doing as a starting point? That should be generic enough to handle any case. Why not proceeding this way first? Otherwise you'd need some sort of reverse mapping since verifier already converted BPF_CALL insns into relative helper addresses in imm part. Sorry but I don't get what you are trying to say. Can you explain it with an example? Ok, probably the best is to check fixup_bpf_calls() in the verifier, see the fn = prog->aux->ops->get_func_proto(insn->imm). It fetches the helper function specification based on the BPF_FUNC_* enum and converts the imm field into a relative address for the function such that if you look at ___bpf_prog_run(), JMP_CALL label, the call address can be reconstructed again. So you'd need some reverse mapping to get back to the struct bpf_func_proto, so you can check argX_type that needs to be extended with whether its JITable on 32bit or not.
Re: [PATCH v2] arm: eBPF JIT compiler
Hi Daniel, > > Sorry, had a travel over the weekend, so didn't read it in time. > > What is the issue with imitating in JIT what the interpreter is > doing as a starting point? That should be generic enough to handle > any case. > > Otherwise you'd need some sort of reverse mapping since verifier > already converted BPF_CALL insns into relative helper addresses > in imm part. > Sorry but I don't get what you are trying to say. Can you explain it with an example? -Shubham
Re: [PATCH v2] arm: eBPF JIT compiler
On 06/17/2017 02:23 PM, Shubham Bansal wrote: Hi Daniel, Not all of the helpers have 4 or less byte arguments only, there are a few with 8 byte arguments, so making that general assumption wouldn't work. I guess what could be done is that helpers have a flag in struct bpf_func_proto which indicates for JITs that all args are 4 byte on 32bit so you could probably use convention similar to case2 for them. Presumably for that information to process, the JIT might need to be reworked to extract that via bpf_analyzer() that does a verifier run to re-analyze the program like in nfp JIT case. Let me try a better solution which can be used to support both 4 byte and 8 byte arguments. I hope it would work out. Are you sure this patch can pass if it only supports 4 byte arguments though? Let me list out what I have to do, so that you can tell me if I am thinking in a wrong way :- * I will add a bit flag in bpf_func_proto to represent whether different arguments in a function call are 4 bytes or 8 bytes. If lsb of bit flag is set then first argument is 8 byte, otherwise its not. I think I can handle this flag properly in build_insn() in my code. Does this sound okay? I don't understand second part of your solution, i.e. Presumably for that information to process, the JIT might need to be reworked to extract that via bpf_analyzer() that does a verifier run to re-analyze the program like in nfp JIT case. Please explain what are you suggesting and how can I extract bit flag from bpf_func_proto(). Please reply asap, as I would like to finish it over the weekend. Please. Sorry, had a travel over the weekend, so didn't read it in time. What is the issue with imitating in JIT what the interpreter is doing as a starting point? That should be generic enough to handle any case. Otherwise you'd need some sort of reverse mapping since verifier already converted BPF_CALL insns into relative helper addresses in imm part. -Shubham
Re: [PATCH v2] arm: eBPF JIT compiler
Hi Daniel, > > Not all of the helpers have 4 or less byte arguments only, there are a > few with 8 byte arguments, so making that general assumption wouldn't > work. I guess what could be done is that helpers have a flag in struct > bpf_func_proto which indicates for JITs that all args are 4 byte on 32bit > so you could probably use convention similar to case2 for them. Presumably > for that information to process, the JIT might need to be reworked to > extract that via bpf_analyzer() that does a verifier run to re-analyze > the program like in nfp JIT case. Let me try a better solution which can be used to support both 4 byte and 8 byte arguments. I hope it would work out. Are you sure this patch can pass if it only supports 4 byte arguments though? Let me list out what I have to do, so that you can tell me if I am thinking in a wrong way :- * I will add a bit flag in bpf_func_proto to represent whether different arguments in a function call are 4 bytes or 8 bytes. If lsb of bit flag is set then first argument is 8 byte, otherwise its not. I think I can handle this flag properly in build_insn() in my code. Does this sound okay? I don't understand second part of your solution, i.e. > Presumably > for that information to process, the JIT might need to be reworked to > extract that via bpf_analyzer() that does a verifier run to re-analyze > the program like in nfp JIT case. Please explain what are you suggesting and how can I extract bit flag from bpf_func_proto(). Please reply asap, as I would like to finish it over the weekend. Please. -Shubham
Re: [PATCH v2] arm: eBPF JIT compiler
On 06/13/2017 08:56 AM, Shubham Bansal wrote: Hi Daniel, Kees, David, Russel, Any plans to implement above especially BPF_JMP | BPF_CALL in near future? Reason why I'm asking is that i) currently the arm32 cBPF JIT implements all of the cBPF extensions (except SKF_AD_RANDOM and SKF_AD_VLAN_TPID). Some of the programs that were JITed before e.g. using SKF_AD_CPU would now fall back to the eBPF interpreter due to lack of translation in JIT, but also ii) that probably most (if not all) of eBPF programs use BPF helper calls heavily, which will still redirect them to the interpreter right now due to lack of BPF_JMP | BPF_CALL support, so it's really quite essential to have it implemented. I can try for BPF_JMP | BPF_CALL. I didn't do it last time because I thought, it would make the code look messy and become pain to get it through the review. For this, I have to map eBPF arguments with arm ABI arguments and move ebpf arguments to corresponding arm ABI arguments, as eBPF arguments doesn't match with arm ABI arguments. Let me try that if its possible. Okay. I looked at it, tried few different solutions also. There is a problem with implementing BPF_JMP | BPF_CALL. Problem is transition between 4 byte and 8 byte arguments. Lets take a look a the following example to get a more clear look at the problem. Lets consider this function : CASE 1:foo(int a, int b, long long c, int d) For calling this function in arm 32 arch, I have to pass the arguments as following: a -> r0 b -> r1 c -> r2, r3 d -> stack_top Now consider an another example function : CASE 2: bar(int a, int b, int c, int d) For calling this function in arm32 arch, I have to pass the arguments as following: a -> r0 b -> r1 c -> r2 d -> r3 So, you can clearly see the problem with it. There is no way of knowing which of the above way to pass the arguments. There are solutions possible: Right. 1. One thing I can do is look at the address of the function to call and pass the argument accordingly but thats not really a robust solution as we have to change the arm32 JIT each time we add any new BPF helper function. Yeah, that would be rather ugly. 2. Another solution is, if any of you guys can assure/confirm me that there will be only 4 byte argument passed to BPF helper functions in arm32 as of now and in future including the pointer as well, then I can just assume that each argument is passed as 4 byte value and my trimming the 8byte arguments to 4 bytes arguments wouldn't be a problem. In that case, arguments for CASE 1 and CASE 2 will be passed in the same way, i.e. a -> r0 b -> r1 c -> r2 d -> r3 Let me know what you think. I don't think I can find the solution to this problem other than those mentioned above. Would love to here any ideas from you guys. Not all of the helpers have 4 or less byte arguments only, there are a few with 8 byte arguments, so making that general assumption wouldn't work. I guess what could be done is that helpers have a flag in struct bpf_func_proto which indicates for JITs that all args are 4 byte on 32bit so you could probably use convention similar to case2 for them. Presumably for that information to process, the JIT might need to be reworked to extract that via bpf_analyzer() that does a verifier run to re-analyze the program like in nfp JIT case. The other option could perhaps be to check the interpreter disasm of ___bpf_prog_run() with regards to how it handles BPF_JMP | BPF_CALL helper call and do something similarly generic in the JIT as well. Did you also manage to get the BPF selftest suite running in the meantime (tools/testing/selftests/bpf/)? There are a couple of programs that clang will compile (test_pkt_access.o, test_xdp.o, test_l4lb.o, test_tcp_estats.o) and then test run. I will run these tests tonight. Hopefully I will be able to run them. Ok. Any comments are welcome. Would love to here what you think about my solutions above.
Re: [PATCH v2] arm: eBPF JIT compiler
Hi Daniel, Kees, David, Russel, >> Any plans to implement above especially BPF_JMP | BPF_CALL in near future? >> Reason why I'm asking is that i) currently the arm32 cBPF JIT implements >> all of the cBPF extensions (except SKF_AD_RANDOM and SKF_AD_VLAN_TPID). >> Some of the programs that were JITed before e.g. using SKF_AD_CPU would now >> fall back to the eBPF interpreter due to lack of translation in JIT, but >> also ii) that probably most (if not all) of eBPF programs use BPF helper >> calls heavily, which will still redirect them to the interpreter right now >> due to lack of BPF_JMP | BPF_CALL support, so it's really quite essential >> to have it implemented. > > I can try for BPF_JMP | BPF_CALL. I didn't do it last time because I > thought, it would make the code look messy and become pain to get it > through the review. > For this, I have to map eBPF arguments with arm ABI arguments and move > ebpf arguments to corresponding arm ABI arguments, as eBPF arguments > doesn't match with arm ABI arguments. > Let me try that if its possible. Okay. I looked at it, tried few different solutions also. There is a problem with implementing BPF_JMP | BPF_CALL. Problem is transition between 4 byte and 8 byte arguments. Lets take a look a the following example to get a more clear look at the problem. Lets consider this function : CASE 1:foo(int a, int b, long long c, int d) For calling this function in arm 32 arch, I have to pass the arguments as following: a -> r0 b -> r1 c -> r2, r3 d -> stack_top Now consider an another example function : CASE 2: bar(int a, int b, int c, int d) For calling this function in arm32 arch, I have to pass the arguments as following: a -> r0 b -> r1 c -> r2 d -> r3 So, you can clearly see the problem with it. There is no way of knowing which of the above way to pass the arguments. There are solutions possible: 1. One thing I can do is look at the address of the function to call and pass the argument accordingly but thats not really a robust solution as we have to change the arm32 JIT each time we add any new BPF helper function. 2. Another solution is, if any of you guys can assure/confirm me that there will be only 4 byte argument passed to BPF helper functions in arm32 as of now and in future including the pointer as well, then I can just assume that each argument is passed as 4 byte value and my trimming the 8byte arguments to 4 bytes arguments wouldn't be a problem. In that case, arguments for CASE 1 and CASE 2 will be passed in the same way, i.e. a -> r0 b -> r1 c -> r2 d -> r3 Let me know what you think. I don't think I can find the solution to this problem other than those mentioned above. Would love to here any ideas from you guys. >> Did you also manage to get the BPF selftest suite running in the meantime >> (tools/testing/selftests/bpf/)? There are a couple of programs that clang >> will compile (test_pkt_access.o, test_xdp.o, test_l4lb.o, test_tcp_estats.o) >> and then test run. I will run these tests tonight. Hopefully I will be able to run them. Any comments are welcome. Would love to here what you think about my solutions above. Thanks. Shubham
Re: [PATCH v2] arm: eBPF JIT compiler
On 06/12/2017 05:40 PM, Shubham Bansal wrote: [...] Did you manage to get tail calls tested as well (I assume so since you implemented emit_bpf_tail_call() in the patch but just out of curiosity)? I didn't try it exclusively, I thought test_bpf must have tested it. Doesn't it? In samples/bpf/ there's sockex3* that would exercise it, or alternatively in iproute2 repo under examples/bpf/ there's bpf_cyclic.c and bpf_tailcall.c as a prog. Hm, generally, we should really add a test case also to BPF selftest suite to facilitate that. I'll likely do that for the next batch of BPF patches.
Re: [PATCH v2] arm: eBPF JIT compiler
From: Alexander AlemayhuDate: Tue, 13 Jun 2017 00:45:45 +0200 > On Mon, Jun 12, 2017 at 09:10:07PM +0530, Shubham Bansal wrote: >> >> Nope. It looks like a latest addition to testing. Can you please tell >> me how to test with it? >> > cd tools/testing/selftests/bpf/ > make > sudo ./test_progs Also, you might need to do a "make headers_install" at the top level before doing this.
Re: [PATCH v2] arm: eBPF JIT compiler
On Mon, Jun 12, 2017 at 09:10:07PM +0530, Shubham Bansal wrote: > > Nope. It looks like a latest addition to testing. Can you please tell > me how to test with it? > cd tools/testing/selftests/bpf/ make sudo ./test_progs -- Mit freundlichen Grüßen Alexander Alemayhu
Re: [PATCH v2] arm: eBPF JIT compiler
On Mon, Jun 12, 2017 at 3:51 PM, Daniel Borkmannwrote: > On 05/30/2017 09:19 PM, Kees Cook wrote: >>> This patch is essentially changing the current implementation of JIT >>> compiler of Berkeley Packet Filter from classic to internal with almost >>> all instructions from eBPF ISA supported except the following >>> BPF_ALU64 | BPF_DIV | BPF_K >>> BPF_ALU64 | BPF_DIV | BPF_X >>> BPF_ALU64 | BPF_MOD | BPF_K >>> BPF_ALU64 | BPF_MOD | BPF_X >>> BPF_STX | BPF_XADD | BPF_W >>> BPF_STX | BPF_XADD | BPF_DW >>> BPF_JMP | BPF_CALL > > > Any plans to implement above especially BPF_JMP | BPF_CALL in near future? > Reason why I'm asking is that i) currently the arm32 cBPF JIT implements > all of the cBPF extensions (except SKF_AD_RANDOM and SKF_AD_VLAN_TPID). > Some of the programs that were JITed before e.g. using SKF_AD_CPU would now > fall back to the eBPF interpreter due to lack of translation in JIT, but > also ii) that probably most (if not all) of eBPF programs use BPF helper > calls heavily, which will still redirect them to the interpreter right now > due to lack of BPF_JMP | BPF_CALL support, so it's really quite essential > to have it implemented. I can try for BPF_JMP | BPF_CALL. I didn't do it last time because I thought, it would make the code look messy and become pain to get it through the review. For this, I have to map eBPF arguments with arm ABI arguments and move ebpf arguments to corresponding arm ABI arguments, as eBPF arguments doesn't match with arm ABI arguments. Let me try that if its possible. As far as following 4 are concerned : >>> BPF_ALU64 | BPF_DIV | BPF_K >>> BPF_ALU64 | BPF_DIV | BPF_X >>> BPF_ALU64 | BPF_MOD | BPF_K >>> BPF_ALU64 | BPF_MOD | BPF_X I don't think it possible with current constraints over registers. I already tried this. > >>> Implementation is using scratch space to emulate 64 bit eBPF ISA on 32 >>> bit >>> ARM because of deficiency of general purpose registers on ARM. Currently, >>> only LITTLE ENDIAN machines are supported in this eBPF JIT Compiler. >>> >>> Tested on ARMv7 with QEMU by me (Shubham Bansal). >>> Tested on ARMv5 by Andrew Lunn (and...@lunn.ch). >>> Expected to work on ARMv6 as well, as its a part ARMv7 and part ARMv5. >>> Although, a proper testing is not done for ARMv6. >>> >>> Both of these testing are done with and without CONFIG_FRAME_POINTER >>> separately for LITTLE ENDIAN machine. >>> >>> For testing: >>> >>> 1. JIT is enabled with >>> echo 1 > /proc/sys/net/core/bpf_jit_enable >>> 2. Constant Blinding can be enabled along with JIT using >>> echo 1 > /proc/sys/net/core/bpf_jit_enable >>> echo 2 > /proc/sys/net/core/bpf_jit_harden >>> >>> See Documentation/networking/filter.txt for more information. >>> >>> Result : test_bpf: Summary: 314 PASSED, 0 FAILED, [278/306 JIT'ed] > > > Did you also manage to get the BPF selftest suite running in the meantime > (tools/testing/selftests/bpf/)? There are a couple of programs that clang > will compile (test_pkt_access.o, test_xdp.o, test_l4lb.o, test_tcp_estats.o) > and then test run. Nope. It looks like a latest addition to testing. Can you please tell me how to test with it? > > Did you manage to get tail calls tested as well (I assume so since you > implemented emit_bpf_tail_call() in the patch but just out of curiosity)? I didn't try it exclusively, I thought test_bpf must have tested it. Doesn't it? > >>> Signed-off-by: Shubham Bansal >>> --- >>> Documentation/networking/filter.txt |4 +- >>> arch/arm/Kconfig|2 +- >>> arch/arm/net/bpf_jit_32.c | 2404 >>> --- >>> arch/arm/net/bpf_jit_32.h | 108 +- >>> 4 files changed, 1713 insertions(+), 805 deletions(-) >>> > [...] > > If arm folks take the patch, there will be two minor (silent) merge > conflicts with net-next: > > 1) In bpf_int_jit_compile(), below the jited = 1 assignment, there >needs to come a prog->jited_len = image_size. Done. > 2) The internal tail call opcode changed from BPF_JMP | BPF_CALL | BPF_X >into BPF_JMP | BPF_TAIL_CALL. Done. > > Two minor things below, could probably also be as follow-up. > > [...] >>> >>> + /* dst = imm64 */ >>> + case BPF_LD | BPF_IMM | BPF_DW: >>> + { >>> + const struct bpf_insn insn1 = insn[1]; >>> + u32 hi, lo = imm; >>> + >>> + if (insn1.code != 0 || insn1.src_reg != 0 || >>> + insn1.dst_reg != 0 || insn1.off != 0) { >>> + /* Note: verifier in BPF core must catch invalid >>> +* instruction. >>> +*/ >>> + pr_err_once("Invalid BPF_LD_IMM64 >>> instruction\n"); >>> + return -EINVAL; >>> + } > > > Nit: this check can be removed
Re: [PATCH v2] arm: eBPF JIT compiler
Hi Russel, On Mon, Jun 12, 2017 at 4:36 PM, Russell King - ARM Linuxwrote: > On Mon, Jun 12, 2017 at 12:21:03PM +0200, Daniel Borkmann wrote: >> On 05/30/2017 09:19 PM, Kees Cook wrote: >> >On Thu, May 25, 2017 at 4:13 PM, Shubham Bansal >> > wrote: >> >>+static int validate_code(struct jit_ctx *ctx) >> >>+{ >> >>+ int i; >> >>+ >> >>+ for (i = 0; i < ctx->idx; i++) { >> >>+ u32 a32_insn = le32_to_cpu(ctx->target[i]); >> >> Given __opcode_to_mem_arm(ARM_INST_UDF) is used to fill the image, >> perhaps use the __mem_to_opcode_arm() helper for the check? >> >> >>+ if (a32_insn == ARM_INST_UDF) > > The following is probably better: > > if (ctx->target[i] == __opcode_to_mem_arm(ARM_INST_UDF)) > > since then you can take advantage of the compiler optimising the > constant rather than having to do a byte swap on an unknown 32-bit > value. Done. Thanks :) Please check if you can find anymore issues with the code. I really appreciate it. -Shubham
Re: [PATCH v2] arm: eBPF JIT compiler
On Mon, Jun 12, 2017 at 12:21:03PM +0200, Daniel Borkmann wrote: > On 05/30/2017 09:19 PM, Kees Cook wrote: > >On Thu, May 25, 2017 at 4:13 PM, Shubham Bansal > >wrote: > >>+static int validate_code(struct jit_ctx *ctx) > >>+{ > >>+ int i; > >>+ > >>+ for (i = 0; i < ctx->idx; i++) { > >>+ u32 a32_insn = le32_to_cpu(ctx->target[i]); > > Given __opcode_to_mem_arm(ARM_INST_UDF) is used to fill the image, > perhaps use the __mem_to_opcode_arm() helper for the check? > > >>+ if (a32_insn == ARM_INST_UDF) The following is probably better: if (ctx->target[i] == __opcode_to_mem_arm(ARM_INST_UDF)) since then you can take advantage of the compiler optimising the constant rather than having to do a byte swap on an unknown 32-bit value. -- RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net.
Re: [PATCH v2] arm: eBPF JIT compiler
On 05/30/2017 09:19 PM, Kees Cook wrote: Forwarding this to net-dev and eBPF folks, who weren't on CC... Sorry for being late. Some comments below from a cursory look ... -Kees On Thu, May 25, 2017 at 4:13 PM, Shubham Bansalwrote: The JIT compiler emits ARM 32 bit instructions. Currently, It supports eBPF only. Classic BPF is supported because of the conversion by BPF core. This patch is essentially changing the current implementation of JIT compiler of Berkeley Packet Filter from classic to internal with almost all instructions from eBPF ISA supported except the following BPF_ALU64 | BPF_DIV | BPF_K BPF_ALU64 | BPF_DIV | BPF_X BPF_ALU64 | BPF_MOD | BPF_K BPF_ALU64 | BPF_MOD | BPF_X BPF_STX | BPF_XADD | BPF_W BPF_STX | BPF_XADD | BPF_DW BPF_JMP | BPF_CALL Any plans to implement above especially BPF_JMP | BPF_CALL in near future? Reason why I'm asking is that i) currently the arm32 cBPF JIT implements all of the cBPF extensions (except SKF_AD_RANDOM and SKF_AD_VLAN_TPID). Some of the programs that were JITed before e.g. using SKF_AD_CPU would now fall back to the eBPF interpreter due to lack of translation in JIT, but also ii) that probably most (if not all) of eBPF programs use BPF helper calls heavily, which will still redirect them to the interpreter right now due to lack of BPF_JMP | BPF_CALL support, so it's really quite essential to have it implemented. Implementation is using scratch space to emulate 64 bit eBPF ISA on 32 bit ARM because of deficiency of general purpose registers on ARM. Currently, only LITTLE ENDIAN machines are supported in this eBPF JIT Compiler. Tested on ARMv7 with QEMU by me (Shubham Bansal). Tested on ARMv5 by Andrew Lunn (and...@lunn.ch). Expected to work on ARMv6 as well, as its a part ARMv7 and part ARMv5. Although, a proper testing is not done for ARMv6. Both of these testing are done with and without CONFIG_FRAME_POINTER separately for LITTLE ENDIAN machine. For testing: 1. JIT is enabled with echo 1 > /proc/sys/net/core/bpf_jit_enable 2. Constant Blinding can be enabled along with JIT using echo 1 > /proc/sys/net/core/bpf_jit_enable echo 2 > /proc/sys/net/core/bpf_jit_harden See Documentation/networking/filter.txt for more information. Result : test_bpf: Summary: 314 PASSED, 0 FAILED, [278/306 JIT'ed] Did you also manage to get the BPF selftest suite running in the meantime (tools/testing/selftests/bpf/)? There are a couple of programs that clang will compile (test_pkt_access.o, test_xdp.o, test_l4lb.o, test_tcp_estats.o) and then test run. Did you manage to get tail calls tested as well (I assume so since you implemented emit_bpf_tail_call() in the patch but just out of curiosity)? Signed-off-by: Shubham Bansal --- Documentation/networking/filter.txt |4 +- arch/arm/Kconfig|2 +- arch/arm/net/bpf_jit_32.c | 2404 --- arch/arm/net/bpf_jit_32.h | 108 +- 4 files changed, 1713 insertions(+), 805 deletions(-) [...] If arm folks take the patch, there will be two minor (silent) merge conflicts with net-next: 1) In bpf_int_jit_compile(), below the jited = 1 assignment, there needs to come a prog->jited_len = image_size. 2) The internal tail call opcode changed from BPF_JMP | BPF_CALL | BPF_X into BPF_JMP | BPF_TAIL_CALL. Two minor things below, could probably also be as follow-up. [...] + /* dst = imm64 */ + case BPF_LD | BPF_IMM | BPF_DW: + { + const struct bpf_insn insn1 = insn[1]; + u32 hi, lo = imm; + + if (insn1.code != 0 || insn1.src_reg != 0 || + insn1.dst_reg != 0 || insn1.off != 0) { + /* Note: verifier in BPF core must catch invalid +* instruction. +*/ + pr_err_once("Invalid BPF_LD_IMM64 instruction\n"); + return -EINVAL; + } Nit: this check can be removed as verifier already takes care of it. (No JIT checks for this anymore.) + hi = insn1.imm; + emit_a32_mov_i(dst_lo, lo, dstk, ctx); + emit_a32_mov_i(dst_hi, hi, dstk, ctx); + + return 1; + } [...] - /* compute offsets only during the first pass */ - if (ctx->target == NULL) - ctx->offsets[i] = ctx->idx * 4; +static int validate_code(struct jit_ctx *ctx) +{ + int i; + + for (i = 0; i < ctx->idx; i++) { + u32 a32_insn = le32_to_cpu(ctx->target[i]); Given __opcode_to_mem_arm(ARM_INST_UDF) is used to fill the image, perhaps use the __mem_to_opcode_arm() helper for the check? + if (a32_insn == ARM_INST_UDF) + return -1; + } return 0; } +void bpf_jit_compile(struct bpf_prog
Re: [PATCH v2] arm: eBPF JIT compiler
On Tue, Jun 6, 2017 at 12:47 PM, Shubham Bansalwrote: > Hi Russell, Alexei, David, Daniel, kees, > > Any update on this patch moving forward? Since this has gotten testing by various people and passes the existing self-tests, I think this can probably go in via the ARM patch tracker? Russell does that sound okay to you? -Kees -- Kees Cook Pixel Security
Re: [PATCH v2] arm: eBPF JIT compiler
Hi Russell, Alexei, David, Daniel, kees, Any update on this patch moving forward? Best, Shubham Bansal On Wed, May 31, 2017 at 12:49 AM, Kees Cookwrote: > Forwarding this to net-dev and eBPF folks, who weren't on CC... > > -Kees > > On Thu, May 25, 2017 at 4:13 PM, Shubham Bansal > wrote: >> The JIT compiler emits ARM 32 bit instructions. Currently, It supports >> eBPF only. Classic BPF is supported because of the conversion by BPF >> core. >> >> This patch is essentially changing the current implementation of JIT >> compiler of Berkeley Packet Filter from classic to internal with almost >> all instructions from eBPF ISA supported except the following >> BPF_ALU64 | BPF_DIV | BPF_K >> BPF_ALU64 | BPF_DIV | BPF_X >> BPF_ALU64 | BPF_MOD | BPF_K >> BPF_ALU64 | BPF_MOD | BPF_X >> BPF_STX | BPF_XADD | BPF_W >> BPF_STX | BPF_XADD | BPF_DW >> BPF_JMP | BPF_CALL >> >> Implementation is using scratch space to emulate 64 bit eBPF ISA on 32 bit >> ARM because of deficiency of general purpose registers on ARM. Currently, >> only LITTLE ENDIAN machines are supported in this eBPF JIT Compiler. >> >> Tested on ARMv7 with QEMU by me (Shubham Bansal). >> Tested on ARMv5 by Andrew Lunn (and...@lunn.ch). >> Expected to work on ARMv6 as well, as its a part ARMv7 and part ARMv5. >> Although, a proper testing is not done for ARMv6. >> >> Both of these testing are done with and without CONFIG_FRAME_POINTER >> separately for LITTLE ENDIAN machine. >> >> For testing: >> >> 1. JIT is enabled with >> echo 1 > /proc/sys/net/core/bpf_jit_enable >> 2. Constant Blinding can be enabled along with JIT using >> echo 1 > /proc/sys/net/core/bpf_jit_enable >> echo 2 > /proc/sys/net/core/bpf_jit_harden >> >> See Documentation/networking/filter.txt for more information. >> >> Result : test_bpf: Summary: 314 PASSED, 0 FAILED, [278/306 JIT'ed] >> >> Signed-off-by: Shubham Bansal >> --- >> Documentation/networking/filter.txt |4 +- >> arch/arm/Kconfig|2 +- >> arch/arm/net/bpf_jit_32.c | 2404 >> --- >> arch/arm/net/bpf_jit_32.h | 108 +- >> 4 files changed, 1713 insertions(+), 805 deletions(-) >> >> diff --git a/Documentation/networking/filter.txt >> b/Documentation/networking/filter.txt >> index b69b205..01165ac 100644 >> --- a/Documentation/networking/filter.txt >> +++ b/Documentation/networking/filter.txt >> @@ -596,8 +596,8 @@ skb pointer). All constraints and restrictions from >> bpf_check_classic() apply >> before a conversion to the new layout is being done behind the scenes! >> >> Currently, the classic BPF format is being used for JITing on most 32-bit >> -architectures, whereas x86-64, aarch64, s390x, powerpc64, sparc64 perform >> JIT >> -compilation from eBPF instruction set. >> +architectures, whereas x86-64, aarch64, arm, s390x, powerpc64, sparc64 >> perform >> +JIT compilation from eBPF instruction set. >> >> Some core changes of the new internal format: >> >> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig >> index 8a7ab5e..13ade46 100644 >> --- a/arch/arm/Kconfig >> +++ b/arch/arm/Kconfig >> @@ -47,7 +47,7 @@ config ARM >> select HAVE_ARCH_SECCOMP_FILTER if (AEABI && !OABI_COMPAT) >> select HAVE_ARCH_TRACEHOOK >> select HAVE_ARM_SMCCC if CPU_V7 >> - select HAVE_CBPF_JIT >> + select HAVE_EBPF_JIT >> select HAVE_CC_STACKPROTECTOR >> select HAVE_CONTEXT_TRACKING >> select HAVE_C_RECORDMCOUNT >> diff --git a/arch/arm/net/bpf_jit_32.c b/arch/arm/net/bpf_jit_32.c >> index 93d0b6d..c7476e5 100644 >> --- a/arch/arm/net/bpf_jit_32.c >> +++ b/arch/arm/net/bpf_jit_32.c >> @@ -1,13 +1,15 @@ >> /* >> - * Just-In-Time compiler for BPF filters on 32bit ARM >> + * Just-In-Time compiler for eBPF filters on 32bit ARM >> * >> * Copyright (c) 2011 Mircea Gherzan >> + * Copyright (c) 2017 Shubham Bansal >> * >> * This program is free software; you can redistribute it and/or modify it >> * under the terms of the GNU General Public License as published by the >> * Free Software Foundation; version 2 of the License. >> */ >> >> +#include >> #include >> #include >> #include >> @@ -23,44 +25,91 @@ >> >> #include "bpf_jit_32.h" >> >> +int bpf_jit_enable __read_mostly; >> + >> +#define STACK_OFFSET(k)(k) >> +#define TMP_REG_1 (MAX_BPF_JIT_REG + 0) /* TEMP Register 1 */ >> +#define TMP_REG_2 (MAX_BPF_JIT_REG + 1) /* TEMP Register 2 */ >> +#define TCALL_CNT (MAX_BPF_JIT_REG + 2) /* Tail Call Count */ >> + >> +/* Flags used for JIT optimization */ >> +#define SEEN_CALL (1 << 0) >> + >> +#define FLAG_IMM_OVERFLOW (1 << 0) >> + >> /* >> - * ABI: >> + * Map eBPF registers to ARM 32bit registers or stack scratch space. >> + * >> + * 1. First
Re: [PATCH v2] arm: eBPF JIT compiler
Forwarding this to net-dev and eBPF folks, who weren't on CC... -Kees On Thu, May 25, 2017 at 4:13 PM, Shubham Bansalwrote: > The JIT compiler emits ARM 32 bit instructions. Currently, It supports > eBPF only. Classic BPF is supported because of the conversion by BPF > core. > > This patch is essentially changing the current implementation of JIT > compiler of Berkeley Packet Filter from classic to internal with almost > all instructions from eBPF ISA supported except the following > BPF_ALU64 | BPF_DIV | BPF_K > BPF_ALU64 | BPF_DIV | BPF_X > BPF_ALU64 | BPF_MOD | BPF_K > BPF_ALU64 | BPF_MOD | BPF_X > BPF_STX | BPF_XADD | BPF_W > BPF_STX | BPF_XADD | BPF_DW > BPF_JMP | BPF_CALL > > Implementation is using scratch space to emulate 64 bit eBPF ISA on 32 bit > ARM because of deficiency of general purpose registers on ARM. Currently, > only LITTLE ENDIAN machines are supported in this eBPF JIT Compiler. > > Tested on ARMv7 with QEMU by me (Shubham Bansal). > Tested on ARMv5 by Andrew Lunn (and...@lunn.ch). > Expected to work on ARMv6 as well, as its a part ARMv7 and part ARMv5. > Although, a proper testing is not done for ARMv6. > > Both of these testing are done with and without CONFIG_FRAME_POINTER > separately for LITTLE ENDIAN machine. > > For testing: > > 1. JIT is enabled with > echo 1 > /proc/sys/net/core/bpf_jit_enable > 2. Constant Blinding can be enabled along with JIT using > echo 1 > /proc/sys/net/core/bpf_jit_enable > echo 2 > /proc/sys/net/core/bpf_jit_harden > > See Documentation/networking/filter.txt for more information. > > Result : test_bpf: Summary: 314 PASSED, 0 FAILED, [278/306 JIT'ed] > > Signed-off-by: Shubham Bansal > --- > Documentation/networking/filter.txt |4 +- > arch/arm/Kconfig|2 +- > arch/arm/net/bpf_jit_32.c | 2404 > --- > arch/arm/net/bpf_jit_32.h | 108 +- > 4 files changed, 1713 insertions(+), 805 deletions(-) > > diff --git a/Documentation/networking/filter.txt > b/Documentation/networking/filter.txt > index b69b205..01165ac 100644 > --- a/Documentation/networking/filter.txt > +++ b/Documentation/networking/filter.txt > @@ -596,8 +596,8 @@ skb pointer). All constraints and restrictions from > bpf_check_classic() apply > before a conversion to the new layout is being done behind the scenes! > > Currently, the classic BPF format is being used for JITing on most 32-bit > -architectures, whereas x86-64, aarch64, s390x, powerpc64, sparc64 perform JIT > -compilation from eBPF instruction set. > +architectures, whereas x86-64, aarch64, arm, s390x, powerpc64, sparc64 > perform > +JIT compilation from eBPF instruction set. > > Some core changes of the new internal format: > > diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig > index 8a7ab5e..13ade46 100644 > --- a/arch/arm/Kconfig > +++ b/arch/arm/Kconfig > @@ -47,7 +47,7 @@ config ARM > select HAVE_ARCH_SECCOMP_FILTER if (AEABI && !OABI_COMPAT) > select HAVE_ARCH_TRACEHOOK > select HAVE_ARM_SMCCC if CPU_V7 > - select HAVE_CBPF_JIT > + select HAVE_EBPF_JIT > select HAVE_CC_STACKPROTECTOR > select HAVE_CONTEXT_TRACKING > select HAVE_C_RECORDMCOUNT > diff --git a/arch/arm/net/bpf_jit_32.c b/arch/arm/net/bpf_jit_32.c > index 93d0b6d..c7476e5 100644 > --- a/arch/arm/net/bpf_jit_32.c > +++ b/arch/arm/net/bpf_jit_32.c > @@ -1,13 +1,15 @@ > /* > - * Just-In-Time compiler for BPF filters on 32bit ARM > + * Just-In-Time compiler for eBPF filters on 32bit ARM > * > * Copyright (c) 2011 Mircea Gherzan > + * Copyright (c) 2017 Shubham Bansal > * > * This program is free software; you can redistribute it and/or modify it > * under the terms of the GNU General Public License as published by the > * Free Software Foundation; version 2 of the License. > */ > > +#include > #include > #include > #include > @@ -23,44 +25,91 @@ > > #include "bpf_jit_32.h" > > +int bpf_jit_enable __read_mostly; > + > +#define STACK_OFFSET(k)(k) > +#define TMP_REG_1 (MAX_BPF_JIT_REG + 0) /* TEMP Register 1 */ > +#define TMP_REG_2 (MAX_BPF_JIT_REG + 1) /* TEMP Register 2 */ > +#define TCALL_CNT (MAX_BPF_JIT_REG + 2) /* Tail Call Count */ > + > +/* Flags used for JIT optimization */ > +#define SEEN_CALL (1 << 0) > + > +#define FLAG_IMM_OVERFLOW (1 << 0) > + > /* > - * ABI: > + * Map eBPF registers to ARM 32bit registers or stack scratch space. > + * > + * 1. First argument is passed using the arm 32bit registers and rest of the > + * arguments are passed on stack scratch space. > + * 2. First callee-saved aregument is mapped to arm 32 bit registers and rest > + * arguments are mapped to scratch space on stack. > + * 3. We need two 64 bit temp registers to do complex operations