[PATCH v8 net-next 3/3] doc: filter: add Extended BPF documentation
Signed-off-by: Alexei Starovoitov Reviewed-by: Daniel Borkmann --- Documentation/networking/filter.txt | 181 +++ 1 file changed, 181 insertions(+) diff --git a/Documentation/networking/filter.txt b/Documentation/networking/filter.txt index a06b48d2f5cc..6a0e29583a30 100644 --- a/Documentation/networking/filter.txt +++ b/Documentation/networking/filter.txt @@ -546,6 +546,186 @@ a0069c8f + : For BPF JIT developers, bpf_jit_disasm, bpf_asm and bpf_dbg provides a useful toolchain for developing and testing the kernel's JIT compiler. +Extended BPF + +Extended BPF extends BPF in the following ways: +- from 2 to 10 registers + Original BPF has two registers (A and X) and hidden frame pointer. + Extended BPF has ten registers and read-only frame pointer. +- from 32-bit registers to 64-bit registers + semantics of old 32-bit ALU operations are preserved via 32-bit + subregisters +- if (cond) jump_true; else jump_false; + old BPF insns are replaced with: + if (cond) jump_true; /* else fallthrough */ +- adds signed > and >= insns +- 16 4-byte stack slots for register spill-fill replaced with + up to 512 bytes of multi-use stack space +- introduces bpf_call insn and register passing convention for zero + overhead calls from/to other kernel functions (not part of this patch) +- adds arithmetic right shift insn +- adds swab32/swab64 insns +- adds atomic_add insn +- old tax/txa insns are replaced with 'mov dst,src' insn + +Extended BPF is designed to be JITed with one to one mapping, which +allows GCC/LLVM compilers to generate optimized BPF code that performs +almost as fast as natively compiled code + +sysctl net.core.bpf_ext_enable=1 +controls whether filters attached to sockets will be automatically +converted to extended BPF or not. + +BPF is safe dynamically loadable program that can call fixed set +of kernel functions and takes a pointer to data as an input, +where data is skb, seccomp_data, kprobe function arguments or else. + +Extended Instruction Set was designed with these goals: +- write programs in restricted C and compile into BPF with GCC/LLVM +- just-in-time map to modern 64-bit CPU with minimal performance overhead + over two steps: C -> BPF -> native code +- guarantee termination and safety of BPF program in kernel + with simple algorithm + +GCC/LLVM-bpf backend is optional. +Extended BPF can be coded with macroses from filter.h just like original BPF, +though the same filter done in C is easier to understand. +sk_convert_filter() remaps original BPF insns into extended. + +Minimal performance overhead is achieved by having one to one mapping +between BPF insns and native insns, and one to one mapping between BPF +registers and native registers on 64-bit CPUs + +Extended BPF may allow jump forward and backward for two reasons: +to reduce branch mispredict penalty compiler moves cold basic blocks out of +fall-through path and to reduce code duplication that would be hard to avoid +if only jump forward was available. +To guarantee termination simple non-recursive depth-first-search verifies +that there are no back-edges (no loops in the program), program is a DAG +with root at the first insn, all branches end at the last RET insn and +all instructions are reachable. +Original BPF actually allows unreachable insns. Though it's safe, it will be +fixed when extended BPF replaces BPF completely. + +Original BPF has two registers (A and X) and hidden frame pointer. +Extended BPF has ten registers and read-only frame pointer. +Since 64-bit CPUs are passing arguments to the functions via registers +the number of args from BPF program to in-kernel function is restricted to 5 +and one register is used to accept return value from in-kernel function. +x86_64 passes first 6 arguments in registers. +aarch64/sparcv9/mips64 have 7-8 registers for arguments. +x86_64 has 6 callee saved registers. +aarch64/sparcv9/mips64 have 11 or more callee saved registers. + +Therefore extended BPF calling convention is defined as: +R0 - return value from in-kernel function +R1-R5 - arguments from BPF program to in-kernel function +R6-R9 - callee saved registers that in-kernel function will preserve +R10 - read-only frame pointer to access stack + +so that all BPF registers map one to one to HW registers on x86_64,aarch64,etc +and BPF calling convention maps directly to ABIs used by kernel on 64-bit +architectures. +On 32-bit architectures JIT may map programs that use only 32-bit arithmetic +and let more complex programs to be interpreted. + +R0-R5 are scratch registers and BPF program needs spill/fill them if necessary +across calls. +Note that there is only one BPF program == one BPF function and it cannot call +other BPF functions. It can only call predefined in-kernel functions. + +All BPF registers are 64-bit with 32-bit lower subregister that zero-extends +into 64-bit if written to. That behavior maps directly to x86_64 and arm64 +subregister defintion,
[PATCH v8 net-next 3/3] doc: filter: add Extended BPF documentation
Signed-off-by: Alexei Starovoitov a...@plumgrid.com Reviewed-by: Daniel Borkmann dbork...@redhat.com --- Documentation/networking/filter.txt | 181 +++ 1 file changed, 181 insertions(+) diff --git a/Documentation/networking/filter.txt b/Documentation/networking/filter.txt index a06b48d2f5cc..6a0e29583a30 100644 --- a/Documentation/networking/filter.txt +++ b/Documentation/networking/filter.txt @@ -546,6 +546,186 @@ a0069c8f + x: For BPF JIT developers, bpf_jit_disasm, bpf_asm and bpf_dbg provides a useful toolchain for developing and testing the kernel's JIT compiler. +Extended BPF + +Extended BPF extends BPF in the following ways: +- from 2 to 10 registers + Original BPF has two registers (A and X) and hidden frame pointer. + Extended BPF has ten registers and read-only frame pointer. +- from 32-bit registers to 64-bit registers + semantics of old 32-bit ALU operations are preserved via 32-bit + subregisters +- if (cond) jump_true; else jump_false; + old BPF insns are replaced with: + if (cond) jump_true; /* else fallthrough */ +- adds signed and = insns +- 16 4-byte stack slots for register spill-fill replaced with + up to 512 bytes of multi-use stack space +- introduces bpf_call insn and register passing convention for zero + overhead calls from/to other kernel functions (not part of this patch) +- adds arithmetic right shift insn +- adds swab32/swab64 insns +- adds atomic_add insn +- old tax/txa insns are replaced with 'mov dst,src' insn + +Extended BPF is designed to be JITed with one to one mapping, which +allows GCC/LLVM compilers to generate optimized BPF code that performs +almost as fast as natively compiled code + +sysctl net.core.bpf_ext_enable=1 +controls whether filters attached to sockets will be automatically +converted to extended BPF or not. + +BPF is safe dynamically loadable program that can call fixed set +of kernel functions and takes a pointer to data as an input, +where data is skb, seccomp_data, kprobe function arguments or else. + +Extended Instruction Set was designed with these goals: +- write programs in restricted C and compile into BPF with GCC/LLVM +- just-in-time map to modern 64-bit CPU with minimal performance overhead + over two steps: C - BPF - native code +- guarantee termination and safety of BPF program in kernel + with simple algorithm + +GCC/LLVM-bpf backend is optional. +Extended BPF can be coded with macroses from filter.h just like original BPF, +though the same filter done in C is easier to understand. +sk_convert_filter() remaps original BPF insns into extended. + +Minimal performance overhead is achieved by having one to one mapping +between BPF insns and native insns, and one to one mapping between BPF +registers and native registers on 64-bit CPUs + +Extended BPF may allow jump forward and backward for two reasons: +to reduce branch mispredict penalty compiler moves cold basic blocks out of +fall-through path and to reduce code duplication that would be hard to avoid +if only jump forward was available. +To guarantee termination simple non-recursive depth-first-search verifies +that there are no back-edges (no loops in the program), program is a DAG +with root at the first insn, all branches end at the last RET insn and +all instructions are reachable. +Original BPF actually allows unreachable insns. Though it's safe, it will be +fixed when extended BPF replaces BPF completely. + +Original BPF has two registers (A and X) and hidden frame pointer. +Extended BPF has ten registers and read-only frame pointer. +Since 64-bit CPUs are passing arguments to the functions via registers +the number of args from BPF program to in-kernel function is restricted to 5 +and one register is used to accept return value from in-kernel function. +x86_64 passes first 6 arguments in registers. +aarch64/sparcv9/mips64 have 7-8 registers for arguments. +x86_64 has 6 callee saved registers. +aarch64/sparcv9/mips64 have 11 or more callee saved registers. + +Therefore extended BPF calling convention is defined as: +R0 - return value from in-kernel function +R1-R5 - arguments from BPF program to in-kernel function +R6-R9 - callee saved registers that in-kernel function will preserve +R10 - read-only frame pointer to access stack + +so that all BPF registers map one to one to HW registers on x86_64,aarch64,etc +and BPF calling convention maps directly to ABIs used by kernel on 64-bit +architectures. +On 32-bit architectures JIT may map programs that use only 32-bit arithmetic +and let more complex programs to be interpreted. + +R0-R5 are scratch registers and BPF program needs spill/fill them if necessary +across calls. +Note that there is only one BPF program == one BPF function and it cannot call +other BPF functions. It can only call predefined in-kernel functions. + +All BPF registers are 64-bit with 32-bit lower subregister that zero-extends +into 64-bit if written to. That behavior maps directly to x86_64 and