Re: [PATCH v9 net-next 1/3] filter: add Extended BPF interpreter and converter
On Wed, Mar 12, 2014 at 3:16 PM, Cong Wang wrote: > (Sorry for jumping into this thread late.) > > On Mon, Mar 10, 2014 at 9:41 PM, Alexei Starovoitov wrote: >> >> 3. tracing filters systemtap-like with extended BPF >> >> 4. OVS with extended BPF >> >> 5. nftables with extended BPF > > If you plan to use extended BPF out of networking, does > it make sense to make it independent of networking? > That is, rename those sk_*() APIs? well, v1 and v2 series already demonstrate how ebpf can be used for tracing. That's why in v1 series they were under kernel/ dir, but since it's a natural bpf extension that reuses a lot of bpf pieces, it makes sense for the whole thing be in net/core/filter.c and in linux/filter.h I believe that was a consensus in v2-v3 series. I don't want to rename the whole thing once again. Also seccomp is not networking, but it already uses bpf and after these patches ebpf. So net/core/filter.c location is fine. >> >> +int bpf_ext_enable __read_mostly; >> + > > This could be converted to a jump label, no? yes, but it's not in critical path, so no need. Thank you for taking a look! Even late feedback is always appreciated. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v9 net-next 1/3] filter: add Extended BPF interpreter and converter
(Sorry for jumping into this thread late.) On Mon, Mar 10, 2014 at 9:41 PM, Alexei Starovoitov wrote: > > 3. tracing filters systemtap-like with extended BPF > > 4. OVS with extended BPF > > 5. nftables with extended BPF If you plan to use extended BPF out of networking, does it make sense to make it independent of networking? That is, rename those sk_*() APIs? > > +int bpf_ext_enable __read_mostly; > + This could be converted to a jump label, no? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v9 net-next 1/3] filter: add Extended BPF interpreter and converter
On Wed, Mar 12, 2014 at 12:22 PM, David Miller wrote: > From: Alexei Starovoitov > Date: Mon, 10 Mar 2014 21:41:30 -0700 > >> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h >> index 1a869488b8ae..2c13d000389c 100644 >> --- a/include/linux/netdevice.h >> +++ b/include/linux/netdevice.h >> @@ -3054,6 +3054,7 @@ extern int netdev_max_backlog; >> extern int netdev_tstamp_prequeue; >> extern int weight_p; >> extern int bpf_jit_enable; >> +extern int bpf_ext_enable; > > I don't like seeing config ifdefs in C code, so please adjust the > bpf_jit_enable declaration with something like: > > #ifdef CONFIG_BPF_JIT > extern int bpf_jit_enable; > #else > #define bpf_jit_enable 0 > #endif agree. that's definitely cleaner. >> +#ifdef CONFIG_64BIT >> + DL(BPF_STX, BPF_XADD, BPF_DW) >> +#endif > ... >> +#ifdef CONFIG_64BIT >> +BPF_STX_BPF_XADD_BPF_DW: /* lock xadd *(u64 *)(A + insn->off) += X */ >> + atomic64_add((u64)X, (atomic64_t *)(ulong)(A + insn->off)); >> + CONT; >> +#endif > > I'm not too happy about instructions which are only available on > certain architectures. > > I'd rather see best effort emulation than wholesale lack of support. > > We have an implementation of 64-bit atomics for 32-bit systems or > those without true atomic compare-and-saw etc. ops in lib/atomic64.c > so the routines are always available. Good point. Will do. Thanks -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v9 net-next 1/3] filter: add Extended BPF interpreter and converter
From: Alexei Starovoitov Date: Mon, 10 Mar 2014 21:41:30 -0700 > diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h > index 1a869488b8ae..2c13d000389c 100644 > --- a/include/linux/netdevice.h > +++ b/include/linux/netdevice.h > @@ -3054,6 +3054,7 @@ extern int netdev_max_backlog; > extern int netdev_tstamp_prequeue; > extern int weight_p; > extern int bpf_jit_enable; > +extern int bpf_ext_enable; I don't like seeing config ifdefs in C code, so please adjust the bpf_jit_enable declaration with something like: #ifdef CONFIG_BPF_JIT extern int bpf_jit_enable; #else #define bpf_jit_enable 0 #endif Then you can kill the ifdefs here: > +#ifdef CONFIG_BPF_JIT > + if (bpf_ext_enable && !bpf_jit_enable) > +#else > + if (bpf_ext_enable) > +#endif And here: > +#ifdef CONFIG_BPF_JIT > + if (bpf_ext_enable && !bpf_jit_enable) { > +#else > + if (bpf_ext_enable) { > +#endif Next: > +#ifdef CONFIG_64BIT > + DL(BPF_STX, BPF_XADD, BPF_DW) > +#endif ... > +#ifdef CONFIG_64BIT > +BPF_STX_BPF_XADD_BPF_DW: /* lock xadd *(u64 *)(A + insn->off) += X */ > + atomic64_add((u64)X, (atomic64_t *)(ulong)(A + insn->off)); > + CONT; > +#endif I'm not too happy about instructions which are only available on certain architectures. I'd rather see best effort emulation than wholesale lack of support. We have an implementation of 64-bit atomics for 32-bit systems or those without true atomic compare-and-saw etc. ops in lib/atomic64.c so the routines are always available. 64-bit stores and loads are trivial. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v9 net-next 1/3] filter: add Extended BPF interpreter and converter
From: Alexei Starovoitov a...@plumgrid.com Date: Mon, 10 Mar 2014 21:41:30 -0700 diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index 1a869488b8ae..2c13d000389c 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -3054,6 +3054,7 @@ extern int netdev_max_backlog; extern int netdev_tstamp_prequeue; extern int weight_p; extern int bpf_jit_enable; +extern int bpf_ext_enable; I don't like seeing config ifdefs in C code, so please adjust the bpf_jit_enable declaration with something like: #ifdef CONFIG_BPF_JIT extern int bpf_jit_enable; #else #define bpf_jit_enable 0 #endif Then you can kill the ifdefs here: +#ifdef CONFIG_BPF_JIT + if (bpf_ext_enable !bpf_jit_enable) +#else + if (bpf_ext_enable) +#endif And here: +#ifdef CONFIG_BPF_JIT + if (bpf_ext_enable !bpf_jit_enable) { +#else + if (bpf_ext_enable) { +#endif Next: +#ifdef CONFIG_64BIT + DL(BPF_STX, BPF_XADD, BPF_DW) +#endif ... +#ifdef CONFIG_64BIT +BPF_STX_BPF_XADD_BPF_DW: /* lock xadd *(u64 *)(A + insn-off) += X */ + atomic64_add((u64)X, (atomic64_t *)(ulong)(A + insn-off)); + CONT; +#endif I'm not too happy about instructions which are only available on certain architectures. I'd rather see best effort emulation than wholesale lack of support. We have an implementation of 64-bit atomics for 32-bit systems or those without true atomic compare-and-saw etc. ops in lib/atomic64.c so the routines are always available. 64-bit stores and loads are trivial. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v9 net-next 1/3] filter: add Extended BPF interpreter and converter
On Wed, Mar 12, 2014 at 12:22 PM, David Miller da...@davemloft.net wrote: From: Alexei Starovoitov a...@plumgrid.com Date: Mon, 10 Mar 2014 21:41:30 -0700 diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index 1a869488b8ae..2c13d000389c 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -3054,6 +3054,7 @@ extern int netdev_max_backlog; extern int netdev_tstamp_prequeue; extern int weight_p; extern int bpf_jit_enable; +extern int bpf_ext_enable; I don't like seeing config ifdefs in C code, so please adjust the bpf_jit_enable declaration with something like: #ifdef CONFIG_BPF_JIT extern int bpf_jit_enable; #else #define bpf_jit_enable 0 #endif agree. that's definitely cleaner. +#ifdef CONFIG_64BIT + DL(BPF_STX, BPF_XADD, BPF_DW) +#endif ... +#ifdef CONFIG_64BIT +BPF_STX_BPF_XADD_BPF_DW: /* lock xadd *(u64 *)(A + insn-off) += X */ + atomic64_add((u64)X, (atomic64_t *)(ulong)(A + insn-off)); + CONT; +#endif I'm not too happy about instructions which are only available on certain architectures. I'd rather see best effort emulation than wholesale lack of support. We have an implementation of 64-bit atomics for 32-bit systems or those without true atomic compare-and-saw etc. ops in lib/atomic64.c so the routines are always available. Good point. Will do. Thanks -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v9 net-next 1/3] filter: add Extended BPF interpreter and converter
(Sorry for jumping into this thread late.) On Mon, Mar 10, 2014 at 9:41 PM, Alexei Starovoitov a...@plumgrid.com wrote: 3. tracing filters systemtap-like with extended BPF 4. OVS with extended BPF 5. nftables with extended BPF If you plan to use extended BPF out of networking, does it make sense to make it independent of networking? That is, rename those sk_*() APIs? +int bpf_ext_enable __read_mostly; + This could be converted to a jump label, no? -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v9 net-next 1/3] filter: add Extended BPF interpreter and converter
On Wed, Mar 12, 2014 at 3:16 PM, Cong Wang cw...@twopensource.com wrote: (Sorry for jumping into this thread late.) On Mon, Mar 10, 2014 at 9:41 PM, Alexei Starovoitov a...@plumgrid.com wrote: 3. tracing filters systemtap-like with extended BPF 4. OVS with extended BPF 5. nftables with extended BPF If you plan to use extended BPF out of networking, does it make sense to make it independent of networking? That is, rename those sk_*() APIs? well, v1 and v2 series already demonstrate how ebpf can be used for tracing. That's why in v1 series they were under kernel/ dir, but since it's a natural bpf extension that reuses a lot of bpf pieces, it makes sense for the whole thing be in net/core/filter.c and in linux/filter.h I believe that was a consensus in v2-v3 series. I don't want to rename the whole thing once again. Also seccomp is not networking, but it already uses bpf and after these patches ebpf. So net/core/filter.c location is fine. +int bpf_ext_enable __read_mostly; + This could be converted to a jump label, no? yes, but it's not in critical path, so no need. Thank you for taking a look! Even late feedback is always appreciated. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/