Re: [PATCH v9 net-next 1/3] filter: add Extended BPF interpreter and converter

2014-03-12 Thread Alexei Starovoitov
On Wed, Mar 12, 2014 at 3:16 PM, Cong Wang  wrote:
> (Sorry for jumping into this thread late.)
>
> On Mon, Mar 10, 2014 at 9:41 PM, Alexei Starovoitov  wrote:
>>
>> 3. tracing filters systemtap-like with extended BPF
>>
>> 4. OVS with extended BPF
>>
>> 5. nftables with extended BPF
>
> If you plan to use extended BPF out of networking, does
> it make sense to make it independent of networking?
> That is, rename those sk_*() APIs?

well, v1 and v2 series already demonstrate how ebpf can be used for tracing.
That's why in v1 series they were under kernel/ dir, but since it's
a natural bpf extension that reuses a lot of bpf pieces, it makes sense
for the whole thing be in net/core/filter.c and in linux/filter.h
I believe that was a consensus in v2-v3 series.
I don't want to rename the whole thing once again.
Also seccomp is not networking, but it already uses bpf and
after these patches ebpf. So net/core/filter.c location is fine.

>>
>> +int bpf_ext_enable __read_mostly;
>> +
>
> This could be converted to a jump label, no?

yes, but it's not in critical path, so no need.

Thank you for taking a look!
Even late feedback is always appreciated.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v9 net-next 1/3] filter: add Extended BPF interpreter and converter

2014-03-12 Thread Cong Wang
(Sorry for jumping into this thread late.)

On Mon, Mar 10, 2014 at 9:41 PM, Alexei Starovoitov  wrote:
>
> 3. tracing filters systemtap-like with extended BPF
>
> 4. OVS with extended BPF
>
> 5. nftables with extended BPF

If you plan to use extended BPF out of networking, does
it make sense to make it independent of networking?
That is, rename those sk_*() APIs?

>
> +int bpf_ext_enable __read_mostly;
> +

This could be converted to a jump label, no?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v9 net-next 1/3] filter: add Extended BPF interpreter and converter

2014-03-12 Thread Alexei Starovoitov
On Wed, Mar 12, 2014 at 12:22 PM, David Miller  wrote:
> From: Alexei Starovoitov 
> Date: Mon, 10 Mar 2014 21:41:30 -0700
>
>> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
>> index 1a869488b8ae..2c13d000389c 100644
>> --- a/include/linux/netdevice.h
>> +++ b/include/linux/netdevice.h
>> @@ -3054,6 +3054,7 @@ extern int  netdev_max_backlog;
>>  extern int   netdev_tstamp_prequeue;
>>  extern int   weight_p;
>>  extern int   bpf_jit_enable;
>> +extern int   bpf_ext_enable;
>
> I don't like seeing config ifdefs in C code, so please adjust the
> bpf_jit_enable declaration with something like:
>
> #ifdef CONFIG_BPF_JIT
> extern int  bpf_jit_enable;
> #else
> #define bpf_jit_enable  0
> #endif

agree. that's definitely cleaner.

>> +#ifdef CONFIG_64BIT
>> + DL(BPF_STX, BPF_XADD, BPF_DW)
>> +#endif
>  ...
>> +#ifdef CONFIG_64BIT
>> +BPF_STX_BPF_XADD_BPF_DW: /* lock xadd *(u64 *)(A + insn->off) += X */
>> + atomic64_add((u64)X, (atomic64_t *)(ulong)(A + insn->off));
>> + CONT;
>> +#endif
>
> I'm not too happy about instructions which are only available on
> certain architectures.
>
> I'd rather see best effort emulation than wholesale lack of support.
>
> We have an implementation of 64-bit atomics for 32-bit systems or
> those without true atomic compare-and-saw etc. ops in lib/atomic64.c
> so the routines are always available.

Good point. Will do.

Thanks
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v9 net-next 1/3] filter: add Extended BPF interpreter and converter

2014-03-12 Thread David Miller
From: Alexei Starovoitov 
Date: Mon, 10 Mar 2014 21:41:30 -0700

> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> index 1a869488b8ae..2c13d000389c 100644
> --- a/include/linux/netdevice.h
> +++ b/include/linux/netdevice.h
> @@ -3054,6 +3054,7 @@ extern int  netdev_max_backlog;
>  extern int   netdev_tstamp_prequeue;
>  extern int   weight_p;
>  extern int   bpf_jit_enable;
> +extern int   bpf_ext_enable;

I don't like seeing config ifdefs in C code, so please adjust the
bpf_jit_enable declaration with something like:

#ifdef CONFIG_BPF_JIT
extern int  bpf_jit_enable;
#else
#define bpf_jit_enable  0
#endif

Then you can kill the ifdefs here:

> +#ifdef CONFIG_BPF_JIT
> + if (bpf_ext_enable && !bpf_jit_enable)
> +#else
> + if (bpf_ext_enable)
> +#endif

And here:

> +#ifdef CONFIG_BPF_JIT
> + if (bpf_ext_enable && !bpf_jit_enable) {
> +#else
> + if (bpf_ext_enable) {
> +#endif

Next:

> +#ifdef CONFIG_64BIT
> + DL(BPF_STX, BPF_XADD, BPF_DW)
> +#endif
 ...
> +#ifdef CONFIG_64BIT
> +BPF_STX_BPF_XADD_BPF_DW: /* lock xadd *(u64 *)(A + insn->off) += X */
> + atomic64_add((u64)X, (atomic64_t *)(ulong)(A + insn->off));
> + CONT;
> +#endif

I'm not too happy about instructions which are only available on
certain architectures.

I'd rather see best effort emulation than wholesale lack of support.

We have an implementation of 64-bit atomics for 32-bit systems or
those without true atomic compare-and-saw etc. ops in lib/atomic64.c
so the routines are always available.

64-bit stores and loads are trivial.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v9 net-next 1/3] filter: add Extended BPF interpreter and converter

2014-03-12 Thread David Miller
From: Alexei Starovoitov a...@plumgrid.com
Date: Mon, 10 Mar 2014 21:41:30 -0700

 diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
 index 1a869488b8ae..2c13d000389c 100644
 --- a/include/linux/netdevice.h
 +++ b/include/linux/netdevice.h
 @@ -3054,6 +3054,7 @@ extern int  netdev_max_backlog;
  extern int   netdev_tstamp_prequeue;
  extern int   weight_p;
  extern int   bpf_jit_enable;
 +extern int   bpf_ext_enable;

I don't like seeing config ifdefs in C code, so please adjust the
bpf_jit_enable declaration with something like:

#ifdef CONFIG_BPF_JIT
extern int  bpf_jit_enable;
#else
#define bpf_jit_enable  0
#endif

Then you can kill the ifdefs here:

 +#ifdef CONFIG_BPF_JIT
 + if (bpf_ext_enable  !bpf_jit_enable)
 +#else
 + if (bpf_ext_enable)
 +#endif

And here:

 +#ifdef CONFIG_BPF_JIT
 + if (bpf_ext_enable  !bpf_jit_enable) {
 +#else
 + if (bpf_ext_enable) {
 +#endif

Next:

 +#ifdef CONFIG_64BIT
 + DL(BPF_STX, BPF_XADD, BPF_DW)
 +#endif
 ...
 +#ifdef CONFIG_64BIT
 +BPF_STX_BPF_XADD_BPF_DW: /* lock xadd *(u64 *)(A + insn-off) += X */
 + atomic64_add((u64)X, (atomic64_t *)(ulong)(A + insn-off));
 + CONT;
 +#endif

I'm not too happy about instructions which are only available on
certain architectures.

I'd rather see best effort emulation than wholesale lack of support.

We have an implementation of 64-bit atomics for 32-bit systems or
those without true atomic compare-and-saw etc. ops in lib/atomic64.c
so the routines are always available.

64-bit stores and loads are trivial.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v9 net-next 1/3] filter: add Extended BPF interpreter and converter

2014-03-12 Thread Alexei Starovoitov
On Wed, Mar 12, 2014 at 12:22 PM, David Miller da...@davemloft.net wrote:
 From: Alexei Starovoitov a...@plumgrid.com
 Date: Mon, 10 Mar 2014 21:41:30 -0700

 diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
 index 1a869488b8ae..2c13d000389c 100644
 --- a/include/linux/netdevice.h
 +++ b/include/linux/netdevice.h
 @@ -3054,6 +3054,7 @@ extern int  netdev_max_backlog;
  extern int   netdev_tstamp_prequeue;
  extern int   weight_p;
  extern int   bpf_jit_enable;
 +extern int   bpf_ext_enable;

 I don't like seeing config ifdefs in C code, so please adjust the
 bpf_jit_enable declaration with something like:

 #ifdef CONFIG_BPF_JIT
 extern int  bpf_jit_enable;
 #else
 #define bpf_jit_enable  0
 #endif

agree. that's definitely cleaner.

 +#ifdef CONFIG_64BIT
 + DL(BPF_STX, BPF_XADD, BPF_DW)
 +#endif
  ...
 +#ifdef CONFIG_64BIT
 +BPF_STX_BPF_XADD_BPF_DW: /* lock xadd *(u64 *)(A + insn-off) += X */
 + atomic64_add((u64)X, (atomic64_t *)(ulong)(A + insn-off));
 + CONT;
 +#endif

 I'm not too happy about instructions which are only available on
 certain architectures.

 I'd rather see best effort emulation than wholesale lack of support.

 We have an implementation of 64-bit atomics for 32-bit systems or
 those without true atomic compare-and-saw etc. ops in lib/atomic64.c
 so the routines are always available.

Good point. Will do.

Thanks
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v9 net-next 1/3] filter: add Extended BPF interpreter and converter

2014-03-12 Thread Cong Wang
(Sorry for jumping into this thread late.)

On Mon, Mar 10, 2014 at 9:41 PM, Alexei Starovoitov a...@plumgrid.com wrote:

 3. tracing filters systemtap-like with extended BPF

 4. OVS with extended BPF

 5. nftables with extended BPF

If you plan to use extended BPF out of networking, does
it make sense to make it independent of networking?
That is, rename those sk_*() APIs?


 +int bpf_ext_enable __read_mostly;
 +

This could be converted to a jump label, no?
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v9 net-next 1/3] filter: add Extended BPF interpreter and converter

2014-03-12 Thread Alexei Starovoitov
On Wed, Mar 12, 2014 at 3:16 PM, Cong Wang cw...@twopensource.com wrote:
 (Sorry for jumping into this thread late.)

 On Mon, Mar 10, 2014 at 9:41 PM, Alexei Starovoitov a...@plumgrid.com wrote:

 3. tracing filters systemtap-like with extended BPF

 4. OVS with extended BPF

 5. nftables with extended BPF

 If you plan to use extended BPF out of networking, does
 it make sense to make it independent of networking?
 That is, rename those sk_*() APIs?

well, v1 and v2 series already demonstrate how ebpf can be used for tracing.
That's why in v1 series they were under kernel/ dir, but since it's
a natural bpf extension that reuses a lot of bpf pieces, it makes sense
for the whole thing be in net/core/filter.c and in linux/filter.h
I believe that was a consensus in v2-v3 series.
I don't want to rename the whole thing once again.
Also seccomp is not networking, but it already uses bpf and
after these patches ebpf. So net/core/filter.c location is fine.


 +int bpf_ext_enable __read_mostly;
 +

 This could be converted to a jump label, no?

yes, but it's not in critical path, so no need.

Thank you for taking a look!
Even late feedback is always appreciated.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/