Re: [PATCH net-next] fix unsafe set_memory_rw from softirq

2013-10-03 Thread Alexei Starovoitov
On Wed, Oct 2, 2013 at 9:57 PM, Eric Dumazet  wrote:
> On Wed, 2013-10-02 at 21:53 -0700, Eric Dumazet wrote:
>> On Wed, 2013-10-02 at 21:44 -0700, Alexei Starovoitov wrote:
>>
>> > I think ifdef config_x86 is a bit ugly inside struct sk_filter, but
>> > don't mind whichever way.
>>
>> Its not fair to make sk_filter bigger, because it means that simple (non
>> JIT) filter might need an extra cache line.
>>
>> You could presumably use the following layout instead :
>>
>> struct sk_filter
>> {
>> atomic_trefcnt;
>> struct rcu_head rcu;
>>   struct work_struct  work;
>>
>> unsigned intlen cacheline_aligned;/* Number of 
>> filter blocks */
>> unsigned int(*bpf_func)(const struct sk_buff *skb,
>> const struct sock_filter 
>> *filter);
>> struct sock_filter  insns[0];
>> };
>
> And since @len is not used by sk_run_filter() use :
>
> struct sk_filter {
> atomic_trefcnt;
> int len; /* number of filter blocks */
> struct rcu_head rcu;
> struct work_struct  work;
>
> unsigned int(*bpf_func)(const struct sk_buff *skb,
> const struct sock_filter *filter) 
> cacheline_aligned;
> struct sock_filter  insns[0];
> };

yes. make sense to avoid first insn cache miss inside sk_run_filter()
at the expense
of 8-byte gap between work and bpf_func (on x86_64 w/o lockdep)

Probably even better to overlap work and insns fields.
Pro: sk_filter size the same, no impact on non-jit case
Con: would be harder to understand the code

another problem is that kfree(sk_filter) inside
sk_filter_release_rcu() needs to move inside bpf_jit_free().
so self nack. Let me fix these issues and respin

Thanks
Alexei
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH net-next] fix unsafe set_memory_rw from softirq

2013-10-03 Thread Alexei Starovoitov
On Wed, Oct 2, 2013 at 9:57 PM, Eric Dumazet eric.duma...@gmail.com wrote:
 On Wed, 2013-10-02 at 21:53 -0700, Eric Dumazet wrote:
 On Wed, 2013-10-02 at 21:44 -0700, Alexei Starovoitov wrote:

  I think ifdef config_x86 is a bit ugly inside struct sk_filter, but
  don't mind whichever way.

 Its not fair to make sk_filter bigger, because it means that simple (non
 JIT) filter might need an extra cache line.

 You could presumably use the following layout instead :

 struct sk_filter
 {
 atomic_trefcnt;
 struct rcu_head rcu;
   struct work_struct  work;

 unsigned intlen cacheline_aligned;/* Number of 
 filter blocks */
 unsigned int(*bpf_func)(const struct sk_buff *skb,
 const struct sock_filter 
 *filter);
 struct sock_filter  insns[0];
 };

 And since @len is not used by sk_run_filter() use :

 struct sk_filter {
 atomic_trefcnt;
 int len; /* number of filter blocks */
 struct rcu_head rcu;
 struct work_struct  work;

 unsigned int(*bpf_func)(const struct sk_buff *skb,
 const struct sock_filter *filter) 
 cacheline_aligned;
 struct sock_filter  insns[0];
 };

yes. make sense to avoid first insn cache miss inside sk_run_filter()
at the expense
of 8-byte gap between work and bpf_func (on x86_64 w/o lockdep)

Probably even better to overlap work and insns fields.
Pro: sk_filter size the same, no impact on non-jit case
Con: would be harder to understand the code

another problem is that kfree(sk_filter) inside
sk_filter_release_rcu() needs to move inside bpf_jit_free().
so self nack. Let me fix these issues and respin

Thanks
Alexei
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH net-next] fix unsafe set_memory_rw from softirq

2013-10-02 Thread Eric Dumazet
On Wed, 2013-10-02 at 21:53 -0700, Eric Dumazet wrote:
> On Wed, 2013-10-02 at 21:44 -0700, Alexei Starovoitov wrote:
> 
> > I think ifdef config_x86 is a bit ugly inside struct sk_filter, but
> > don't mind whichever way.
> 
> Its not fair to make sk_filter bigger, because it means that simple (non
> JIT) filter might need an extra cache line.
> 
> You could presumably use the following layout instead :
> 
> struct sk_filter
> {
> atomic_trefcnt;
> struct rcu_head rcu;
>   struct work_struct  work;
> 
> unsigned intlen cacheline_aligned;/* Number of 
> filter blocks */
> unsigned int(*bpf_func)(const struct sk_buff *skb,
> const struct sock_filter *filter);
> struct sock_filter  insns[0];
> };

And since @len is not used by sk_run_filter() use :

struct sk_filter {
atomic_trefcnt;
int len; /* number of filter blocks */
struct rcu_head rcu;
struct work_struct  work;

unsigned int(*bpf_func)(const struct sk_buff *skb,
const struct sock_filter *filter) 
cacheline_aligned;
struct sock_filter  insns[0];
};



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH net-next] fix unsafe set_memory_rw from softirq

2013-10-02 Thread Eric Dumazet
On Wed, 2013-10-02 at 21:44 -0700, Alexei Starovoitov wrote:

> I think ifdef config_x86 is a bit ugly inside struct sk_filter, but
> don't mind whichever way.

Its not fair to make sk_filter bigger, because it means that simple (non
JIT) filter might need an extra cache line.

You could presumably use the following layout instead :

struct sk_filter
{
atomic_trefcnt;
struct rcu_head rcu;
struct work_struct  work;

unsigned intlen cacheline_aligned;/* Number of 
filter blocks */
unsigned int(*bpf_func)(const struct sk_buff *skb,
const struct sock_filter *filter);
struct sock_filter  insns[0];
};



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH net-next] fix unsafe set_memory_rw from softirq

2013-10-02 Thread Alexei Starovoitov
On Wed, Oct 2, 2013 at 9:23 PM, Eric Dumazet  wrote:
> On Wed, 2013-10-02 at 20:50 -0700, Alexei Starovoitov wrote:
>> on x86 system with net.core.bpf_jit_enable = 1
>
>> diff --git a/include/linux/filter.h b/include/linux/filter.h
>> index a6ac848..378fa03 100644
>> --- a/include/linux/filter.h
>> +++ b/include/linux/filter.h
>> @@ -27,6 +27,7 @@ struct sk_filter
>>   unsigned intlen;/* Number of filter blocks */
>>   unsigned int(*bpf_func)(const struct sk_buff *skb,
>>   const struct sock_filter *filter);
>> + struct work_struct  work;
>>   struct rcu_head rcu;
>>   struct sock_filter  insns[0];
>>  };
>
> Nice catch !
>
> It seems only x86 and s390 needs this work_struct.

I think ifdef config_x86 is a bit ugly inside struct sk_filter, but
don't mind whichever way.

> (and you might CC Heiko Carstens  to ask him
> to make the s390 part, of Ack it if you plan to do it)

set_memory_rw() on s390 is a simple page table walker that doesn't do
any IPI unlike x86
Heiko, please confirm that it's not an issue there.

Thanks
Alexei
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH net-next] fix unsafe set_memory_rw from softirq

2013-10-02 Thread Eric Dumazet
On Wed, 2013-10-02 at 20:50 -0700, Alexei Starovoitov wrote:
> on x86 system with net.core.bpf_jit_enable = 1

> diff --git a/include/linux/filter.h b/include/linux/filter.h
> index a6ac848..378fa03 100644
> --- a/include/linux/filter.h
> +++ b/include/linux/filter.h
> @@ -27,6 +27,7 @@ struct sk_filter
>   unsigned intlen;/* Number of filter blocks */
>   unsigned int(*bpf_func)(const struct sk_buff *skb,
>   const struct sock_filter *filter);
> + struct work_struct  work;
>   struct rcu_head rcu;
>   struct sock_filter  insns[0];
>  };

Nice catch !

It seems only x86 and s390 needs this work_struct.

(and you might CC Heiko Carstens  to ask him
to make the s390 part, of Ack it if you plan to do it)

Thanks


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH net-next] fix unsafe set_memory_rw from softirq

2013-10-02 Thread Eric Dumazet
On Wed, 2013-10-02 at 20:50 -0700, Alexei Starovoitov wrote:
 on x86 system with net.core.bpf_jit_enable = 1

 diff --git a/include/linux/filter.h b/include/linux/filter.h
 index a6ac848..378fa03 100644
 --- a/include/linux/filter.h
 +++ b/include/linux/filter.h
 @@ -27,6 +27,7 @@ struct sk_filter
   unsigned intlen;/* Number of filter blocks */
   unsigned int(*bpf_func)(const struct sk_buff *skb,
   const struct sock_filter *filter);
 + struct work_struct  work;
   struct rcu_head rcu;
   struct sock_filter  insns[0];
  };

Nice catch !

It seems only x86 and s390 needs this work_struct.

(and you might CC Heiko Carstens heiko.carst...@de.ibm.com to ask him
to make the s390 part, of Ack it if you plan to do it)

Thanks


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH net-next] fix unsafe set_memory_rw from softirq

2013-10-02 Thread Alexei Starovoitov
On Wed, Oct 2, 2013 at 9:23 PM, Eric Dumazet eric.duma...@gmail.com wrote:
 On Wed, 2013-10-02 at 20:50 -0700, Alexei Starovoitov wrote:
 on x86 system with net.core.bpf_jit_enable = 1

 diff --git a/include/linux/filter.h b/include/linux/filter.h
 index a6ac848..378fa03 100644
 --- a/include/linux/filter.h
 +++ b/include/linux/filter.h
 @@ -27,6 +27,7 @@ struct sk_filter
   unsigned intlen;/* Number of filter blocks */
   unsigned int(*bpf_func)(const struct sk_buff *skb,
   const struct sock_filter *filter);
 + struct work_struct  work;
   struct rcu_head rcu;
   struct sock_filter  insns[0];
  };

 Nice catch !

 It seems only x86 and s390 needs this work_struct.

I think ifdef config_x86 is a bit ugly inside struct sk_filter, but
don't mind whichever way.

 (and you might CC Heiko Carstens heiko.carst...@de.ibm.com to ask him
 to make the s390 part, of Ack it if you plan to do it)

set_memory_rw() on s390 is a simple page table walker that doesn't do
any IPI unlike x86
Heiko, please confirm that it's not an issue there.

Thanks
Alexei
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH net-next] fix unsafe set_memory_rw from softirq

2013-10-02 Thread Eric Dumazet
On Wed, 2013-10-02 at 21:44 -0700, Alexei Starovoitov wrote:

 I think ifdef config_x86 is a bit ugly inside struct sk_filter, but
 don't mind whichever way.

Its not fair to make sk_filter bigger, because it means that simple (non
JIT) filter might need an extra cache line.

You could presumably use the following layout instead :

struct sk_filter
{
atomic_trefcnt;
struct rcu_head rcu;
struct work_struct  work;

unsigned intlen cacheline_aligned;/* Number of 
filter blocks */
unsigned int(*bpf_func)(const struct sk_buff *skb,
const struct sock_filter *filter);
struct sock_filter  insns[0];
};



--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH net-next] fix unsafe set_memory_rw from softirq

2013-10-02 Thread Eric Dumazet
On Wed, 2013-10-02 at 21:53 -0700, Eric Dumazet wrote:
 On Wed, 2013-10-02 at 21:44 -0700, Alexei Starovoitov wrote:
 
  I think ifdef config_x86 is a bit ugly inside struct sk_filter, but
  don't mind whichever way.
 
 Its not fair to make sk_filter bigger, because it means that simple (non
 JIT) filter might need an extra cache line.
 
 You could presumably use the following layout instead :
 
 struct sk_filter
 {
 atomic_trefcnt;
 struct rcu_head rcu;
   struct work_struct  work;
 
 unsigned intlen cacheline_aligned;/* Number of 
 filter blocks */
 unsigned int(*bpf_func)(const struct sk_buff *skb,
 const struct sock_filter *filter);
 struct sock_filter  insns[0];
 };

And since @len is not used by sk_run_filter() use :

struct sk_filter {
atomic_trefcnt;
int len; /* number of filter blocks */
struct rcu_head rcu;
struct work_struct  work;

unsigned int(*bpf_func)(const struct sk_buff *skb,
const struct sock_filter *filter) 
cacheline_aligned;
struct sock_filter  insns[0];
};



--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/