Re: [PATCH v2 2/3] net: Add BUG_ON() to get_net()

2018-01-10 Thread Kirill Tkhai
On 10.01.2018 12:58, Eric Dumazet wrote:
> On Wed, 2018-01-10 at 10:37 +0300, Kirill Tkhai wrote:
>> On 09.01.2018 21:52, Eric Dumazet wrote:
>>> On Tue, 2018-01-09 at 18:00 +0300, Kirill Tkhai wrote:
 Since people may mistakenly obtain destroying net
 from net_namespace_list and from net::netns_ids
 without checking for its net::counter, let's protect
 against such situations and insert BUG_ON() to stop
 move on after this.

 Panic is better, than memory corruption and undefined
 behavior.

 Signed-off-by: Kirill Tkhai 
 ---
  include/net/net_namespace.h |2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

 diff --git a/include/net/net_namespace.h b/include/net/net_namespace.h
 index 10f99dafd5ac..ff0e47471d5b 100644
 --- a/include/net/net_namespace.h
 +++ b/include/net/net_namespace.h
 @@ -195,7 +195,7 @@ void __put_net(struct net *net);
  
  static inline struct net *get_net(struct net *net)
  {
 -  atomic_inc(>count);
 +  BUG_ON(atomic_inc_return(>count) <= 1);
return net;
  }
>>>
>>>
>>> Why not simply use refcount_t instead of duplicating its logic?
>>
>> The main goal of the change is to catch rare races happening on production 
>> nodes
>> with real load and to prevent memory corruption. You can't simply use 
>> refcount_t
>> primitives, as there is no appropriate primitive with BUG_ON() among them. 
>> WARN_ON()
>> from the primitives doesn't protect from memory corruption.
>>
>> Also, keep in mind, that CONFIG_REFCOUNT_FULL is usually disabled on 
>> no-debug kernel.
>> I've checked both Fedora and Debian. So, the only possibility to catch such 
>> the races,
>> if someone really happy meets them on test kernel and test workload, which 
>> is usually
>> is very unlikely.
>>
> 
> Keep in mind that most of these bugs are found by syzkaller or other
> fuzzer bots, that have CONFIG_REFCOUNT_FULL enabled.
> 
> Do not rely on production workload to find such bug for you, coverage
> is very very low.
> 
> Resistance is futile, because this refcount will eventually be
> converted some day.

OK, I'll do v3.


Re: [PATCH v2 2/3] net: Add BUG_ON() to get_net()

2018-01-10 Thread Eric Dumazet
On Wed, 2018-01-10 at 10:37 +0300, Kirill Tkhai wrote:
> On 09.01.2018 21:52, Eric Dumazet wrote:
> > On Tue, 2018-01-09 at 18:00 +0300, Kirill Tkhai wrote:
> > > Since people may mistakenly obtain destroying net
> > > from net_namespace_list and from net::netns_ids
> > > without checking for its net::counter, let's protect
> > > against such situations and insert BUG_ON() to stop
> > > move on after this.
> > > 
> > > Panic is better, than memory corruption and undefined
> > > behavior.
> > > 
> > > Signed-off-by: Kirill Tkhai 
> > > ---
> > >  include/net/net_namespace.h |2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/include/net/net_namespace.h b/include/net/net_namespace.h
> > > index 10f99dafd5ac..ff0e47471d5b 100644
> > > --- a/include/net/net_namespace.h
> > > +++ b/include/net/net_namespace.h
> > > @@ -195,7 +195,7 @@ void __put_net(struct net *net);
> > >  
> > >  static inline struct net *get_net(struct net *net)
> > >  {
> > > - atomic_inc(>count);
> > > + BUG_ON(atomic_inc_return(>count) <= 1);
> > >   return net;
> > >  }
> > 
> > 
> > Why not simply use refcount_t instead of duplicating its logic?
> 
> The main goal of the change is to catch rare races happening on production 
> nodes
> with real load and to prevent memory corruption. You can't simply use 
> refcount_t
> primitives, as there is no appropriate primitive with BUG_ON() among them. 
> WARN_ON()
> from the primitives doesn't protect from memory corruption.
> 
> Also, keep in mind, that CONFIG_REFCOUNT_FULL is usually disabled on no-debug 
> kernel.
> I've checked both Fedora and Debian. So, the only possibility to catch such 
> the races,
> if someone really happy meets them on test kernel and test workload, which is 
> usually
> is very unlikely.
> 

Keep in mind that most of these bugs are found by syzkaller or other
fuzzer bots, that have CONFIG_REFCOUNT_FULL enabled.

Do not rely on production workload to find such bug for you, coverage
is very very low.

Resistance is futile, because this refcount will eventually be
converted some day.




Re: [PATCH v2 2/3] net: Add BUG_ON() to get_net()

2018-01-09 Thread Kirill Tkhai
On 09.01.2018 21:52, Eric Dumazet wrote:
> On Tue, 2018-01-09 at 18:00 +0300, Kirill Tkhai wrote:
>> Since people may mistakenly obtain destroying net
>> from net_namespace_list and from net::netns_ids
>> without checking for its net::counter, let's protect
>> against such situations and insert BUG_ON() to stop
>> move on after this.
>>
>> Panic is better, than memory corruption and undefined
>> behavior.
>>
>> Signed-off-by: Kirill Tkhai 
>> ---
>>  include/net/net_namespace.h |2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/include/net/net_namespace.h b/include/net/net_namespace.h
>> index 10f99dafd5ac..ff0e47471d5b 100644
>> --- a/include/net/net_namespace.h
>> +++ b/include/net/net_namespace.h
>> @@ -195,7 +195,7 @@ void __put_net(struct net *net);
>>  
>>  static inline struct net *get_net(struct net *net)
>>  {
>> -atomic_inc(>count);
>> +BUG_ON(atomic_inc_return(>count) <= 1);
>>  return net;
>>  }
> 
> 
> Why not simply use refcount_t instead of duplicating its logic?

The main goal of the change is to catch rare races happening on production nodes
with real load and to prevent memory corruption. You can't simply use refcount_t
primitives, as there is no appropriate primitive with BUG_ON() among them. 
WARN_ON()
from the primitives doesn't protect from memory corruption.

Also, keep in mind, that CONFIG_REFCOUNT_FULL is usually disabled on no-debug 
kernel.
I've checked both Fedora and Debian. So, the only possibility to catch such the 
races,
if someone really happy meets them on test kernel and test workload, which is 
usually
is very unlikely.

Kirill


Re: [PATCH v2 2/3] net: Add BUG_ON() to get_net()

2018-01-09 Thread Eric Dumazet
On Tue, 2018-01-09 at 18:00 +0300, Kirill Tkhai wrote:
> Since people may mistakenly obtain destroying net
> from net_namespace_list and from net::netns_ids
> without checking for its net::counter, let's protect
> against such situations and insert BUG_ON() to stop
> move on after this.
> 
> Panic is better, than memory corruption and undefined
> behavior.
> 
> Signed-off-by: Kirill Tkhai 
> ---
>  include/net/net_namespace.h |2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/include/net/net_namespace.h b/include/net/net_namespace.h
> index 10f99dafd5ac..ff0e47471d5b 100644
> --- a/include/net/net_namespace.h
> +++ b/include/net/net_namespace.h
> @@ -195,7 +195,7 @@ void __put_net(struct net *net);
>  
>  static inline struct net *get_net(struct net *net)
>  {
> - atomic_inc(>count);
> + BUG_ON(atomic_inc_return(>count) <= 1);
>   return net;
>  }


Why not simply use refcount_t instead of duplicating its logic ?




[PATCH v2 2/3] net: Add BUG_ON() to get_net()

2018-01-09 Thread Kirill Tkhai
Since people may mistakenly obtain destroying net
from net_namespace_list and from net::netns_ids
without checking for its net::counter, let's protect
against such situations and insert BUG_ON() to stop
move on after this.

Panic is better, than memory corruption and undefined
behavior.

Signed-off-by: Kirill Tkhai 
---
 include/net/net_namespace.h |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/net/net_namespace.h b/include/net/net_namespace.h
index 10f99dafd5ac..ff0e47471d5b 100644
--- a/include/net/net_namespace.h
+++ b/include/net/net_namespace.h
@@ -195,7 +195,7 @@ void __put_net(struct net *net);
 
 static inline struct net *get_net(struct net *net)
 {
-   atomic_inc(>count);
+   BUG_ON(atomic_inc_return(>count) <= 1);
return net;
 }