Re: [PATCH net] ipv6: fix a dst leak when removing its exception

2018-11-19 Thread David Ahern
On 11/19/18 7:16 PM, David Miller wrote:
> From: Xin Long 
> Date: Thu, 15 Nov 2018 16:23:38 +0900
> 
>> The attachment is the ip6_dst.sh with IPVS.
>>
>> # sh ip6_dst.sh
> 
> Maybe a selftests candidate?
> 

That script was not a reliable reproducer for me.

I created a much simpler one that shows the problem every time - and
proves the change. It needs some adjustments to work as a selftest.
Specifically, it needs to check dmesg for net_device reference counts
after the rcu delay. I'll add to the to-do list.


Re: [PATCH net] ipv6: fix a dst leak when removing its exception

2018-11-19 Thread David Miller
From: Xin Long 
Date: Thu, 15 Nov 2018 16:23:38 +0900

> The attachment is the ip6_dst.sh with IPVS.
> 
> # sh ip6_dst.sh

Maybe a selftests candidate?


Re: [PATCH net] ipv6: fix a dst leak when removing its exception

2018-11-16 Thread David Miller
From: Xin Long 
Date: Wed, 14 Nov 2018 00:48:28 +0800

> These is no need to hold dst before calling rt6_remove_exception_rt().
> The call to dst_hold_safe() in ip6_link_failure() was for ip6_del_rt(),
> which has been removed in Commit 93531c674315 ("net/ipv6: separate
> handling of FIB entries from dst based routes"). Otherwise, it will
> cause a dst leak.
> 
> This patch is to simply remove the dst_hold_safe() call before calling
> rt6_remove_exception_rt() and also do the same in ip6_del_cached_rt().
> It's safe, because the removal of the exception that holds its dst's
> refcnt is protected by rt6_exception_lock.
> 
> Fixes: 93531c674315 ("net/ipv6: separate handling of FIB entries from dst 
> based routes")
> Fixes: 23fb93a4d3f1 ("net/ipv6: Cleanup exception and cache route handling")
> Reported-by: Li Shuang 
> Signed-off-by: Xin Long 

Applied and queued up for -stable.


Re: [PATCH net] ipv6: fix a dst leak when removing its exception

2018-11-15 Thread David Ahern
On 11/13/18 8:48 AM, Xin Long wrote:
> These is no need to hold dst before calling rt6_remove_exception_rt().
> The call to dst_hold_safe() in ip6_link_failure() was for ip6_del_rt(),
> which has been removed in Commit 93531c674315 ("net/ipv6: separate
> handling of FIB entries from dst based routes"). Otherwise, it will
> cause a dst leak.
> 
> This patch is to simply remove the dst_hold_safe() call before calling
> rt6_remove_exception_rt() and also do the same in ip6_del_cached_rt().
> It's safe, because the removal of the exception that holds its dst's
> refcnt is protected by rt6_exception_lock.
> 
> Fixes: 93531c674315 ("net/ipv6: separate handling of FIB entries from dst 
> based routes")
> Fixes: 23fb93a4d3f1 ("net/ipv6: Cleanup exception and cache route handling")
> Reported-by: Li Shuang 
> Signed-off-by: Xin Long 
> ---
>  net/ipv6/route.c | 7 +++
>  1 file changed, 3 insertions(+), 4 deletions(-)
> 

Ok, I see now. commit ad65a2f05695 add the dst_hold_safe with
ip6_del_rt. ip6_del_rt called ip6_rt_put to release the reference taken
by the hold_safe. Those paths are gone now.

Reviewed-by: David Ahern 


Re: [PATCH net] ipv6: fix a dst leak when removing its exception

2018-11-15 Thread Mika Penttilä

On 15.11.2018 20.17, David Ahern wrote:
> On 11/14/18 11:23 PM, Xin Long wrote:
>> On Thu, Nov 15, 2018 at 3:33 PM David Ahern  wrote:
>>> On 11/14/18 11:03 AM, David Ahern wrote:
 On 11/13/18 8:48 AM, Xin Long wrote:
> These is no need to hold dst before calling rt6_remove_exception_rt().
> The call to dst_hold_safe() in ip6_link_failure() was for ip6_del_rt(),
> which has been removed in Commit 93531c674315 ("net/ipv6: separate
> handling of FIB entries from dst based routes"). Otherwise, it will
> cause a dst leak.
>
> This patch is to simply remove the dst_hold_safe() call before calling
> rt6_remove_exception_rt() and also do the same in ip6_del_cached_rt().
> It's safe, because the removal of the exception that holds its dst's
> refcnt is protected by rt6_exception_lock.
>
> Fixes: 93531c674315 ("net/ipv6: separate handling of FIB entries from dst 
> based routes")
> Fixes: 23fb93a4d3f1 ("net/ipv6: Cleanup exception and cache route 
> handling")
> Reported-by: Li Shuang 
> Signed-off-by: Xin Long 
> ---
>  net/ipv6/route.c | 7 +++
>  1 file changed, 3 insertions(+), 4 deletions(-)
 was this problem actually hit or is this patch based on a code analysis?

>>> I ask because I have not been able to reproduce the leak using existing
>>> tests (e.g., pmtu) that I know create exceptions.
>>>
>>> If this problem was hit, it would be good to get a test case for it.
>> The attachment is the ip6_dst.sh with IPVS.
>>
>> # sh ip6_dst.sh
>>
>> But this one triggers the kernel warnings caused by 2 places:
>>unregister_netdevice: waiting for br0 to become free. Usage count = 3
>>
>> 1. one is IPVS, I just posted the fix:
>> https://patchwork.ozlabs.org/patch/998123/  [1]
>> 2. the other one is IPv6,
>> ip6_link_failure() will be hit.
>>
>> So to make this reproduce clearly, you may want to apply
>> patch [1] firstly.
>>
> Thanks for the script. It does not replicate the problem using net-next
> tree as of
>
> commit 6d5db6c37929cb0a84e64ba0590a74593e5ce3b8
> Merge: 15cef30974c5 bd3b5d462add
> Author: David S. Miller 
> Date:   Wed Nov 14 08:51:28 2018 -0800
>
> Merge branch 'nfp-abm-track-all-Qdiscs'
>
>
> I would be really surprised if the fib6_info change introduced a need to
> change the dst hold's for exception routes. I am not seeing the
> connection, so I really want to see it reproduced.
>
> Thanks


Maybe it's not 100% reproducer then, but I think the fix is obviously right.

--Mika



Re: [PATCH net] ipv6: fix a dst leak when removing its exception

2018-11-15 Thread David Ahern
On 11/14/18 11:23 PM, Xin Long wrote:
> On Thu, Nov 15, 2018 at 3:33 PM David Ahern  wrote:
>>
>> On 11/14/18 11:03 AM, David Ahern wrote:
>>> On 11/13/18 8:48 AM, Xin Long wrote:
 These is no need to hold dst before calling rt6_remove_exception_rt().
 The call to dst_hold_safe() in ip6_link_failure() was for ip6_del_rt(),
 which has been removed in Commit 93531c674315 ("net/ipv6: separate
 handling of FIB entries from dst based routes"). Otherwise, it will
 cause a dst leak.

 This patch is to simply remove the dst_hold_safe() call before calling
 rt6_remove_exception_rt() and also do the same in ip6_del_cached_rt().
 It's safe, because the removal of the exception that holds its dst's
 refcnt is protected by rt6_exception_lock.

 Fixes: 93531c674315 ("net/ipv6: separate handling of FIB entries from dst 
 based routes")
 Fixes: 23fb93a4d3f1 ("net/ipv6: Cleanup exception and cache route 
 handling")
 Reported-by: Li Shuang 
 Signed-off-by: Xin Long 
 ---
  net/ipv6/route.c | 7 +++
  1 file changed, 3 insertions(+), 4 deletions(-)
>>>
>>> was this problem actually hit or is this patch based on a code analysis?
>>>
>>
>> I ask because I have not been able to reproduce the leak using existing
>> tests (e.g., pmtu) that I know create exceptions.
>>
>> If this problem was hit, it would be good to get a test case for it.
> The attachment is the ip6_dst.sh with IPVS.
> 
> # sh ip6_dst.sh
> 
> But this one triggers the kernel warnings caused by 2 places:
>unregister_netdevice: waiting for br0 to become free. Usage count = 3
> 
> 1. one is IPVS, I just posted the fix:
> https://patchwork.ozlabs.org/patch/998123/  [1]
> 2. the other one is IPv6,
> ip6_link_failure() will be hit.
> 
> So to make this reproduce clearly, you may want to apply
> patch [1] firstly.
> 

Thanks for the script. It does not replicate the problem using net-next
tree as of

commit 6d5db6c37929cb0a84e64ba0590a74593e5ce3b8
Merge: 15cef30974c5 bd3b5d462add
Author: David S. Miller 
Date:   Wed Nov 14 08:51:28 2018 -0800

Merge branch 'nfp-abm-track-all-Qdiscs'


I would be really surprised if the fib6_info change introduced a need to
change the dst hold's for exception routes. I am not seeing the
connection, so I really want to see it reproduced.

Thanks


Re: [PATCH net] ipv6: fix a dst leak when removing its exception

2018-11-14 Thread Xin Long
On Thu, Nov 15, 2018 at 3:33 PM David Ahern  wrote:
>
> On 11/14/18 11:03 AM, David Ahern wrote:
> > On 11/13/18 8:48 AM, Xin Long wrote:
> >> These is no need to hold dst before calling rt6_remove_exception_rt().
> >> The call to dst_hold_safe() in ip6_link_failure() was for ip6_del_rt(),
> >> which has been removed in Commit 93531c674315 ("net/ipv6: separate
> >> handling of FIB entries from dst based routes"). Otherwise, it will
> >> cause a dst leak.
> >>
> >> This patch is to simply remove the dst_hold_safe() call before calling
> >> rt6_remove_exception_rt() and also do the same in ip6_del_cached_rt().
> >> It's safe, because the removal of the exception that holds its dst's
> >> refcnt is protected by rt6_exception_lock.
> >>
> >> Fixes: 93531c674315 ("net/ipv6: separate handling of FIB entries from dst 
> >> based routes")
> >> Fixes: 23fb93a4d3f1 ("net/ipv6: Cleanup exception and cache route 
> >> handling")
> >> Reported-by: Li Shuang 
> >> Signed-off-by: Xin Long 
> >> ---
> >>  net/ipv6/route.c | 7 +++
> >>  1 file changed, 3 insertions(+), 4 deletions(-)
> >
> > was this problem actually hit or is this patch based on a code analysis?
> >
>
> I ask because I have not been able to reproduce the leak using existing
> tests (e.g., pmtu) that I know create exceptions.
>
> If this problem was hit, it would be good to get a test case for it.
The attachment is the ip6_dst.sh with IPVS.

# sh ip6_dst.sh

But this one triggers the kernel warnings caused by 2 places:
   unregister_netdevice: waiting for br0 to become free. Usage count = 3

1. one is IPVS, I just posted the fix:
https://patchwork.ozlabs.org/patch/998123/  [1]
2. the other one is IPv6,
ip6_link_failure() will be hit.

So to make this reproduce clearly, you may want to apply
patch [1] firstly.


ip6_dst.sh
Description: Bourne shell script


Re: [PATCH net] ipv6: fix a dst leak when removing its exception

2018-11-14 Thread David Ahern
On 11/14/18 11:03 AM, David Ahern wrote:
> On 11/13/18 8:48 AM, Xin Long wrote:
>> These is no need to hold dst before calling rt6_remove_exception_rt().
>> The call to dst_hold_safe() in ip6_link_failure() was for ip6_del_rt(),
>> which has been removed in Commit 93531c674315 ("net/ipv6: separate
>> handling of FIB entries from dst based routes"). Otherwise, it will
>> cause a dst leak.
>>
>> This patch is to simply remove the dst_hold_safe() call before calling
>> rt6_remove_exception_rt() and also do the same in ip6_del_cached_rt().
>> It's safe, because the removal of the exception that holds its dst's
>> refcnt is protected by rt6_exception_lock.
>>
>> Fixes: 93531c674315 ("net/ipv6: separate handling of FIB entries from dst 
>> based routes")
>> Fixes: 23fb93a4d3f1 ("net/ipv6: Cleanup exception and cache route handling")
>> Reported-by: Li Shuang 
>> Signed-off-by: Xin Long 
>> ---
>>  net/ipv6/route.c | 7 +++
>>  1 file changed, 3 insertions(+), 4 deletions(-)
> 
> was this problem actually hit or is this patch based on a code analysis?
> 

I ask because I have not been able to reproduce the leak using existing
tests (e.g., pmtu) that I know create exceptions.

If this problem was hit, it would be good to get a test case for it.


Re: [PATCH net] ipv6: fix a dst leak when removing its exception

2018-11-14 Thread David Ahern
On 11/13/18 8:48 AM, Xin Long wrote:
> These is no need to hold dst before calling rt6_remove_exception_rt().
> The call to dst_hold_safe() in ip6_link_failure() was for ip6_del_rt(),
> which has been removed in Commit 93531c674315 ("net/ipv6: separate
> handling of FIB entries from dst based routes"). Otherwise, it will
> cause a dst leak.
> 
> This patch is to simply remove the dst_hold_safe() call before calling
> rt6_remove_exception_rt() and also do the same in ip6_del_cached_rt().
> It's safe, because the removal of the exception that holds its dst's
> refcnt is protected by rt6_exception_lock.
> 
> Fixes: 93531c674315 ("net/ipv6: separate handling of FIB entries from dst 
> based routes")
> Fixes: 23fb93a4d3f1 ("net/ipv6: Cleanup exception and cache route handling")
> Reported-by: Li Shuang 
> Signed-off-by: Xin Long 
> ---
>  net/ipv6/route.c | 7 +++
>  1 file changed, 3 insertions(+), 4 deletions(-)

was this problem actually hit or is this patch based on a code analysis?