Re: [PATCH net] ipv6: remove incorrect WARN_ON() in fib6_del()
On Tue, Sep 26, 2017 at 6:20 AM, Eric Dumazetwrote: > On Mon, Sep 25, 2017 at 10:52 PM, Wei Wang wrote: >> On Mon, Sep 25, 2017 at 7:23 PM, Eric Dumazet wrote: >>> On Mon, Sep 25, 2017 at 7:07 PM, Martin KaFai Lau wrote: >>> I am probably still missing something. Considering the del operation should be under the writer lock, if rt->rt6i_node should be NULL (for rt that has already been removed from fib6), why this WARN_ON() is triggered? An example may help. >>> >>> Look at the stack trace, you'll find the answers... >>> >>> ip6_link_failure() -> ip6_del_rt() >>> >>> Note that rt might have been deleted from the _tree_ already. >> >> Had a brief talk with Martin. >> He has a valid point. >> The current WARN_ON() code is as follows: >> #if RT6_DEBUG >= 2 >>if (rt->dst.obsolete > 0) { >>WARN_ON(fn); >>return -ENOENT; >>} >> #endif >> >> The WARN_ON() only triggers when fn is not NULL. (I missed it before.) >> In theory, fib6_del() calls fib6_del_route() which should set >> rt->rt6i_node to NULL and rt->dst.obsolete to DST_OBSOLETE_DEAD within >> the same write_lock session. >> If those 2 values are inconsistent, it indicates something is wrong. >> Will need more time to root cause the issue. >> >> Please ignore this patch. Sorry about the confusion. > > Oh well, for some reason I was seeing WARN_ON(1) here, since this is > a construct I often add in my tests ... Just an update on this issue: This WARNING issue should already be fixed by commit 7483cea79957312e9f8e9cf760a1bc5d6c507113: Author: Ido Schimmel Date: Thu Aug 3 13:28:22 2017 +0200 ipv6: fib: Unlink replaced routes from their nodes When a route is deleted its node pointer is set to NULL to indicate it's no longer linked to its node. Do the same for routes that are replaced. This will later allow us to test if a route is still in the FIB by checking its node pointer instead of its reference count. Signed-off-by: Ido Schimmel Signed-off-by: Jiri Pirko Signed-off-by: David S. Miller So no further action is needed on this. Thanks. Wei
Re: [PATCH net] ipv6: remove incorrect WARN_ON() in fib6_del()
On Mon, Sep 25, 2017 at 10:52 PM, Wei Wangwrote: > On Mon, Sep 25, 2017 at 7:23 PM, Eric Dumazet wrote: >> On Mon, Sep 25, 2017 at 7:07 PM, Martin KaFai Lau wrote: >> >>> I am probably still missing something. >>> >>> Considering the del operation should be under the writer lock, >>> if rt->rt6i_node should be NULL (for rt that has already been >>> removed from fib6), why this WARN_ON() is triggered? >>> >>> An example may help. >>> >> >> Look at the stack trace, you'll find the answers... >> >> ip6_link_failure() -> ip6_del_rt() >> >> Note that rt might have been deleted from the _tree_ already. > > Had a brief talk with Martin. > He has a valid point. > The current WARN_ON() code is as follows: > #if RT6_DEBUG >= 2 >if (rt->dst.obsolete > 0) { >WARN_ON(fn); >return -ENOENT; >} > #endif > > The WARN_ON() only triggers when fn is not NULL. (I missed it before.) > In theory, fib6_del() calls fib6_del_route() which should set > rt->rt6i_node to NULL and rt->dst.obsolete to DST_OBSOLETE_DEAD within > the same write_lock session. > If those 2 values are inconsistent, it indicates something is wrong. > Will need more time to root cause the issue. > > Please ignore this patch. Sorry about the confusion. Oh well, for some reason I was seeing WARN_ON(1) here, since this is a construct I often add in my tests ...
Re: [PATCH net] ipv6: remove incorrect WARN_ON() in fib6_del()
On Mon, Sep 25, 2017 at 7:23 PM, Eric Dumazetwrote: > On Mon, Sep 25, 2017 at 7:07 PM, Martin KaFai Lau wrote: > >> I am probably still missing something. >> >> Considering the del operation should be under the writer lock, >> if rt->rt6i_node should be NULL (for rt that has already been >> removed from fib6), why this WARN_ON() is triggered? >> >> An example may help. >> > > Look at the stack trace, you'll find the answers... > > ip6_link_failure() -> ip6_del_rt() > > Note that rt might have been deleted from the _tree_ already. Had a brief talk with Martin. He has a valid point. The current WARN_ON() code is as follows: #if RT6_DEBUG >= 2 if (rt->dst.obsolete > 0) { WARN_ON(fn); return -ENOENT; } #endif The WARN_ON() only triggers when fn is not NULL. (I missed it before.) In theory, fib6_del() calls fib6_del_route() which should set rt->rt6i_node to NULL and rt->dst.obsolete to DST_OBSOLETE_DEAD within the same write_lock session. If those 2 values are inconsistent, it indicates something is wrong. Will need more time to root cause the issue. Please ignore this patch. Sorry about the confusion.
Re: [PATCH net] ipv6: remove incorrect WARN_ON() in fib6_del()
On Mon, Sep 25, 2017 at 7:07 PM, Martin KaFai Lauwrote: > I am probably still missing something. > > Considering the del operation should be under the writer lock, > if rt->rt6i_node should be NULL (for rt that has already been > removed from fib6), why this WARN_ON() is triggered? > > An example may help. > Look at the stack trace, you'll find the answers... ip6_link_failure() -> ip6_del_rt() Note that rt might have been deleted from the _tree_ already.
Re: [PATCH net] ipv6: remove incorrect WARN_ON() in fib6_del()
On Tue, Sep 26, 2017 at 01:16:05AM +, Wei Wang wrote: > On Mon, Sep 25, 2017 at 5:56 PM, Martin KaFai Lauwrote: > > On Mon, Sep 25, 2017 at 05:35:22PM +, Wei Wang wrote: > >> From: Wei Wang > >> > >> fib6_del() generates WARN_ON() when rt->dst.obsolete > 0. This does not > >> make sense because it is possible that the route passed in is already > >> deleted by some other thread and rt->dst.obsolete is set to > >> DST_OBSOLETE_DEAD. > >> So this commit deletes this WARN_ON() and also remove the > >> "#ifdef RT6_DEBUG >= 2" condition so that if the route is already > >> obsolete, we return right at the beginning of fib6_del(). > >> > >> > >> diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c > >> index e5308d7cbd75..693bcd7ef6d2 100644 > >> --- a/net/ipv6/ip6_fib.c > >> +++ b/net/ipv6/ip6_fib.c > >> @@ -1592,13 +1592,7 @@ int fib6_del(struct rt6_info *rt, struct nl_info > >> *info) > >> struct net *net = info->nl_net; > >> struct rt6_info **rtp; > >> > >> -#if RT6_DEBUG >= 2 > >> - if (rt->dst.obsolete > 0) { > >> - WARN_ON(fn); > > fn should have already been set to NULL if it is removed > > from the fib6 tree? > > > > That is true. rt->rt6i_node (fn) should already be marked as NULL. I am probably still missing something. Considering the del operation should be under the writer lock, if rt->rt6i_node should be NULL (for rt that has already been removed from fib6), why this WARN_ON() is triggered? An example may help. > That means the check on rt->dst.obsolete is redundant. > I will remove it in v2. > Thanks Martin. > > > >> - return -ENOENT; > >> - } > >> -#endif > >> - if (!fn || rt == net->ipv6.ip6_null_entry) > >> + if (!fn || rt->dst.obsolete > 0 || rt == net->ipv6.ip6_null_entry) > >> return -ENOENT; > >> > >> WARN_ON(!(fn->fn_flags & RTN_RTINFO)); > >> -- > >> 2.14.1.821.g8fa685d3b7-goog > >>
Re: [PATCH net] ipv6: remove incorrect WARN_ON() in fib6_del()
On Mon, Sep 25, 2017 at 5:56 PM, Martin KaFai Lauwrote: > On Mon, Sep 25, 2017 at 05:35:22PM +, Wei Wang wrote: >> From: Wei Wang >> >> fib6_del() generates WARN_ON() when rt->dst.obsolete > 0. This does not >> make sense because it is possible that the route passed in is already >> deleted by some other thread and rt->dst.obsolete is set to >> DST_OBSOLETE_DEAD. >> So this commit deletes this WARN_ON() and also remove the >> "#ifdef RT6_DEBUG >= 2" condition so that if the route is already >> obsolete, we return right at the beginning of fib6_del(). >> >> >> diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c >> index e5308d7cbd75..693bcd7ef6d2 100644 >> --- a/net/ipv6/ip6_fib.c >> +++ b/net/ipv6/ip6_fib.c >> @@ -1592,13 +1592,7 @@ int fib6_del(struct rt6_info *rt, struct nl_info >> *info) >> struct net *net = info->nl_net; >> struct rt6_info **rtp; >> >> -#if RT6_DEBUG >= 2 >> - if (rt->dst.obsolete > 0) { >> - WARN_ON(fn); > fn should have already been set to NULL if it is removed > from the fib6 tree? > That is true. rt->rt6i_node (fn) should already be marked as NULL. That means the check on rt->dst.obsolete is redundant. I will remove it in v2. Thanks Martin. >> - return -ENOENT; >> - } >> -#endif >> - if (!fn || rt == net->ipv6.ip6_null_entry) >> + if (!fn || rt->dst.obsolete > 0 || rt == net->ipv6.ip6_null_entry) >> return -ENOENT; >> >> WARN_ON(!(fn->fn_flags & RTN_RTINFO)); >> -- >> 2.14.1.821.g8fa685d3b7-goog >>
Re: [PATCH net] ipv6: remove incorrect WARN_ON() in fib6_del()
On Mon, Sep 25, 2017 at 05:35:22PM +, Wei Wang wrote: > From: Wei Wang> > fib6_del() generates WARN_ON() when rt->dst.obsolete > 0. This does not > make sense because it is possible that the route passed in is already > deleted by some other thread and rt->dst.obsolete is set to > DST_OBSOLETE_DEAD. > So this commit deletes this WARN_ON() and also remove the > "#ifdef RT6_DEBUG >= 2" condition so that if the route is already > obsolete, we return right at the beginning of fib6_del(). > > > diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c > index e5308d7cbd75..693bcd7ef6d2 100644 > --- a/net/ipv6/ip6_fib.c > +++ b/net/ipv6/ip6_fib.c > @@ -1592,13 +1592,7 @@ int fib6_del(struct rt6_info *rt, struct nl_info *info) > struct net *net = info->nl_net; > struct rt6_info **rtp; > > -#if RT6_DEBUG >= 2 > - if (rt->dst.obsolete > 0) { > - WARN_ON(fn); fn should have already been set to NULL if it is removed from the fib6 tree? > - return -ENOENT; > - } > -#endif > - if (!fn || rt == net->ipv6.ip6_null_entry) > + if (!fn || rt->dst.obsolete > 0 || rt == net->ipv6.ip6_null_entry) > return -ENOENT; > > WARN_ON(!(fn->fn_flags & RTN_RTINFO)); > -- > 2.14.1.821.g8fa685d3b7-goog >
[PATCH net] ipv6: remove incorrect WARN_ON() in fib6_del()
From: Wei Wangfib6_del() generates WARN_ON() when rt->dst.obsolete > 0. This does not make sense because it is possible that the route passed in is already deleted by some other thread and rt->dst.obsolete is set to DST_OBSOLETE_DEAD. So this commit deletes this WARN_ON() and also remove the "#ifdef RT6_DEBUG >= 2" condition so that if the route is already obsolete, we return right at the beginning of fib6_del(). Syzkaller hit this WARN_ON() in the following call trace: __dump_stack lib/dump_stack.c:16 [inline] dump_stack+0x194/0x257 lib/dump_stack.c:52 panic+0x1e4/0x417 kernel/panic.c:180 __warn+0x1c4/0x1d9 kernel/panic.c:541 report_bug+0x211/0x2d0 lib/bug.c:183 fixup_bug+0x40/0x90 arch/x86/kernel/traps.c:190 do_trap_no_signal arch/x86/kernel/traps.c:224 [inline] do_trap+0x260/0x390 arch/x86/kernel/traps.c:273 do_error_trap+0x120/0x390 arch/x86/kernel/traps.c:310 do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:323 invalid_op+0x1e/0x30 arch/x86/entry/entry_64.S:846 RIP: 0010:fib6_del+0x947/0xca0 net/ipv6/ip6_fib.c:1477 RSP: 0018:8801db2074d8 EFLAGS: 00010206 RAX: 8801d1500080 RBX: 8801d01638c0 RCX: RDX: 0100 RSI: 8801db207650 RDI: 8801d0163924 RBP: 8801db2075f0 R08: 86df5f98 R09: 0002 R10: 8801db2074b8 R11: 11003a2a026b R12: dc00 R13: 8801db207650 R14: 8801a0748180 R15: 11003b640ea5 __ip6_del_rt+0xc7/0x120 net/ipv6/route.c:2136 ip6_del_rt+0x132/0x1a0 net/ipv6/route.c:2149 ip6_link_failure+0x244/0x380 net/ipv6/route.c:1359 dst_link_failure include/net/dst.h:454 [inline] ndisc_error_report+0xae/0x180 net/ipv6/ndisc.c:682 neigh_invalidate+0x225/0x530 net/core/neighbour.c:883 neigh_timer_handler+0x883/0xca0 net/core/neighbour.c:969 call_timer_fn+0x233/0x830 kernel/time/timer.c:1268 expire_timers kernel/time/timer.c:1307 [inline] __run_timers+0x7fd/0xb90 kernel/time/timer.c:1601 run_timer_softirq+0x21/0x80 kernel/time/timer.c:1614 __do_softirq+0x2f5/0xba3 kernel/softirq.c:284 invoke_softirq kernel/softirq.c:364 [inline] irq_exit+0x1cc/0x200 kernel/softirq.c:405 exiting_irq arch/x86/include/asm/apic.h:638 [inline] smp_apic_timer_interrupt+0x76/0xa0 arch/x86/kernel/apic/apic.c:1044 apic_timer_interrupt+0x93/0xa0 arch/x86/entry/entry_64.S:702 RIP: 0010:arch_local_irq_enable arch/x86/include/asm/paravirt.h:824 [inline] RIP: 0010:__raw_spin_unlock_irq include/linux/spinlock_api_smp.h:168 [inline] RIP: 0010:_raw_spin_unlock_irq+0x56/0x70 kernel/locking/spinlock.c:199 RSP: 0018:8801d0407040 EFLAGS: 0286 ORIG_RAX: ff10 RAX: dc00 RBX: 8801db225780 RCX: RDX: 10b59433 RSI: 0001 RDI: 85aca198 RBP: 8801d0407048 R08: 0001 R09: R10: R11: R12: 8801c6820400 R13: 11003a080e11 R14: 8801d1500080 R15: 8801d1500080 finish_lock_switch kernel/sched/sched.h:1334 [inline] finish_task_switch+0x1d3/0x740 kernel/sched/core.c:2638 context_switch kernel/sched/core.c:2774 [inline] __schedule+0x8f0/0x2070 kernel/sched/core.c:3332 schedule+0x108/0x440 kernel/sched/core.c:3391 schedule_hrtimeout_range_clock+0x23e/0x810 kernel/time/hrtimer.c:1708 schedule_hrtimeout_range+0x2a/0x40 kernel/time/hrtimer.c:1753 poll_schedule_timeout+0x10f/0x1f0 fs/select.c:242 do_select+0x11ea/0x1710 fs/select.c:581 core_sys_select+0x480/0x960 fs/select.c:655 do_pselect fs/select.c:732 [inline] SYSC_pselect6 fs/select.c:773 [inline] SyS_pselect6+0x54a/0x650 fs/select.c:758 entry_SYSCALL_64_fastpath+0x1f/0xbe RIP: 0033:0x45f181 RSP: 002b:7f91306e1db0 EFLAGS: 0246 ORIG_RAX: 010e RAX: ffda RBX: RCX: 0045f181 RDX: RSI: RDI: RBP: 0086 R08: 7f91306e1db0 R09: R10: R11: 0246 R12: 7ffdd9621670 R13: 7f91306e29c0 R14: 7f9130eac040 R15: 0003 Note: there is no Fixes tag because this bug was introduced long ago. Signed-off-by: Wei Wang Acked-by: Eric Dumazet --- net/ipv6/ip6_fib.c | 8 +--- 1 file changed, 1 insertion(+), 7 deletions(-) diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c index e5308d7cbd75..693bcd7ef6d2 100644 --- a/net/ipv6/ip6_fib.c +++ b/net/ipv6/ip6_fib.c @@ -1592,13 +1592,7 @@ int fib6_del(struct rt6_info *rt, struct nl_info *info) struct net *net = info->nl_net; struct rt6_info **rtp; -#if RT6_DEBUG >= 2 - if (rt->dst.obsolete > 0) { - WARN_ON(fn); - return -ENOENT; - } -#endif - if (!fn || rt == net->ipv6.ip6_null_entry) + if (!fn || rt->dst.obsolete > 0 || rt == net->ipv6.ip6_null_entry) return -ENOENT; WARN_ON(!(fn->fn_flags & RTN_RTINFO)); -- 2.14.1.821.g8fa685d3b7-goog