Re: netlink: NULL timer crash

2017-07-26 Thread ChunYu Wang
Wo, thanks!

On Wed, Jul 26, 2017 at 9:13 PM, Dmitry Vyukov  wrote:
> On Wed, Jul 26, 2017 at 3:09 PM,   wrote:
>> Hi Dmitry,
>>
>> By trying to apply your reproducer to normal kernels, this scenery can not
>> be reproduced (on fedora). Does this C source only for  KASAN kernels?
>
> No, NULL derefs are detected without KASAN.
>
>
>> On Thursday, March 23, 2017 at 8:55:52 PM UTC+8, Dmitry Vyukov wrote:
>>>
>>> Hello,
>>>
>>> The following program triggers call of NULL timer func:
>>>
>>>
>>> https://gist.githubusercontent.com/dvyukov/c210d01c74b911273469a93862ea7788/raw/2a3182772a6a6e20af3e71c02c2a1c2895d803fb/gistfile1.txt
>>>
>>>
>>> BUG: unable to handle kernel NULL pointer dereference at   (null)
>>> IP:   (null)
>>> PGD 0
>>> Oops: 0010 [#1] SMP KASAN
>>> Modules linked in:
>>> CPU: 2 PID: 0 Comm: swapper/2 Not tainted 4.11.0-rc3+ #365
>>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs
>>> 01/01/2011
>>> task: 88006c634300 task.stack: 88006c64
>>> RIP: 0010:  (null)
>>> RSP: 0018:88006d1077c8 EFLAGS: 00010246
>>> RAX: dc00 RBX: 880062bddb00 RCX: 8154e161
>>> RDX: 1090c1f1 RSI:  RDI: 880062bddb00
>>> RBP: 88006d1077e8 R08: fbfff0a936a8 R09: 0001
>>> R10: 0001 R11: fbfff0a936a7 R12: 84860f80
>>> R13:  R14: 880062bddb60 R15: 11000da20f05
>>> FS:  () GS:88006d10()
>>> knlGS:
>>> CS:  0010 DS:  ES:  CR0: 80050033
>>> CR2:  CR3: 04e21000 CR4: 001406e0
>>> Call Trace:
>>>  
>>>  neigh_timer_handler+0x365/0xd40 net/core/neighbour.c:944
>>>  call_timer_fn+0x232/0x8c0 kernel/time/timer.c:1268
>>>  expire_timers kernel/time/timer.c:1307 [inline]
>>>  __run_timers+0x6f7/0xbd0 kernel/time/timer.c:1601
>>>  run_timer_softirq+0x21/0x80 kernel/time/timer.c:1614
>>>  __do_softirq+0x2d6/0xb54 kernel/softirq.c:284
>>>  invoke_softirq kernel/softirq.c:364 [inline]
>>>  irq_exit+0x1b1/0x1e0 kernel/softirq.c:405
>>>  exiting_irq arch/x86/include/asm/apic.h:657 [inline]
>>>  smp_apic_timer_interrupt+0x76/0xa0 arch/x86/kernel/apic/apic.c:962
>>>  apic_timer_interrupt+0x93/0xa0 arch/x86/entry/entry_64.S:487
>>> RIP: 0010:native_safe_halt+0x6/0x10 arch/x86/include/asm/irqflags.h:53
>>> RSP: 0018:88006c647dc0 EFLAGS: 0286 ORIG_RAX: ff10
>>> RAX: dc00 RBX: 11000d8c8fbb RCX: 
>>> RDX: 109d8ed4 RSI: 0001 RDI: 84ec76a0
>>> RBP: 88006c647dc0 R08: ed000d8c6861 R09: 
>>> R10:  R11:  R12: fbfff09d8ed2
>>> R13: 88006c647e78 R14: 84ec7690 R15: 0002
>>>  
>>>  arch_safe_halt arch/x86/include/asm/paravirt.h:98 [inline]
>>>  default_idle+0xba/0x450 arch/x86/kernel/process.c:275
>>>  arch_cpu_idle+0xa/0x10 arch/x86/kernel/process.c:266
>>>  default_idle_call+0x37/0x80 kernel/sched/idle.c:97
>>>  cpuidle_idle_call kernel/sched/idle.c:155 [inline]
>>>  do_idle+0x230/0x380 kernel/sched/idle.c:244
>>>  cpu_startup_entry+0x18/0x20 kernel/sched/idle.c:346
>>>  start_secondary+0x2a7/0x340 arch/x86/kernel/smpboot.c:275
>>>  start_cpu+0x14/0x14 arch/x86/kernel/head_64.S:306
>>> Code:  Bad RIP value.
>>> RIP:   (null) RSP: 88006d1077c8
>>> CR2: 
>>> ---[ end trace 845120b8a0d21411 ]---
>>>
>>> On commit 093b995e3b55a0ae0670226ddfcb05bfbf0099ae
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "syzkaller" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to syzkaller+unsubscr...@googlegroups.com.
>> For more options, visit https://groups.google.com/d/optout.



-- 
CHUNYU WANG

ASSOCIATE QE

KERNEL ENG


Re: netlink: NULL timer crash

2017-07-26 Thread ChunYu Wang
Wo, thanks!

On Wed, Jul 26, 2017 at 9:13 PM, Dmitry Vyukov  wrote:
> On Wed, Jul 26, 2017 at 3:09 PM,   wrote:
>> Hi Dmitry,
>>
>> By trying to apply your reproducer to normal kernels, this scenery can not
>> be reproduced (on fedora). Does this C source only for  KASAN kernels?
>
> No, NULL derefs are detected without KASAN.
>
>
>> On Thursday, March 23, 2017 at 8:55:52 PM UTC+8, Dmitry Vyukov wrote:
>>>
>>> Hello,
>>>
>>> The following program triggers call of NULL timer func:
>>>
>>>
>>> https://gist.githubusercontent.com/dvyukov/c210d01c74b911273469a93862ea7788/raw/2a3182772a6a6e20af3e71c02c2a1c2895d803fb/gistfile1.txt
>>>
>>>
>>> BUG: unable to handle kernel NULL pointer dereference at   (null)
>>> IP:   (null)
>>> PGD 0
>>> Oops: 0010 [#1] SMP KASAN
>>> Modules linked in:
>>> CPU: 2 PID: 0 Comm: swapper/2 Not tainted 4.11.0-rc3+ #365
>>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs
>>> 01/01/2011
>>> task: 88006c634300 task.stack: 88006c64
>>> RIP: 0010:  (null)
>>> RSP: 0018:88006d1077c8 EFLAGS: 00010246
>>> RAX: dc00 RBX: 880062bddb00 RCX: 8154e161
>>> RDX: 1090c1f1 RSI:  RDI: 880062bddb00
>>> RBP: 88006d1077e8 R08: fbfff0a936a8 R09: 0001
>>> R10: 0001 R11: fbfff0a936a7 R12: 84860f80
>>> R13:  R14: 880062bddb60 R15: 11000da20f05
>>> FS:  () GS:88006d10()
>>> knlGS:
>>> CS:  0010 DS:  ES:  CR0: 80050033
>>> CR2:  CR3: 04e21000 CR4: 001406e0
>>> Call Trace:
>>>  
>>>  neigh_timer_handler+0x365/0xd40 net/core/neighbour.c:944
>>>  call_timer_fn+0x232/0x8c0 kernel/time/timer.c:1268
>>>  expire_timers kernel/time/timer.c:1307 [inline]
>>>  __run_timers+0x6f7/0xbd0 kernel/time/timer.c:1601
>>>  run_timer_softirq+0x21/0x80 kernel/time/timer.c:1614
>>>  __do_softirq+0x2d6/0xb54 kernel/softirq.c:284
>>>  invoke_softirq kernel/softirq.c:364 [inline]
>>>  irq_exit+0x1b1/0x1e0 kernel/softirq.c:405
>>>  exiting_irq arch/x86/include/asm/apic.h:657 [inline]
>>>  smp_apic_timer_interrupt+0x76/0xa0 arch/x86/kernel/apic/apic.c:962
>>>  apic_timer_interrupt+0x93/0xa0 arch/x86/entry/entry_64.S:487
>>> RIP: 0010:native_safe_halt+0x6/0x10 arch/x86/include/asm/irqflags.h:53
>>> RSP: 0018:88006c647dc0 EFLAGS: 0286 ORIG_RAX: ff10
>>> RAX: dc00 RBX: 11000d8c8fbb RCX: 
>>> RDX: 109d8ed4 RSI: 0001 RDI: 84ec76a0
>>> RBP: 88006c647dc0 R08: ed000d8c6861 R09: 
>>> R10:  R11:  R12: fbfff09d8ed2
>>> R13: 88006c647e78 R14: 84ec7690 R15: 0002
>>>  
>>>  arch_safe_halt arch/x86/include/asm/paravirt.h:98 [inline]
>>>  default_idle+0xba/0x450 arch/x86/kernel/process.c:275
>>>  arch_cpu_idle+0xa/0x10 arch/x86/kernel/process.c:266
>>>  default_idle_call+0x37/0x80 kernel/sched/idle.c:97
>>>  cpuidle_idle_call kernel/sched/idle.c:155 [inline]
>>>  do_idle+0x230/0x380 kernel/sched/idle.c:244
>>>  cpu_startup_entry+0x18/0x20 kernel/sched/idle.c:346
>>>  start_secondary+0x2a7/0x340 arch/x86/kernel/smpboot.c:275
>>>  start_cpu+0x14/0x14 arch/x86/kernel/head_64.S:306
>>> Code:  Bad RIP value.
>>> RIP:   (null) RSP: 88006d1077c8
>>> CR2: 
>>> ---[ end trace 845120b8a0d21411 ]---
>>>
>>> On commit 093b995e3b55a0ae0670226ddfcb05bfbf0099ae
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "syzkaller" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to syzkaller+unsubscr...@googlegroups.com.
>> For more options, visit https://groups.google.com/d/optout.



-- 
CHUNYU WANG

ASSOCIATE QE

KERNEL ENG


Re: netlink: NULL timer crash

2017-07-26 Thread Dmitry Vyukov
On Wed, Jul 26, 2017 at 3:09 PM,   wrote:
> Hi Dmitry,
>
> By trying to apply your reproducer to normal kernels, this scenery can not
> be reproduced (on fedora). Does this C source only for  KASAN kernels?

No, NULL derefs are detected without KASAN.


> On Thursday, March 23, 2017 at 8:55:52 PM UTC+8, Dmitry Vyukov wrote:
>>
>> Hello,
>>
>> The following program triggers call of NULL timer func:
>>
>>
>> https://gist.githubusercontent.com/dvyukov/c210d01c74b911273469a93862ea7788/raw/2a3182772a6a6e20af3e71c02c2a1c2895d803fb/gistfile1.txt
>>
>>
>> BUG: unable to handle kernel NULL pointer dereference at   (null)
>> IP:   (null)
>> PGD 0
>> Oops: 0010 [#1] SMP KASAN
>> Modules linked in:
>> CPU: 2 PID: 0 Comm: swapper/2 Not tainted 4.11.0-rc3+ #365
>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs
>> 01/01/2011
>> task: 88006c634300 task.stack: 88006c64
>> RIP: 0010:  (null)
>> RSP: 0018:88006d1077c8 EFLAGS: 00010246
>> RAX: dc00 RBX: 880062bddb00 RCX: 8154e161
>> RDX: 1090c1f1 RSI:  RDI: 880062bddb00
>> RBP: 88006d1077e8 R08: fbfff0a936a8 R09: 0001
>> R10: 0001 R11: fbfff0a936a7 R12: 84860f80
>> R13:  R14: 880062bddb60 R15: 11000da20f05
>> FS:  () GS:88006d10()
>> knlGS:
>> CS:  0010 DS:  ES:  CR0: 80050033
>> CR2:  CR3: 04e21000 CR4: 001406e0
>> Call Trace:
>>  
>>  neigh_timer_handler+0x365/0xd40 net/core/neighbour.c:944
>>  call_timer_fn+0x232/0x8c0 kernel/time/timer.c:1268
>>  expire_timers kernel/time/timer.c:1307 [inline]
>>  __run_timers+0x6f7/0xbd0 kernel/time/timer.c:1601
>>  run_timer_softirq+0x21/0x80 kernel/time/timer.c:1614
>>  __do_softirq+0x2d6/0xb54 kernel/softirq.c:284
>>  invoke_softirq kernel/softirq.c:364 [inline]
>>  irq_exit+0x1b1/0x1e0 kernel/softirq.c:405
>>  exiting_irq arch/x86/include/asm/apic.h:657 [inline]
>>  smp_apic_timer_interrupt+0x76/0xa0 arch/x86/kernel/apic/apic.c:962
>>  apic_timer_interrupt+0x93/0xa0 arch/x86/entry/entry_64.S:487
>> RIP: 0010:native_safe_halt+0x6/0x10 arch/x86/include/asm/irqflags.h:53
>> RSP: 0018:88006c647dc0 EFLAGS: 0286 ORIG_RAX: ff10
>> RAX: dc00 RBX: 11000d8c8fbb RCX: 
>> RDX: 109d8ed4 RSI: 0001 RDI: 84ec76a0
>> RBP: 88006c647dc0 R08: ed000d8c6861 R09: 
>> R10:  R11:  R12: fbfff09d8ed2
>> R13: 88006c647e78 R14: 84ec7690 R15: 0002
>>  
>>  arch_safe_halt arch/x86/include/asm/paravirt.h:98 [inline]
>>  default_idle+0xba/0x450 arch/x86/kernel/process.c:275
>>  arch_cpu_idle+0xa/0x10 arch/x86/kernel/process.c:266
>>  default_idle_call+0x37/0x80 kernel/sched/idle.c:97
>>  cpuidle_idle_call kernel/sched/idle.c:155 [inline]
>>  do_idle+0x230/0x380 kernel/sched/idle.c:244
>>  cpu_startup_entry+0x18/0x20 kernel/sched/idle.c:346
>>  start_secondary+0x2a7/0x340 arch/x86/kernel/smpboot.c:275
>>  start_cpu+0x14/0x14 arch/x86/kernel/head_64.S:306
>> Code:  Bad RIP value.
>> RIP:   (null) RSP: 88006d1077c8
>> CR2: 
>> ---[ end trace 845120b8a0d21411 ]---
>>
>> On commit 093b995e3b55a0ae0670226ddfcb05bfbf0099ae
>
> --
> You received this message because you are subscribed to the Google Groups
> "syzkaller" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to syzkaller+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.


Re: netlink: NULL timer crash

2017-07-26 Thread Dmitry Vyukov
On Wed, Jul 26, 2017 at 3:09 PM,   wrote:
> Hi Dmitry,
>
> By trying to apply your reproducer to normal kernels, this scenery can not
> be reproduced (on fedora). Does this C source only for  KASAN kernels?

No, NULL derefs are detected without KASAN.


> On Thursday, March 23, 2017 at 8:55:52 PM UTC+8, Dmitry Vyukov wrote:
>>
>> Hello,
>>
>> The following program triggers call of NULL timer func:
>>
>>
>> https://gist.githubusercontent.com/dvyukov/c210d01c74b911273469a93862ea7788/raw/2a3182772a6a6e20af3e71c02c2a1c2895d803fb/gistfile1.txt
>>
>>
>> BUG: unable to handle kernel NULL pointer dereference at   (null)
>> IP:   (null)
>> PGD 0
>> Oops: 0010 [#1] SMP KASAN
>> Modules linked in:
>> CPU: 2 PID: 0 Comm: swapper/2 Not tainted 4.11.0-rc3+ #365
>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs
>> 01/01/2011
>> task: 88006c634300 task.stack: 88006c64
>> RIP: 0010:  (null)
>> RSP: 0018:88006d1077c8 EFLAGS: 00010246
>> RAX: dc00 RBX: 880062bddb00 RCX: 8154e161
>> RDX: 1090c1f1 RSI:  RDI: 880062bddb00
>> RBP: 88006d1077e8 R08: fbfff0a936a8 R09: 0001
>> R10: 0001 R11: fbfff0a936a7 R12: 84860f80
>> R13:  R14: 880062bddb60 R15: 11000da20f05
>> FS:  () GS:88006d10()
>> knlGS:
>> CS:  0010 DS:  ES:  CR0: 80050033
>> CR2:  CR3: 04e21000 CR4: 001406e0
>> Call Trace:
>>  
>>  neigh_timer_handler+0x365/0xd40 net/core/neighbour.c:944
>>  call_timer_fn+0x232/0x8c0 kernel/time/timer.c:1268
>>  expire_timers kernel/time/timer.c:1307 [inline]
>>  __run_timers+0x6f7/0xbd0 kernel/time/timer.c:1601
>>  run_timer_softirq+0x21/0x80 kernel/time/timer.c:1614
>>  __do_softirq+0x2d6/0xb54 kernel/softirq.c:284
>>  invoke_softirq kernel/softirq.c:364 [inline]
>>  irq_exit+0x1b1/0x1e0 kernel/softirq.c:405
>>  exiting_irq arch/x86/include/asm/apic.h:657 [inline]
>>  smp_apic_timer_interrupt+0x76/0xa0 arch/x86/kernel/apic/apic.c:962
>>  apic_timer_interrupt+0x93/0xa0 arch/x86/entry/entry_64.S:487
>> RIP: 0010:native_safe_halt+0x6/0x10 arch/x86/include/asm/irqflags.h:53
>> RSP: 0018:88006c647dc0 EFLAGS: 0286 ORIG_RAX: ff10
>> RAX: dc00 RBX: 11000d8c8fbb RCX: 
>> RDX: 109d8ed4 RSI: 0001 RDI: 84ec76a0
>> RBP: 88006c647dc0 R08: ed000d8c6861 R09: 
>> R10:  R11:  R12: fbfff09d8ed2
>> R13: 88006c647e78 R14: 84ec7690 R15: 0002
>>  
>>  arch_safe_halt arch/x86/include/asm/paravirt.h:98 [inline]
>>  default_idle+0xba/0x450 arch/x86/kernel/process.c:275
>>  arch_cpu_idle+0xa/0x10 arch/x86/kernel/process.c:266
>>  default_idle_call+0x37/0x80 kernel/sched/idle.c:97
>>  cpuidle_idle_call kernel/sched/idle.c:155 [inline]
>>  do_idle+0x230/0x380 kernel/sched/idle.c:244
>>  cpu_startup_entry+0x18/0x20 kernel/sched/idle.c:346
>>  start_secondary+0x2a7/0x340 arch/x86/kernel/smpboot.c:275
>>  start_cpu+0x14/0x14 arch/x86/kernel/head_64.S:306
>> Code:  Bad RIP value.
>> RIP:   (null) RSP: 88006d1077c8
>> CR2: 
>> ---[ end trace 845120b8a0d21411 ]---
>>
>> On commit 093b995e3b55a0ae0670226ddfcb05bfbf0099ae
>
> --
> You received this message because you are subscribed to the Google Groups
> "syzkaller" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to syzkaller+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.


Re: netlink: NULL timer crash

2017-03-23 Thread Eric Dumazet
On Thu, 2017-03-23 at 12:00 -0700, David Miller wrote:
> From: Eric Dumazet 
> Date: Thu, 23 Mar 2017 09:00:58 -0700
> 
> > On Thu, 2017-03-23 at 07:53 -0700, Eric Dumazet wrote:
> > 
> >> Nice !
> >> 
> >> Looks like neigh->ops->solicit is NULL
> > 
> > Apparently we allow admins to do really stupid things with neighbours
> > on tunnels.
> > 
> > Following patch should avoid the crash.
> > 
> > Anyone has better ideas ?
> 
> This is probably good enough for now, but you need to also handle
> dn_neigh_ops.
> 
> Another way to solve this is to add a NULL method check to the
> one spot where we invoke this method.  That clearly shows that
> the method is optional.

Yes, this would be a one liner. I will post this in a minute.

Thanks.





Re: netlink: NULL timer crash

2017-03-23 Thread Eric Dumazet
On Thu, 2017-03-23 at 12:00 -0700, David Miller wrote:
> From: Eric Dumazet 
> Date: Thu, 23 Mar 2017 09:00:58 -0700
> 
> > On Thu, 2017-03-23 at 07:53 -0700, Eric Dumazet wrote:
> > 
> >> Nice !
> >> 
> >> Looks like neigh->ops->solicit is NULL
> > 
> > Apparently we allow admins to do really stupid things with neighbours
> > on tunnels.
> > 
> > Following patch should avoid the crash.
> > 
> > Anyone has better ideas ?
> 
> This is probably good enough for now, but you need to also handle
> dn_neigh_ops.
> 
> Another way to solve this is to add a NULL method check to the
> one spot where we invoke this method.  That clearly shows that
> the method is optional.

Yes, this would be a one liner. I will post this in a minute.

Thanks.





Re: netlink: NULL timer crash

2017-03-23 Thread David Miller
From: Eric Dumazet 
Date: Thu, 23 Mar 2017 09:00:58 -0700

> On Thu, 2017-03-23 at 07:53 -0700, Eric Dumazet wrote:
> 
>> Nice !
>> 
>> Looks like neigh->ops->solicit is NULL
> 
> Apparently we allow admins to do really stupid things with neighbours
> on tunnels.
> 
> Following patch should avoid the crash.
> 
> Anyone has better ideas ?

This is probably good enough for now, but you need to also handle
dn_neigh_ops.

Another way to solve this is to add a NULL method check to the
one spot where we invoke this method.  That clearly shows that
the method is optional.



Re: netlink: NULL timer crash

2017-03-23 Thread David Miller
From: Eric Dumazet 
Date: Thu, 23 Mar 2017 09:00:58 -0700

> On Thu, 2017-03-23 at 07:53 -0700, Eric Dumazet wrote:
> 
>> Nice !
>> 
>> Looks like neigh->ops->solicit is NULL
> 
> Apparently we allow admins to do really stupid things with neighbours
> on tunnels.
> 
> Following patch should avoid the crash.
> 
> Anyone has better ideas ?

This is probably good enough for now, but you need to also handle
dn_neigh_ops.

Another way to solve this is to add a NULL method check to the
one spot where we invoke this method.  That clearly shows that
the method is optional.



Re: netlink: NULL timer crash

2017-03-23 Thread Eric Dumazet
On Thu, 2017-03-23 at 07:53 -0700, Eric Dumazet wrote:

> Nice !
> 
> Looks like neigh->ops->solicit is NULL

Apparently we allow admins to do really stupid things with neighbours
on tunnels.

Following patch should avoid the crash.

Anyone has better ideas ?


 net/ipv4/arp.c   |5 +
 net/ipv6/ndisc.c |4 
 2 files changed, 9 insertions(+)

diff --git a/net/ipv4/arp.c b/net/ipv4/arp.c
index 
51b27ae09fbd725bcd8030982e5850215ac4ce5c..963191b12e28041bf5df6f37f222a7155f83a414
 100644
--- a/net/ipv4/arp.c
+++ b/net/ipv4/arp.c
@@ -146,8 +146,13 @@ static const struct neigh_ops arp_hh_ops = {
.connected_output = neigh_resolve_output,
 };
 
+static void arp_no_solicit(struct neighbour *neigh, struct sk_buff *skb)
+{
+}
+
 static const struct neigh_ops arp_direct_ops = {
.family =   AF_INET,
+   .solicit =  arp_no_solicit,
.output =   neigh_direct_output,
.connected_output = neigh_direct_output,
 };
diff --git a/net/ipv6/ndisc.c b/net/ipv6/ndisc.c
index 
7ebac630d3c603186be2fc0dcbaac7d7e74bfde6..86f290b749d5ca0db4310b17ebeff35d847540c7
 100644
--- a/net/ipv6/ndisc.c
+++ b/net/ipv6/ndisc.c
@@ -99,9 +99,13 @@ static const struct neigh_ops ndisc_hh_ops = {
.connected_output = neigh_resolve_output,
 };
 
+static void ndisc_no_solicit(struct neighbour *neigh, struct sk_buff *skb)
+{
+}
 
 static const struct neigh_ops ndisc_direct_ops = {
.family =   AF_INET6,
+   .solicit =  ndisc_no_solicit,
.output =   neigh_direct_output,
.connected_output = neigh_direct_output,
 };




Re: netlink: NULL timer crash

2017-03-23 Thread Eric Dumazet
On Thu, 2017-03-23 at 07:53 -0700, Eric Dumazet wrote:

> Nice !
> 
> Looks like neigh->ops->solicit is NULL

Apparently we allow admins to do really stupid things with neighbours
on tunnels.

Following patch should avoid the crash.

Anyone has better ideas ?


 net/ipv4/arp.c   |5 +
 net/ipv6/ndisc.c |4 
 2 files changed, 9 insertions(+)

diff --git a/net/ipv4/arp.c b/net/ipv4/arp.c
index 
51b27ae09fbd725bcd8030982e5850215ac4ce5c..963191b12e28041bf5df6f37f222a7155f83a414
 100644
--- a/net/ipv4/arp.c
+++ b/net/ipv4/arp.c
@@ -146,8 +146,13 @@ static const struct neigh_ops arp_hh_ops = {
.connected_output = neigh_resolve_output,
 };
 
+static void arp_no_solicit(struct neighbour *neigh, struct sk_buff *skb)
+{
+}
+
 static const struct neigh_ops arp_direct_ops = {
.family =   AF_INET,
+   .solicit =  arp_no_solicit,
.output =   neigh_direct_output,
.connected_output = neigh_direct_output,
 };
diff --git a/net/ipv6/ndisc.c b/net/ipv6/ndisc.c
index 
7ebac630d3c603186be2fc0dcbaac7d7e74bfde6..86f290b749d5ca0db4310b17ebeff35d847540c7
 100644
--- a/net/ipv6/ndisc.c
+++ b/net/ipv6/ndisc.c
@@ -99,9 +99,13 @@ static const struct neigh_ops ndisc_hh_ops = {
.connected_output = neigh_resolve_output,
 };
 
+static void ndisc_no_solicit(struct neighbour *neigh, struct sk_buff *skb)
+{
+}
 
 static const struct neigh_ops ndisc_direct_ops = {
.family =   AF_INET6,
+   .solicit =  ndisc_no_solicit,
.output =   neigh_direct_output,
.connected_output = neigh_direct_output,
 };




Re: netlink: NULL timer crash

2017-03-23 Thread Eric Dumazet
On Thu, Mar 23, 2017 at 5:55 AM, Dmitry Vyukov  wrote:
> Hello,
>
> The following program triggers call of NULL timer func:
>
> https://gist.githubusercontent.com/dvyukov/c210d01c74b911273469a93862ea7788/raw/2a3182772a6a6e20af3e71c02c2a1c2895d803fb/gistfile1.txt
>
>
> BUG: unable to handle kernel NULL pointer dereference at   (null)
> IP:   (null)
> PGD 0
> Oops: 0010 [#1] SMP KASAN
> Modules linked in:
> CPU: 2 PID: 0 Comm: swapper/2 Not tainted 4.11.0-rc3+ #365
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
> task: 88006c634300 task.stack: 88006c64
> RIP: 0010:  (null)
> RSP: 0018:88006d1077c8 EFLAGS: 00010246
> RAX: dc00 RBX: 880062bddb00 RCX: 8154e161
> RDX: 1090c1f1 RSI:  RDI: 880062bddb00
> RBP: 88006d1077e8 R08: fbfff0a936a8 R09: 0001
> R10: 0001 R11: fbfff0a936a7 R12: 84860f80
> R13:  R14: 880062bddb60 R15: 11000da20f05
> FS:  () GS:88006d10() knlGS:
> CS:  0010 DS:  ES:  CR0: 80050033
> CR2:  CR3: 04e21000 CR4: 001406e0
> Call Trace:
>  
>  neigh_timer_handler+0x365/0xd40 net/core/neighbour.c:944
>  call_timer_fn+0x232/0x8c0 kernel/time/timer.c:1268
>  expire_timers kernel/time/timer.c:1307 [inline]
>  __run_timers+0x6f7/0xbd0 kernel/time/timer.c:1601
>  run_timer_softirq+0x21/0x80 kernel/time/timer.c:1614
>  __do_softirq+0x2d6/0xb54 kernel/softirq.c:284
>  invoke_softirq kernel/softirq.c:364 [inline]
>  irq_exit+0x1b1/0x1e0 kernel/softirq.c:405
>  exiting_irq arch/x86/include/asm/apic.h:657 [inline]
>  smp_apic_timer_interrupt+0x76/0xa0 arch/x86/kernel/apic/apic.c:962
>  apic_timer_interrupt+0x93/0xa0 arch/x86/entry/entry_64.S:487
> RIP: 0010:native_safe_halt+0x6/0x10 arch/x86/include/asm/irqflags.h:53
> RSP: 0018:88006c647dc0 EFLAGS: 0286 ORIG_RAX: ff10
> RAX: dc00 RBX: 11000d8c8fbb RCX: 
> RDX: 109d8ed4 RSI: 0001 RDI: 84ec76a0
> RBP: 88006c647dc0 R08: ed000d8c6861 R09: 
> R10:  R11:  R12: fbfff09d8ed2
> R13: 88006c647e78 R14: 84ec7690 R15: 0002
>  
>  arch_safe_halt arch/x86/include/asm/paravirt.h:98 [inline]
>  default_idle+0xba/0x450 arch/x86/kernel/process.c:275
>  arch_cpu_idle+0xa/0x10 arch/x86/kernel/process.c:266
>  default_idle_call+0x37/0x80 kernel/sched/idle.c:97
>  cpuidle_idle_call kernel/sched/idle.c:155 [inline]
>  do_idle+0x230/0x380 kernel/sched/idle.c:244
>  cpu_startup_entry+0x18/0x20 kernel/sched/idle.c:346
>  start_secondary+0x2a7/0x340 arch/x86/kernel/smpboot.c:275
>  start_cpu+0x14/0x14 arch/x86/kernel/head_64.S:306
> Code:  Bad RIP value.
> RIP:   (null) RSP: 88006d1077c8
> CR2: 
> ---[ end trace 845120b8a0d21411 ]---
>
> On commit 093b995e3b55a0ae0670226ddfcb05bfbf0099ae

Nice !

Looks like neigh->ops->solicit is NULL


Re: netlink: NULL timer crash

2017-03-23 Thread Eric Dumazet
On Thu, Mar 23, 2017 at 5:55 AM, Dmitry Vyukov  wrote:
> Hello,
>
> The following program triggers call of NULL timer func:
>
> https://gist.githubusercontent.com/dvyukov/c210d01c74b911273469a93862ea7788/raw/2a3182772a6a6e20af3e71c02c2a1c2895d803fb/gistfile1.txt
>
>
> BUG: unable to handle kernel NULL pointer dereference at   (null)
> IP:   (null)
> PGD 0
> Oops: 0010 [#1] SMP KASAN
> Modules linked in:
> CPU: 2 PID: 0 Comm: swapper/2 Not tainted 4.11.0-rc3+ #365
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
> task: 88006c634300 task.stack: 88006c64
> RIP: 0010:  (null)
> RSP: 0018:88006d1077c8 EFLAGS: 00010246
> RAX: dc00 RBX: 880062bddb00 RCX: 8154e161
> RDX: 1090c1f1 RSI:  RDI: 880062bddb00
> RBP: 88006d1077e8 R08: fbfff0a936a8 R09: 0001
> R10: 0001 R11: fbfff0a936a7 R12: 84860f80
> R13:  R14: 880062bddb60 R15: 11000da20f05
> FS:  () GS:88006d10() knlGS:
> CS:  0010 DS:  ES:  CR0: 80050033
> CR2:  CR3: 04e21000 CR4: 001406e0
> Call Trace:
>  
>  neigh_timer_handler+0x365/0xd40 net/core/neighbour.c:944
>  call_timer_fn+0x232/0x8c0 kernel/time/timer.c:1268
>  expire_timers kernel/time/timer.c:1307 [inline]
>  __run_timers+0x6f7/0xbd0 kernel/time/timer.c:1601
>  run_timer_softirq+0x21/0x80 kernel/time/timer.c:1614
>  __do_softirq+0x2d6/0xb54 kernel/softirq.c:284
>  invoke_softirq kernel/softirq.c:364 [inline]
>  irq_exit+0x1b1/0x1e0 kernel/softirq.c:405
>  exiting_irq arch/x86/include/asm/apic.h:657 [inline]
>  smp_apic_timer_interrupt+0x76/0xa0 arch/x86/kernel/apic/apic.c:962
>  apic_timer_interrupt+0x93/0xa0 arch/x86/entry/entry_64.S:487
> RIP: 0010:native_safe_halt+0x6/0x10 arch/x86/include/asm/irqflags.h:53
> RSP: 0018:88006c647dc0 EFLAGS: 0286 ORIG_RAX: ff10
> RAX: dc00 RBX: 11000d8c8fbb RCX: 
> RDX: 109d8ed4 RSI: 0001 RDI: 84ec76a0
> RBP: 88006c647dc0 R08: ed000d8c6861 R09: 
> R10:  R11:  R12: fbfff09d8ed2
> R13: 88006c647e78 R14: 84ec7690 R15: 0002
>  
>  arch_safe_halt arch/x86/include/asm/paravirt.h:98 [inline]
>  default_idle+0xba/0x450 arch/x86/kernel/process.c:275
>  arch_cpu_idle+0xa/0x10 arch/x86/kernel/process.c:266
>  default_idle_call+0x37/0x80 kernel/sched/idle.c:97
>  cpuidle_idle_call kernel/sched/idle.c:155 [inline]
>  do_idle+0x230/0x380 kernel/sched/idle.c:244
>  cpu_startup_entry+0x18/0x20 kernel/sched/idle.c:346
>  start_secondary+0x2a7/0x340 arch/x86/kernel/smpboot.c:275
>  start_cpu+0x14/0x14 arch/x86/kernel/head_64.S:306
> Code:  Bad RIP value.
> RIP:   (null) RSP: 88006d1077c8
> CR2: 
> ---[ end trace 845120b8a0d21411 ]---
>
> On commit 093b995e3b55a0ae0670226ddfcb05bfbf0099ae

Nice !

Looks like neigh->ops->solicit is NULL


netlink: NULL timer crash

2017-03-23 Thread Dmitry Vyukov
Hello,

The following program triggers call of NULL timer func:

https://gist.githubusercontent.com/dvyukov/c210d01c74b911273469a93862ea7788/raw/2a3182772a6a6e20af3e71c02c2a1c2895d803fb/gistfile1.txt


BUG: unable to handle kernel NULL pointer dereference at   (null)
IP:   (null)
PGD 0
Oops: 0010 [#1] SMP KASAN
Modules linked in:
CPU: 2 PID: 0 Comm: swapper/2 Not tainted 4.11.0-rc3+ #365
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
task: 88006c634300 task.stack: 88006c64
RIP: 0010:  (null)
RSP: 0018:88006d1077c8 EFLAGS: 00010246
RAX: dc00 RBX: 880062bddb00 RCX: 8154e161
RDX: 1090c1f1 RSI:  RDI: 880062bddb00
RBP: 88006d1077e8 R08: fbfff0a936a8 R09: 0001
R10: 0001 R11: fbfff0a936a7 R12: 84860f80
R13:  R14: 880062bddb60 R15: 11000da20f05
FS:  () GS:88006d10() knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2:  CR3: 04e21000 CR4: 001406e0
Call Trace:
 
 neigh_timer_handler+0x365/0xd40 net/core/neighbour.c:944
 call_timer_fn+0x232/0x8c0 kernel/time/timer.c:1268
 expire_timers kernel/time/timer.c:1307 [inline]
 __run_timers+0x6f7/0xbd0 kernel/time/timer.c:1601
 run_timer_softirq+0x21/0x80 kernel/time/timer.c:1614
 __do_softirq+0x2d6/0xb54 kernel/softirq.c:284
 invoke_softirq kernel/softirq.c:364 [inline]
 irq_exit+0x1b1/0x1e0 kernel/softirq.c:405
 exiting_irq arch/x86/include/asm/apic.h:657 [inline]
 smp_apic_timer_interrupt+0x76/0xa0 arch/x86/kernel/apic/apic.c:962
 apic_timer_interrupt+0x93/0xa0 arch/x86/entry/entry_64.S:487
RIP: 0010:native_safe_halt+0x6/0x10 arch/x86/include/asm/irqflags.h:53
RSP: 0018:88006c647dc0 EFLAGS: 0286 ORIG_RAX: ff10
RAX: dc00 RBX: 11000d8c8fbb RCX: 
RDX: 109d8ed4 RSI: 0001 RDI: 84ec76a0
RBP: 88006c647dc0 R08: ed000d8c6861 R09: 
R10:  R11:  R12: fbfff09d8ed2
R13: 88006c647e78 R14: 84ec7690 R15: 0002
 
 arch_safe_halt arch/x86/include/asm/paravirt.h:98 [inline]
 default_idle+0xba/0x450 arch/x86/kernel/process.c:275
 arch_cpu_idle+0xa/0x10 arch/x86/kernel/process.c:266
 default_idle_call+0x37/0x80 kernel/sched/idle.c:97
 cpuidle_idle_call kernel/sched/idle.c:155 [inline]
 do_idle+0x230/0x380 kernel/sched/idle.c:244
 cpu_startup_entry+0x18/0x20 kernel/sched/idle.c:346
 start_secondary+0x2a7/0x340 arch/x86/kernel/smpboot.c:275
 start_cpu+0x14/0x14 arch/x86/kernel/head_64.S:306
Code:  Bad RIP value.
RIP:   (null) RSP: 88006d1077c8
CR2: 
---[ end trace 845120b8a0d21411 ]---

On commit 093b995e3b55a0ae0670226ddfcb05bfbf0099ae


netlink: NULL timer crash

2017-03-23 Thread Dmitry Vyukov
Hello,

The following program triggers call of NULL timer func:

https://gist.githubusercontent.com/dvyukov/c210d01c74b911273469a93862ea7788/raw/2a3182772a6a6e20af3e71c02c2a1c2895d803fb/gistfile1.txt


BUG: unable to handle kernel NULL pointer dereference at   (null)
IP:   (null)
PGD 0
Oops: 0010 [#1] SMP KASAN
Modules linked in:
CPU: 2 PID: 0 Comm: swapper/2 Not tainted 4.11.0-rc3+ #365
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
task: 88006c634300 task.stack: 88006c64
RIP: 0010:  (null)
RSP: 0018:88006d1077c8 EFLAGS: 00010246
RAX: dc00 RBX: 880062bddb00 RCX: 8154e161
RDX: 1090c1f1 RSI:  RDI: 880062bddb00
RBP: 88006d1077e8 R08: fbfff0a936a8 R09: 0001
R10: 0001 R11: fbfff0a936a7 R12: 84860f80
R13:  R14: 880062bddb60 R15: 11000da20f05
FS:  () GS:88006d10() knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2:  CR3: 04e21000 CR4: 001406e0
Call Trace:
 
 neigh_timer_handler+0x365/0xd40 net/core/neighbour.c:944
 call_timer_fn+0x232/0x8c0 kernel/time/timer.c:1268
 expire_timers kernel/time/timer.c:1307 [inline]
 __run_timers+0x6f7/0xbd0 kernel/time/timer.c:1601
 run_timer_softirq+0x21/0x80 kernel/time/timer.c:1614
 __do_softirq+0x2d6/0xb54 kernel/softirq.c:284
 invoke_softirq kernel/softirq.c:364 [inline]
 irq_exit+0x1b1/0x1e0 kernel/softirq.c:405
 exiting_irq arch/x86/include/asm/apic.h:657 [inline]
 smp_apic_timer_interrupt+0x76/0xa0 arch/x86/kernel/apic/apic.c:962
 apic_timer_interrupt+0x93/0xa0 arch/x86/entry/entry_64.S:487
RIP: 0010:native_safe_halt+0x6/0x10 arch/x86/include/asm/irqflags.h:53
RSP: 0018:88006c647dc0 EFLAGS: 0286 ORIG_RAX: ff10
RAX: dc00 RBX: 11000d8c8fbb RCX: 
RDX: 109d8ed4 RSI: 0001 RDI: 84ec76a0
RBP: 88006c647dc0 R08: ed000d8c6861 R09: 
R10:  R11:  R12: fbfff09d8ed2
R13: 88006c647e78 R14: 84ec7690 R15: 0002
 
 arch_safe_halt arch/x86/include/asm/paravirt.h:98 [inline]
 default_idle+0xba/0x450 arch/x86/kernel/process.c:275
 arch_cpu_idle+0xa/0x10 arch/x86/kernel/process.c:266
 default_idle_call+0x37/0x80 kernel/sched/idle.c:97
 cpuidle_idle_call kernel/sched/idle.c:155 [inline]
 do_idle+0x230/0x380 kernel/sched/idle.c:244
 cpu_startup_entry+0x18/0x20 kernel/sched/idle.c:346
 start_secondary+0x2a7/0x340 arch/x86/kernel/smpboot.c:275
 start_cpu+0x14/0x14 arch/x86/kernel/head_64.S:306
Code:  Bad RIP value.
RIP:   (null) RSP: 88006d1077c8
CR2: 
---[ end trace 845120b8a0d21411 ]---

On commit 093b995e3b55a0ae0670226ddfcb05bfbf0099ae