Re: BUG: free active (active state 0) object type: work_struct hint: strp_work

2018-02-14 Thread Tom Herbert
On Tue, Feb 13, 2018 at 12:15 PM, Dmitry Vyukov  wrote:
>
> On Thu, Jan 4, 2018 at 8:36 PM, Tom Herbert  wrote:
> > On Thu, Jan 4, 2018 at 4:10 AM, syzbot
> >  wrote:
> >> Hello,
> >>
> >> syzkaller hit the following crash on
> >> 6bb8824732f69de0f233ae6b1a8158e149627b38
> >> git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git/master
> >> compiler: gcc (GCC) 7.1.1 20170620
> >> .config is attached
> >> Raw console output is attached.
> >> Unfortunately, I don't have any reproducer for this bug yet.
> >>
> >>
> >> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> >> Reported-by: syzbot+3c6c745b0d2f341bb...@syzkaller.appspotmail.com
> >> It will help syzbot understand when the bug is fixed. See footer for
> >> details.
> >> If you forward the report, please keep this part and the footer.
> >>
> >> Use struct sctp_assoc_value instead
> >> sctp: [Deprecated]: syz-executor4 (pid 12483) Use of int in maxseg socket
> >> option.
> >> Use struct sctp_assoc_value instead
> >> [ cut here ]
> >> ODEBUG: free active (active state 0) object type: work_struct hint:
> >> strp_work+0x0/0xf0 net/strparser/strparser.c:381
> >> WARNING: CPU: 1 PID: 3502 at lib/debugobjects.c:291
> >> debug_print_object+0x166/0x220 lib/debugobjects.c:288
> >> Kernel panic - not syncing: panic_on_warn set ...
> >>
> >> CPU: 1 PID: 3502 Comm: kworker/u4:4 Not tainted 4.15.0-rc5+ #170
> >> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> >> Google 01/01/2011
> >> Workqueue: kkcmd kcm_tx_work
> >> Call Trace:
> >>  __dump_stack lib/dump_stack.c:17 [inline]
> >>  dump_stack+0x194/0x257 lib/dump_stack.c:53
> >>  panic+0x1e4/0x41c kernel/panic.c:183
> >>  __warn+0x1dc/0x200 kernel/panic.c:547
> >>  report_bug+0x211/0x2d0 lib/bug.c:184
> >>  fixup_bug.part.11+0x37/0x80 arch/x86/kernel/traps.c:178
> >>  fixup_bug arch/x86/kernel/traps.c:247 [inline]
> >>  do_error_trap+0x2d7/0x3e0 arch/x86/kernel/traps.c:296
> >>  do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:315
> >>  invalid_op+0x22/0x40 arch/x86/entry/entry_64.S:1061
> >> RIP: 0010:debug_print_object+0x166/0x220 lib/debugobjects.c:288
> >> RSP: 0018:8801c0ee7068 EFLAGS: 00010086
> >> RAX: dc08 RBX: 0003 RCX: 8159bc3e
> >> RDX:  RSI: 1100381dcdc8 RDI: 8801db317dd0
> >> RBP: 8801c0ee70a8 R08:  R09: 1100381dcd9a
> >> R10: ed00381dce3c R11: 86137ad8 R12: 0001
> >> R13: 86113480 R14: 8560dc40 R15: 8146e5f0
> >>  __debug_check_no_obj_freed lib/debugobjects.c:745 [inline]
> >>  debug_check_no_obj_freed+0x662/0xf1f lib/debugobjects.c:774
> >>  kmem_cache_free+0x253/0x2a0 mm/slab.c:3745
> >
> > I believe we just need to defer kmem_cache_free to call_rcu.
>
>
> Hi Tom,
>
> Was this ever submitted? I don't any such change in net/kcm/kcmsock.c.


Hi Dmitry,

I am looking at it. Not yet convinced that call_rcu is right fix.

Tom


Re: BUG: free active (active state 0) object type: work_struct hint: strp_work

2018-02-13 Thread Dmitry Vyukov
On Thu, Jan 4, 2018 at 8:36 PM, Tom Herbert  wrote:
> On Thu, Jan 4, 2018 at 4:10 AM, syzbot
>  wrote:
>> Hello,
>>
>> syzkaller hit the following crash on
>> 6bb8824732f69de0f233ae6b1a8158e149627b38
>> git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git/master
>> compiler: gcc (GCC) 7.1.1 20170620
>> .config is attached
>> Raw console output is attached.
>> Unfortunately, I don't have any reproducer for this bug yet.
>>
>>
>> IMPORTANT: if you fix the bug, please add the following tag to the commit:
>> Reported-by: syzbot+3c6c745b0d2f341bb...@syzkaller.appspotmail.com
>> It will help syzbot understand when the bug is fixed. See footer for
>> details.
>> If you forward the report, please keep this part and the footer.
>>
>> Use struct sctp_assoc_value instead
>> sctp: [Deprecated]: syz-executor4 (pid 12483) Use of int in maxseg socket
>> option.
>> Use struct sctp_assoc_value instead
>> [ cut here ]
>> ODEBUG: free active (active state 0) object type: work_struct hint:
>> strp_work+0x0/0xf0 net/strparser/strparser.c:381
>> WARNING: CPU: 1 PID: 3502 at lib/debugobjects.c:291
>> debug_print_object+0x166/0x220 lib/debugobjects.c:288
>> Kernel panic - not syncing: panic_on_warn set ...
>>
>> CPU: 1 PID: 3502 Comm: kworker/u4:4 Not tainted 4.15.0-rc5+ #170
>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
>> Google 01/01/2011
>> Workqueue: kkcmd kcm_tx_work
>> Call Trace:
>>  __dump_stack lib/dump_stack.c:17 [inline]
>>  dump_stack+0x194/0x257 lib/dump_stack.c:53
>>  panic+0x1e4/0x41c kernel/panic.c:183
>>  __warn+0x1dc/0x200 kernel/panic.c:547
>>  report_bug+0x211/0x2d0 lib/bug.c:184
>>  fixup_bug.part.11+0x37/0x80 arch/x86/kernel/traps.c:178
>>  fixup_bug arch/x86/kernel/traps.c:247 [inline]
>>  do_error_trap+0x2d7/0x3e0 arch/x86/kernel/traps.c:296
>>  do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:315
>>  invalid_op+0x22/0x40 arch/x86/entry/entry_64.S:1061
>> RIP: 0010:debug_print_object+0x166/0x220 lib/debugobjects.c:288
>> RSP: 0018:8801c0ee7068 EFLAGS: 00010086
>> RAX: dc08 RBX: 0003 RCX: 8159bc3e
>> RDX:  RSI: 1100381dcdc8 RDI: 8801db317dd0
>> RBP: 8801c0ee70a8 R08:  R09: 1100381dcd9a
>> R10: ed00381dce3c R11: 86137ad8 R12: 0001
>> R13: 86113480 R14: 8560dc40 R15: 8146e5f0
>>  __debug_check_no_obj_freed lib/debugobjects.c:745 [inline]
>>  debug_check_no_obj_freed+0x662/0xf1f lib/debugobjects.c:774
>>  kmem_cache_free+0x253/0x2a0 mm/slab.c:3745
>
> I believe we just need to defer kmem_cache_free to call_rcu.


Hi Tom,

Was this ever submitted? I don't any such change in net/kcm/kcmsock.c.


Re: BUG: free active (active state 0) object type: work_struct hint: strp_work

2018-01-04 Thread Tom Herbert
On Thu, Jan 4, 2018 at 4:10 AM, syzbot
 wrote:
> Hello,
>
> syzkaller hit the following crash on
> 6bb8824732f69de0f233ae6b1a8158e149627b38
> git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git/master
> compiler: gcc (GCC) 7.1.1 20170620
> .config is attached
> Raw console output is attached.
> Unfortunately, I don't have any reproducer for this bug yet.
>
>
> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: syzbot+3c6c745b0d2f341bb...@syzkaller.appspotmail.com
> It will help syzbot understand when the bug is fixed. See footer for
> details.
> If you forward the report, please keep this part and the footer.
>
> Use struct sctp_assoc_value instead
> sctp: [Deprecated]: syz-executor4 (pid 12483) Use of int in maxseg socket
> option.
> Use struct sctp_assoc_value instead
> [ cut here ]
> ODEBUG: free active (active state 0) object type: work_struct hint:
> strp_work+0x0/0xf0 net/strparser/strparser.c:381
> WARNING: CPU: 1 PID: 3502 at lib/debugobjects.c:291
> debug_print_object+0x166/0x220 lib/debugobjects.c:288
> Kernel panic - not syncing: panic_on_warn set ...
>
> CPU: 1 PID: 3502 Comm: kworker/u4:4 Not tainted 4.15.0-rc5+ #170
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> Google 01/01/2011
> Workqueue: kkcmd kcm_tx_work
> Call Trace:
>  __dump_stack lib/dump_stack.c:17 [inline]
>  dump_stack+0x194/0x257 lib/dump_stack.c:53
>  panic+0x1e4/0x41c kernel/panic.c:183
>  __warn+0x1dc/0x200 kernel/panic.c:547
>  report_bug+0x211/0x2d0 lib/bug.c:184
>  fixup_bug.part.11+0x37/0x80 arch/x86/kernel/traps.c:178
>  fixup_bug arch/x86/kernel/traps.c:247 [inline]
>  do_error_trap+0x2d7/0x3e0 arch/x86/kernel/traps.c:296
>  do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:315
>  invalid_op+0x22/0x40 arch/x86/entry/entry_64.S:1061
> RIP: 0010:debug_print_object+0x166/0x220 lib/debugobjects.c:288
> RSP: 0018:8801c0ee7068 EFLAGS: 00010086
> RAX: dc08 RBX: 0003 RCX: 8159bc3e
> RDX:  RSI: 1100381dcdc8 RDI: 8801db317dd0
> RBP: 8801c0ee70a8 R08:  R09: 1100381dcd9a
> R10: ed00381dce3c R11: 86137ad8 R12: 0001
> R13: 86113480 R14: 8560dc40 R15: 8146e5f0
>  __debug_check_no_obj_freed lib/debugobjects.c:745 [inline]
>  debug_check_no_obj_freed+0x662/0xf1f lib/debugobjects.c:774
>  kmem_cache_free+0x253/0x2a0 mm/slab.c:3745

I believe we just need to defer kmem_cache_free to call_rcu.

Tom

>  unreserve_psock+0x5a1/0x780 net/kcm/kcmsock.c:547
>  kcm_write_msgs+0xbae/0x1b80 net/kcm/kcmsock.c:590
>  kcm_tx_work+0x2e/0x190 net/kcm/kcmsock.c:731
>  process_one_work+0xbbf/0x1b10 kernel/workqueue.c:2112
>  worker_thread+0x223/0x1990 kernel/workqueue.c:2246
>  kthread+0x33c/0x400 kernel/kthread.c:238
>  ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:515
>
> ==
> WARNING: possible circular locking dependency detected
> 4.15.0-rc5+ #170 Not tainted
> --
> kworker/u4:4/3502 is trying to acquire lock:
>  ((console_sem).lock){-.-.}, at: [<91214b42>] down_trylock+0x13/0x70
> kernel/locking/semaphore.c:136
>
> but task is already holding lock:
>  (&obj_hash[i].lock){-.-.}, at: []
> __debug_check_no_obj_freed lib/debugobjects.c:736 [inline]
>  (&obj_hash[i].lock){-.-.}, at: []
> debug_check_no_obj_freed+0x1e9/0xf1f lib/debugobjects.c:774
>
> which lock already depends on the new lock.
>
>
> the existing dependency chain (in reverse order) is:
>
> -> #3 (&obj_hash[i].lock){-.-.}:
>__raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
>_raw_spin_lock_irqsave+0x96/0xc0 kernel/locking/spinlock.c:152
>__debug_object_init+0x109/0x1040 lib/debugobjects.c:343
>debug_object_init+0x17/0x20 lib/debugobjects.c:391
>debug_hrtimer_init kernel/time/hrtimer.c:396 [inline]
>debug_init kernel/time/hrtimer.c:441 [inline]
>hrtimer_init+0x8c/0x410 kernel/time/hrtimer.c:1122
>init_dl_task_timer+0x1b/0x50 kernel/sched/deadline.c:1023
>__sched_fork+0x2c4/0xb70 kernel/sched/core.c:2188
>init_idle+0x75/0x820 kernel/sched/core.c:5279
>sched_init+0xb19/0xc43 kernel/sched/core.c:5976
>start_kernel+0x452/0x819 init/main.c:582
>x86_64_start_reservations+0x2a/0x2c arch/x86/kernel/head64.c:378
>x86_64_start_kernel+0x77/0x7a arch/x86/kernel/head64.c:359
>secondary_startup_64+0xa5/0xb0 arch/x86/kernel/head_64.S:237
>
> -> #2 (&rq->lock){-.-.}:
>__raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
>_raw_spin_lock+0x2a/0x40 kernel/locking/spinlock.c:144
>rq_lock kernel/sched/sched.h:1766 [inline]
>task_fork_fair+0x7a/0x690 kernel/sched/fair.c:9449
>sched_fork+0x435/0xc00 kernel/sched/core.c:2404
>copy_process.part