Re: INFO: task hung in stop_sync_thread (2)
On Fri, Mar 30, 2018 at 6:46 PM, syzbot wrote: > syzbot has found reproducer for the following crash on net-next commit /\/\/\/\/\/\ now syzbot can test fixes > 18845557fd6fc1998f2d0d8c30467f86db587529 (Thu Mar 29 20:24:06 2018 +) > Merge tag 'wireless-drivers-next-for-davem-2018-03-29' of > git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next > syzbot dashboard link: > https://syzkaller.appspot.com/bug?extid=5fe074c01b2032ce9618 > > So far this crash happened 2 times on net-next. > C reproducer: https://syzkaller.appspot.com/x/repro.c?id=5531070485233664 > syzkaller reproducer: > https://syzkaller.appspot.com/x/repro.syz?id=5632385408303104 > Raw console output: > https://syzkaller.appspot.com/x/log.txt?id=6520566391504896 > Kernel config: https://syzkaller.appspot.com/x/.config?id=-37309782588693906 > compiler: gcc (GCC) 7.1.1 20170620 > > IMPORTANT: if you fix the bug, please add the following tag to the commit: > Reported-by: syzbot+5fe074c01b2032ce9...@syzkaller.appspotmail.com > It will help syzbot understand when the bug is fixed.
Re: INFO: task hung in stop_sync_thread (2)
syzbot has found reproducer for the following crash on net-next commit 18845557fd6fc1998f2d0d8c30467f86db587529 (Thu Mar 29 20:24:06 2018 +) Merge tag 'wireless-drivers-next-for-davem-2018-03-29' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next syzbot dashboard link: https://syzkaller.appspot.com/bug?extid=5fe074c01b2032ce9618 So far this crash happened 2 times on net-next. C reproducer: https://syzkaller.appspot.com/x/repro.c?id=5531070485233664 syzkaller reproducer: https://syzkaller.appspot.com/x/repro.syz?id=5632385408303104 Raw console output: https://syzkaller.appspot.com/x/log.txt?id=6520566391504896 Kernel config: https://syzkaller.appspot.com/x/.config?id=-37309782588693906 compiler: gcc (GCC) 7.1.1 20170620 IMPORTANT: if you fix the bug, please add the following tag to the commit: Reported-by: syzbot+5fe074c01b2032ce9...@syzkaller.appspotmail.com It will help syzbot understand when the bug is fixed. INFO: task syzkaller923914:4319 blocked for more than 120 seconds. Not tainted 4.16.0-rc6+ #286 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. syzkaller923914 D23312 4319 4316 0x Call Trace: context_switch kernel/sched/core.c:2862 [inline] __schedule+0x8fb/0x1ec0 kernel/sched/core.c:3440 schedule+0xf5/0x430 kernel/sched/core.c:3499 schedule_timeout+0x1a3/0x230 kernel/time/timer.c:1777 do_wait_for_common kernel/sched/completion.c:86 [inline] __wait_for_common kernel/sched/completion.c:107 [inline] wait_for_common kernel/sched/completion.c:118 [inline] wait_for_completion+0x415/0x770 kernel/sched/completion.c:139 kthread_stop+0x14a/0x7a0 kernel/kthread.c:530 stop_sync_thread+0x3d9/0x740 net/netfilter/ipvs/ip_vs_sync.c:1996 do_ip_vs_set_ctl+0x2b1/0x1cc0 net/netfilter/ipvs/ip_vs_ctl.c:2394 nf_sockopt net/netfilter/nf_sockopt.c:106 [inline] nf_setsockopt+0x67/0xc0 net/netfilter/nf_sockopt.c:115 ip_setsockopt+0x97/0xa0 net/ipv4/ip_sockglue.c:1253 udp_setsockopt+0x45/0x80 net/ipv4/udp.c:2400 sock_common_setsockopt+0x95/0xd0 net/core/sock.c:3039 SYSC_setsockopt net/socket.c:1850 [inline] SyS_setsockopt+0x189/0x360 net/socket.c:1829 do_syscall_64+0x281/0x940 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x42/0xb7 RIP: 0033:0x4459d9 RSP: 002b:7f1d6f47cdb8 EFLAGS: 0297 ORIG_RAX: 0036 RAX: ffda RBX: 006dac24 RCX: 004459d9 RDX: 048c RSI: RDI: 0008 RBP: 006dac20 R08: 0018 R09: R10: 2000 R11: 0297 R12: R13: 7fffcf128acf R14: 7f1d6f47d9c0 R15: 0001 INFO: lockdep is turned off. NMI backtrace for cpu 1 CPU: 1 PID: 869 Comm: khungtaskd Not tainted 4.16.0-rc6+ #286 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:17 [inline] dump_stack+0x194/0x24d lib/dump_stack.c:53 nmi_cpu_backtrace+0x1d2/0x210 lib/nmi_backtrace.c:103 nmi_trigger_cpumask_backtrace+0x123/0x180 lib/nmi_backtrace.c:62 arch_trigger_cpumask_backtrace+0x14/0x20 arch/x86/kernel/apic/hw_nmi.c:38 trigger_all_cpu_backtrace include/linux/nmi.h:138 [inline] check_hung_task kernel/hung_task.c:132 [inline] check_hung_uninterruptible_tasks kernel/hung_task.c:190 [inline] watchdog+0x90c/0xd60 kernel/hung_task.c:249 kthread+0x33c/0x400 kernel/kthread.c:238 ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:406 Sending NMI from CPU 1 to CPUs 0: NMI backtrace for cpu 0 CPU: 0 PID: 4317 Comm: syzkaller923914 Not tainted 4.16.0-rc6+ #286 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:__sanitizer_cov_trace_pc+0x0/0x50 RSP: 0018:8801b56df138 EFLAGS: 0093 RAX: 8801cbaae640 RBX: RCX: 866a3971 RDX: RSI: 0040 RDI: 8801db218038 RBP: 8801b56df168 R08: 88021fff801c R09: 88021fff8008 R10: 88021fff8010 R11: 88021fff801d R12: 8801db218038 R13: 0040 R14: 8801db218038 R15: FS: 01968880() GS:8801db20() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: ff600400 CR3: 0001c94dc001 CR4: 001606f0 DR0: DR1: DR2: DR3: DR6: fffe0ff0 DR7: 0400 Call Trace: cpumask_next+0x24/0x30 lib/cpumask.c:21 select_idle_smt kernel/sched/fair.c:6116 [inline] select_idle_sibling+0x86d/0xda0 kernel/sched/fair.c:6238 select_task_rq_fair+0xe0a/0x2910 kernel/sched/fair.c:6394 select_task_rq kernel/sched/core.c:1554 [inline] try_to_wake_up+0x4ee/0x15f0 kernel/sched/core.c:2064 default_wake_function+0x30/0x50 kernel/sched/core.c:3693 autoremove_wake_function+0x78/0x350 kernel/sched/wait.c:377 __wake_up_common+0x18e/0x780 kernel/sched/wait.c:97 __wake_up_common_lock+0x1b4/0
Re: INFO: task hung in stop_sync_thread (2)
Hello, On Thu, 29 Mar 2018, syzbot wrote: > syzbot hit the following crash on net-next commit > 5d22d47b9ed96eddb35821dc2cc4f629f45827f7 (Tue Mar 27 17:33:21 2018 +) > Merge branch 'sfc-filter-locking' > syzbot dashboard link: > https://syzkaller.appspot.com/bug?extid=5fe074c01b2032ce9618 > > Unfortunately, I don't have any reproducer for this crash yet. > Raw console output: > https://syzkaller.appspot.com/x/log.txt?id=6119456711376896 > Kernel config: https://syzkaller.appspot.com/x/.config?id=4372867303600475372 > compiler: gcc (GCC) 7.1.1 20170620 > > IMPORTANT: if you fix the bug, please add the following tag to the commit: > Reported-by: syzbot+5fe074c01b2032ce9...@syzkaller.appspotmail.com > It will help syzbot understand when the bug is fixed. See footer for details. > If you forward the report, please keep this part and the footer. > > IPVS: sync thread started: state = BACKUP, mcast_ifn = syz_tun, syncid = 4, id > IPVS: = 0 > IPVS: stopping backup sync thread 25415 ... > INFO: task syz-executor7:25421 blocked for more than 120 seconds. I think, I know what happens: start_sync_thread holds rtnl_lock while calling kthread_stop on error. The backup kthread is signalled and it tries to rtnl_lock in sock_release => lockup. I think, my fix will solve both reports, only that there is no reproducer for this one to test the fix with test command. > Not tainted 4.16.0-rc6+ #284 > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > syz-executor7 D23688 25421 4408 0x0004 > Call Trace: > context_switch kernel/sched/core.c:2862 [inline] > __schedule+0x8fb/0x1ec0 kernel/sched/core.c:3440 > schedule+0xf5/0x430 kernel/sched/core.c:3499 > schedule_timeout+0x1a3/0x230 kernel/time/timer.c:1777 > do_wait_for_common kernel/sched/completion.c:86 [inline] > __wait_for_common kernel/sched/completion.c:107 [inline] > wait_for_common kernel/sched/completion.c:118 [inline] > wait_for_completion+0x415/0x770 kernel/sched/completion.c:139 > kthread_stop+0x14a/0x7a0 kernel/kthread.c:530 > stop_sync_thread+0x3d9/0x740 net/netfilter/ipvs/ip_vs_sync.c:1996 > do_ip_vs_set_ctl+0x2b1/0x1cc0 net/netfilter/ipvs/ip_vs_ctl.c:2394 > nf_sockopt net/netfilter/nf_sockopt.c:106 [inline] > nf_setsockopt+0x67/0xc0 net/netfilter/nf_sockopt.c:115 > ip_setsockopt+0x97/0xa0 net/ipv4/ip_sockglue.c:1253 > sctp_setsockopt+0x2ca/0x63e0 net/sctp/socket.c:4154 > sock_common_setsockopt+0x95/0xd0 net/core/sock.c:3039 > SYSC_setsockopt net/socket.c:1850 [inline] > SyS_setsockopt+0x189/0x360 net/socket.c:1829 > do_syscall_64+0x281/0x940 arch/x86/entry/common.c:287 > entry_SYSCALL_64_after_hwframe+0x42/0xb7 > RIP: 0033:0x454889 > RSP: 002b:7fc927626c68 EFLAGS: 0246 ORIG_RAX: 0036 > RAX: ffda RBX: 7fc9276276d4 RCX: 00454889 > RDX: 048c RSI: RDI: 0017 > RBP: 0072bf58 R08: 0018 R09: > R10: 2000 R11: 0246 R12: > R13: 051c R14: 006f9b40 R15: 0001 > > Showing all locks held in the system: > 2 locks held by khungtaskd/868: > #0: (rcu_read_lock){}, at: [] > check_hung_uninterruptible_tasks kernel/hung_task.c:175 [inline] > #0: (rcu_read_lock){}, at: [ ] watchdog+0x1c5/0xd60 > kernel/hung_task.c:249 > #1: (tasklist_lock){.+.+}, at: [<37c2f8f9>] > debug_show_all_locks+0xd3/0x3d0 kernel/locking/lockdep.c:4470 > 1 lock held by rsyslogd/4247: > #0: (&f->f_pos_lock){+.+.}, at: [<0d8d6983>] __fdget_pos+0x12b/0x190 > fs/file.c:765 > 2 locks held by getty/4338: > #0: (&tty->ldisc_sem){}, at: [ ] > ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365 > #1: (&ldata->atomic_read_lock){+.+.}, at: [ ] > n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131 > 2 locks held by getty/4339: > #0: (&tty->ldisc_sem){}, at: [ ] > ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365 > #1: (&ldata->atomic_read_lock){+.+.}, at: [ ] > n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131 > 2 locks held by getty/4340: > #0: (&tty->ldisc_sem){}, at: [ ] > ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365 > #1: (&ldata->atomic_read_lock){+.+.}, at: [ ] > n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131 > 2 locks held by getty/4341: > #0: (&tty->ldisc_sem){}, at: [ ] > ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365 > #1: (&ldata->atomic_read_lock){+.+.}, at: [ ] > n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131 > 2 locks held by getty/4342: > #0: (&tty->ldisc_sem){}, at: [ ] > ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365 > #1: (&ldata->atomic_read_lock){+.+.}, at: [ ] > n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131 > 2 locks held by getty/4343: > #0: (&tty->ldisc_se