Re: INFO: task hung in addrconf_verify_work (2)

2019-10-14 Thread Cong Wang
On Sun, Oct 13, 2019 at 10:37 PM Eric Dumazet  wrote:
> Infinite loop because tcf_add_notify() returns -EAGAIN as the message can not 
> be delivered to the socket,
> since its SO_RCVBUF has been set to 0.

Interesting corner case...

>
> Perhaps we need this patch ?

This patch looks reasonable to me, as the -EAGAIN here is mainly (if not
totally) for the locking retry logic.

Thanks.


Re: INFO: task hung in addrconf_verify_work (2)

2019-10-13 Thread Eric Dumazet



On 10/13/19 9:42 PM, syzbot wrote:
> Hello,
> 
> syzbot found the following crash on:
> 
> HEAD commit:    c208bdb9 tcp: improve recv_skip_hint for tcp_zerocopy_rece..
> git tree:   net-next
> console output: https://syzkaller.appspot.com/x/log.txt?x=15b6133b60
> kernel config:  https://syzkaller.appspot.com/x/.config?x=d9be300620399522
> dashboard link: https://syzkaller.appspot.com/bug?extid=cf0adbb9c28c8866c788
> compiler:   gcc (GCC) 9.0.0 20181231 (experimental)
> syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=1548666f60
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=11957d3b60
> 
> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: syzbot+cf0adbb9c28c8866c...@syzkaller.appspotmail.com
> 
> INFO: task kworker/0:2:2913 blocked for more than 143 seconds.
>   Not tainted 5.4.0-rc1+ #0
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> kworker/0:2 D27000  2913  2 0x80004000
> Workqueue: ipv6_addrconf addrconf_verify_work
> Call Trace:
>  context_switch kernel/sched/core.c:3384 [inline]
>  __schedule+0x94f/0x1e70 kernel/sched/core.c:4069
>  schedule+0xd9/0x260 kernel/sched/core.c:4136
>  schedule_preempt_disabled+0x13/0x20 kernel/sched/core.c:4195
>  __mutex_lock_common kernel/locking/mutex.c:1033 [inline]
>  __mutex_lock+0x7b0/0x13c0 kernel/locking/mutex.c:1103
>  mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:1118
>  rtnl_lock+0x17/0x20 net/core/rtnetlink.c:72
>  addrconf_verify_work+0xe/0x20 net/ipv6/addrconf.c:4520
>  process_one_work+0x9af/0x1740 kernel/workqueue.c:2269
>  worker_thread+0x98/0xe40 kernel/workqueue.c:2415
>  kthread+0x361/0x430 kernel/kthread.c:255
>  ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352
> 
> Showing all locks held in the system:
> 1 lock held by khungtaskd/1054:
>  #0: 88faae40 (rcu_read_lock){}, at: 
> debug_show_all_locks+0x5f/0x27e kernel/locking/lockdep.c:5337
> 3 locks held by kworker/0:2/2913:
>  #0: 888216019428 ((wq_completion)ipv6_addrconf){+.+.}, at: 
> __write_once_size include/linux/compiler.h:226 [inline]
>  #0: 888216019428 ((wq_completion)ipv6_addrconf){+.+.}, at: 
> arch_atomic64_set arch/x86/include/asm/atomic64_64.h:34 [inline]
>  #0: 888216019428 ((wq_completion)ipv6_addrconf){+.+.}, at: atomic64_set 
> include/asm-generic/atomic-instrumented.h:855 [inline]
>  #0: 888216019428 ((wq_completion)ipv6_addrconf){+.+.}, at: 
> atomic_long_set include/asm-generic/atomic-long.h:40 [inline]
>  #0: 888216019428 ((wq_completion)ipv6_addrconf){+.+.}, at: set_work_data 
> kernel/workqueue.c:620 [inline]
>  #0: 888216019428 ((wq_completion)ipv6_addrconf){+.+.}, at: 
> set_work_pool_and_clear_pending kernel/workqueue.c:647 [inline]
>  #0: 888216019428 ((wq_completion)ipv6_addrconf){+.+.}, at: 
> process_one_work+0x88b/0x1740 kernel/workqueue.c:2240
>  #1: 8880a05b7dc0 ((addr_chk_work).work){+.+.}, at: 
> process_one_work+0x8c1/0x1740 kernel/workqueue.c:2244
>  #2: 89993b20 (rtnl_mutex){+.+.}, at: rtnl_lock+0x17/0x20 
> net/core/rtnetlink.c:72
> 1 lock held by rsyslogd/8744:
>  #0: 8880899fa120 (>f_pos_lock){+.+.}, at: __fdget_pos+0xee/0x110 
> fs/file.c:801
> 2 locks held by getty/8833:
>  #0: 888090baedd0 (>ldisc_sem){}, at: ldsem_down_read+0x33/0x40 
> drivers/tty/tty_ldsem.c:340
>  #1: c90005f292e0 (>atomic_read_lock){+.+.}, at: 
> n_tty_read+0x232/0x1c10 drivers/tty/n_tty.c:2156
> 2 locks held by getty/8834:
>  #0: 88808d0f6dd0 (>ldisc_sem){}, at: ldsem_down_read+0x33/0x40 
> drivers/tty/tty_ldsem.c:340
>  #1: c90005f392e0 (>atomic_read_lock){+.+.}, at: 
> n_tty_read+0x232/0x1c10 drivers/tty/n_tty.c:2156
> 2 locks held by getty/8835:
>  #0: 888090148e10 (>ldisc_sem){}, at: ldsem_down_read+0x33/0x40 
> drivers/tty/tty_ldsem.c:340
>  #1: c90005f252e0 (>atomic_read_lock){+.+.}, at: 
> n_tty_read+0x232/0x1c10 drivers/tty/n_tty.c:2156
> 2 locks held by getty/8836:
>  #0: 8880a7ab3750 (>ldisc_sem){}, at: ldsem_down_read+0x33/0x40 
> drivers/tty/tty_ldsem.c:340
>  #1: c90005f412e0 (>atomic_read_lock){+.+.}, at: 
> n_tty_read+0x232/0x1c10 drivers/tty/n_tty.c:2156
> 2 locks held by getty/8837:
>  #0: 8880a7accf10 (>ldisc_sem){}, at: ldsem_down_read+0x33/0x40 
> drivers/tty/tty_ldsem.c:340
>  #1: c90005f3d2e0 (>atomic_read_lock){+.+.}, at: 
> n_tty_read+0x232/0x1c10 drivers/tty/n_tty.c:2156
> 2 locks held by getty/8838:
>  #0: 88808d0f7650 (>ldisc_sem){}, at: ldsem_down_read+0x33/0x40 
> drivers/tty/tty_ldsem.c:340
>  #1: c90005f352e0 (>atomic_read_lock){+.+.}, at: 
> n_tty_read+0x232/0x1c10 drivers/tty/n_tty.c:2156
> 2 locks held by getty/8839:
>  #0: 88808d162bd0 (>ldisc_sem){}, at: ldsem_down_read+0x33/0x40 
> drivers/tty/tty_ldsem.c:340
>  #1: c90005f112e0 (>atomic_read_lock){+.+.}, at: 
> n_tty_read+0x232/0x1c10 drivers/tty/n_tty.c:2156
> 1 lock held by syz-executor910/8859:
> 
> 

INFO: task hung in addrconf_verify_work (2)

2019-10-13 Thread syzbot

Hello,

syzbot found the following crash on:

HEAD commit:c208bdb9 tcp: improve recv_skip_hint for tcp_zerocopy_rece..
git tree:   net-next
console output: https://syzkaller.appspot.com/x/log.txt?x=15b6133b60
kernel config:  https://syzkaller.appspot.com/x/.config?x=d9be300620399522
dashboard link: https://syzkaller.appspot.com/bug?extid=cf0adbb9c28c8866c788
compiler:   gcc (GCC) 9.0.0 20181231 (experimental)
syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=1548666f60
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=11957d3b60

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+cf0adbb9c28c8866c...@syzkaller.appspotmail.com

INFO: task kworker/0:2:2913 blocked for more than 143 seconds.
  Not tainted 5.4.0-rc1+ #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kworker/0:2 D27000  2913  2 0x80004000
Workqueue: ipv6_addrconf addrconf_verify_work
Call Trace:
 context_switch kernel/sched/core.c:3384 [inline]
 __schedule+0x94f/0x1e70 kernel/sched/core.c:4069
 schedule+0xd9/0x260 kernel/sched/core.c:4136
 schedule_preempt_disabled+0x13/0x20 kernel/sched/core.c:4195
 __mutex_lock_common kernel/locking/mutex.c:1033 [inline]
 __mutex_lock+0x7b0/0x13c0 kernel/locking/mutex.c:1103
 mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:1118
 rtnl_lock+0x17/0x20 net/core/rtnetlink.c:72
 addrconf_verify_work+0xe/0x20 net/ipv6/addrconf.c:4520
 process_one_work+0x9af/0x1740 kernel/workqueue.c:2269
 worker_thread+0x98/0xe40 kernel/workqueue.c:2415
 kthread+0x361/0x430 kernel/kthread.c:255
 ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352

Showing all locks held in the system:
1 lock held by khungtaskd/1054:
 #0: 88faae40 (rcu_read_lock){}, at:  
debug_show_all_locks+0x5f/0x27e kernel/locking/lockdep.c:5337

3 locks held by kworker/0:2/2913:
 #0: 888216019428 ((wq_completion)ipv6_addrconf){+.+.}, at:  
__write_once_size include/linux/compiler.h:226 [inline]
 #0: 888216019428 ((wq_completion)ipv6_addrconf){+.+.}, at:  
arch_atomic64_set arch/x86/include/asm/atomic64_64.h:34 [inline]
 #0: 888216019428 ((wq_completion)ipv6_addrconf){+.+.}, at:  
atomic64_set include/asm-generic/atomic-instrumented.h:855 [inline]
 #0: 888216019428 ((wq_completion)ipv6_addrconf){+.+.}, at:  
atomic_long_set include/asm-generic/atomic-long.h:40 [inline]
 #0: 888216019428 ((wq_completion)ipv6_addrconf){+.+.}, at:  
set_work_data kernel/workqueue.c:620 [inline]
 #0: 888216019428 ((wq_completion)ipv6_addrconf){+.+.}, at:  
set_work_pool_and_clear_pending kernel/workqueue.c:647 [inline]
 #0: 888216019428 ((wq_completion)ipv6_addrconf){+.+.}, at:  
process_one_work+0x88b/0x1740 kernel/workqueue.c:2240
 #1: 8880a05b7dc0 ((addr_chk_work).work){+.+.}, at:  
process_one_work+0x8c1/0x1740 kernel/workqueue.c:2244
 #2: 89993b20 (rtnl_mutex){+.+.}, at: rtnl_lock+0x17/0x20  
net/core/rtnetlink.c:72

1 lock held by rsyslogd/8744:
 #0: 8880899fa120 (>f_pos_lock){+.+.}, at: __fdget_pos+0xee/0x110  
fs/file.c:801

2 locks held by getty/8833:
 #0: 888090baedd0 (>ldisc_sem){}, at:  
ldsem_down_read+0x33/0x40 drivers/tty/tty_ldsem.c:340
 #1: c90005f292e0 (>atomic_read_lock){+.+.}, at:  
n_tty_read+0x232/0x1c10 drivers/tty/n_tty.c:2156

2 locks held by getty/8834:
 #0: 88808d0f6dd0 (>ldisc_sem){}, at:  
ldsem_down_read+0x33/0x40 drivers/tty/tty_ldsem.c:340
 #1: c90005f392e0 (>atomic_read_lock){+.+.}, at:  
n_tty_read+0x232/0x1c10 drivers/tty/n_tty.c:2156

2 locks held by getty/8835:
 #0: 888090148e10 (>ldisc_sem){}, at:  
ldsem_down_read+0x33/0x40 drivers/tty/tty_ldsem.c:340
 #1: c90005f252e0 (>atomic_read_lock){+.+.}, at:  
n_tty_read+0x232/0x1c10 drivers/tty/n_tty.c:2156

2 locks held by getty/8836:
 #0: 8880a7ab3750 (>ldisc_sem){}, at:  
ldsem_down_read+0x33/0x40 drivers/tty/tty_ldsem.c:340
 #1: c90005f412e0 (>atomic_read_lock){+.+.}, at:  
n_tty_read+0x232/0x1c10 drivers/tty/n_tty.c:2156

2 locks held by getty/8837:
 #0: 8880a7accf10 (>ldisc_sem){}, at:  
ldsem_down_read+0x33/0x40 drivers/tty/tty_ldsem.c:340
 #1: c90005f3d2e0 (>atomic_read_lock){+.+.}, at:  
n_tty_read+0x232/0x1c10 drivers/tty/n_tty.c:2156

2 locks held by getty/8838:
 #0: 88808d0f7650 (>ldisc_sem){}, at:  
ldsem_down_read+0x33/0x40 drivers/tty/tty_ldsem.c:340
 #1: c90005f352e0 (>atomic_read_lock){+.+.}, at:  
n_tty_read+0x232/0x1c10 drivers/tty/n_tty.c:2156

2 locks held by getty/8839:
 #0: 88808d162bd0 (>ldisc_sem){}, at:  
ldsem_down_read+0x33/0x40 drivers/tty/tty_ldsem.c:340
 #1: c90005f112e0 (>atomic_read_lock){+.+.}, at:  
n_tty_read+0x232/0x1c10 drivers/tty/n_tty.c:2156

1 lock held by syz-executor910/8859:

=

NMI backtrace for cpu 0
CPU: 0 PID: 1054 Comm: khungtaskd Not tainted 5.4.0-rc1+ #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google