On Tue, Oct 22, 2024 at 03:40:52PM +0200, Marco Elver wrote:
> On Tue, Oct 22, 2024 at 01:31PM +0200, Peter Zijlstra wrote:
> > On Tue, Oct 22, 2024 at 10:06:23AM +0200, Alexander Potapenko wrote:
> > > On Fri, Sep 27, 2024 at 4:57 PM syzbot
> > > <[email protected]> wrote:
> > > >
> > > > Hello,
> > > >
> > > > syzbot found the following issue on:
> > > >
> > > > HEAD commit:    075dbe9f6e3c Merge tag 'soc-ep93xx-dt-6.12' of 
> > > > git://git.k..
> > > > git tree:       upstream
> > > > console output: https://syzkaller.appspot.com/x/log.txt?x=15f07a80580000
> > > > kernel config:  
> > > > https://syzkaller.appspot.com/x/.config?x=86254f9e0a8f2c98
> > > > dashboard link: 
> > > > https://syzkaller.appspot.com/bug?extid=0ec1e96c2cdf5c0e512a
> > > > compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for 
> > > > Debian) 2.40
> > > >
> > > > Unfortunately, I don't have any reproducer for this issue yet.
> > > >
> > > > Downloadable assets:
> > > > disk image: 
> > > > https://storage.googleapis.com/syzbot-assets/1be80941df60/disk-075dbe9f.raw.xz
> > > > vmlinux: 
> > > > https://storage.googleapis.com/syzbot-assets/494a9ac89c09/vmlinux-075dbe9f.xz
> > > > kernel image: 
> > > > https://storage.googleapis.com/syzbot-assets/919788d8c731/bzImage-075dbe9f.xz
> > > >
> > > > IMPORTANT: if you fix the issue, please add the following tag to the 
> > > > commit:
> > > > Reported-by: [email protected]
> [...]
> > > +PeterZ, who added the KCSAN assertion.
> > 
> > Well, PaulMck did in d6111cf45c57 ("sched: Use WRITE_ONCE() for
> > p->on_rq"), I just moved it around in e8901061ca0c ("sched: Split
> > DEQUEUE_SLEEP from deactivate_task()").
> > 
> > I'm not at all sure I have any inkling as to what the annotation does
> > nor what KCSAN is trying to tell us above.
> 
> ASSERT_EXCLUSIVE_WRITER(var) is to say that that there should be no
> concurrent writes to var; other readers are allowed. If KCSAN is
> enabled, it then goes and reports any violations of that assertion.
> Main usecase is for already marked accesses where concurrent accesses
> are _not_ data races, but the algorithm does not assume concurrent
> writers regardless.
> 
> In this case it seems that Paul was trying to say that there should be
> no concurrent writers to this variable. But KCSAN disproved that.

Just confirming that this was my intent.

And for all I know, maybe it is now OK to have concurrent writers to
that variable, but if so, would we please have an explanatory comment
(or a reference to one)?

                                                        Thanx, Paul

> > Can someone please translate?
> 
> We can get the 2nd stack trace with:
> 
>       --- a/kernel/sched/Makefile
>       +++ b/kernel/sched/Makefile
>       @@ -10,8 +10,8 @@ KCOV_INSTRUMENT := n
> 
>        # Disable KCSAN to avoid excessive noise and performance degradation. 
> To avoid
>        # false positives ensure barriers implied by sched functions are 
> instrumented.
>       -KCSAN_SANITIZE := n
>       -KCSAN_INSTRUMENT_BARRIERS := y
>       +#KCSAN_SANITIZE := n
>       +#KCSAN_INSTRUMENT_BARRIERS := y
> 
> Which gives us:
> 
>  | ==================================================================
>  | BUG: KCSAN: assert: race in dequeue_entities / ttwu_do_activate
>  | 
>  | write (marked) to 0xffff9e100329c628 of 4 bytes by interrupt on cpu 0:
>  |  activate_task kernel/sched/core.c:2064 [inline]
> 
> This is this one:
> 
>       void activate_task(struct rq *rq, struct task_struct *p, int flags)
>       {
>               if (task_on_rq_migrating(p))
>                       flags |= ENQUEUE_MIGRATED;
>               if (flags & ENQUEUE_MIGRATED)
>                       sched_mm_cid_migrate_to(rq, p);
> 
>               enqueue_task(rq, p, flags);
> 
>               WRITE_ONCE(p->on_rq, TASK_ON_RQ_QUEUED);
>               ASSERT_EXCLUSIVE_WRITER(p->on_rq);
>       }
> 
>  |  ttwu_do_activate+0x153/0x3e0 kernel/sched/core.c:3671
>  |  ttwu_queue kernel/sched/core.c:3944 [inline]
>  |  try_to_wake_up+0x60f/0xaf0 kernel/sched/core.c:4270
>  |  default_wake_function+0x25/0x30 kernel/sched/core.c:7009
>  |  __pollwake fs/select.c:205 [inline]
>  |  pollwake+0xc0/0x100 fs/select.c:215
>  |  __wake_up_common kernel/sched/wait.c:89 [inline]
>  |  __wake_up_common_lock kernel/sched/wait.c:106 [inline]
>  |  __wake_up_sync_key+0x85/0xc0 kernel/sched/wait.c:173
>  |  sock_def_readable+0x6f/0x180 net/core/sock.c:3442
>  |  tcp_data_ready+0x194/0x230 net/ipv4/tcp_input.c:5193
>  |  tcp_data_queue+0x1052/0x2710 net/ipv4/tcp_input.c:5283
>  |  tcp_rcv_established+0x7e3/0xd60 net/ipv4/tcp_input.c:6237
>  |  tcp_v4_do_rcv+0x545/0x600 net/ipv4/tcp_ipv4.c:1915
>  |  tcp_v4_rcv+0x159c/0x1890 net/ipv4/tcp_ipv4.c:2350
>  |  ip_protocol_deliver_rcu+0x2d8/0x620 net/ipv4/ip_input.c:205
>  |  ip_local_deliver_finish+0x11a/0x150 net/ipv4/ip_input.c:233
>  |  NF_HOOK include/linux/netfilter.h:314 [inline]
>  |  ip_local_deliver+0xce/0x1a0 net/ipv4/ip_input.c:254
>  |  dst_input include/net/dst.h:460 [inline]
>  |  ip_sublist_rcv_finish net/ipv4/ip_input.c:580 [inline]
>  |  ip_list_rcv_finish net/ipv4/ip_input.c:630 [inline]
>  |  ip_sublist_rcv+0x43d/0x520 net/ipv4/ip_input.c:638
>  |  ip_list_rcv+0x262/0x2a0 net/ipv4/ip_input.c:672
>  |  __netif_receive_skb_list_ptype net/core/dev.c:5709 [inline]
>  |  __netif_receive_skb_list_core+0x4fc/0x520 net/core/dev.c:5756
>  |  __netif_receive_skb_list net/core/dev.c:5808 [inline]
>  |  netif_receive_skb_list_internal+0x46d/0x5e0 net/core/dev.c:5899
>  |  gro_normal_list include/net/gro.h:515 [inline]
>  |  napi_complete_done+0x161/0x3a0 net/core/dev.c:6250
>  |  e1000_clean+0x7c7/0x1a70 
> drivers/net/ethernet/intel/e1000/e1000_main.c:3808
>  |  __napi_poll+0x66/0x360 net/core/dev.c:6775
>  |  napi_poll net/core/dev.c:6844 [inline]
>  |  net_rx_action+0x3d9/0x820 net/core/dev.c:6966
>  |  handle_softirqs+0xe6/0x2d0 kernel/softirq.c:554
>  |  __do_softirq kernel/softirq.c:588 [inline]
>  |  invoke_softirq kernel/softirq.c:428 [inline]
>  |  __irq_exit_rcu+0x45/0xc0 kernel/softirq.c:637
>  |  common_interrupt+0x4f/0xc0 arch/x86/kernel/irq.c:278
>  |  asm_common_interrupt+0x26/0x40 arch/x86/include/asm/idtentry.h:693
>  | 
>  | assert no writes to 0xffff9e100329c628 of 4 bytes by task 10571 on cpu 3:
>  |  __block_task kernel/sched/sched.h:2770 [inline]
> 
> And that's:
> 
>       static inline void __block_task(struct rq *rq, struct task_struct *p)
>       {
>               WRITE_ONCE(p->on_rq, 0);
>               ASSERT_EXCLUSIVE_WRITER(p->on_rq);
>               if (p->sched_contributes_to_load)
>                       rq->nr_uninterruptible++;
> 
>  |  dequeue_entities+0xd83/0xe70 kernel/sched/fair.c:7177
>  |  pick_next_entity kernel/sched/fair.c:5627 [inline]
>  |  pick_task_fair kernel/sched/fair.c:8856 [inline]
>  |  pick_next_task_fair+0xaf/0x710 kernel/sched/fair.c:8876
>  |  __pick_next_task kernel/sched/core.c:5955 [inline]
>  |  pick_next_task kernel/sched/core.c:6477 [inline]
>  |  __schedule+0x47a/0x1130 kernel/sched/core.c:6629
>  |  __schedule_loop kernel/sched/core.c:6752 [inline]
>  |  schedule+0x7b/0x130 kernel/sched/core.c:6767
>  |  do_nanosleep+0xdb/0x310 kernel/time/hrtimer.c:2032
>  |  hrtimer_nanosleep+0xa0/0x180 kernel/time/hrtimer.c:2080
>  |  common_nsleep+0x52/0x70 kernel/time/posix-timers.c:1365
>  |  __do_sys_clock_nanosleep kernel/time/posix-timers.c:1411 [inline]
>  |  __se_sys_clock_nanosleep+0x1b2/0x1f0 kernel/time/posix-timers.c:1388
>  |  __x64_sys_clock_nanosleep+0x55/0x70 kernel/time/posix-timers.c:1388
>  |  x64_sys_call+0x2612/0x2f00 
> arch/x86/include/generated/asm/syscalls_64.h:231
>  |  do_syscall_x64 arch/x86/entry/common.c:52 [inline]
>  |  do_syscall_64+0xd0/0x1a0 arch/x86/entry/common.c:83
>  |  entry_SYSCALL_64_after_hwframe+0x77/0x7f
>  | 
>  | Reported by Kernel Concurrency Sanitizer on:
>  | CPU: 3 UID: 0 PID: 10571 Comm: syz.3.1083 Not tainted 
> 6.12.0-rc2-00003-g44423ac48780-dirty #7
>  | Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
> 1.16.3-debian-1.16.3-2 04/01/2014
>  | ==================================================================

Reply via email to