On Wed, Sep 02, 2015 at 03:44:05PM -0400, Tejun Heo wrote: > (cc'ing peterz) > > Ooh, this is from irq_work which doesn't have much to do with > workqueue. Peter? > > On Mon, Aug 24, 2015 at 05:16:11PM -0700, Paul E. McKenney wrote: > > Hello, Tejun, > > > > As discussed last week, I am getting an occasional warning out of > > irq_work_queue_on() WARN_ON_ONCE(cpu_is_offline(cpu)). The repeat-by > > seems to be a week or so of rcutorture runs on 16-CPU KVM instances > > on x86. So please see below on the off-chance that this is of use. > > I have also attached a .config file. > > > > Thoughts? > > > > Thanx, Paul > > > > ------------------------------------------------------------------------ > > > > [ 875.702254] ------------[ cut here ]------------ > > [ 875.703111] WARNING: CPU: 0 PID: 768 at > > /home/paulmck/public_git/bisect-linux-rcu/kernel/irq_work.c:69 > > irq_work_queue_on+0xd4/0x110() > > [ 875.703227] Modules linked in: > > [ 875.703227] CPU: 0 PID: 768 Comm: rcu_torture_rea Tainted: G W > > 4.1.0-rc4+ #1 > > [ 875.703227] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS > > Bochs 01/01/2011 > > [ 875.703227] ffffffff81baadd8 ffff88001dc5fce8 ffffffff81895418 > > 00000000000000aa > > [ 875.703227] 0000000000000000 ffff88001dc5fd28 ffffffff810517d5 > > 0000000000015bc0 > > [ 875.703227] 0000000000000004 0000000000000004 ffff88001fc8f980 > > ffff88001fc8d500 > > [ 875.703227] Call Trace: > > [ 875.703227] [<ffffffff81895418>] dump_stack+0x45/0x57 > > [ 875.703227] [<ffffffff810517d5>] warn_slowpath_common+0x85/0xc0 > > [ 875.703227] [<ffffffff810518b5>] warn_slowpath_null+0x15/0x20 > > [ 875.703227] [<ffffffff811119a4>] irq_work_queue_on+0xd4/0x110 > > [ 875.703227] [<ffffffff810c2d74>] tick_nohz_full_kick_cpu+0x44/0x50
It happens in nohz full, but I'm not sure the guilty is nohz full. The problem here is that wake_up_nohz_cpu() selects a CPU that is offline. But this shouldn't happen. Either it selects a CPU that is in the domain tree, and I suspect offline CPUs aren't supposed to be there, or it selects the current CPU. And if the CPU is offlined, it shouldn't be running some kthread... > > [ 875.703227] [<ffffffff81076384>] wake_up_nohz_cpu+0xb4/0x100 > > [ 875.703227] [<ffffffff810b1196>] internal_add_timer+0x86/0xa0 > > [ 875.703227] [<ffffffff810b30f1>] mod_timer+0xf1/0x1e0 > > [ 875.703227] [<ffffffff810a63a4>] rcu_torture_reader+0x2a4/0x2e0 > > [ 875.703227] [<ffffffff810a63e0>] ? rcu_torture_reader+0x2e0/0x2e0 > > [ 875.703227] [<ffffffff810a6100>] ? > > rcutorture_trace_dump.part.10+0x20/0x20 > > [ 875.703227] [<ffffffff8106d75d>] kthread+0xcd/0xf0 > > [ 875.703227] [<ffffffff8106d690>] ? kthread_create_on_node+0x180/0x180 > > [ 875.703227] [<ffffffff8189fb92>] ret_from_fork+0x42/0x70 > > [ 875.703227] [<ffffffff8106d690>] ? kthread_create_on_node+0x180/0x180 > > [ 875.703227] ---[ end trace 74175128740d0113 ]--- -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/