Re: [RFC PATCH 6/8] sched: Report local wake up on resched blind zone within idle loop
On Thu, Jan 14, 2021 at 11:46:06AM +, Chris Wilson wrote: > Quoting Frederic Weisbecker (2021-01-09 02:05:34) > > +void noinstr sched_resched_local_assert_allowed(void) > > +{ > > + if (this_rq()->resched_local_allow) > > + return; > > + > > + /* > > +* Idle interrupts break the CPU from its pause and > > +* rescheduling happens on idle loop exit. > > +*/ > > + if (in_hardirq()) > > + return; > > + > > + /* > > +* What applies to hardirq also applies to softirq as > > +* we assume they execute on hardirq tail. Ksoftirqd > > +* shouldn't have resched_local_allow == 0. > > +* We also assume that no local_bh_enable() call may > > +* execute softirqs inline on fragile idle/entry > > +* path... > > +*/ > > + if (in_serving_softirq()) > > + return; > > + > > + WARN_ONCE(1, "Late current task rescheduling may be lost\n"); > > +} > > This warn is impacting all suspend testing in linux-next: Does the patch series here help? https://lore.kernel.org/lkml/20210112144344.850850...@infradead.org/ And in Frederic's defense, his patch isn't causing the problem, but rather calling attention to a condition that can lead to hangs. Thanx, Paul > <4> [124.226839] Late current task rescheduling may be lost > <4> [124.226847] WARNING: CPU: 0 PID: 0 at kernel/sched/core.c:628 > sched_resched_local_assert_allowed+0x53/0x60 > <4> [124.226854] Modules linked in: vgem btusb btrtl btbcm btintel > snd_hda_codec_hdmi bluetooth snd_hda_codec_realtek snd_hda_codec_generic > coretemp ledtrig_audio crct10dif_pclmul crc32_pclmul ghash_clmulni_intel > ecdh_generic ecc r8169 realtek i915 lpc_ich snd_hda_intel snd_intel_dspcfg > snd_hda_codec snd_hwdep snd_hda_core snd_pcm pinctrl_cherryview prime_numbers > <4> [124.226905] CPU: 0 PID: 0 Comm: swapper/0 Not tainted > 5.11.0-rc3-next-20210113-gaa515cdce7a15-next-20210113 #1 > <4> [124.226909] Hardware name: /NUC5CPYB, BIOS > PYBSWCEL.86A.0058.2016.1102.1842 11/02/2016 > <4> [124.226912] RIP: 0010:sched_resched_local_assert_allowed+0x53/0x60 > <4> [124.226918] Code: a9 00 00 0f 00 75 0e f6 c4 01 75 09 80 3d 55 42 d9 00 > 00 74 02 5b c3 48 c7 c7 68 0e 31 82 c6 05 43 42 d9 00 01 e8 9c 39 fb ff <0f> > 0b 5b c3 cc cc cc cc cc cc cc cc cc 48 39 77 10 0f 84 97 00 00 > <4> [124.226922] RSP: 0018:82603cf8 EFLAGS: 00010082 > <4> [124.226926] RAX: RBX: 0003b280 RCX: > 0007 > <4> [124.226929] RDX: 0007 RSI: 8112d041 RDI: > > <4> [124.226931] RBP: 888121b9a840 R08: R09: > > <4> [124.226934] R10: 0c3c R11: 0001 R12: > 82603d50 > <4> [124.226937] R13: 888121b9b0c8 R14: 0083 R15: > 88817b83b280 > <4> [124.226940] FS: () GS:88817b80() > knlGS: > <4> [124.226943] CS: 0010 DS: ES: CR0: 80050033 > <4> [124.226946] CR2: 5645075dd1b0 CR3: 05612000 CR4: > 001006f0 > <4> [124.226949] Call Trace: > <4> [124.226951] check_preempt_curr+0x44/0x90 > <4> [124.226956] ttwu_do_wakeup+0x14/0x220 > <4> [124.226961] try_to_wake_up+0x1ef/0x7b0 > <4> [124.226966] ? get_pwq.isra.21+0x2c/0x40 > <4> [124.226972] __queue_work+0x180/0x610 > <4> [124.226977] queue_work_on+0x65/0x70 > <4> [124.226981] timekeeping_resume+0x150/0x1b0 > <4> [124.226986] tick_unfreeze+0x3c/0x120 > <4> [124.226990] cpuidle_enter_s2idle+0xec/0x180 > <4> [124.226995] do_idle+0x1f3/0x2b0 > <4> [124.227000] cpu_startup_entry+0x14/0x20 > <4> [124.227004] start_kernel+0x551/0x576 > <4> [124.227009] secondary_startup_64_no_verify+0xb0/0xbb > <4> [124.227017] irq event stamp: 595206 > <4> [124.227019] hardirqs last enabled at (595205): [] > asm_sysvec_apic_timer_interrupt+0x12/0x20 > <4> [124.227024] hardirqs last disabled at (595206): [] > do_idle+0xaa/0x2b0 > <4> [124.227028] softirqs last enabled at (595182): [] > __do_softirq+0x342/0x48e > <4> [124.227033] softirqs last disabled at (595171): [] > asm_call_irq_on_stack+0x12/0x20 > <4> [124.227038] ---[ end trace 6fff00bd318698a4 ]--- > <4> [124.227050] > <4> [124.227052] == > <4> [124.227054] WARNING: possible circular locking dependency detected > <4> [124.227055] 5.11.0-rc3-next-20210113-gaa515cdce7a15-next-20210113 #1 Not > tainted > <4> [124.227056] -- > <4> [124.227058] swapper/0/0 is trying to acquire lock: > <4> [124.227059] 8272fdb8 ((console_sem).lock){-...}-{2:2}, at: > down_trylock+0xa/0x30 > <4> [124.227063] > <4> [124.227064] but task is already holding lock: > <4> [124.227065] 88817b83b298 (>lock){-.-.}-{2:2}, at: > try_to_wake_up+0x1c9/0x7b0 > <4> [124.227069] > <4>
Re: [RFC PATCH 6/8] sched: Report local wake up on resched blind zone within idle loop
On Mon, Jan 11, 2021 at 01:25:59PM +0100, Peter Zijlstra wrote: > On Sat, Jan 09, 2021 at 03:05:34AM +0100, Frederic Weisbecker wrote: > > The idle loop has several need_resched() checks that make sure we don't > > miss a rescheduling request. This means that any wake up performed on > > the local runqueue after the last generic need_resched() check is going > > to have its rescheduling silently ignored. This has happened in the > > past with rcu kthreads awaken from rcu_idle_enter() for example. > > > > Perform sanity checks to report these situations. > > I really don't like this.. > > - it's too specific to the actual reschedule condition, any wakeup this >late is dodgy, not only those that happen to cause a local >reschedule. Right. > > - we can already test this with unwind and checking against __cpuidle > > - moving all of __cpuidle into noinstr would also cover this. And we're >going to have to do that anyway. Ok then, I'll wait for that instead. > > > +void noinstr sched_resched_local_assert_allowed(void) > > +{ > > + if (this_rq()->resched_local_allow) > > + return; > > + > > > + /* > > +* Idle interrupts break the CPU from its pause and > > +* rescheduling happens on idle loop exit. > > +*/ > > + if (in_hardirq()) > > + return; > > + > > + /* > > +* What applies to hardirq also applies to softirq as > > +* we assume they execute on hardirq tail. Ksoftirqd > > +* shouldn't have resched_local_allow == 0. > > +* We also assume that no local_bh_enable() call may > > +* execute softirqs inline on fragile idle/entry > > +* path... > > +*/ > > + if (in_serving_softirq()) > > + return; > > + > > + WARN_ONCE(1, "Late current task rescheduling may be lost\n"); > > That seems like it wants to be: > > WARN_ONCE(in_task(), "..."); Right! But I guess I'll drop that patch now. Thanks.
Re: [RFC PATCH 6/8] sched: Report local wake up on resched blind zone within idle loop
On Sat, Jan 09, 2021 at 03:05:34AM +0100, Frederic Weisbecker wrote: > The idle loop has several need_resched() checks that make sure we don't > miss a rescheduling request. This means that any wake up performed on > the local runqueue after the last generic need_resched() check is going > to have its rescheduling silently ignored. This has happened in the > past with rcu kthreads awaken from rcu_idle_enter() for example. > > Perform sanity checks to report these situations. I really don't like this.. - it's too specific to the actual reschedule condition, any wakeup this late is dodgy, not only those that happen to cause a local reschedule. - we can already test this with unwind and checking against __cpuidle - moving all of __cpuidle into noinstr would also cover this. And we're going to have to do that anyway. > +void noinstr sched_resched_local_assert_allowed(void) > +{ > + if (this_rq()->resched_local_allow) > + return; > + > + /* > + * Idle interrupts break the CPU from its pause and > + * rescheduling happens on idle loop exit. > + */ > + if (in_hardirq()) > + return; > + > + /* > + * What applies to hardirq also applies to softirq as > + * we assume they execute on hardirq tail. Ksoftirqd > + * shouldn't have resched_local_allow == 0. > + * We also assume that no local_bh_enable() call may > + * execute softirqs inline on fragile idle/entry > + * path... > + */ > + if (in_serving_softirq()) > + return; > + > + WARN_ONCE(1, "Late current task rescheduling may be lost\n"); That seems like it wants to be: WARN_ONCE(in_task(), "..."); > +}
[RFC PATCH 6/8] sched: Report local wake up on resched blind zone within idle loop
The idle loop has several need_resched() checks that make sure we don't miss a rescheduling request. This means that any wake up performed on the local runqueue after the last generic need_resched() check is going to have its rescheduling silently ignored. This has happened in the past with rcu kthreads awaken from rcu_idle_enter() for example. Perform sanity checks to report these situations. Signed-off-by: Frederic Weisbecker Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: Ingo Molnar Cc: Paul E. McKenney Cc: Rafael J. Wysocki --- include/linux/sched.h | 11 +++ kernel/sched/core.c | 42 ++ kernel/sched/idle.c | 3 +++ kernel/sched/sched.h | 3 +++ 4 files changed, 59 insertions(+) diff --git a/include/linux/sched.h b/include/linux/sched.h index 6e3a5eeec509..83fedda54943 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -1917,6 +1917,17 @@ static __always_inline bool need_resched(void) return unlikely(tif_need_resched()); } +#ifdef CONFIG_SCHED_DEBUG +extern void sched_resched_local_allow(void); +extern void sched_resched_local_forbid(void); +extern void sched_resched_local_assert_allowed(void); +#else +static inline void sched_resched_local_allow(void) { } +static inline void sched_resched_local_forbid(void) { } +static inline void sched_resched_local_assert_allowed(void) { } +#endif + + /* * Wrappers for p->thread_info->cpu access. No-op on UP. */ diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 15d2562118d1..6056f0374674 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -591,6 +591,44 @@ void wake_up_q(struct wake_q_head *head) } } +#ifdef CONFIG_SCHED_DEBUG +void noinstr sched_resched_local_allow(void) +{ + this_rq()->resched_local_allow = 1; +} + +void noinstr sched_resched_local_forbid(void) +{ + this_rq()->resched_local_allow = 0; +} + +void noinstr sched_resched_local_assert_allowed(void) +{ + if (this_rq()->resched_local_allow) + return; + + /* +* Idle interrupts break the CPU from its pause and +* rescheduling happens on idle loop exit. +*/ + if (in_hardirq()) + return; + + /* +* What applies to hardirq also applies to softirq as +* we assume they execute on hardirq tail. Ksoftirqd +* shouldn't have resched_local_allow == 0. +* We also assume that no local_bh_enable() call may +* execute softirqs inline on fragile idle/entry +* path... +*/ + if (in_serving_softirq()) + return; + + WARN_ONCE(1, "Late current task rescheduling may be lost\n"); +} +#endif + /* * resched_curr - mark rq's current task 'to be rescheduled now'. * @@ -613,6 +651,7 @@ void resched_curr(struct rq *rq) if (cpu == smp_processor_id()) { set_tsk_need_resched(curr); set_preempt_need_resched(); + sched_resched_local_assert_allowed(); return; } @@ -7796,6 +7835,9 @@ void __init sched_init(void) #endif /* CONFIG_SMP */ hrtick_rq_init(rq); atomic_set(>nr_iowait, 0); +#ifdef CONFIG_SCHED_DEBUG + rq->resched_local_allow = 1; +#endif } set_load_weight(_task, false); diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c index b601a3aa2152..cdffd32812bd 100644 --- a/kernel/sched/idle.c +++ b/kernel/sched/idle.c @@ -185,6 +185,8 @@ static void cpuidle_idle_call(void) return; } + sched_resched_local_forbid(); + /* * The RCU framework needs to be told that we are entering an idle * section, so no more rcu read side critical sections and one more @@ -247,6 +249,7 @@ static void cpuidle_idle_call(void) } exit_idle: + sched_resched_local_allow(); __current_set_polling(); /* diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index 12ada79d40f3..a9416c383451 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -1060,6 +1060,9 @@ struct rq { #endif unsigned intpush_busy; struct cpu_stop_workpush_work; +#ifdef CONFIG_SCHED_DEBUG + unsigned intresched_local_allow; +#endif }; #ifdef CONFIG_FAIR_GROUP_SCHED -- 2.25.1