Re: [RFC PATCH 6/8] sched: Report local wake up on resched blind zone within idle loop

2021-01-14 Thread Paul E. McKenney
On Thu, Jan 14, 2021 at 11:46:06AM +, Chris Wilson wrote:
> Quoting Frederic Weisbecker (2021-01-09 02:05:34)
> > +void noinstr sched_resched_local_assert_allowed(void)
> > +{
> > +   if (this_rq()->resched_local_allow)
> > +   return;
> > +
> > +   /*
> > +* Idle interrupts break the CPU from its pause and
> > +* rescheduling happens on idle loop exit.
> > +*/
> > +   if (in_hardirq())
> > +   return;
> > +
> > +   /*
> > +* What applies to hardirq also applies to softirq as
> > +* we assume they execute on hardirq tail. Ksoftirqd
> > +* shouldn't have resched_local_allow == 0.
> > +* We also assume that no local_bh_enable() call may
> > +* execute softirqs inline on fragile idle/entry
> > +* path...
> > +*/
> > +   if (in_serving_softirq())
> > +   return;
> > +
> > +   WARN_ONCE(1, "Late current task rescheduling may be lost\n");
> > +}
> 
> This warn is impacting all suspend testing in linux-next:

Does the patch series here help?

https://lore.kernel.org/lkml/20210112144344.850850...@infradead.org/

And in Frederic's defense, his patch isn't causing the problem, but
rather calling attention to a condition that can lead to hangs.

Thanx, Paul

> <4> [124.226839] Late current task rescheduling may be lost
> <4> [124.226847] WARNING: CPU: 0 PID: 0 at kernel/sched/core.c:628 
> sched_resched_local_assert_allowed+0x53/0x60
> <4> [124.226854] Modules linked in: vgem btusb btrtl btbcm btintel 
> snd_hda_codec_hdmi bluetooth snd_hda_codec_realtek snd_hda_codec_generic 
> coretemp ledtrig_audio crct10dif_pclmul crc32_pclmul ghash_clmulni_intel 
> ecdh_generic ecc r8169 realtek i915 lpc_ich snd_hda_intel snd_intel_dspcfg 
> snd_hda_codec snd_hwdep snd_hda_core snd_pcm pinctrl_cherryview prime_numbers
> <4> [124.226905] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 
> 5.11.0-rc3-next-20210113-gaa515cdce7a15-next-20210113 #1
> <4> [124.226909] Hardware name:  /NUC5CPYB, BIOS 
> PYBSWCEL.86A.0058.2016.1102.1842 11/02/2016
> <4> [124.226912] RIP: 0010:sched_resched_local_assert_allowed+0x53/0x60
> <4> [124.226918] Code: a9 00 00 0f 00 75 0e f6 c4 01 75 09 80 3d 55 42 d9 00 
> 00 74 02 5b c3 48 c7 c7 68 0e 31 82 c6 05 43 42 d9 00 01 e8 9c 39 fb ff <0f> 
> 0b 5b c3 cc cc cc cc cc cc cc cc cc 48 39 77 10 0f 84 97 00 00
> <4> [124.226922] RSP: 0018:82603cf8 EFLAGS: 00010082
> <4> [124.226926] RAX:  RBX: 0003b280 RCX: 
> 0007
> <4> [124.226929] RDX: 0007 RSI: 8112d041 RDI: 
> 
> <4> [124.226931] RBP: 888121b9a840 R08:  R09: 
> 
> <4> [124.226934] R10: 0c3c R11: 0001 R12: 
> 82603d50
> <4> [124.226937] R13: 888121b9b0c8 R14: 0083 R15: 
> 88817b83b280
> <4> [124.226940] FS:  () GS:88817b80() 
> knlGS:
> <4> [124.226943] CS:  0010 DS:  ES:  CR0: 80050033
> <4> [124.226946] CR2: 5645075dd1b0 CR3: 05612000 CR4: 
> 001006f0
> <4> [124.226949] Call Trace:
> <4> [124.226951]  check_preempt_curr+0x44/0x90
> <4> [124.226956]  ttwu_do_wakeup+0x14/0x220
> <4> [124.226961]  try_to_wake_up+0x1ef/0x7b0
> <4> [124.226966]  ? get_pwq.isra.21+0x2c/0x40
> <4> [124.226972]  __queue_work+0x180/0x610
> <4> [124.226977]  queue_work_on+0x65/0x70
> <4> [124.226981]  timekeeping_resume+0x150/0x1b0
> <4> [124.226986]  tick_unfreeze+0x3c/0x120
> <4> [124.226990]  cpuidle_enter_s2idle+0xec/0x180
> <4> [124.226995]  do_idle+0x1f3/0x2b0
> <4> [124.227000]  cpu_startup_entry+0x14/0x20
> <4> [124.227004]  start_kernel+0x551/0x576
> <4> [124.227009]  secondary_startup_64_no_verify+0xb0/0xbb
> <4> [124.227017] irq event stamp: 595206
> <4> [124.227019] hardirqs last  enabled at (595205): [] 
> asm_sysvec_apic_timer_interrupt+0x12/0x20
> <4> [124.227024] hardirqs last disabled at (595206): [] 
> do_idle+0xaa/0x2b0
> <4> [124.227028] softirqs last  enabled at (595182): [] 
> __do_softirq+0x342/0x48e
> <4> [124.227033] softirqs last disabled at (595171): [] 
> asm_call_irq_on_stack+0x12/0x20
> <4> [124.227038] ---[ end trace 6fff00bd318698a4 ]---
> <4> [124.227050] 
> <4> [124.227052] ==
> <4> [124.227054] WARNING: possible circular locking dependency detected
> <4> [124.227055] 5.11.0-rc3-next-20210113-gaa515cdce7a15-next-20210113 #1 Not 
> tainted
> <4> [124.227056] --
> <4> [124.227058] swapper/0/0 is trying to acquire lock:
> <4> [124.227059] 8272fdb8 ((console_sem).lock){-...}-{2:2}, at: 
> down_trylock+0xa/0x30
> <4> [124.227063] 
> <4> [124.227064] but task is already holding lock:
> <4> [124.227065] 88817b83b298 (>lock){-.-.}-{2:2}, at: 
> try_to_wake_up+0x1c9/0x7b0
> <4> [124.227069] 
> <4> 

Re: [RFC PATCH 6/8] sched: Report local wake up on resched blind zone within idle loop

2021-01-11 Thread Frederic Weisbecker
On Mon, Jan 11, 2021 at 01:25:59PM +0100, Peter Zijlstra wrote:
> On Sat, Jan 09, 2021 at 03:05:34AM +0100, Frederic Weisbecker wrote:
> > The idle loop has several need_resched() checks that make sure we don't
> > miss a rescheduling request. This means that any wake up performed on
> > the local runqueue after the last generic need_resched() check is going
> > to have its rescheduling silently ignored. This has happened in the
> > past with rcu kthreads awaken from rcu_idle_enter() for example.
> > 
> > Perform sanity checks to report these situations.
> 
> I really don't like this..
> 
>  - it's too specific to the actual reschedule condition, any wakeup this
>late is dodgy, not only those that happen to cause a local
>reschedule.

Right.

> 
>  - we can already test this with unwind and checking against __cpuidle
> 
>  - moving all of __cpuidle into noinstr would also cover this. And we're
>going to have to do that anyway.

Ok then, I'll wait for that instead.

> 
> > +void noinstr sched_resched_local_assert_allowed(void)
> > +{
> > +   if (this_rq()->resched_local_allow)
> > +   return;
> > +
> 
> > +   /*
> > +* Idle interrupts break the CPU from its pause and
> > +* rescheduling happens on idle loop exit.
> > +*/
> > +   if (in_hardirq())
> > +   return;
> > +
> > +   /*
> > +* What applies to hardirq also applies to softirq as
> > +* we assume they execute on hardirq tail. Ksoftirqd
> > +* shouldn't have resched_local_allow == 0.
> > +* We also assume that no local_bh_enable() call may
> > +* execute softirqs inline on fragile idle/entry
> > +* path...
> > +*/
> > +   if (in_serving_softirq())
> > +   return;
> > +
> > +   WARN_ONCE(1, "Late current task rescheduling may be lost\n");
> 
> That seems like it wants to be:
> 
>   WARN_ONCE(in_task(), "...");

Right! But I guess I'll drop that patch now.

Thanks.


Re: [RFC PATCH 6/8] sched: Report local wake up on resched blind zone within idle loop

2021-01-11 Thread Peter Zijlstra
On Sat, Jan 09, 2021 at 03:05:34AM +0100, Frederic Weisbecker wrote:
> The idle loop has several need_resched() checks that make sure we don't
> miss a rescheduling request. This means that any wake up performed on
> the local runqueue after the last generic need_resched() check is going
> to have its rescheduling silently ignored. This has happened in the
> past with rcu kthreads awaken from rcu_idle_enter() for example.
> 
> Perform sanity checks to report these situations.

I really don't like this..

 - it's too specific to the actual reschedule condition, any wakeup this
   late is dodgy, not only those that happen to cause a local
   reschedule.

 - we can already test this with unwind and checking against __cpuidle

 - moving all of __cpuidle into noinstr would also cover this. And we're
   going to have to do that anyway.

> +void noinstr sched_resched_local_assert_allowed(void)
> +{
> + if (this_rq()->resched_local_allow)
> + return;
> +

> + /*
> +  * Idle interrupts break the CPU from its pause and
> +  * rescheduling happens on idle loop exit.
> +  */
> + if (in_hardirq())
> + return;
> +
> + /*
> +  * What applies to hardirq also applies to softirq as
> +  * we assume they execute on hardirq tail. Ksoftirqd
> +  * shouldn't have resched_local_allow == 0.
> +  * We also assume that no local_bh_enable() call may
> +  * execute softirqs inline on fragile idle/entry
> +  * path...
> +  */
> + if (in_serving_softirq())
> + return;
> +
> + WARN_ONCE(1, "Late current task rescheduling may be lost\n");

That seems like it wants to be:

WARN_ONCE(in_task(), "...");

> +}




[RFC PATCH 6/8] sched: Report local wake up on resched blind zone within idle loop

2021-01-08 Thread Frederic Weisbecker
The idle loop has several need_resched() checks that make sure we don't
miss a rescheduling request. This means that any wake up performed on
the local runqueue after the last generic need_resched() check is going
to have its rescheduling silently ignored. This has happened in the
past with rcu kthreads awaken from rcu_idle_enter() for example.

Perform sanity checks to report these situations.

Signed-off-by: Frederic Weisbecker 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: Ingo Molnar
Cc: Paul E. McKenney 
Cc: Rafael J. Wysocki 
---
 include/linux/sched.h | 11 +++
 kernel/sched/core.c   | 42 ++
 kernel/sched/idle.c   |  3 +++
 kernel/sched/sched.h  |  3 +++
 4 files changed, 59 insertions(+)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 6e3a5eeec509..83fedda54943 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1917,6 +1917,17 @@ static __always_inline bool need_resched(void)
return unlikely(tif_need_resched());
 }
 
+#ifdef CONFIG_SCHED_DEBUG
+extern void sched_resched_local_allow(void);
+extern void sched_resched_local_forbid(void);
+extern void sched_resched_local_assert_allowed(void);
+#else
+static inline void sched_resched_local_allow(void) { }
+static inline void sched_resched_local_forbid(void) { }
+static inline void sched_resched_local_assert_allowed(void) { }
+#endif
+
+
 /*
  * Wrappers for p->thread_info->cpu access. No-op on UP.
  */
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 15d2562118d1..6056f0374674 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -591,6 +591,44 @@ void wake_up_q(struct wake_q_head *head)
}
 }
 
+#ifdef CONFIG_SCHED_DEBUG
+void noinstr sched_resched_local_allow(void)
+{
+   this_rq()->resched_local_allow = 1;
+}
+
+void noinstr sched_resched_local_forbid(void)
+{
+   this_rq()->resched_local_allow = 0;
+}
+
+void noinstr sched_resched_local_assert_allowed(void)
+{
+   if (this_rq()->resched_local_allow)
+   return;
+
+   /*
+* Idle interrupts break the CPU from its pause and
+* rescheduling happens on idle loop exit.
+*/
+   if (in_hardirq())
+   return;
+
+   /*
+* What applies to hardirq also applies to softirq as
+* we assume they execute on hardirq tail. Ksoftirqd
+* shouldn't have resched_local_allow == 0.
+* We also assume that no local_bh_enable() call may
+* execute softirqs inline on fragile idle/entry
+* path...
+*/
+   if (in_serving_softirq())
+   return;
+
+   WARN_ONCE(1, "Late current task rescheduling may be lost\n");
+}
+#endif
+
 /*
  * resched_curr - mark rq's current task 'to be rescheduled now'.
  *
@@ -613,6 +651,7 @@ void resched_curr(struct rq *rq)
if (cpu == smp_processor_id()) {
set_tsk_need_resched(curr);
set_preempt_need_resched();
+   sched_resched_local_assert_allowed();
return;
}
 
@@ -7796,6 +7835,9 @@ void __init sched_init(void)
 #endif /* CONFIG_SMP */
hrtick_rq_init(rq);
atomic_set(>nr_iowait, 0);
+#ifdef CONFIG_SCHED_DEBUG
+   rq->resched_local_allow = 1;
+#endif
}
 
set_load_weight(_task, false);
diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c
index b601a3aa2152..cdffd32812bd 100644
--- a/kernel/sched/idle.c
+++ b/kernel/sched/idle.c
@@ -185,6 +185,8 @@ static void cpuidle_idle_call(void)
return;
}
 
+   sched_resched_local_forbid();
+
/*
 * The RCU framework needs to be told that we are entering an idle
 * section, so no more rcu read side critical sections and one more
@@ -247,6 +249,7 @@ static void cpuidle_idle_call(void)
}
 
 exit_idle:
+   sched_resched_local_allow();
__current_set_polling();
 
/*
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 12ada79d40f3..a9416c383451 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -1060,6 +1060,9 @@ struct rq {
 #endif
unsigned intpush_busy;
struct cpu_stop_workpush_work;
+#ifdef CONFIG_SCHED_DEBUG
+   unsigned intresched_local_allow;
+#endif
 };
 
 #ifdef CONFIG_FAIR_GROUP_SCHED
-- 
2.25.1