Re: [PATCH v4 10/17] watchdog/hardlockup: Move perf hardlockup watchdog petting to watchdog.c

2023-05-19 Thread Doug Anderson
Hi,

On Thu, May 11, 2023 at 8:46 AM Petr Mladek  wrote:
>
> > @@ -111,6 +125,11 @@ static void watchdog_hardlockup_interrupt_count(void)
> >
> >  void watchdog_hardlockup_check(unsigned int cpu, struct pt_regs *regs)
> >  {
> > + if (__this_cpu_read(watchdog_hardlockup_touch)) {
> > + __this_cpu_write(watchdog_hardlockup_touch, false);
> > + return;
> > + }
>
> If we clear watchdog_hardlockup_touch() here then
> watchdog_hardlockup_check() won't be called yet another
> watchdog_hrtimer_sample_threshold perior.
>
> It means that any touch will cause ignoring one full period.
> The is_hardlockup() check will be done after full two periods.
>
> It is not ideal, see below.
>
> > +
> >   /*
> >* Check for a hardlockup by making sure the CPU's timer
> >* interrupt is incrementing. The timer interrupt should have
> > diff --git a/kernel/watchdog_perf.c b/kernel/watchdog_perf.c
> > index 9be90b2a2ea7..547917ebd5d3 100644
> > --- a/kernel/watchdog_perf.c
> > +++ b/kernel/watchdog_perf.c
> > @@ -112,11 +98,6 @@ static void watchdog_overflow_callback(struct 
> > perf_event *event,
> >   /* Ensure the watchdog never gets throttled */
> >   event->hw.interrupts = 0;
> >
> > - if (__this_cpu_read(watchdog_nmi_touch) == true) {
> > - __this_cpu_write(watchdog_nmi_touch, false);
> > - return;
> > - }
>
> The original code looks wrong. arch_touch_nmi_watchdog() caused
> skipping only one period of the perf event.
>
> I would expect that it caused restarting the period,
> something like:
>
> if (__this_cpu_read(watchdog_nmi_touch) == true) {
> /*
>  * Restart the period after which the interrupt
>  * counter is checked.
>  */
> __this_cpu_write(nmi_rearmed, 0);
> __this_cpu_write(last_timestamp, now);
> __this_cpu_write(watchdog_nmi_touch, false);
> return;
> }
>
> By other words, we should restart the period in the very next perf
> event after the watchdog was touched.
>
> That said, the new code looks better than the original.
> IMHO, the original code was prone to false positives.

I had a little bit of a hard time following, but I _think_ the "tl;dr"
of all the above is that my change is fine. If I misunderstood, please
yell.


> Best Regards,
> Petr
>
> PS: It might be worth fixing this problem in a separate patch at the
> beginning of this patchset. It might be a candidate for stable
> backports.

Done. It's now its own patch and early in the series.


Re: [PATCH v4 10/17] watchdog/hardlockup: Move perf hardlockup watchdog petting to watchdog.c

2023-05-11 Thread Petr Mladek
On Thu 2023-05-04 15:13:42, Douglas Anderson wrote:
> In preparation for the buddy hardlockup detector, which wants the same
> petting logic as the current perf hardlockup detector, move the code
> to watchdog.c. While doing this, rename the global variable to match
> others nearby. The arch_touch_nmi_watchdog() function is not renamed
> since that is exported with "EXPORT_SYMBOL" and is thus ABI.
> 
> Currently the code in watchdog.c is guarded by
> CONFIG_HARDLOCKUP_DETECTOR_PERF, which makes this change seem
> silly. However, a future patch will change this.
> 
> NOTE: there is a slight side effect to this change, though from code
> analysis I believe it to be a beneficial one. Specifically the perf
> hardlockup detector will now do check the "timestamp" before clearing
> any watchdog pets. Previously the order was reversed. With the old
> order if the watchdog perf event was firing super fast then it would
> also be clearing existing watchdog pets super fast. The new behavior
> of handling super-fast perf before clearing watchdog pets seems
> better.

Ah, I think that this was actually pretty serious bug in the perf
detector. But I think that it should work another way, see below.

> Signed-off-by: Douglas Anderson 
> ---
> 
> --- a/kernel/watchdog.c
> +++ b/kernel/watchdog.c
> @@ -111,6 +125,11 @@ static void watchdog_hardlockup_interrupt_count(void)
>  
>  void watchdog_hardlockup_check(unsigned int cpu, struct pt_regs *regs)
>  {
> + if (__this_cpu_read(watchdog_hardlockup_touch)) {
> + __this_cpu_write(watchdog_hardlockup_touch, false);
> + return;
> + }

If we clear watchdog_hardlockup_touch() here then
watchdog_hardlockup_check() won't be called yet another
watchdog_hrtimer_sample_threshold perior.

It means that any touch will cause ignoring one full period.
The is_hardlockup() check will be done after full two periods.

It is not ideal, see below.

> +
>   /*
>* Check for a hardlockup by making sure the CPU's timer
>* interrupt is incrementing. The timer interrupt should have
> diff --git a/kernel/watchdog_perf.c b/kernel/watchdog_perf.c
> index 9be90b2a2ea7..547917ebd5d3 100644
> --- a/kernel/watchdog_perf.c
> +++ b/kernel/watchdog_perf.c
> @@ -112,11 +98,6 @@ static void watchdog_overflow_callback(struct perf_event 
> *event,
>   /* Ensure the watchdog never gets throttled */
>   event->hw.interrupts = 0;
>  
> - if (__this_cpu_read(watchdog_nmi_touch) == true) {
> - __this_cpu_write(watchdog_nmi_touch, false);
> - return;
> - }

The original code looks wrong. arch_touch_nmi_watchdog() caused
skipping only one period of the perf event.

I would expect that it caused restarting the period,
something like:

if (__this_cpu_read(watchdog_nmi_touch) == true) {
/*
 * Restart the period after which the interrupt
 * counter is checked.
 */
__this_cpu_write(nmi_rearmed, 0);
__this_cpu_write(last_timestamp, now);
__this_cpu_write(watchdog_nmi_touch, false);
return;
}

By other words, we should restart the period in the very next perf
event after the watchdog was touched.

That said, the new code looks better than the original.
IMHO, the original code was prone to false positives.

Best Regards,
Petr

PS: It might be worth fixing this problem in a separate patch at the
beginning of this patchset. It might be a candidate for stable
backports.

> -
>   if (!watchdog_check_timestamp())
>   return;
>  


[PATCH v4 10/17] watchdog/hardlockup: Move perf hardlockup watchdog petting to watchdog.c

2023-05-04 Thread Douglas Anderson
In preparation for the buddy hardlockup detector, which wants the same
petting logic as the current perf hardlockup detector, move the code
to watchdog.c. While doing this, rename the global variable to match
others nearby. The arch_touch_nmi_watchdog() function is not renamed
since that is exported with "EXPORT_SYMBOL" and is thus ABI.

Currently the code in watchdog.c is guarded by
CONFIG_HARDLOCKUP_DETECTOR_PERF, which makes this change seem
silly. However, a future patch will change this.

NOTE: there is a slight side effect to this change, though from code
analysis I believe it to be a beneficial one. Specifically the perf
hardlockup detector will now do check the "timestamp" before clearing
any watchdog pets. Previously the order was reversed. With the old
order if the watchdog perf event was firing super fast then it would
also be clearing existing watchdog pets super fast. The new behavior
of handling super-fast perf before clearing watchdog pets seems
better.

Signed-off-by: Douglas Anderson 
---

Changes in v4:
- ("Move perf hardlockup watchdog petting to watchdog.c") new for v4.

 include/linux/nmi.h|  5 +++--
 kernel/watchdog.c  | 19 +++
 kernel/watchdog_perf.c | 19 ---
 3 files changed, 22 insertions(+), 21 deletions(-)

diff --git a/include/linux/nmi.h b/include/linux/nmi.h
index 2c9ea1ba285c..94ff84e1c8ec 100644
--- a/include/linux/nmi.h
+++ b/include/linux/nmi.h
@@ -88,7 +88,10 @@ static inline void hardlockup_detector_disable(void) {}
 #endif
 
 #if defined(CONFIG_HARDLOCKUP_DETECTOR_PERF)
+void arch_touch_nmi_watchdog(void);
 void watchdog_hardlockup_check(unsigned int cpu, struct pt_regs *regs);
+#elif !defined(CONFIG_HAVE_NMI_WATCHDOG)
+static inline void arch_touch_nmi_watchdog(void) { }
 #endif
 
 #if defined(CONFIG_HAVE_NMI_WATCHDOG) || defined(CONFIG_HARDLOCKUP_DETECTOR)
@@ -98,7 +101,6 @@ void watchdog_hardlockup_check(unsigned int cpu, struct 
pt_regs *regs);
 #endif
 
 #if defined(CONFIG_HARDLOCKUP_DETECTOR_PERF)
-extern void arch_touch_nmi_watchdog(void);
 extern void hardlockup_detector_perf_stop(void);
 extern void hardlockup_detector_perf_restart(void);
 extern void hardlockup_detector_perf_disable(void);
@@ -113,7 +115,6 @@ static inline void hardlockup_detector_perf_enable(void) { }
 static inline void hardlockup_detector_perf_cleanup(void) { }
 # if !defined(CONFIG_HAVE_NMI_WATCHDOG)
 static inline int hardlockup_detector_perf_init(void) { return -ENODEV; }
-static inline void arch_touch_nmi_watchdog(void) {}
 # else
 static inline int hardlockup_detector_perf_init(void) { return 0; }
 # endif
diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index 367bea0167a5..9c17090611f2 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -90,8 +90,22 @@ __setup("nmi_watchdog=", hardlockup_panic_setup);
 static DEFINE_PER_CPU(unsigned long, hrtimer_interrupts);
 static DEFINE_PER_CPU(unsigned long, hrtimer_interrupts_saved);
 static DEFINE_PER_CPU(bool, watchdog_hardlockup_processed);
+static DEFINE_PER_CPU(bool, watchdog_hardlockup_touch);
 static unsigned long watchdog_hardlockup_dumped_stacks;
 
+notrace void arch_touch_nmi_watchdog(void)
+{
+   /*
+* Using __raw here because some code paths have
+* preemption enabled.  If preemption is enabled
+* then interrupts should be enabled too, in which
+* case we shouldn't have to worry about the watchdog
+* going off.
+*/
+   raw_cpu_write(watchdog_hardlockup_touch, true);
+}
+EXPORT_SYMBOL(arch_touch_nmi_watchdog);
+
 static bool watchdog_hardlockup_is_lockedup(unsigned int cpu)
 {
unsigned long hrint = per_cpu(hrtimer_interrupts, cpu);
@@ -111,6 +125,11 @@ static void watchdog_hardlockup_interrupt_count(void)
 
 void watchdog_hardlockup_check(unsigned int cpu, struct pt_regs *regs)
 {
+   if (__this_cpu_read(watchdog_hardlockup_touch)) {
+   __this_cpu_write(watchdog_hardlockup_touch, false);
+   return;
+   }
+
/*
 * Check for a hardlockup by making sure the CPU's timer
 * interrupt is incrementing. The timer interrupt should have
diff --git a/kernel/watchdog_perf.c b/kernel/watchdog_perf.c
index 9be90b2a2ea7..547917ebd5d3 100644
--- a/kernel/watchdog_perf.c
+++ b/kernel/watchdog_perf.c
@@ -20,26 +20,12 @@
 #include 
 #include 
 
-static DEFINE_PER_CPU(bool, watchdog_nmi_touch);
 static DEFINE_PER_CPU(struct perf_event *, watchdog_ev);
 static DEFINE_PER_CPU(struct perf_event *, dead_event);
 static struct cpumask dead_events_mask;
 
 static atomic_t watchdog_cpus = ATOMIC_INIT(0);
 
-notrace void arch_touch_nmi_watchdog(void)
-{
-   /*
-* Using __raw here because some code paths have
-* preemption enabled.  If preemption is enabled
-* then interrupts should be enabled too, in which
-* case we shouldn't have to worry about the watchdog
-* going off.
-*/
-   raw_cpu_write(watchdog_nmi_touch, true);
-}