On Fri, Apr 21, 2023 at 3:54 PM Douglas Anderson <diand...@chromium.org> wrote: > > From: Colin Cross <ccr...@android.com> > > Implement a hardlockup detector that can be enabled on SMP systems > that don't have an arch provided one or one implemented atop perf by > using interrupts on other cpus. Each cpu will use its softlockup > hrtimer to check that the next cpu is processing hrtimer interrupts by > verifying that a counter is increasing. > > NOTE: unlike the other hard lockup detectors, the buddy one can't > easily provide a backtrace on the CPU that locked up. It relies on > some other mechanism in the system to get information about the locked > up CPUs. This could be support for NMI backtraces like [1], it could > be a mechanism for printing the PC of locked CPUs like [2], or it > could be something else. > > This style of hardlockup detector originated in some downstream > Android trees and has been rebased on / carried in ChromeOS trees for > quite a long time for use on arm and arm64 boards. Historically on > these boards we've leveraged mechanism [2] to get information about > hung CPUs, but we could move to [1]. > > NOTE: the buddy system is not really useful to enable on any > architectures that have a better mechanism. On arm64 folks have been > trying to get a better mechanism for years and there has even been > recent posts of patches adding support [3]. However, nothing about the > buddy system is tied to arm64 and several archs (even arm32, where it > was originally developed) could find it useful. > > [1] https://lore.kernel.org/r/20230419225604.21204-1-diand...@chromium.org > [2] https://issuetracker.google.com/172213129 > [3] > https://lore.kernel.org/linux-arm-kernel/20220903093415.15850-1-lecopzer.c...@mediatek.com/
There is another proposal to use timers for lockup detection but not the buddy system: https://lore.kernel.org/lkml/20230413035844.ga31...@ranerica-svr.sc.intel.com/ It'd be very good to free up the counter used by the current NMI watchdog. Thanks, Ian > Signed-off-by: Colin Cross <ccr...@android.com> > Signed-off-by: Matthias Kaehlcke <m...@chromium.org> > Signed-off-by: Guenter Roeck <gro...@chromium.org> > Signed-off-by: Tzung-Bi Shih <tzun...@chromium.org> > Signed-off-by: Douglas Anderson <diand...@chromium.org> > --- > This patch has been rebased in ChromeOS kernel trees many times, and > each time someone had to do work on it they added their > Signed-off-by. I've included those here. I've also left the author as > Colin Cross since the core code is still his. > > I'll also note that the CC list is pretty giant, but that's what > get_maintainers came up with (plus a few other folks I thought would > be interested). As far as I can tell, there's no true MAINTAINER > listed for the existing watchdog code. Assuming people don't hate > this, maybe it would go through Andrew Morton's tree? > > include/linux/nmi.h | 18 ++++- > kernel/Makefile | 1 + > kernel/watchdog.c | 24 ++++-- > kernel/watchdog_buddy_cpu.c | 141 ++++++++++++++++++++++++++++++++++++ > lib/Kconfig.debug | 19 ++++- > 5 files changed, 192 insertions(+), 11 deletions(-) > create mode 100644 kernel/watchdog_buddy_cpu.c > > diff --git a/include/linux/nmi.h b/include/linux/nmi.h > index 048c0b9aa623..35f6c5c2378b 100644 > --- a/include/linux/nmi.h > +++ b/include/linux/nmi.h > @@ -45,6 +45,8 @@ extern void touch_softlockup_watchdog(void); > extern void touch_softlockup_watchdog_sync(void); > extern void touch_all_softlockup_watchdogs(void); > extern unsigned int softlockup_panic; > +DECLARE_PER_CPU(unsigned long, hrtimer_interrupts); > +DECLARE_PER_CPU(unsigned long, hrtimer_interrupts_saved); > > extern int lockup_detector_online_cpu(unsigned int cpu); > extern int lockup_detector_offline_cpu(unsigned int cpu); > @@ -81,14 +83,14 @@ static inline void reset_hung_task_detector(void) { } > #define NMI_WATCHDOG_ENABLED (1 << NMI_WATCHDOG_ENABLED_BIT) > #define SOFT_WATCHDOG_ENABLED (1 << SOFT_WATCHDOG_ENABLED_BIT) > > -#if defined(CONFIG_HARDLOCKUP_DETECTOR) > +#if defined(CONFIG_HARDLOCKUP_DETECTOR_CORE) > extern void hardlockup_detector_disable(void); > extern unsigned int hardlockup_panic; > #else > static inline void hardlockup_detector_disable(void) {} > #endif > > -#if defined(CONFIG_HAVE_NMI_WATCHDOG) || defined(CONFIG_HARDLOCKUP_DETECTOR) > +#if defined(CONFIG_HAVE_NMI_WATCHDOG) || > defined(CONFIG_HARDLOCKUP_DETECTOR_CORE) > # define NMI_WATCHDOG_SYSCTL_PERM 0644 > #else > # define NMI_WATCHDOG_SYSCTL_PERM 0444 > @@ -124,6 +126,14 @@ void watchdog_nmi_disable(unsigned int cpu); > > void lockup_detector_reconfigure(void); > > +#ifdef CONFIG_HARDLOCKUP_DETECTOR_BUDDY_CPU > +void buddy_cpu_touch_watchdog(void); > +void watchdog_check_hardlockup(void); > +#else > +static inline void buddy_cpu_touch_watchdog(void) {} > +static inline void watchdog_check_hardlockup(void) {} > +#endif > + > /** > * touch_nmi_watchdog - restart NMI watchdog timeout. > * > @@ -134,6 +144,7 @@ void lockup_detector_reconfigure(void); > static inline void touch_nmi_watchdog(void) > { > arch_touch_nmi_watchdog(); > + buddy_cpu_touch_watchdog(); > touch_softlockup_watchdog(); > } > > @@ -196,8 +207,7 @@ static inline bool trigger_single_cpu_backtrace(int cpu) > u64 hw_nmi_get_sample_period(int watchdog_thresh); > #endif > > -#if defined(CONFIG_HARDLOCKUP_CHECK_TIMESTAMP) && \ > - defined(CONFIG_HARDLOCKUP_DETECTOR) > +#if defined(CONFIG_HARDLOCKUP_CHECK_TIMESTAMP) && > defined(CONFIG_HARDLOCKUP_DETECTOR_PERF) > void watchdog_update_hrtimer_threshold(u64 period); > #else > static inline void watchdog_update_hrtimer_threshold(u64 period) { } > diff --git a/kernel/Makefile b/kernel/Makefile > index 10ef068f598d..a2054f16f9f4 100644 > --- a/kernel/Makefile > +++ b/kernel/Makefile > @@ -91,6 +91,7 @@ obj-$(CONFIG_FAIL_FUNCTION) += fail_function.o > obj-$(CONFIG_KGDB) += debug/ > obj-$(CONFIG_DETECT_HUNG_TASK) += hung_task.o > obj-$(CONFIG_LOCKUP_DETECTOR) += watchdog.o > +obj-$(CONFIG_HARDLOCKUP_DETECTOR_BUDDY_CPU) += watchdog_buddy_cpu.o > obj-$(CONFIG_HARDLOCKUP_DETECTOR_PERF) += watchdog_hld.o > obj-$(CONFIG_SECCOMP) += seccomp.o > obj-$(CONFIG_RELAY) += relay.o > diff --git a/kernel/watchdog.c b/kernel/watchdog.c > index 8e61f21e7e33..1199043689ae 100644 > --- a/kernel/watchdog.c > +++ b/kernel/watchdog.c > @@ -29,7 +29,7 @@ > > static DEFINE_MUTEX(watchdog_mutex); > > -#if defined(CONFIG_HARDLOCKUP_DETECTOR) || defined(CONFIG_HAVE_NMI_WATCHDOG) > +#if defined(CONFIG_HARDLOCKUP_DETECTOR_CORE) || > defined(CONFIG_HAVE_NMI_WATCHDOG) > # define WATCHDOG_DEFAULT (SOFT_WATCHDOG_ENABLED | NMI_WATCHDOG_ENABLED) > # define NMI_WATCHDOG_DEFAULT 1 > #else > @@ -47,7 +47,7 @@ static int __read_mostly nmi_watchdog_available; > struct cpumask watchdog_cpumask __read_mostly; > unsigned long *watchdog_cpumask_bits = cpumask_bits(&watchdog_cpumask); > > -#ifdef CONFIG_HARDLOCKUP_DETECTOR > +#ifdef CONFIG_HARDLOCKUP_DETECTOR_CORE > > # ifdef CONFIG_SMP > int __read_mostly sysctl_hardlockup_all_cpu_backtrace; > @@ -85,7 +85,9 @@ static int __init hardlockup_panic_setup(char *str) > } > __setup("nmi_watchdog=", hardlockup_panic_setup); > > -#endif /* CONFIG_HARDLOCKUP_DETECTOR */ > +#endif /* CONFIG_HARDLOCKUP_DETECTOR_CORE */ > + > +#ifdef CONFIG_HARDLOCKUP_DETECTOR > > /* > * These functions can be overridden if an architecture implements its > @@ -106,6 +108,13 @@ void __weak watchdog_nmi_disable(unsigned int cpu) > hardlockup_detector_perf_disable(); > } > > +#else > + > +int __weak watchdog_nmi_enable(unsigned int cpu) { return 0; } > +void __weak watchdog_nmi_disable(unsigned int cpu) { return; } > + > +#endif /* CONFIG_HARDLOCKUP_DETECTOR */ > + > /* Return 0, if a NMI watchdog is available. Error code otherwise */ > int __weak __init watchdog_nmi_probe(void) > { > @@ -179,8 +188,8 @@ static DEFINE_PER_CPU(unsigned long, watchdog_touch_ts); > static DEFINE_PER_CPU(unsigned long, watchdog_report_ts); > static DEFINE_PER_CPU(struct hrtimer, watchdog_hrtimer); > static DEFINE_PER_CPU(bool, softlockup_touch_sync); > -static DEFINE_PER_CPU(unsigned long, hrtimer_interrupts); > -static DEFINE_PER_CPU(unsigned long, hrtimer_interrupts_saved); > +DEFINE_PER_CPU(unsigned long, hrtimer_interrupts); > +DEFINE_PER_CPU(unsigned long, hrtimer_interrupts_saved); > static unsigned long soft_lockup_nmi_warn; > > static int __init nowatchdog_setup(char *str) > @@ -364,6 +373,9 @@ static enum hrtimer_restart watchdog_timer_fn(struct > hrtimer *hrtimer) > /* kick the hardlockup detector */ > watchdog_interrupt_count(); > > + /* test for hardlockups */ > + watchdog_check_hardlockup(); > + > /* kick the softlockup detector */ > if (completion_done(this_cpu_ptr(&softlockup_completion))) { > reinit_completion(this_cpu_ptr(&softlockup_completion)); > @@ -820,7 +832,7 @@ static struct ctl_table watchdog_sysctls[] = { > }, > #endif /* CONFIG_SMP */ > #endif > -#ifdef CONFIG_HARDLOCKUP_DETECTOR > +#ifdef CONFIG_HARDLOCKUP_DETECTOR_CORE > { > .procname = "hardlockup_panic", > .data = &hardlockup_panic, > diff --git a/kernel/watchdog_buddy_cpu.c b/kernel/watchdog_buddy_cpu.c > new file mode 100644 > index 000000000000..db813b00e6ef > --- /dev/null > +++ b/kernel/watchdog_buddy_cpu.c > @@ -0,0 +1,141 @@ > +// SPDX-License-Identifier: GPL-2.0 > + > +#include <linux/cpu.h> > +#include <linux/cpumask.h> > +#include <linux/kernel.h> > +#include <linux/nmi.h> > +#include <linux/percpu-defs.h> > + > +static DEFINE_PER_CPU(bool, watchdog_touch); > +static DEFINE_PER_CPU(bool, hard_watchdog_warn); > +static cpumask_t __read_mostly watchdog_cpus; > + > +static unsigned long hardlockup_allcpu_dumped; > + > +int __init watchdog_nmi_probe(void) > +{ > + return 0; > +} > + > +notrace void buddy_cpu_touch_watchdog(void) > +{ > + /* > + * Using __raw here because some code paths have > + * preemption enabled. If preemption is enabled > + * then interrupts should be enabled too, in which > + * case we shouldn't have to worry about the watchdog > + * going off. > + */ > + raw_cpu_write(watchdog_touch, true); > +} > +EXPORT_SYMBOL_GPL(buddy_cpu_touch_watchdog); > + > +static unsigned int watchdog_next_cpu(unsigned int cpu) > +{ > + cpumask_t cpus = watchdog_cpus; > + unsigned int next_cpu; > + > + next_cpu = cpumask_next(cpu, &cpus); > + if (next_cpu >= nr_cpu_ids) > + next_cpu = cpumask_first(&cpus); > + > + if (next_cpu == cpu) > + return nr_cpu_ids; > + > + return next_cpu; > +} > + > +int watchdog_nmi_enable(unsigned int cpu) > +{ > + /* > + * The new cpu will be marked online before the first hrtimer > interrupt > + * runs on it. If another cpu tests for a hardlockup on the new cpu > + * before it has run its first hrtimer, it will get a false positive. > + * Touch the watchdog on the new cpu to delay the first check for at > + * least 3 sampling periods to guarantee one hrtimer has run on the > new > + * cpu. > + */ > + per_cpu(watchdog_touch, cpu) = true; > + /* Match with smp_rmb() in watchdog_check_hardlockup() */ > + smp_wmb(); > + cpumask_set_cpu(cpu, &watchdog_cpus); > + return 0; > +} > + > +void watchdog_nmi_disable(unsigned int cpu) > +{ > + unsigned int next_cpu = watchdog_next_cpu(cpu); > + > + /* > + * Offlining this cpu will cause the cpu before this one to start > + * checking the one after this one. If this cpu just finished > checking > + * the next cpu and updating hrtimer_interrupts_saved, and then the > + * previous cpu checks it within one sample period, it will trigger a > + * false positive. Touch the watchdog on the next cpu to prevent it. > + */ > + if (next_cpu < nr_cpu_ids) > + per_cpu(watchdog_touch, next_cpu) = true; > + /* Match with smp_rmb() in watchdog_check_hardlockup() */ > + smp_wmb(); > + cpumask_clear_cpu(cpu, &watchdog_cpus); > +} > + > +static int is_hardlockup_buddy_cpu(unsigned int cpu) > +{ > + unsigned long hrint = per_cpu(hrtimer_interrupts, cpu); > + > + if (per_cpu(hrtimer_interrupts_saved, cpu) == hrint) > + return 1; > + > + per_cpu(hrtimer_interrupts_saved, cpu) = hrint; > + return 0; > +} > + > +void watchdog_check_hardlockup(void) > +{ > + unsigned int next_cpu; > + > + /* > + * Test for hardlockups every 3 samples. The sample period is > + * watchdog_thresh * 2 / 5, so 3 samples gets us back to slightly > over > + * watchdog_thresh (over by 20%). > + */ > + if (__this_cpu_read(hrtimer_interrupts) % 3 != 0) > + return; > + > + /* check for a hardlockup on the next cpu */ > + next_cpu = watchdog_next_cpu(smp_processor_id()); > + if (next_cpu >= nr_cpu_ids) > + return; > + > + /* Match with smp_wmb() in watchdog_nmi_enable() / > watchdog_nmi_disable() */ > + smp_rmb(); > + > + if (per_cpu(watchdog_touch, next_cpu) == true) { > + per_cpu(watchdog_touch, next_cpu) = false; > + return; > + } > + > + if (is_hardlockup_buddy_cpu(next_cpu)) { > + /* only warn once */ > + if (per_cpu(hard_watchdog_warn, next_cpu) == true) > + return; > + > + /* > + * Perform all-CPU dump only once to avoid multiple > hardlockups > + * generating interleaving traces > + */ > + if (sysctl_hardlockup_all_cpu_backtrace && > + !test_and_set_bit(0, > &hardlockup_allcpu_dumped)) > + trigger_allbutself_cpu_backtrace(); > + > + if (hardlockup_panic) > + panic("Watchdog detected hard LOCKUP on cpu %u", > next_cpu); > + else > + WARN(1, "Watchdog detected hard LOCKUP on cpu %u", > next_cpu); > + > + per_cpu(hard_watchdog_warn, next_cpu) = true; > + } else { > + per_cpu(hard_watchdog_warn, next_cpu) = false; > + } > +} > diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug > index 39d1d93164bd..9eb86bc9f5ee 100644 > --- a/lib/Kconfig.debug > +++ b/lib/Kconfig.debug > @@ -1036,6 +1036,9 @@ config HARDLOCKUP_DETECTOR_PERF > config HARDLOCKUP_CHECK_TIMESTAMP > bool > > +config HARDLOCKUP_DETECTOR_CORE > + bool > + > # > # arch/ can define HAVE_HARDLOCKUP_DETECTOR_ARCH to provide their own hard > # lockup detector rather than the perf based detector. > @@ -1045,6 +1048,7 @@ config HARDLOCKUP_DETECTOR > depends on DEBUG_KERNEL && !S390 > depends on HAVE_HARDLOCKUP_DETECTOR_PERF || > HAVE_HARDLOCKUP_DETECTOR_ARCH > select LOCKUP_DETECTOR > + select HARDLOCKUP_DETECTOR_CORE > select HARDLOCKUP_DETECTOR_PERF if HAVE_HARDLOCKUP_DETECTOR_PERF > help > Say Y here to enable the kernel to act as a watchdog to detect > @@ -1055,9 +1059,22 @@ config HARDLOCKUP_DETECTOR > chance to run. The current stack trace is displayed upon detection > and the system will stay locked up. > > +config HARDLOCKUP_DETECTOR_BUDDY_CPU > + bool "Buddy CPU hardlockup detector" > + depends on DEBUG_KERNEL && SMP > + depends on !HARDLOCKUP_DETECTOR && !HAVE_NMI_WATCHDOG > + depends on !S390 > + select HARDLOCKUP_DETECTOR_CORE > + select SOFTLOCKUP_DETECTOR > + help > + Say Y here to enable a hardlockup detector where CPUs check > + each other for lockup. Each cpu uses its softlockup hrtimer > + to check that the next cpu is processing hrtimer interrupts by > + verifying that a counter is increasing. > + > config BOOTPARAM_HARDLOCKUP_PANIC > bool "Panic (Reboot) On Hard Lockups" > - depends on HARDLOCKUP_DETECTOR > + depends on HARDLOCKUP_DETECTOR_CORE > help > Say Y here to enable the kernel to panic on "hard lockups", > which are bugs that cause the kernel to loop in kernel > -- > 2.40.0.634.g4ca3ef3211-goog > _______________________________________________ Kgdb-bugreport mailing list Kgdb-bugreport@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kgdb-bugreport