Re: [RFC PATCH v3 11/21] x86/watchdog/hardlockup: Add an HPET-based hardlockup detector

2019-05-15 Thread Ricardo Neri
On Tue, May 14, 2019 at 07:26:58AM -0700, Randy Dunlap wrote:
> On 5/14/19 7:02 AM, Ricardo Neri wrote:
> > diff --git a/arch/x86/Kconfig.debug b/arch/x86/Kconfig.debug
> > index 15d0fbe27872..376a5db81aec 100644
> > --- a/arch/x86/Kconfig.debug
> > +++ b/arch/x86/Kconfig.debug
> > @@ -169,6 +169,17 @@ config IOMMU_LEAK
> >  config HAVE_MMIOTRACE_SUPPORT
> > def_bool y
> >  
> > +config X86_HARDLOCKUP_DETECTOR_HPET
> > +   bool "Use HPET Timer for Hard Lockup Detection"
> > +   select SOFTLOCKUP_DETECTOR
> > +   select HARDLOCKUP_DETECTOR
> > +   select HARDLOCKUP_DETECTOR_CORE
> > +   depends on HPET_TIMER && HPET && X86_64
> > +   help
> > + Say y to enable a hardlockup detector that is driven by an High-
> 
>  by a
> 
I'll correct.

Thanks and BR,
Ricardo
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH v3 11/21] x86/watchdog/hardlockup: Add an HPET-based hardlockup detector

2019-05-14 Thread Randy Dunlap
On 5/14/19 7:02 AM, Ricardo Neri wrote:
> diff --git a/arch/x86/Kconfig.debug b/arch/x86/Kconfig.debug
> index 15d0fbe27872..376a5db81aec 100644
> --- a/arch/x86/Kconfig.debug
> +++ b/arch/x86/Kconfig.debug
> @@ -169,6 +169,17 @@ config IOMMU_LEAK
>  config HAVE_MMIOTRACE_SUPPORT
>   def_bool y
>  
> +config X86_HARDLOCKUP_DETECTOR_HPET
> + bool "Use HPET Timer for Hard Lockup Detection"
> + select SOFTLOCKUP_DETECTOR
> + select HARDLOCKUP_DETECTOR
> + select HARDLOCKUP_DETECTOR_CORE
> + depends on HPET_TIMER && HPET && X86_64
> + help
> +   Say y to enable a hardlockup detector that is driven by an High-

   by a

> +   Precision Event Timer. This option is helpful to not use counters
> +   from the Performance Monitoring Unit to drive the detector.
> +
>  config X86_DECODER_SELFTEST
>   bool "x86 instruction decoder selftest"
>   depends on DEBUG_KERNEL && KPROBES


-- 
~Randy


[RFC PATCH v3 11/21] x86/watchdog/hardlockup: Add an HPET-based hardlockup detector

2019-05-14 Thread Ricardo Neri
This is the initial implementation of a hardlockup detector driven by an
HPET timer. This initial implementation includes functions to control the
timer via its registers. It also requests such timer, installs an NMI
interrupt handler and performs the initial configuration of the timer.

The detector is not functional at this stage. A subsequent changeset will
invoke the interfaces provides by this detector as well as functionality
to determine if the HPET timer caused the NMI.

In order to detect hardlockups in all the monitored CPUs, move the
interrupt to the next monitored CPU while handling the NMI interrupt; wrap
around when reaching the highest CPU in the mask. This rotation is
achieved by setting the affinity mask to only contain the next CPU to
monitor. A cpumask keeps track of all the CPUs that need to be monitored.
Such cpumask is updated when the watchdog is enabled or disabled in a
particular CPU.

This detector relies on an HPET timer that is capable of using Front Side
Bus interrupts. In order to avoid using the generic interrupt code,
program directly the MSI message register of the HPET timer.

HPET registers are only accessed to kick the timer after looking for
hardlockups. This happens every watchdog_thresh seconds. A subsequent
changeset will determine whether the HPET timer caused the interrupt based
on the value of the time-stamp counter. For now, just add a stub function.

Cc: "H. Peter Anvin" 
Cc: Ashok Raj 
Cc: Andi Kleen 
Cc: Tony Luck 
Cc: Peter Zijlstra 
Cc: Clemens Ladisch 
Cc: Arnd Bergmann 
Cc: Philippe Ombredanne 
Cc: Kate Stewart 
Cc: "Rafael J. Wysocki" 
Cc: "Ravi V. Shankar" 
Cc: Mimi Zohar 
Cc: Jan Kiszka 
Cc: Nick Desaulniers 
Cc: Masahiro Yamada 
Cc: Nayna Jain 
Cc: Stephane Eranian 
Cc: Suravee Suthikulpanit 
Cc: x...@kernel.org
Signed-off-by: Ricardo Neri 
---
 arch/x86/Kconfig.debug  |  11 +
 arch/x86/include/asm/hpet.h |  13 ++
 arch/x86/kernel/Makefile|   1 +
 arch/x86/kernel/hpet.c  |   3 +-
 arch/x86/kernel/watchdog_hld_hpet.c | 334 
 5 files changed, 361 insertions(+), 1 deletion(-)
 create mode 100644 arch/x86/kernel/watchdog_hld_hpet.c

diff --git a/arch/x86/Kconfig.debug b/arch/x86/Kconfig.debug
index 15d0fbe27872..376a5db81aec 100644
--- a/arch/x86/Kconfig.debug
+++ b/arch/x86/Kconfig.debug
@@ -169,6 +169,17 @@ config IOMMU_LEAK
 config HAVE_MMIOTRACE_SUPPORT
def_bool y
 
+config X86_HARDLOCKUP_DETECTOR_HPET
+   bool "Use HPET Timer for Hard Lockup Detection"
+   select SOFTLOCKUP_DETECTOR
+   select HARDLOCKUP_DETECTOR
+   select HARDLOCKUP_DETECTOR_CORE
+   depends on HPET_TIMER && HPET && X86_64
+   help
+ Say y to enable a hardlockup detector that is driven by an High-
+ Precision Event Timer. This option is helpful to not use counters
+ from the Performance Monitoring Unit to drive the detector.
+
 config X86_DECODER_SELFTEST
bool "x86 instruction decoder selftest"
depends on DEBUG_KERNEL && KPROBES
diff --git a/arch/x86/include/asm/hpet.h b/arch/x86/include/asm/hpet.h
index 20abdaa5372d..31fc27508cf3 100644
--- a/arch/x86/include/asm/hpet.h
+++ b/arch/x86/include/asm/hpet.h
@@ -114,12 +114,25 @@ struct hpet_hld_data {
boolhas_periodic;
u32 num;
u64 ticks_per_second;
+   u32 handling_cpu;
+   u32 enabled_cpus;
+   struct msi_msg  msi_msg;
+   unsigned long   cpu_monitored_mask[0];
 };
 
 extern struct hpet_hld_data *hpet_hardlockup_detector_assign_timer(void);
+extern int hardlockup_detector_hpet_init(void);
+extern void hardlockup_detector_hpet_stop(void);
+extern void hardlockup_detector_hpet_enable(unsigned int cpu);
+extern void hardlockup_detector_hpet_disable(unsigned int cpu);
 #else
 static inline struct hpet_hld_data *hpet_hardlockup_detector_assign_timer(void)
 { return NULL; }
+static inline int hardlockup_detector_hpet_init(void)
+{ return -ENODEV; }
+static inline void hardlockup_detector_hpet_stop(void) {}
+static inline void hardlockup_detector_hpet_enable(unsigned int cpu) {}
+static inline void hardlockup_detector_hpet_disable(unsigned int cpu) {}
 #endif /* CONFIG_X86_HARDLOCKUP_DETECTOR_HPET */
 
 #else /* CONFIG_HPET_TIMER */
diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index 62e78a3fd31e..f9222769d84b 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -106,6 +106,7 @@ obj-$(CONFIG_VM86)  += vm86_32.o
 obj-$(CONFIG_EARLY_PRINTK) += early_printk.o
 
 obj-$(CONFIG_HPET_TIMER)   += hpet.o
+obj-$(CONFIG_X86_HARDLOCKUP_DETECTOR_HPET) += watchdog_hld_hpet.o
 obj-$(CONFIG_APB_TIMER)+= apb_timer.o
 
 obj-$(CONFIG_AMD_NB)   += amd_nb.o
diff --git a/arch/x86/kernel/hpet.c b/arch/x86/kernel/hpet.c
index 20a16a304f89..44459b36d333 100644
--- a/arch/x86/kernel/hpet.c
+++ b/arch/x86/kernel/hpet.c
@@ -181,7 +181,8 @@ struct