[PATCHv13 5/5] watchdog/softlockup: report the most frequent interrupts

2024-04-11 Thread Bitao Hu
irq#102 ... [ 638.875313] Call trace: [ 638.875315] __do_softirq+0xa8/0x364 Signed-off-by: Bitao Hu Reviewed-by: Liu Song Reviewed-by: Douglas Anderson --- kernel/watchdog.c | 116 -- 1 file changed, 112 insertions(+), 4 deletions

[PATCHv13 4/5] watchdog/softlockup: low-overhead detection of interrupt storm

2024-04-11 Thread Bitao Hu
er of CPUs is <= 128. Signed-off-by: Bitao Hu Reviewed-by: Douglas Anderson Reviewed-by: Liu Song --- kernel/watchdog.c | 99 ++- lib/Kconfig.debug | 14 +++ 2 files changed, 112 insertions(+), 1 deletion(-) diff --git a/kernel/watchd

[PATCHv13 3/5] genirq: Avoid summation loops for /proc/interrupts

2024-04-11 Thread Bitao Hu
logic is already implemented in kstat_irqs(). Split the inner access logic out of kstat_irqs() and use it for kstat_irqs() and show_interrupts() to avoid the accumulation loop when possible. Originally-by: Thomas Gleixner Signed-off-by: Bitao Hu Reviewed-by: Liu Song Reviewed-by: Douglas

[PATCHv13 2/5] genirq: Provide a snapshot mechanism for interrupt statistics

2024-04-11 Thread Bitao Hu
to the per CPU irq_desc::kstat_irq structure and provide interfaces to take a snapshot of all interrupts on the current CPU and to retrieve the delta of a specific interrupt later on. Originally-by: Thomas Gleixner Signed-off-by: Bitao Hu --- include/linux/irqdesc.h | 4 include/linux

[PATCHv13 0/5] *** Detect interrupt storm in softlockup ***

2024-04-11 Thread Bitao Hu
ustats. With the maximum number of CPUs, that's now this. 2 * 8192 * 4 + 1 * 8192 * 5 * 4 + 1 * 8192 = 237,568 bytes. - From Liu Song, refactor the code format and add necessary comments. - From Douglas, use interrupt counts instead of interrupt time to determine the cause of softlockup. - Re

[PATCHv13 1/5] genirq: Convert kstat_irqs to a struct

2024-04-11 Thread Bitao Hu
containing only the count. Originally-by: Thomas Gleixner Signed-off-by: Bitao Hu --- arch/mips/dec/setup.c| 2 +- arch/parisc/kernel/smp.c | 2 +- arch/powerpc/kvm/book3s_hv_rm_xics.c | 2 +- include/linux/irqdesc.h | 12 ++-- kernel/irq/internals.h

Re: [PATCHv12 1/4] genirq: Provide a snapshot mechanism for interrupt statistics

2024-04-10 Thread Bitao Hu
issues with this set of patches? I hope we can resolve all points of contention in v13. Best Regards, Bitao Hu

Re: [PATCHv12 4/4] watchdog/softlockup: report the most frequent interrupts

2024-03-25 Thread Bitao Hu
Hi, Thomas On 2024/3/24 04:43, Thomas Gleixner wrote: On Wed, Mar 06 2024 at 20:52, Bitao Hu wrote: + if (__this_cpu_read(snapshot_taken)) { + for_each_active_irq(i) { + count = kstat_get_irq_since_snapshot(i

[PATCHv12 4/4] watchdog/softlockup: report the most frequent interrupts

2024-03-06 Thread Bitao Hu
irq#102 ... [ 638.875313] Call trace: [ 638.875315] __do_softirq+0xa8/0x364 Signed-off-by: Bitao Hu Reviewed-by: Liu Song Reviewed-by: Douglas Anderson --- kernel/watchdog.c | 115 -- 1 file changed, 111 insertions(+), 4 deletions

[PATCHv12 3/4] watchdog/softlockup: low-overhead detection of interrupt storm

2024-03-06 Thread Bitao Hu
er of CPUs is <= 128. Signed-off-by: Bitao Hu Reviewed-by: Douglas Anderson Reviewed-by: Liu Song --- kernel/watchdog.c | 98 ++- lib/Kconfig.debug | 14 +++ 2 files changed, 111 insertions(+), 1 deletion(-) diff --git a/kernel/watchd

[PATCHv12 2/4] genirq: Avoid summation loops for /proc/interrupts

2024-03-06 Thread Bitao Hu
logic is already implemented in kstat_irqs(). Split the inner access logic out of kstat_irqs() and use it for kstat_irqs() and show_interrupts() to avoid the accumulation loop when possible. Originally-by: Thomas Gleixner Signed-off-by: Bitao Hu Reviewed-by: Liu Song Reviewed-by: Douglas

[PATCHv12 1/4] genirq: Provide a snapshot mechanism for interrupt statistics

2024-03-06 Thread Bitao Hu
::kstat_irq member to a data structure which contains the counter plus a snapshot member and provide interfaces to take a snapshot of all interrupts on the current CPU and to retrieve the delta of a specific interrupt later on. Originally-by: Thomas Gleixner Signed-off-by: Bitao Hu Reviewed

[PATCHv12 0/4] *** Detect interrupt storm in softlockup ***

2024-03-06 Thread Bitao Hu
nstead of interrupt time to determine the cause of softlockup. - Remove the cmdline parameter added in PATCHv1. Bitao Hu (4): genirq: Provide a snapshot mechanism for interrupt statistics genirq: Avoid summation loops for /proc/interrupts watchdog/softlockup: low-overhead detection o

Re: [PATCHv11 2/4] genirq: Provide a snapshot mechanism for interrupt statistics

2024-03-06 Thread Bitao Hu
to the patch which adds the lockup detector parts. OK, I will implement this in the next version. Best Regards, Bitao Hu

Re: [PATCHv11 2/4] genirq: Provide a snapshot mechanism for interrupt statistics

2024-03-05 Thread Bitao Hu
posing was to directly disable "GENERIC_IRQ_STAT_SNAPSHOT" when "SOFTLOCKUP_DETECTOR_INTR_STORM" is not enabled, as a way to save memory. If my current understanding is correct, then the code for that part would look something like the following. Does this align with your expectations? B

Re: [PATCHv11 2/4] genirq: Provide a snapshot mechanism for interrupt statistics

2024-03-04 Thread Bitao Hu
APSHOT" while enabling "config SOFTLOCKUP_DETECTOR_INTR_STORM". Best Regards, Bitao Hu diff --git a/kernel/irq/Kconfig b/kernel/irq/Kconfig index 2531f3496ab6..9cf3b2d4c2a8 100644 --- a/kernel/irq/Kconfig +++ b/kernel/irq/Kconfig @@ -108,6 +108,15 @@ config GENERIC_IRQ_M

[PATCHv11 4/4] watchdog/softlockup: report the most frequent interrupts

2024-02-27 Thread Bitao Hu
irq#102 ... [ 638.875313] Call trace: [ 638.875315] __do_softirq+0xa8/0x364 Signed-off-by: Bitao Hu Reviewed-by: Liu Song --- kernel/watchdog.c | 115 -- 1 file changed, 111 insertions(+), 4 deletions(-) diff --git a/kernel/watchdog.c b

[PATCHv11 3/4] genirq: Avoid summation loops for /proc/interrupts

2024-02-27 Thread Bitao Hu
logic is already implemented in kstat_irqs(). Split the inner access logic out of kstat_irqs() and use it for kstat_irqs() and show_interrupts() to avoid the accumulation loop when possible. Originally-by: Thomas Gleixner Signed-off-by: Bitao Hu Reviewed-by: Liu Song --- kernel/irq

[PATCHv11 2/4] genirq: Provide a snapshot mechanism for interrupt statistics

2024-02-27 Thread Bitao Hu
::kstat_irq member to a data structure which contains the counter plus a snapshot member and provide interfaces to take a snapshot of all interrupts on the current CPU and to retrieve the delta of a specific interrupt later on. Originally-by: Thomas Gleixner Signed-off-by: Bitao Hu Reviewed

[PATCHv11 1/4] watchdog/softlockup: low-overhead detection of interrupt storm

2024-02-27 Thread Bitao Hu
er of CPUs is <= 128. Signed-off-by: Bitao Hu Reviewed-by: Douglas Anderson Reviewed-by: Liu Song --- kernel/watchdog.c | 98 ++- lib/Kconfig.debug | 13 +++ 2 files changed, 110 insertions(+), 1 deletion(-) diff --git a/kernel/watchd

[PATCHv11 0/4] *** Detect interrupt storm in softlockup ***

2024-02-27 Thread Bitao Hu
ustats. With the maximum number of CPUs, that's now this. 2 * 8192 * 4 + 1 * 8192 * 5 * 4 + 1 * 8192 = 237,568 bytes. - From Liu Song, refactor the code format and add necessary comments. - From Douglas, use interrupt counts instead of interrupt time to determine the cause of softlockup. - Remove

Re: [PATCHv10 3/4] genirq: Avoid summation loops for /proc/interrupts

2024-02-27 Thread Bitao Hu
On 2024/2/27 23:39, Thomas Gleixner wrote: On Tue, Feb 27 2024 at 19:20, Bitao Hu wrote: On 2024/2/27 17:26, Thomas Gleixner wrote: and then let kstat_irqs() and show_interrupts() use it. See? I have a concern. kstat_irqs() uses for_each_possible_cpu() for summation. However

Re: [PATCHv10 3/4] genirq: Avoid summation loops for /proc/interrupts

2024-02-27 Thread Bitao Hu
Hi, On 2024/2/27 17:26, Thomas Gleixner wrote: On Mon, Feb 26 2024 at 10:09, Bitao Hu wrote: We could use the irq_desc::tot_count member to avoid the summation loop for interrupts which are not marked as 'PER_CPU' interrupts in 'show_interrupts'. This could reduce the time overhead of reading

[PATCHv10 4/4] watchdog/softlockup: report the most frequent interrupts

2024-02-25 Thread Bitao Hu
irq#102 ... [ 638.875313] Call trace: [ 638.875315] __do_softirq+0xa8/0x364 Signed-off-by: Bitao Hu --- kernel/watchdog.c | 115 -- 1 file changed, 111 insertions(+), 4 deletions(-) diff --git a/kernel/watchdog.c b/kernel/watchdog.c index

[PATCHv10 3/4] genirq: Avoid summation loops for /proc/interrupts

2024-02-25 Thread Bitao Hu
We could use the irq_desc::tot_count member to avoid the summation loop for interrupts which are not marked as 'PER_CPU' interrupts in 'show_interrupts'. This could reduce the time overhead of reading /proc/interrupts. Originally-by: Thomas Gleixner Signed-off-by: Bitao Hu --- include/linux

[PATCHv10 2/4] genirq: Provide a snapshot mechanism for interrupt statistics

2024-02-25 Thread Bitao Hu
::kstat_irq member to a data structure which contains the counter plus a snapshot member and provide interfaces to take a snapshot of all interrupts on the current CPU and to retrieve the delta of a specific interrupt later on. Originally-by: Thomas Gleixner Signed-off-by: Bitao Hu --- arch/mips

[PATCHv10 1/4] watchdog/softlockup: low-overhead detection of interrupt storm

2024-02-25 Thread Bitao Hu
er of CPUs is <= 128. Signed-off-by: Bitao Hu Reviewed-by: Douglas Anderson Reviewed-by: Liu Song --- kernel/watchdog.c | 98 ++- lib/Kconfig.debug | 13 +++ 2 files changed, 110 insertions(+), 1 deletion(-) diff --git a/kernel/watchd

[PATCHv10 0/4] *** Detect interrupt storm in softlockup ***

2024-02-25 Thread Bitao Hu
CPUs, that's now this. 2 * 8192 * 4 + 1 * 8192 * 5 * 4 + 1 * 8192 = 237,568 bytes. - From Liu Song, refactor the code format and add necessary comments. - From Douglas, use interrupt counts instead of interrupt time to determine the cause of softlockup. - Remove the cmdline parameter added in PATCHv

Re: [PATCHv9 2/3] irq: use a struct for the kstat_irqs in the interrupt descriptor

2024-02-22 Thread Bitao Hu
Hi, On 2024/2/22 21:22, Thomas Gleixner wrote: On Thu, Feb 22 2024 at 17:34, Bitao Hu wrote: First of all the subsystem prefix is 'genirq:'. 'git log kernel/irq/' gives you a pretty good hint. It's documented Secondly the subject line does not match what this patch is about. It's

[PATCHv9 3/3] watchdog/softlockup: report the most frequent interrupts

2024-02-22 Thread Bitao Hu
irq#102 ... [ 638.875313] Call trace: [ 638.875315] __do_softirq+0xa8/0x364 Signed-off-by: Bitao Hu --- kernel/watchdog.c | 115 -- 1 file changed, 111 insertions(+), 4 deletions(-) diff --git a/kernel/watchdog.c b/kernel/watchdog.c index

[PATCHv9 2/3] irq: use a struct for the kstat_irqs in the interrupt descriptor

2024-02-22 Thread Bitao Hu
and providing sensible interfaces for the watchdog code can keep it self contained to the interrupt core code. Signed-off-by: Bitao Hu --- arch/mips/dec/setup.c| 2 +- arch/parisc/kernel/smp.c | 2 +- arch/powerpc/kvm/book3s_hv_rm_xics.c | 2 +- include/linux/irqdesc.h

[PATCHv9 1/3] watchdog/softlockup: low-overhead detection of interrupt storm

2024-02-22 Thread Bitao Hu
er of CPUs is <= 128. Signed-off-by: Bitao Hu Reviewed-by: Douglas Anderson Reviewed-by: Liu Song --- kernel/watchdog.c | 98 ++- lib/Kconfig.debug | 13 +++ 2 files changed, 110 insertions(+), 1 deletion(-) diff --git a/kernel/watchd

[PATCHv9 0/3] *** Detect interrupt storm in softlockup ***

2024-02-22 Thread Bitao Hu
ary comments. - From Douglas, use interrupt counts instead of interrupt time to determine the cause of softlockup. - Remove the cmdline parameter added in PATCHv1. Bitao Hu (3): watchdog/softlockup: low-overhead detection of interrupt storm irq: use a struct for the kstat_irqs in the int

Re: [PATCHv8 2/2] watchdog/softlockup: report the most frequent interrupts

2024-02-20 Thread Bitao Hu
Hi, On 2024/2/20 17:35, Thomas Gleixner wrote: On Tue, Feb 20 2024 at 00:19, Bitao Hu wrote: arch/mips/dec/setup.c| 2 +- arch/parisc/kernel/smp.c | 2 +- arch/powerpc/kvm/book3s_hv_rm_xics.c | 2 +- include/linux/irqdesc.h | 9

[PATCHv8 1/2] watchdog/softlockup: low-overhead detection of interrupt

2024-02-19 Thread Bitao Hu
er of CPUs is <= 128. Signed-off-by: Bitao Hu Reviewed-by: Douglas Anderson Reviewed-by: Liu Song --- kernel/watchdog.c | 98 ++- lib/Kconfig.debug | 13 +++ 2 files changed, 110 insertions(+), 1 deletion(-) diff --git a/kernel/watchd

[PATCHv8 0/2] *** Detect interrupt storm in softlockup ***

2024-02-19 Thread Bitao Hu
192 * 5 * 4 + 1 * 8192 = 237,568 bytes. - From Liu Song, refactor the code format and add necessary comments. - From Douglas, use interrupt counts instead of interrupt time to determine the cause of softlockup. - Remove the cmdline parameter added in PATCHv1. Bitao Hu (2): watchdog/softlo

[PATCHv8 2/2] watchdog/softlockup: report the most frequent interrupts

2024-02-19 Thread Bitao Hu
... [ 2987.492728] Call trace: [ 2987.492729] __do_softirq+0xa8/0x364 Signed-off-by: Bitao Hu --- arch/mips/dec/setup.c| 2 +- arch/parisc/kernel/smp.c | 2 +- arch/powerpc/kvm/book3s_hv_rm_xics.c | 2 +- include/linux/irqdesc.h | 9 ++- include/linux