On June 26, 2026 1:14:14 PM GMT+01:00, Petr Mladek <[email protected]> wrote: >On Fri 2026-06-26 12:23:50, Petr Mladek wrote: >> On Thu 2026-06-25 15:25:58, Bradley Morgan wrote: >> > panic_other_cpus_shutdown() handles SYS_INFO_ALL_BT before stopping >the >> > other CPUs. Do not ask sys_info() to handle that bit again later in >the >> > panic path. >> > >> > Use sys_info_with_filter() so panic_print=all_bt does not request more >> > output after the CPUs are stopped. >> > >> > Fixes: a9af76a78760 ("watchdog: add sys_info sysctls to dump sys info >on system lockup") >> > Cc: [email protected] >> > Signed-off-by: Bradley Morgan <[email protected]> >> > --- >> > kernel/panic.c | 2 +- >> > 1 file changed, 1 insertion(+), 1 deletion(-) >> > >> > diff --git a/kernel/panic.c b/kernel/panic.c >> > index 213725b612aa..eb842823df61 100644 >> > --- a/kernel/panic.c >> > +++ b/kernel/panic.c >> > @@ -680,7 +680,7 @@ void vpanic(const char *fmt, va_list args) >> > */ >> > atomic_notifier_call_chain(&panic_notifier_list, 0, buf); >> > >> > - sys_info(panic_print); >> > + sys_info_with_filter(panic_print, SYS_INFO_ALL_BT); >> >> Hmm, this prevents printing backtraces from all CPUs completely. >> But what if they were not printed? >> >> They might be printed by: >> >> static void panic_other_cpus_shutdown(bool crash_kexec) >> { >> if (panic_print & SYS_INFO_ALL_BT) >> panic_trigger_all_cpu_backtrace(); >> >> [...] >> } >> >> But it checks only "panic_print" variable. It won't do anything >> when (panic_print == 0). >> >> In this case, we might still want to print the backraces when >> SYS_INFO_ALL_BT is set in kernel_si_info. >> >> > kmsg_dump_desc(KMSG_DUMP_PANIC, buf); >> >> Of course, we might fix panic_other_cpus_shutdown() to check also >> kernel_si_info. >> >> But it all becomes very hairy. We have several levels: >> >> + watchdog-all_bt-specific option, e.g. >sysctl_hardlockup_all_cpu_backtrace >> >> + watchdog-specific si_info preferences, e.g. hardlockup_si_mask >> >> + panic-specific si_info: panic_print >> >> + universal fallback for any layer: kernel_si_info >> >> Now, we try to check all these variables back and forth to >> trigger all backtraces or to avoid triggering them. >> And it clearly does not work well and the code is more and more >> hairy. >> >> I think about another approach. The word "waterfall" comes to my mind. >> Instead of checking all the settings back and forth, let's process >> each setting one by one and just remember what has been done and >> skip this in the next level. >> >> All the si_info actions seems to dump a global system state. >> So, it would make sense to remember the state in a global variable >> even when it might be modified by more CPUs in parallel. >> >> I am going to think more about it. > >I have created a POC using Gemini. I haven't tested it. >But it looks acceptable. And the logic seems to be more >straightforward. > >One drawback is that it requires adding the _reset() >call for all sys_info() callers. It is fine in principle >but it might complicate back-porting because all changes >have to be done in one patch. > >But honestly, this is a nice to have fix. Most people could >live happily without it. > >From 3c66436d9978030845a96bfaedd6b914536e2ac4 Mon Sep 17 00:00:00 2001 >From: Petr Mladek <[email protected]> >Date: Fri, 26 Jun 2026 13:55:41 +0200 >Subject: [POC] sys_info: Introduce state-tracking APIs to prevent duplicate > backtraces > >In watchdog, panic, and hung task detection scenarios, sys_info() can >be called multiple times or alongside direct backtrace triggers like >trigger_allbutcpu_cpu_backtrace(). This results in identical backtraces >being dumped repeatedly from all CPUs, cluttering the kernel log and >delaying or obscuring critical debug details. > >Introduce a state tracking bitmask and associated helpers: >- sys_info_done(mask): Marks specific sys_info bits as already printed. >- sys_info_reset(): Resets the tracking state. >- sys_info_is_done(mask): Checks if all bits in the mask have been printed. > >Update sys_info() to automatically filter out already printed bits >using this state. Integrate these APIs with the generic hardlockup >and softlockup watchdogs, the PowerPC watchdog, the hung task detector, >and the panic core. This ensures that each piece of system information >and backtrace output is printed at most once per lockup/panic event, >and the state is reset cleanly when a lockup does not trigger a panic. > >Races between sys_info() callers are ignored. It should be acceptable >because the output from various watchdogs has never been synchronized. >And panic() never returns. > >Assisted-by: gemini-1.5-flash ?
Why not use gemini 3.5 flash? I can try if you want. Could I have the prompt you used? :) >Signed-off-by: Petr Mladek <[email protected]> >--- > arch/powerpc/kernel/watchdog.c | 13 ++++++++++--- > include/linux/sys_info.h | 3 +++ > kernel/hung_task.c | 2 ++ > kernel/panic.c | 4 +++- > kernel/watchdog.c | 10 ++++++++-- > lib/sys_info.c | 30 +++++++++++++++++++++++++++++- > 6 files changed, 55 insertions(+), 7 deletions(-) > >diff --git a/arch/powerpc/kernel/watchdog.c b/arch/powerpc/kernel/watchdog.c >index c40c69368476..0eab7894b9dc 100644 >--- a/arch/powerpc/kernel/watchdog.c >+++ b/arch/powerpc/kernel/watchdog.c >@@ -239,6 +239,7 @@ static void watchdog_smp_panic(int cpu) > if (sysctl_hardlockup_all_cpu_backtrace || > (hardlockup_si_mask & SYS_INFO_ALL_BT)) { > trigger_allbutcpu_cpu_backtrace(cpu); >+ sys_info_done(SYS_INFO_ALL_BT); > cpumask_clear(&wd_smp_cpus_ipi); > } else { > /* >@@ -251,10 +252,12 @@ static void watchdog_smp_panic(int cpu) > } > } > >- sys_info(hardlockup_si_mask & ~SYS_INFO_ALL_BT); >+ sys_info(hardlockup_si_mask); > if (hardlockup_panic) > nmi_panic(NULL, "Hard LOCKUP"); > >+ sys_info_reset(); >+ > wd_end_reporting(); > > return; >@@ -419,13 +422,17 @@ DEFINE_INTERRUPT_HANDLER_NMI(soft_nmi_interrupt) > xchg(&__wd_nmi_output, 1); // see wd_lockup_ipi > > if (sysctl_hardlockup_all_cpu_backtrace || >- (hardlockup_si_mask & SYS_INFO_ALL_BT)) >+ (hardlockup_si_mask & SYS_INFO_ALL_BT)) { > trigger_allbutcpu_cpu_backtrace(cpu); >+ sys_info_done(SYS_INFO_ALL_BT); >+ } > >- sys_info(hardlockup_si_mask & ~SYS_INFO_ALL_BT); >+ sys_info(hardlockup_si_mask); > if (hardlockup_panic) > nmi_panic(regs, "Hard LOCKUP"); > >+ sys_info_reset(); >+ > wd_end_reporting(); > } > /* >diff --git a/include/linux/sys_info.h b/include/linux/sys_info.h >index a5bc3ea3d44b..ad43548c75dd 100644 >--- a/include/linux/sys_info.h >+++ b/include/linux/sys_info.h >@@ -18,6 +18,9 @@ > #define SYS_INFO_BLOCKED_TASKS 0x00000080 > > void sys_info(unsigned long si_mask); >+void sys_info_done(unsigned long si_mask); >+void sys_info_reset(void); >+bool sys_info_is_done(unsigned long si_mask); > unsigned long sys_info_parse_param(char *str); > > #ifdef CONFIG_SYSCTL >diff --git a/kernel/hung_task.c b/kernel/hung_task.c >index 6fcc94ce4ca9..dbb6a27770f5 100644 >--- a/kernel/hung_task.c >+++ b/kernel/hung_task.c >@@ -354,6 +354,8 @@ static void check_hung_uninterruptible_tasks(unsigned long >timeout) > > if (hung_task_call_panic) > panic("hung_task: blocked tasks"); >+ >+ sys_info_reset(); > } > > static long hung_timeout_jiffies(unsigned long last_checked, >diff --git a/kernel/panic.c b/kernel/panic.c >index 213725b612aa..86ce17f03da2 100644 >--- a/kernel/panic.c >+++ b/kernel/panic.c >@@ -550,8 +550,10 @@ static void panic_trigger_all_cpu_backtrace(void) > */ > static void panic_other_cpus_shutdown(bool crash_kexec) > { >- if (panic_print & SYS_INFO_ALL_BT) >+ if ((panic_print & SYS_INFO_ALL_BT) && >!sys_info_is_done(SYS_INFO_ALL_BT)) { > panic_trigger_all_cpu_backtrace(); >+ sys_info_done(SYS_INFO_ALL_BT); >+ } > > /* > * Note that smp_send_stop() is the usual SMP shutdown function, >diff --git a/kernel/watchdog.c b/kernel/watchdog.c >index 87dd5e0f6968..f431087c68a7 100644 >--- a/kernel/watchdog.c >+++ b/kernel/watchdog.c >@@ -282,14 +282,17 @@ void watchdog_hardlockup_check(unsigned int cpu, struct >pt_regs *regs) > > if (hardlockup_all_cpu_backtrace) { > trigger_allbutcpu_cpu_backtrace(cpu); >+ sys_info_done(SYS_INFO_ALL_BT); > if (!hardlockup_panic) > clear_bit_unlock(0, &hard_lockup_nmi_warn); > } > >- sys_info(hardlockup_si_mask & ~SYS_INFO_ALL_BT); >+ sys_info(hardlockup_si_mask); > if (hardlockup_panic) > nmi_panic(regs, "Hard LOCKUP"); > >+ sys_info_reset(); >+ > per_cpu(watchdog_hardlockup_warned, cpu) = true; > } > >@@ -895,16 +898,19 @@ static enum hrtimer_restart watchdog_timer_fn(struct >hrtimer *hrtimer) > > if (softlockup_all_cpu_backtrace) { > trigger_allbutcpu_cpu_backtrace(smp_processor_id()); >+ sys_info_done(SYS_INFO_ALL_BT); > if (!softlockup_panic) > clear_bit_unlock(0, &soft_lockup_nmi_warn); > } > > add_taint(TAINT_SOFTLOCKUP, LOCKDEP_STILL_OK); >- sys_info(softlockup_si_mask & ~SYS_INFO_ALL_BT); >+ sys_info(softlockup_si_mask); > thresh_count = duration / get_softlockup_thresh(); > > if (softlockup_panic && thresh_count >= softlockup_panic) > panic("softlockup: hung tasks"); >+ >+ sys_info_reset(); > } > > return HRTIMER_RESTART; >diff --git a/lib/sys_info.c b/lib/sys_info.c >index f32a06ec9ed4..f8e6176fae75 100644 >--- a/lib/sys_info.c >+++ b/lib/sys_info.c >@@ -160,7 +160,35 @@ static void __sys_info(unsigned long si_mask) > show_state_filter(TASK_UNINTERRUPTIBLE); > } > >+static unsigned long sys_info_done_mask; >+ >+void sys_info_done(unsigned long si_mask) >+{ >+ sys_info_done_mask |= si_mask; >+} >+ >+void sys_info_reset(void) >+{ >+ sys_info_done_mask = 0; >+} >+ >+bool sys_info_is_done(unsigned long si_mask) >+{ >+ return (sys_info_done_mask & si_mask) == si_mask; >+} >+ > void sys_info(unsigned long si_mask) > { >- __sys_info(si_mask ? : kernel_si_mask); >+ unsigned long mask; >+ >+ if (si_mask) >+ mask = si_mask & ~sys_info_done_mask; >+ else >+ mask = kernel_si_mask & ~sys_info_done_mask; >+ >+ if (!mask) >+ return; >+ >+ __sys_info(mask); >+ sys_info_done(mask); > } > Thanks!
