With pseudo NMIs support available its possible to configure SGIs to be triggered as pseudo NMIs running in NMI context. And kernel features such as kgdb relies on NMI support to round up CPUs which are stuck in hard lockup state with interrupts disabled.
This patch-set adds support for IPI_CALL_NMI_FUNC which can be triggered as a pseudo NMI which in turn is leveraged via kgdb to round up CPUs. After this patch-set we should be able to get a backtrace for a CPU stuck in HARDLOCKUP. Have a look at an example below from a testcase run on Developerbox: $ echo HARDLOCKUP > /sys/kernel/debug/provoke-crash/DIRECT # Enter kdb via Magic SysRq [11]kdb> btc btc: cpu status: Currently on cpu 10 Available cpus: 0-7(I), 8, 9(I), 10, 11-23(I) <snip> Stack traceback for pid 619 0xffff000871bc9c00 619 618 1 8 R 0xffff000871bca5c0 bash CPU: 8 PID: 619 Comm: bash Not tainted 5.7.0-rc6-00762-g3804420 #77 Hardware name: Socionext SynQuacer E-series DeveloperBox, BIOS build #73 Apr 6 2020 Call trace: dump_backtrace+0x0/0x198 show_stack+0x18/0x28 dump_stack+0xb8/0x100 kgdb_cpu_enter+0x5c0/0x5f8 kgdb_nmicallback+0xa0/0xa8 ipi_kgdb_nmicallback+0x24/0x30 ipi_handler+0x160/0x1b8 handle_percpu_devid_fasteoi_ipi+0x44/0x58 generic_handle_irq+0x30/0x48 handle_domain_nmi+0x44/0x80 gic_handle_irq+0x140/0x2a0 el1_irq+0xcc/0x180 lkdtm_HARDLOCKUP+0x10/0x18 direct_entry+0x124/0x1c0 full_proxy_write+0x60/0xb0 __vfs_write+0x1c/0x48 vfs_write+0xe4/0x1d0 ksys_write+0x6c/0xf8 __arm64_sys_write+0x1c/0x28 el0_svc_common.constprop.0+0x74/0x1f0 do_el0_svc+0x24/0x90 el0_sync_handler+0x178/0x2b8 el0_sync+0x158/0x180 Changes in v3: - Rebased to Marc's latest IPIs patch-set [1]. [1] https://lkml.org/lkml/2020/9/1/603 Changes since RFC version [1]: - Switch to use generic interrupt framework to turn an IPI as NMI. - Dependent on Marc's patch-set [2] which turns IPIs into normal interrupts. - Addressed misc. comments from Doug on patch #4. - Posted kgdb NMI printk() fixup separately which has evolved since to be solved using different approach via changing kgdb interception of printk() in common printk() code (see patch [3]). [1] https://lkml.org/lkml/2020/4/24/328 [2] https://lkml.org/lkml/2020/5/19/710 [3] https://lkml.org/lkml/2020/5/20/418 Sumit Garg (4): arm64: smp: Introduce a new IPI as IPI_CALL_NMI_FUNC irqchip/gic-v3: Enable support for SGIs to act as NMIs arm64: smp: Setup IPI_CALL_NMI_FUNC as a pseudo NMI arm64: kgdb: Round up cpus using IPI_CALL_NMI_FUNC arch/arm64/include/asm/kgdb.h | 8 +++++++ arch/arm64/include/asm/smp.h | 1 + arch/arm64/kernel/kgdb.c | 21 ++++++++++++++++++ arch/arm64/kernel/smp.c | 50 ++++++++++++++++++++++++++++++++++--------- drivers/irqchip/irq-gic-v3.c | 13 +++++++++-- 5 files changed, 81 insertions(+), 12 deletions(-) -- 2.7.4 _______________________________________________ Kgdb-bugreport mailing list Kgdb-bugreport@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kgdb-bugreport