On Thu, 29 Oct 2020 at 22:09, Daniel Thompson <daniel.thomp...@linaro.org> wrote: > > On Thu, Oct 29, 2020 at 04:22:34PM +0000, Daniel Thompson wrote: > > On Thu, Oct 29, 2020 at 08:26:27PM +0530, Sumit Garg wrote: > > > arm64 platforms with GICv3 or later supports pseudo NMIs which can be > > > leveraged to roundup CPUs which are stuck in hard lockup state with > > > interrupts disabled that wouldn't be possible with a normal IPI. > > > > > > So instead switch to roundup CPUs using IPI turned as NMI. And in > > > case a particular arm64 platform doesn't supports pseudo NMIs, > > > it will switch back to default kgdb CPUs roundup mechanism. > > > > > > Signed-off-by: Sumit Garg <sumit.g...@linaro.org> > > > --- > > > arch/arm64/include/asm/kgdb.h | 9 +++++++++ > > > arch/arm64/kernel/ipi_nmi.c | 5 +++++ > > > arch/arm64/kernel/kgdb.c | 35 +++++++++++++++++++++++++++++++++++ > > > 3 files changed, 49 insertions(+) > > > > > > diff --git a/arch/arm64/include/asm/kgdb.h b/arch/arm64/include/asm/kgdb.h > > > index 21fc85e..c3d2425 100644 > > > --- a/arch/arm64/include/asm/kgdb.h > > > +++ b/arch/arm64/include/asm/kgdb.h > > > @@ -24,6 +24,15 @@ static inline void arch_kgdb_breakpoint(void) > > > extern void kgdb_handle_bus_error(void); > > > extern int kgdb_fault_expected; > > > > > > +#ifdef CONFIG_KGDB > > > +extern bool kgdb_ipi_nmicallback(int cpu, void *regs); > > > +#else > > > +static inline bool kgdb_ipi_nmicallback(int cpu, void *regs) > > > +{ > > > + return false; > > > +} > > > +#endif > > > + > > > #endif /* !__ASSEMBLY__ */ > > > > > > /* > > > diff --git a/arch/arm64/kernel/ipi_nmi.c b/arch/arm64/kernel/ipi_nmi.c > > > index 597dcf7..6ace182 100644 > > > --- a/arch/arm64/kernel/ipi_nmi.c > > > +++ b/arch/arm64/kernel/ipi_nmi.c > > > @@ -8,6 +8,7 @@ > > > > > > #include <linux/interrupt.h> > > > #include <linux/irq.h> > > > +#include <linux/kgdb.h> > > > #include <linux/nmi.h> > > > #include <linux/smp.h> > > > > > > @@ -45,10 +46,14 @@ bool arch_trigger_cpumask_backtrace(const cpumask_t > > > *mask, bool exclude_self) > > > static irqreturn_t ipi_nmi_handler(int irq, void *data) > > > { > > > irqreturn_t ret = IRQ_NONE; > > > + unsigned int cpu = smp_processor_id(); > > > > > > if (nmi_cpu_backtrace(get_irq_regs())) > > > ret = IRQ_HANDLED; > > > > > > + if (kgdb_ipi_nmicallback(cpu, get_irq_regs())) > > > + ret = IRQ_HANDLED; > > > + > > > return ret; > > > > It would be better to declare existing return value for > > kgdb_nmicallback() to be dangerously stupid and fix it so it returns an > > irqreturn_t (that's easy since most callers do not need to check the > > return value). > > > > Then this code simply becomes: > > > > return kgdb_nmicallback(cpu, get_irq_regs()); > > Actually, reflecting on this maybe it is better to keep kgdb_nmicallin() > and kgdb_nmicallback() aligned w.r.t. return codes (even if they are a > little unusual). > > I'm still not sure why we'd keep kgdb_ipi_nmicallback() though. > kgdb_nmicallback() is intended to be called from arch code... >
I added kgdb_ipi_nmicallback() just to add a check for "kgdb_active" prior to entry into kgdb as here we are sharing NMI among backtrace and kgdb. But after your comments, I looked carefully into kgdb_nmicallback() and I see the "raw_spin_is_locked(&dbg_master_lock)" check as well. So it looked sufficient to me for calling kgdb_nmicallback() directly from the arch code and hence I will remove kgdb_ipi_nmicallback() in the next version. > > Daniel. > > > > > > > > > } > > > > > > diff --git a/arch/arm64/kernel/kgdb.c b/arch/arm64/kernel/kgdb.c > > > index 1a157ca3..c26e710 100644 > > > --- a/arch/arm64/kernel/kgdb.c > > > +++ b/arch/arm64/kernel/kgdb.c > > > @@ -17,6 +17,7 @@ > > > > > > #include <asm/debug-monitors.h> > > > #include <asm/insn.h> > > > +#include <asm/nmi.h> > > > #include <asm/traps.h> > > > > > > struct dbg_reg_def_t dbg_reg_def[DBG_MAX_REG_NUM] = { > > > @@ -353,3 +354,37 @@ int kgdb_arch_remove_breakpoint(struct kgdb_bkpt > > > *bpt) > > > return aarch64_insn_write((void *)bpt->bpt_addr, > > > *(u32 *)bpt->saved_instr); > > > } > > > + > > > +bool kgdb_ipi_nmicallback(int cpu, void *regs) > > > +{ > > > + if (atomic_read(&kgdb_active) != -1) { > > > + kgdb_nmicallback(cpu, regs); > > > + return true; > > > + } > > > + > > > + return false; > > > +} > > > > I *really* don't like this function. > > > > If the return code of kgdb_nmicallback() is broken then fix it, don't > > just wrap it and invent a new criteria for the return code. > > > > To be honest I don't actually think the logic in kgdb_nmicallback() is > > broken. As mentioned above the return value has a weird definition (0 > > for "handled it OK" and 1 for "nothing for me to do") but the logic to > > calculate the return code looks OK. > > Makes sense, will remove it instead. > > > > > + > > > +static void kgdb_smp_callback(void *data) > > > +{ > > > + unsigned int cpu = smp_processor_id(); > > > + > > > + if (atomic_read(&kgdb_active) != -1) > > > + kgdb_nmicallback(cpu, get_irq_regs()); > > > +} > > > > This is Unused. I presume it is litter from a previous revision of the > > code and can be deleted? > > Yeah. > > > > > + > > > +bool kgdb_arch_roundup_cpus(void) > > > +{ > > > + struct cpumask mask; > > > + > > > + if (!arm64_supports_nmi()) > > > + return false; > > > + > > > + cpumask_copy(&mask, cpu_online_mask); > > > + cpumask_clear_cpu(raw_smp_processor_id(), &mask); > > > + if (cpumask_empty(&mask)) > > > + return false; > > > > Why do we need to fallback if there is no work to do? There will still > > be no work to do when we call the fallback. Okay, won't switch back to fallback mode here. -Sumit > > > > > > Daniel. _______________________________________________ Kgdb-bugreport mailing list Kgdb-bugreport@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kgdb-bugreport