On Tue, 14 Dec 2021 at 17:28, Petr Pavlu <petr.pa...@suse.com> wrote: > When running Linux on a machine with GICv2, the kernel can crash while > processing an interrupt and can subsequently start a kdump kernel from > the active interrupt handler. In such a case, the crashed kernel might > not gracefully signal the end of interrupt to the GICv2 hardware. The > kdump kernel will however try to reset the GIC state on startup to get > the controller into a sane state, in particular the kernel writes ones > to GICD_ICACTIVERn and wipes out GICC_APRn to make sure that no > interrupt is active. > > The patch makes sure that this reset works for the GICv2 emulation in > QEMU too and the kdump kernel starts receiving interrupts. It adds > a logic to recalculate the running priority when GICC_APRn/GICC_NSAPRn > is written and implements read of GICC_IIDR so the kernel can recognize > that the GICv2 with GICC_APRn is present. > > The described scenario can be reproduced on an AArch64 QEMU virt machine > with a kdump-enabled Linux system by using the softdog module. The kdump > kernel will hang at some point because QEMU still thinks the running > priority is that of the timer interrupt and asserts no new interrupts to > the system: > $ modprobe softdog soft_margin=10 soft_panic=1 > $ cat > /dev/watchdog > [Press Enter to start the watchdog, wait for its timeout and observe > that the kdump kernel hangs on startup.]
Hi; thanks for this patch and sorry I haven't got to it earlier (I've been on holiday). Both the mishandling of the cached priority and the failure to implement GICC_IIDR are definitely bugs, but I think they are distinct bugs. Could you split this into a two patch series, please ? (If you don't have time to do the respin, let me know and I can do it at this end.) > Signed-off-by: Petr Pavlu <petr.pa...@suse.com> > --- > hw/intc/arm_gic.c | 6 ++++++ > 1 file changed, 6 insertions(+) > > diff --git a/hw/intc/arm_gic.c b/hw/intc/arm_gic.c > index a994b1f024..2edbc4cb46 100644 > --- a/hw/intc/arm_gic.c > +++ b/hw/intc/arm_gic.c > @@ -1662,6 +1662,10 @@ static MemTxResult gic_cpu_read(GICState *s, int cpu, > int offset, > } > break; > } > + case 0xfc: > + /* GICv1/2, ARM implementation */ > + *data = (s->revision << 16) + 0x43b; This is correct for GICv1 and GICv2, but not for 11MPCore, whose interrupt controller has no GICC_IIDR: https://developer.arm.com/documentation/ddi0360/e/mpcore-distributed-interrupt-controller/cpu-interface-registers-definition?lang=en So this should be if (s->revision == REV_11MPCORE) { /* Reserved on 11MPCore */ *data = 0; } else { /* GICv1 or v2; Arm implementation */ *data = (s->revision << 16) | 0x43b; } (also using '|' rather than '+' since we're assembling a value as a bunch of bit operations, not doing arithmetic on it. The end result is the same but I think '|' is clearer stylistically.) > + break; > default: > qemu_log_mask(LOG_GUEST_ERROR, > "gic_cpu_read: Bad offset %x\n", (int)offset); > @@ -1727,6 +1731,7 @@ static MemTxResult gic_cpu_write(GICState *s, int cpu, > int offset, > } else { > s->apr[regno][cpu] = value; > } > + s->running_priority[cpu] = gic_get_prio_from_apr_bits(s, cpu); > break; > } > case 0xe0: case 0xe4: case 0xe8: case 0xec: > @@ -1743,6 +1748,7 @@ static MemTxResult gic_cpu_write(GICState *s, int cpu, > int offset, > return MEMTX_OK; > } > s->nsapr[regno][cpu] = value; > + s->running_priority[cpu] = gic_get_prio_from_apr_bits(s, cpu); > break; > } > case 0x1000: These parts are correct and just need to be in their own patch. thanks -- PMM