On Thu, Oct 24, 2019 at 04:27:16PM +0200, Greg Kurz wrote: > The interrupt presenters are currently parented to their associated > VCPU, and we rely on CPU_FOREACH() when we need to perform a specific > task with them. Like exposing their state with 'info pic', or finding > the target VCPU for an interrupt when using the XIVE controller. > > We recently realized that the latter could crash QEMU because CPU_FOREACH() > can race with CPU hotplug. This got fixed by checking the presenter pointer > under the CPU was set (commit 627fa61746f7), but I'm not that sure that > this is enough since the presenter pointers also get stale at some point > during CPU unplug. And we still have other users of CPU_FOREACH(), namely > 'info pic' with both XICS and XIVE, that have the very same problem: > > With XIVE: > > Thread 1 "qemu-system-ppc" received signal SIGSEGV, Segmentation fault. > 0x00000001003d2848 in xive_tctx_pic_print_info (tctx=0x101ae5280, > mon=0x7fffffffe180) at /home/greg/Work/qemu/qemu-spapr/hw/intc/xive.c:526 > 526 int cpu_index = tctx->cs ? tctx->cs->cpu_index : -1; > (gdb) p tctx > $1 = (XiveTCTX *) 0x101ae5280 > (gdb) p tctx->cs > $2 = (CPUState *) 0x2057512020203a5d <-- tctx is stale > (gdb) p tctx->cs->cpu_index > Cannot access memory at address 0x205751202020bead > > With XICS: > > Thread 1 "qemu-system-ppc" received signal SIGSEGV, Segmentation fault. > 0x00000001003cc39c in icp_pic_print_info (icp=0x10244ccf0, mon=0x7fffffffe940) > at /home/greg/Work/qemu/qemu-spapr/hw/intc/xics.c:47 > 47 int cpu_index = icp->cs ? icp->cs->cpu_index : -1; > (gdb) p icp > $1 = (ICPState *) 0x10244ccf0 > (gdb) p icp->cs > $2 = (CPUState *) 0x524958203220 <-- icp is stale > (gdb) p icp->cs->cpu_index > Cannot access memory at address 0x52495820b670 > > It may be worth finding a way to address this globally instead of > open-coding the check of the presenter pointer everywhere because > this is fragile. I gave a try with this series: > > [0/6] ppc: Reparent the interrupt presenter > > https://patchwork.ozlabs.org/cover/1182224/ > > but it requires some more reflexion. Also, we're about to enter > softfreeze, and it seems better to come up with a simpler fix. > > Let's forget the reparenting and check the presenter pointers > where needed instead. Patch 1 from the previous series was changed > to also NULLify presenter pointers, so that they can be used to > filter out unwanted vCPUs in patch 3. I've kept patch 2 because > it's a fix in the same area, but it isn't related to the QEMU > crashes.
Applied to ppc-for-4.2, thanks. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson
signature.asc
Description: PGP signature