On 01/10/2019 13:06, Greg Kurz wrote: > On Tue, 1 Oct 2019 10:57:22 +0200 > Cédric Le Goater <c...@kaod.org> wrote: > >> When vCPUs are hotplugged, they are added to the QEMU CPU list before >> being fully realized. This can crash the XIVE presenter because the >> 'tctx' pointer is not necessarily initialized when looking for a >> matching target. >> > > Ouch... :-\ > >> These vCPUs are not valid targets for the presenter. Skip them. >> > > This likely fixes this specific issue, but more generally, this > seems to indicate that using CPU_FOREACH() is rather fragile. > > What about tracking XIVE TM contexts with a QLIST in xive.c ?
This is a good idea. On HW, the thread interrupt contexts belong to the XIVE presenter subengine. This is the logic doing the CAM line matching to find a target for an event notification. So we should model the context list below the XiveRouter in QEMU which models both router and presenter subengines. We have done without a presenter model for the moment and I don't think we will need to introduce one. This would be a nice improvements of my patchset adding support for xive escalations and better support of multi chip systems. I have introduced a PNV_CHIP_CPU_FOREACH in this patchset which would become useless with a list of tctx under the XIVE interrupt controller, XiveRouter, SpaprXive, PnvXive. Next step would be to get rid of the tctx->cs pointer. In my latest patches, it is only used to calculate the HW CAM line. There might be some consequences on the object hierarchy and it will break migration. Thanks, C. > > ================================================================================ > diff --git a/include/hw/ppc/xive.h b/include/hw/ppc/xive.h > index 6d38755f8459..89b9ef7f20b1 100644 > --- a/include/hw/ppc/xive.h > +++ b/include/hw/ppc/xive.h > @@ -319,6 +319,8 @@ typedef struct XiveTCTX { > qemu_irq os_output; > > uint8_t regs[XIVE_TM_RING_COUNT * XIVE_TM_RING_SIZE]; > + > + QTAILQ_ENTRY(XiveTCTX) next; > } XiveTCTX; > > /* > diff --git a/hw/intc/xive.c b/hw/intc/xive.c > index b7417210d817..f7721c711041 100644 > --- a/hw/intc/xive.c > +++ b/hw/intc/xive.c > @@ -568,6 +568,8 @@ static void xive_tctx_reset(void *dev) > ipb_to_pipr(tctx->regs[TM_QW3_HV_PHYS + TM_IPB]); > } > > +static QTAILQ_HEAD(, XiveTCTX) xive_tctx_list = > QTAILQ_HEAD_INITIALIZER(xive_tctx_list); > + > static void xive_tctx_realize(DeviceState *dev, Error **errp) > { > XiveTCTX *tctx = XIVE_TCTX(dev); > @@ -609,10 +611,14 @@ static void xive_tctx_realize(DeviceState *dev, Error > **errp) > } > > qemu_register_reset(xive_tctx_reset, dev); > + QTAILQ_INSERT_HEAD(&xive_tctx_list, tctx, next); > } > > static void xive_tctx_unrealize(DeviceState *dev, Error **errp) > { > + XiveTCTX *tctx = XIVE_TCTX(dev); > + > + QTAILQ_REMOVE(&xive_tctx_list, tctx, next); > qemu_unregister_reset(xive_tctx_reset, dev); > } > > @@ -1385,15 +1391,14 @@ static bool xive_presenter_match(XiveRouter *xrtr, > uint8_t format, > bool cam_ignore, uint8_t priority, > uint32_t logic_serv, XiveTCTXMatch *match) > { > - CPUState *cs; > + XiveTCTX *tctx; > > /* > * TODO (PowerNV): handle chip_id overwrite of block field for > * hardwired CAM compares > */ > > - CPU_FOREACH(cs) { > - XiveTCTX *tctx = xive_router_get_tctx(xrtr, cs); > + QTAILQ_FOREACH(tctx, &xive_tctx_list, next) { > int ring; > > /* > ================================================================================ > >> Signed-off-by: Cédric Le Goater <c...@kaod.org> >> --- >> hw/intc/xive.c | 8 ++++++++ >> 1 file changed, 8 insertions(+) >> >> diff --git a/hw/intc/xive.c b/hw/intc/xive.c >> index b7417210d817..29df06df1136 100644 >> --- a/hw/intc/xive.c >> +++ b/hw/intc/xive.c >> @@ -1396,6 +1396,14 @@ static bool xive_presenter_match(XiveRouter *xrtr, >> uint8_t format, >> XiveTCTX *tctx = xive_router_get_tctx(xrtr, cs); >> int ring; >> >> + /* >> + * Skip partially initialized vCPUs. This can happen when >> + * vCPUs are hotplugged. >> + */ >> + if (!tctx) { >> + continue; >> + } >> + >> /* >> * HW checks that the CPU is enabled in the Physical Thread >> * Enable Register (PTER). >