On Fri, 17 Nov 2017 14:21:27 -0200 Daniel Henrique Barboza <danie...@linux.vnet.ibm.com> wrote:
> On 11/17/2017 10:56 AM, Greg Kurz wrote: > > A DRC with a pending unplug request releases its associated device at > > machine reset time. > > > > In the case of LMB, when all DRCs for a DIMM device have been reset, > > the DIMM gets unplugged, causing guest memory to disappear. This may > > be very confusing for anything still using this memory. > > > > This is exactly what happens with vhost backends, and QEMU aborts > > with: > > > > qemu-system-ppc64: used ring relocated for ring 2 > > qemu-system-ppc64: qemu/hw/virtio/vhost.c:649: vhost_commit: Assertion > > `r >= 0' failed. > > > > The issue is that each DRC registers a QEMU reset handler, and we > > don't control the order in which these handlers are called (ie, > > a LMB DRC will unplug a DIMM before the virtio device using the > > memory on this DIMM could stop its vhost backend). > > > > To avoid such situations, let's reset DRCs after all devices > > have been reset. > > > > Reported-by: Mallesh N. Koti <mall...@linux.vnet.ibm.com> > > Signed-off-by: Greg Kurz <gr...@kaod.org> > > --- > I believe this has to do with the problem discussed at the memory > unplug reboot with vhost=on, right? > > I don't see any problem with this patch and it's probably the best solution > short-term for the problem (and potentially other related LMB release > problems), but that said, how hard it is to forbid LMB unplug at all if the > memory is being used in vhost? > I don't know yet but I'll dig some more. > The way I see it, the root problem is that a device unplug failed at the > guest > side and we don't know about it, leaving drc->unplug_requested flags set > around the code when it shouldn't. Reading that thread I see a comment from > Michael S. Tsirkin saying that this bug is somewhat expected, that vhost > backend will > not behave well if memory is going away. Unlike other LMB unplug > failures from > the guest side that might happen for any other reason, this one sees > predicable > and we can avoid it. > > I think this is something worth considering. But for now, > > > Reviewed-by: Daniel Henrique Barboza <danie...@linux.vnet.ibm.com> > Thanks! > > hw/ppc/spapr.c | 21 +++++++++++++++++++++ > > hw/ppc/spapr_drc.c | 7 ------- > > 2 files changed, 21 insertions(+), 7 deletions(-) > > > > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c > > index 6841bd294b3c..6285f7211f9a 100644 > > --- a/hw/ppc/spapr.c > > +++ b/hw/ppc/spapr.c > > @@ -1411,6 +1411,19 @@ static void find_unknown_sysbus_device(SysBusDevice > > *sbdev, void *opaque) > > } > > } > > > > +static int spapr_reset_drcs(Object *child, void *opaque) > > +{ > > + sPAPRDRConnector *drc = > > + (sPAPRDRConnector *) object_dynamic_cast(child, > > + TYPE_SPAPR_DR_CONNECTOR); > > + > > + if (drc) { > > + spapr_drc_reset(drc); > > + } > > + > > + return 0; > > +} > > + > > static void ppc_spapr_reset(void) > > { > > MachineState *machine = MACHINE(qdev_get_machine()); > > @@ -1434,6 +1447,14 @@ static void ppc_spapr_reset(void) > > } > > > > qemu_devices_reset(); > > + > > + /* DRC reset may cause a device to be unplugged. This will cause > > troubles > > + * if this device is used by another device (eg, a running vhost > > backend > > + * will crash QEMU if the DIMM holding the vring goes away). To avoid > > such > > + * situations, we reset DRCs after all devices have been reset. > > + */ > > + object_child_foreach_recursive(object_get_root(), spapr_reset_drcs, > > NULL); > > + > > spapr_clear_pending_events(spapr); > > > > /* > > diff --git a/hw/ppc/spapr_drc.c b/hw/ppc/spapr_drc.c > > index 915e9b51c40c..e3b122968e89 100644 > > --- a/hw/ppc/spapr_drc.c > > +++ b/hw/ppc/spapr_drc.c > > @@ -455,11 +455,6 @@ void spapr_drc_reset(sPAPRDRConnector *drc) > > } > > } > > > > -static void drc_reset(void *opaque) > > -{ > > - spapr_drc_reset(SPAPR_DR_CONNECTOR(opaque)); > > -} > > - > > bool spapr_drc_needed(void *opaque) > > { > > sPAPRDRConnector *drc = (sPAPRDRConnector *)opaque; > > @@ -518,7 +513,6 @@ static void realize(DeviceState *d, Error **errp) > > } > > vmstate_register(DEVICE(drc), spapr_drc_index(drc), > > &vmstate_spapr_drc, > > drc); > > - qemu_register_reset(drc_reset, drc); > > trace_spapr_drc_realize_complete(spapr_drc_index(drc)); > > } > > > > @@ -529,7 +523,6 @@ static void unrealize(DeviceState *d, Error **errp) > > gchar *name; > > > > trace_spapr_drc_unrealize(spapr_drc_index(drc)); > > - qemu_unregister_reset(drc_reset, drc); > > vmstate_unregister(DEVICE(drc), &vmstate_spapr_drc, drc); > > root_container = container_get(object_get_root(), DRC_CONTAINER_PATH); > > name = g_strdup_printf("%x", spapr_drc_index(drc)); > > > > >