On Wed, 24 May 2017 15:07:54 +1000 David Gibson <da...@gibson.dropbear.id.au> wrote:
> On Tue, May 23, 2017 at 01:18:11PM +0200, Laurent Vivier wrote: > > If the OS is not started, QEMU sends an event to the OS > > that is lost and cannot be recovered. An unplug is not > > able to restore QEMU in a coherent state. > > So, while the OS is not started, disable CPU and memory hotplug. > > We use option vector 6 to know if the OS is started > > > > Signed-off-by: Laurent Vivier <lviv...@redhat.com> > > Urgh.. I'm not terribly confident that this is really correct. As > discussed on the previous patch, you're essentially using OV6 as a > flag that CAS is complete. > > But while it undoubtedly makes the race window much smaller, I don't > see that there's any guarantee the guest OS will really be able to > handle hotplug events immediately after CAS. > > In particular if the CAS process completes partially but then needs to > trigger a reboot, I think that would end up setting the ov6 variable, > but the OS would definitely not be in a state to accept events. > We never have any guarantee that the OS will process an event that we've sent actually (think of a kernel crash just after a successful CAS negotiation for example, or any failure with the various guest components involved in the process of hotplug). > Mike, I really think we need some input from someone familiar with how > these hotplug events are supposed to work. What do we need to do to > handle lost or stale events, such as those delivered when an OS is not > booted. > AFAIK, in the PowerVM world, the HMC exposes a user configurable timeout. https://www.ibm.com/support/knowledgecenter/POWER8/p8hat/p8hat_dlparprocpoweraddp6.htm I'm not sure we can do anything better than being able to "cancel" a previous hotplug attempt if it takes too long, but I'm not necessarily the expert you're looking for :) > > --- > > hw/ppc/spapr.c | 22 +++++++++++++++++++--- > > 1 file changed, 19 insertions(+), 3 deletions(-) > > > > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c > > index eceb4cc..2e9320d 100644 > > --- a/hw/ppc/spapr.c > > +++ b/hw/ppc/spapr.c > > @@ -2625,6 +2625,7 @@ out: > > static void spapr_memory_pre_plug(HotplugHandler *hotplug_dev, DeviceState > > *dev, > > Error **errp) > > { > > + sPAPRMachineState *ms = SPAPR_MACHINE(hotplug_dev); > > PCDIMMDevice *dimm = PC_DIMM(dev); > > PCDIMMDeviceClass *ddc = PC_DIMM_GET_CLASS(dimm); > > MemoryRegion *mr = ddc->get_memory_region(dimm); > > @@ -2645,6 +2646,13 @@ static void spapr_memory_pre_plug(HotplugHandler > > *hotplug_dev, DeviceState *dev, > > goto out; > > } > > > > + if (dev->hotplugged) { > > + if (!ms->os_name) { > > + error_setg(&local_err, "Memory hotplug not supported without > > OS"); > > + goto out; > > + } > > + } > > + > > out: > > error_propagate(errp, local_err); > > } > > @@ -2874,6 +2882,7 @@ static void spapr_core_pre_plug(HotplugHandler > > *hotplug_dev, DeviceState *dev, > > Error **errp) > > { > > MachineState *machine = MACHINE(OBJECT(hotplug_dev)); > > + sPAPRMachineState *ms = SPAPR_MACHINE(machine); > > MachineClass *mc = MACHINE_GET_CLASS(hotplug_dev); > > Error *local_err = NULL; > > CPUCore *cc = CPU_CORE(dev); > > @@ -2884,9 +2893,16 @@ static void spapr_core_pre_plug(HotplugHandler > > *hotplug_dev, DeviceState *dev, > > int node_id; > > int index; > > > > - if (dev->hotplugged && !mc->has_hotpluggable_cpus) { > > - error_setg(&local_err, "CPU hotplug not supported for this > > machine"); > > - goto out; > > + if (dev->hotplugged) { > > + if (!mc->has_hotpluggable_cpus) { > > + error_setg(&local_err, > > + "CPU hotplug not supported for this machine"); > > + goto out; > > + } > > + if (!ms->os_name) { > > + error_setg(&local_err, "CPU hotplug not supported without OS"); > > + goto out; > > + } > > } > > > > if (strcmp(base_core_type, type)) { >
pgpvqcoIvgmxW.pgp
Description: OpenPGP digital signature