On Wed, Feb 13, 2019 at 09:22:46AM +0100, Cédric Le Goater wrote: > On 2/13/19 2:32 AM, David Gibson wrote: > > On Tue, Feb 12, 2019 at 08:18:19AM +0100, Cédric Le Goater wrote: > >> On 2/12/19 2:11 AM, David Gibson wrote: > >>> On Mon, Jan 07, 2019 at 07:39:46PM +0100, Cédric Le Goater wrote: > >>>> The interrupt mode is chosen by the CAS negotiation process and > >>>> activated after a reset to take into account the required changes in > >>>> the machine. This brings new constraints on how the associated KVM IRQ > >>>> device is initialized. > >>>> > >>>> Currently, each model takes care of the initialization of the KVM > >>>> device in their realize method but this is not possible anymore as the > >>>> initialization needs to be done globaly when the interrupt mode is > >>>> known, i.e. when machine is reseted. It also means that we need a way > >>>> to delete a KVM device when another mode is chosen. > >>>> > >>>> Also, to support migration, the QEMU objects holding the state to > >>>> transfer should always be available but not necessarily activated. > >>>> > >>>> The overall approach of this proposal is to initialize both interrupt > >>>> mode at the QEMU level and keep the IRQ number space in sync to allow > >>>> switching from one mode to another. For the KVM side of things, the > >>>> whole initialization of the KVM device, sources and presenters, is > >>>> grouped in a single routine. The XICS and XIVE sPAPR IRQ reset > >>>> handlers are modified accordingly to handle the init and the delete > >>>> sequences of the KVM device. > >>>> > >>>> As KVM is now initialized at reset, we loose the possiblity to > >>>> fallback to the QEMU emulated mode in case of failure and failures > >>>> become fatal to the machine. > >>>> > >>>> Signed-off-by: Cédric Le Goater <c...@kaod.org> > >>>> --- > >>>> hw/intc/spapr_xive.c | 8 +--- > >>>> hw/intc/spapr_xive_kvm.c | 27 ++++++++++++++ > >>>> hw/intc/xics_kvm.c | 25 +++++++++++++ > >>>> hw/intc/xive.c | 4 -- > >>>> hw/ppc/spapr_irq.c | 79 ++++++++++++++++++++++++++++------------ > >>>> 5 files changed, 109 insertions(+), 34 deletions(-) > >>>> > >>>> diff --git a/hw/intc/spapr_xive.c b/hw/intc/spapr_xive.c > >>>> index 21f3c1ef0901..0661aca35900 100644 > >>>> --- a/hw/intc/spapr_xive.c > >>>> +++ b/hw/intc/spapr_xive.c > >>>> @@ -330,13 +330,7 @@ static void spapr_xive_realize(DeviceState *dev, > >>>> Error **errp) > >>>> xive->eat = g_new0(XiveEAS, xive->nr_irqs); > >>>> xive->endt = g_new0(XiveEND, xive->nr_ends); > >>>> > >>>> - if (kvmppc_xive_enabled()) { > >>>> - kvmppc_xive_connect(xive, &local_err); > >>>> - if (local_err) { > >>>> - error_propagate(errp, local_err); > >>>> - return; > >>>> - } > >>>> - } else { > >>>> + if (!kvmppc_xive_enabled()) { > >>>> /* TIMA initialization */ > >>>> memory_region_init_io(&xive->tm_mmio, OBJECT(xive), > >>>> &xive_tm_ops, xive, > >>>> "xive.tima", 4ull << TM_SHIFT); > >>>> diff --git a/hw/intc/spapr_xive_kvm.c b/hw/intc/spapr_xive_kvm.c > >>>> index d35814c1992e..3ebc947f2be7 100644 > >>>> --- a/hw/intc/spapr_xive_kvm.c > >>>> +++ b/hw/intc/spapr_xive_kvm.c > >>>> @@ -737,6 +737,15 @@ void kvmppc_xive_connect(sPAPRXive *xive, Error > >>>> **errp) > >>>> Error *local_err = NULL; > >>>> size_t esb_len; > >>>> size_t tima_len; > >>>> + CPUState *cs; > >>>> + > >>>> + /* > >>>> + * The KVM XIVE device already in use. This is the case when > >>>> + * rebooting XIVE -> XIVE > >>> > >>> Can this case actually occur? Further down you appear to > >>> unconditionally destroy both KVM devices at reset time. > >> > >> I guess you are right. I will check. > >> > >>>> + */ > >>>> + if (xive->fd != -1) { > >>>> + return; > >>>> + } > >>>> > >>>> if (!kvm_enabled() || !kvmppc_has_cap_xive()) { > >>>> error_setg(errp, "IRQ_XIVE capability must be present for KVM"); > >>>> @@ -800,6 +809,24 @@ void kvmppc_xive_connect(sPAPRXive *xive, Error > >>>> **errp) > >>>> xive->change = qemu_add_vm_change_state_handler( > >>>> kvmppc_xive_change_state_handler, xive); > >>>> > >>>> + /* Connect the presenters to the initial VCPUs of the machine */ > >>>> + CPU_FOREACH(cs) { > >>>> + PowerPCCPU *cpu = POWERPC_CPU(cs); > >>>> + > >>>> + kvmppc_xive_cpu_connect(cpu->tctx, &local_err); > >>>> + if (local_err) { > >>>> + error_propagate(errp, local_err); > >>>> + return; > >>>> + } > >>>> + } > >>>> + > >>>> + /* Update the KVM sources */ > >>>> + kvmppc_xive_source_reset(xsrc, &local_err); > >>>> + if (local_err) { > >>>> + error_propagate(errp, local_err); > >>>> + return; > >>>> + } > >>>> + > >>>> kvm_kernel_irqchip = true; > >>>> kvm_msi_via_irqfd_allowed = true; > >>>> kvm_gsi_direct_mapping = true; > >>>> diff --git a/hw/intc/xics_kvm.c b/hw/intc/xics_kvm.c > >>>> index 1d21ff217b82..bfc35d71df7f 100644 > >>>> --- a/hw/intc/xics_kvm.c > >>>> +++ b/hw/intc/xics_kvm.c > >>>> @@ -448,6 +448,16 @@ static void rtas_dummy(PowerPCCPU *cpu, > >>>> sPAPRMachineState *spapr, > >>>> int xics_kvm_init(sPAPRMachineState *spapr, Error **errp) > >>>> { > >>>> int rc; > >>>> + CPUState *cs; > >>>> + Error *local_err = NULL; > >>>> + > >>>> + /* > >>>> + * The KVM XICS device already in use. This is the case when > >>>> + * rebooting XICS -> XICS > >>>> + */ > >>>> + if (kernel_xics_fd != -1) { > >>>> + return 0; > >>>> + } > >>>> > >>>> if (!kvm_enabled() || !kvm_check_extension(kvm_state, > >>>> KVM_CAP_IRQ_XICS)) { > >>>> error_setg(errp, > >>>> @@ -496,6 +506,21 @@ int xics_kvm_init(sPAPRMachineState *spapr, Error > >>>> **errp) > >>>> kvm_msi_via_irqfd_allowed = true; > >>>> kvm_gsi_direct_mapping = true; > >>>> > >>>> + /* Connect the presenters to the initial VCPUs of the machine */ > >>>> + CPU_FOREACH(cs) { > >>>> + PowerPCCPU *cpu = POWERPC_CPU(cs); > >>>> + > >>>> + icp_kvm_connect(cpu->icp, &local_err); > >>>> + if (local_err) { > >>>> + error_propagate(errp, local_err); > >>>> + goto fail; > >>>> + } > >>>> + icp_set_kvm_state(cpu->icp, 1); > >>>> + } > >>>> + > >>>> + /* Update the KVM sources */ > >>>> + ics_set_kvm_state(ICS_KVM(spapr->ics), 1); > >>>> + > >>>> return 0; > >>>> > >>>> fail: > >>>> diff --git a/hw/intc/xive.c b/hw/intc/xive.c > >>>> index c5c2fbc3f8bc..c166eab5b210 100644 > >>>> --- a/hw/intc/xive.c > >>>> +++ b/hw/intc/xive.c > >>>> @@ -932,10 +932,6 @@ static void xive_source_reset(void *dev) > >>>> > >>>> /* PQs are initialized to 0b01 (Q=1) which corresponds to "ints > >>>> off" */ > >>>> memset(xsrc->status, XIVE_ESB_OFF, xsrc->nr_irqs); > >>>> - > >>>> - if (kvmppc_xive_enabled()) { > >>>> - kvmppc_xive_source_reset(xsrc, &error_fatal); > >>>> - } > >>>> } > >>>> > >>>> static void xive_source_realize(DeviceState *dev, Error **errp) > >>>> diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c > >>>> index ba27d9d8e972..5592eec3787b 100644 > >>>> --- a/hw/ppc/spapr_irq.c > >>>> +++ b/hw/ppc/spapr_irq.c > >>>> @@ -98,20 +98,14 @@ static void spapr_irq_init_xics(sPAPRMachineState > >>>> *spapr, Error **errp) > >>>> int nr_irqs = spapr->irq->nr_irqs; > >>>> Error *local_err = NULL; > >>>> > >>>> - if (kvm_enabled()) { > >>>> - if (machine_kernel_irqchip_allowed(machine) && > >>>> - !xics_kvm_init(spapr, &local_err)) { > >>>> - spapr->icp_type = TYPE_KVM_ICP; > >>>> - spapr->ics = spapr_ics_create(spapr, TYPE_ICS_KVM, nr_irqs, > >>>> - &local_err); > >>>> - } > >>>> - if (machine_kernel_irqchip_required(machine) && !spapr->ics) { > >>>> - error_prepend(&local_err, > >>>> - "kernel_irqchip requested but unavailable: "); > >>>> - goto error; > >>> > >>> I don't see anything that replaces the irqchip_required logic, which > >>> doesn't seem right. > >> > >> Yes. We do loose the ability to fall back to the emulated device in case > >> of failure. It is not impossible to do but it will require more changes > >> to check what are the KVM capabilities before starting the machine. > > > > Uh... it seems more like it's the other way around. We'll always fall > > back to emulated, even if we've explicitly said on the command line > > that we don't want that. > > Ah yes. The init function might be also broken. > > XICS mode is a bit more difficult to handle than XIVE because we have > different object type for the KVM device and the QEMU emulated device, > and with the 'dual' mode, we activate the device at CAS reset time.
Yeah.. we should probably fix that. > Failures being handled at reset time, should we keep the same logic and > abort the machine at reset if the kernel irqchip is required ? > > But we won't be able to fall back on the QEMU emulated device if KVM > XICS fails and if the kernel irqchip is only allowed. It should work for > XIVE though. That's fine. If we've said that kernel irqchip is required, we shouldn't fall back to emulation. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson
signature.asc
Description: PGP signature