On Tue, Oct 08, 2019 at 05:59:31PM +0200, Igor Mammedov wrote: > On Tue, 8 Oct 2019 12:52:58 +0200 > Laszlo Ersek <ler...@redhat.com> wrote: > > > FW_CFG_MAX_CPUS exposes the (exclusive) maximum APIC ID to guest firmware, > > due to historical reasons. That value is not useful to edk2, however. For > > supporting VCPU hotplug, edk2 needs: > > > > - the boot CPU count (already exposed in FW_CFG_NB_CPUS), > > > > - and the maximum foreseen CPU count (tracked in > > "MachineState.smp.max_cpus", but not currently exposed). > one can get it with current QEMU without adding new fgcfg > (albeit in a bit awkward way) > > max_cpu count can be derived indirectly as result of cpu hotplug > enumeration (IO interface at 0x0cd8-0xcf7) by writing/reading > to/from selector register (see ACPI_CPU_SELECTOR_OFFSET_WR) > until read value stops changing values (i.e. max cpu count > is reached). One also can figure out present/non-present > cpu status by reading flags register.
Right but so far ACPI was the only driver of CPU hotplug, right? I don't think firmware should poke it directly, it is not really clean or especially scalable. Fine for ACPI but not great as a HV/guest interface. > > Add a new fw-cfg file to expose "max_cpus". > > > > While at it, expose the rest of the topology too (die / core / thread > > counts), because I expect that the VCPU hotplug feature for OVMF will > > ultimately need those too, and the data size is not large. This is > > slightly complicated by the fact that the die count is specific to > > PCMachineState, but fw_cfg_arch_create() intends to be PC-independent (see > > commit 149c50cabcc4). > Could you clarify why topology info is necessary? > > Potentially it's possible to extend cpu hotplug ABI to report > arch specific apic-id (x86) or mpidr (arm) if firmware really > needs to know topology and let guest to decode it according > to CPU's spec. > > So far there were no need for it as all possible cpus are > described in ACPI tables passed to guest, but I'm not going > to suggest to parse them on firmware side as it's too complicated :) We can always add a QEMU specific data table by the way. Format would be up to us and would be easy to parse. I don't see a big advantage as compared to fw cfg though. > PS: > The reason we started building ACPI tables in QEMU, was never > ending story of adding more ABI and supporting it afterwards. > So I'd try to avoid doing it if it can be helped. Absolutely. We should try to keep all ACPI generation in QEMU. But this value is not for ACPI, is it? > > For now, the feature is temporarily disabled. > > > > Cc: "Michael S. Tsirkin" <m...@redhat.com> > > Cc: Eduardo Habkost <ehabk...@redhat.com> > > Cc: Igor Mammedov <imamm...@redhat.com> > > Cc: Marcel Apfelbaum <marcel.apfelb...@gmail.com> > > Cc: Paolo Bonzini <pbonz...@redhat.com> > > Cc: Philippe Mathieu-Daudé <phi...@redhat.com> > > Cc: Richard Henderson <r...@twiddle.net> > > Ref: https://bugzilla.tianocore.org/show_bug.cgi?id=1515 > > Signed-off-by: Laszlo Ersek <ler...@redhat.com> > > --- > > hw/i386/fw_cfg.h | 30 +++++++++++++++++++++++++++++- > > hw/i386/fw_cfg.c | 26 ++++++++++++++++++++++++-- > > hw/i386/pc.c | 4 ++-- > > 3 files changed, 55 insertions(+), 5 deletions(-) > > > > diff --git a/hw/i386/fw_cfg.h b/hw/i386/fw_cfg.h > > index e0856a376996..d742435b9793 100644 > > --- a/hw/i386/fw_cfg.h > > +++ b/hw/i386/fw_cfg.h > > @@ -18,9 +18,37 @@ > > #define FW_CFG_E820_TABLE (FW_CFG_ARCH_LOCAL + 3) > > #define FW_CFG_HPET (FW_CFG_ARCH_LOCAL + 4) > > > > +/** > > + * FWCfgX86Topology: expose the X86 CPU topology to guest firmware over > > fw-cfg. > > + * > > + * All fields have little-endian encoding. > > + * > > + * @dies: Number of dies per package (aka socket). Set it to 1 unless > > the > > + * concrete MachineState subclass defines it differently. > > + * @cores: Corresponds to @CpuTopology.@cores. > > + * @threads: Corresponds to @CpuTopology.@threads. > > + * @max_cpus: Corresponds to @CpuTopology.@max_cpus. > > + * > > + * Firmware can derive the package (aka socket) count with the following > > + * formula: > > + * > > + * DIV_ROUND_UP(@max_cpus, @dies * @cores * @threads) > > + * > > + * Firmware can derive APIC ID field widths and offsets per the standard > > + * calculations in "include/hw/i386/topology.h". > > + */ > > +typedef struct FWCfgX86Topology { > > + uint32_t dies; > > + uint32_t cores; > > + uint32_t threads; > > + uint32_t max_cpus; > > +} QEMU_PACKED FWCfgX86Topology; > > + > > FWCfgState *fw_cfg_arch_create(MachineState *ms, > > uint16_t boot_cpus, > > - uint16_t apic_id_limit); > > + uint16_t apic_id_limit, > > + unsigned smp_dies, > > + bool expose_topology); > > void fw_cfg_build_smbios(MachineState *ms, FWCfgState *fw_cfg); > > void fw_cfg_build_feature_control(MachineState *ms, FWCfgState *fw_cfg); > > > > diff --git a/hw/i386/fw_cfg.c b/hw/i386/fw_cfg.c > > index 39b6bc60520c..33d09875014f 100644 > > --- a/hw/i386/fw_cfg.c > > +++ b/hw/i386/fw_cfg.c > > @@ -85,9 +85,26 @@ void fw_cfg_build_smbios(MachineState *ms, FWCfgState > > *fw_cfg) > > } > > } > > > > +static void fw_cfg_expose_topology(FWCfgState *fw_cfg, > > + unsigned dies, > > + unsigned cores, > > + unsigned threads, > > + unsigned max_cpus) > > +{ > > + FWCfgX86Topology *topo = g_new(FWCfgX86Topology, 1); > > + > > + topo->dies = cpu_to_le32(dies); > > + topo->cores = cpu_to_le32(cores); > > + topo->threads = cpu_to_le32(threads); > > + topo->max_cpus = cpu_to_le32(max_cpus); > > + fw_cfg_add_file(fw_cfg, "etc/x86-smp-topology", topo, sizeof *topo); > > +} > > + > > FWCfgState *fw_cfg_arch_create(MachineState *ms, > > - uint16_t boot_cpus, > > - uint16_t apic_id_limit) > > + uint16_t boot_cpus, > > + uint16_t apic_id_limit, > > + unsigned smp_dies, > > + bool expose_topology) > > { > > FWCfgState *fw_cfg; > > uint64_t *numa_fw_cfg; > > @@ -143,6 +160,11 @@ FWCfgState *fw_cfg_arch_create(MachineState *ms, > > (1 + apic_id_limit + nb_numa_nodes) * > > sizeof(*numa_fw_cfg)); > > > > + if (expose_topology) { > > + fw_cfg_expose_topology(fw_cfg, smp_dies, ms->smp.cores, > > + ms->smp.threads, ms->smp.max_cpus); > > + } > > + > > return fw_cfg; > > } > > > > diff --git a/hw/i386/pc.c b/hw/i386/pc.c > > index bcda50efcc23..bb72b12edad2 100644 > > --- a/hw/i386/pc.c > > +++ b/hw/i386/pc.c > > @@ -1738,8 +1738,8 @@ void pc_memory_init(PCMachineState *pcms, > > option_rom_mr, > > 1); > > > > - fw_cfg = fw_cfg_arch_create(machine, > > - pcms->boot_cpus, pcms->apic_id_limit); > > + fw_cfg = fw_cfg_arch_create(machine, pcms->boot_cpus, > > pcms->apic_id_limit, > > + pcms->smp_dies, false); > > > > rom_set_fw(fw_cfg); > >