On 10/10/19 17:06, Michael S. Tsirkin wrote: > On Thu, Oct 10, 2019 at 04:56:18PM +0200, Laszlo Ersek wrote: >> On 10/09/19 15:22, Igor Mammedov wrote: >>> Extend CPU hotplug interface to return architecture specific >>> identifier for current CPU (in case of x86, it's APIC ID). >>> >>> Signed-off-by: Igor Mammedov <imamm...@redhat.com> >>> --- >>> TODO: >>> * cripple it to behave old way on old machine types so that >>> new firmware started on new QEMU won't see a difference >>> when migrated to an old QEMU (i.e. QEMU that doesn't support >>> this command) >>> --- >>> docs/specs/acpi_cpu_hotplug.txt | 10 +++++++++- >>> hw/acpi/cpu.c | 15 +++++++++++++++ >>> hw/acpi/trace-events | 1 + >>> 3 files changed, 25 insertions(+), 1 deletion(-) >>> >>> diff --git a/docs/specs/acpi_cpu_hotplug.txt >>> b/docs/specs/acpi_cpu_hotplug.txt >>> index 43c5a193f0..0438678249 100644 >>> --- a/docs/specs/acpi_cpu_hotplug.txt >>> +++ b/docs/specs/acpi_cpu_hotplug.txt >>> @@ -32,7 +32,9 @@ Register block size: >>> >>> read access: >>> offset: >>> - [0x0-0x3] reserved >>> + [0x0-0x3] Command data 2: (DWORD access) >>> + upper 32 bit of 'Command data' if returned data value is 64 >>> bit. >>> + in case of error or unsupported command reads is 0x0 >> >> How about >> >> [0x0] Command data 2: (DWORD access, little endian) >> If the "CPU selector" value last stored by the guest refers to >> an impossible CPU, then 0. >> Otherwise, if the "Command field" value last stored by the >> guest differs from 3, then 0. >> Otherwise, the most significant 32 bits of the selected CPU's >> architecture specific ID. >> >> [0x8] Command data: (DWORD access, little endian) >> If the "CPU selector" value last stored by the guest refers to >> an impossible CPU, then 0. >> Otherwise, >> - if the "Command field" value last stored by the guest is 0, >> then the selector of the currently selected CPU; >> - if the "Command field" value last stored by the guest is 3, >> then the least significant 32 bits of the selected CPU's >> architecture specific ID; >> - otherwise, 0. >> >>> [0x4] CPU device status fields: (1 byte access) >>> bits: >>> 0: Device is enabled and may be used by guest >>> @@ -87,6 +89,8 @@ write access: >>> 2: stores value into OST status register, triggers >>> ACPI_DEVICE_OST QMP event from QEMU to external >>> applications >>> with current values of OST event and status registers. >>> + 3: OSPM reads architecture specific value identifying CPU >>> + (x86: APIC ID) >>> other values: reserved >>> >> >> Seems OK. >> >>> Selecting CPU device beyond possible range has no effect on platform: >>> @@ -115,3 +119,7 @@ Typical usecases: >>> 5.2 if 'Command data' register has not changed, there is not CPU >>> corresponding to iterator value and the last valid iterator value >>> equals to 'max_cpus' + 1 >>> + - Get architecture specific id for a CPU >>> + 1. pick a CPU to read from using 'CPU selector' register >>> + 2. write 0x3 int0 'Command field' register >>> + 3. read architecture specific id from 'Command data' register >> >> Looks good, except for: >> >> - typo: "int0" >> >> - in step 3, we should reference 'Command data 2' as well. >> >> >> In fact, in >> <2b10ca48-c734-4f41-9521-136c44060812@redhat.com">http://mid.mail-archive.com/2b10ca48-c734-4f41-9521-136c44060812@redhat.com>, >> I wrote, for the "Get a cpu with pending event" use case: >> >>> 1. Store 0x0 to the 'CPU selector' register. >>> 2. Store 0x0 to the 'Command field' register. >>> 3. Read the 'CPU device status fields' register. >>> 4. If both bit#1 and bit#2 are clear in the value read, there is no >>> CPU with a pending event. >>> 5. Otherwise, read the 'Command data' register. The value read is the >>> selector of the CPU with the pending event (which is already >>> selected). >> >> and your steps #2 and #3, for getting the arch specific ID, can be >> directly appended as steps 6. and 7.! >> >> >>> diff --git a/hw/acpi/cpu.c b/hw/acpi/cpu.c >>> index 87f30a31d7..701542d860 100644 >>> --- a/hw/acpi/cpu.c >>> +++ b/hw/acpi/cpu.c >>> @@ -12,11 +12,13 @@ >>> #define ACPI_CPU_FLAGS_OFFSET_RW 4 >>> #define ACPI_CPU_CMD_OFFSET_WR 5 >>> #define ACPI_CPU_CMD_DATA_OFFSET_RW 8 >>> +#define ACPI_CPU_CMD_DATA2_OFFSET_RW 0 >>> >>> enum { >>> CPHP_GET_NEXT_CPU_WITH_EVENT_CMD = 0, >>> CPHP_OST_EVENT_CMD = 1, >>> CPHP_OST_STATUS_CMD = 2, >>> + CPHP_GET_CPU_ID_CMD = 3, >>> CPHP_CMD_MAX >>> }; >>> >>> @@ -74,11 +76,24 @@ static uint64_t cpu_hotplug_rd(void *opaque, hwaddr >>> addr, unsigned size) >>> case CPHP_GET_NEXT_CPU_WITH_EVENT_CMD: >>> val = cpu_st->selector; >>> break; >>> + case CPHP_GET_CPU_ID_CMD: >>> + val = cpu_to_le64(cdev->arch_id) & 0xFFFFFFFF; >>> + break; >>> default: >>> break; >>> } >>> trace_cpuhp_acpi_read_cmd_data(cpu_st->selector, val); >>> break; >>> + case ACPI_CPU_CMD_DATA2_OFFSET_RW: >>> + switch (cpu_st->command) { >>> + case CPHP_GET_CPU_ID_CMD: >>> + val = cpu_to_le64(cdev->arch_id) >> 32; >>> + break; >>> + default: >>> + break; >>> + } >>> + trace_cpuhp_acpi_read_cmd_data2(cpu_st->selector, val); >>> + break; >>> default: >>> break; >>> } >>> diff --git a/hw/acpi/trace-events b/hw/acpi/trace-events >>> index 96b8273297..afbc77de1c 100644 >>> --- a/hw/acpi/trace-events >>> +++ b/hw/acpi/trace-events >>> @@ -23,6 +23,7 @@ cpuhp_acpi_read_flags(uint32_t idx, uint8_t flags) >>> "idx[0x%"PRIx32"] flags: 0x%" >>> cpuhp_acpi_write_idx(uint32_t idx) "set active cpu idx: 0x%"PRIx32 >>> cpuhp_acpi_write_cmd(uint32_t idx, uint8_t cmd) "idx[0x%"PRIx32"] cmd: >>> 0x%"PRIx8 >>> cpuhp_acpi_read_cmd_data(uint32_t idx, uint32_t data) "idx[0x%"PRIx32"] >>> data: 0x%"PRIx32 >>> +cpuhp_acpi_read_cmd_data2(uint32_t idx, uint32_t data) "idx[0x%"PRIx32"] >>> data: 0x%"PRIx32 >>> cpuhp_acpi_cpu_has_events(uint32_t idx, bool ins, bool rm) >>> "idx[0x%"PRIx32"] inserting: %d, removing: %d" >>> cpuhp_acpi_clear_inserting_evt(uint32_t idx) "idx[0x%"PRIx32"]" >>> cpuhp_acpi_clear_remove_evt(uint32_t idx) "idx[0x%"PRIx32"]" >>> >> >> Looks plausible to me, thanks (discounting the TODO item). >> >> Right now, I can't offer testing for patch#3 (I'm quite far from the >> point where I'll be actually looking for a hotplugged CPU :) ), but >> based on the docs patches #1 and #2, and my proposed updates, I can >> rework my "possible CPU count detection" in OVMF. >> >> Do I need to check in OVMF specifically whether the "modern" CPU hotplug >> register block is available? Can you tell me what the oldest machine >> types are that support the modern interface? >> >> Hmm... Commit abd49bc2ed2f ("docs: update ACPI CPU hotplug spec with new >> protocol", 2016-06-24) seems relevant. First released in v2.7.0. I think >> I should detect whether this interface is available. >> >> Can I use the following sequence to detect whether the interface is >> available? >> >> 1. Store 0x0 to command register. >> 2. Store 0x0 to selector register. >> 3. Read 'command data' register. >> 4. If value read is 0, the interface is available. >> >> (Because I assume that unmapped IO ports read as all-bits-one. Is that >> right?) >> >> BTW, can I dynamically detect support for the GET_CPU_ID command too? >> I'm thinking, when I enumerate / count all possible CPUs, I can at once >> fetch the arch IDs for all of them. If I only get zeros from the command >> data registers, across all CPUs, in response to GET_CPU_ID, then the >> command is not available. >> >> Thanks >> Laszlo > > Laszlo, won't we need to add topology info anyway? > if yes then this patch is just a stopgap, so let's do > fw cfg and be done with it?
No, CPU topology is not relevant *per se* for this OVMF feature (CPU hotplug with SMM). CPU topology is only interesting as a building block. What OVMF really needs, for the hotplug SMI handler, is the ability to: - learn a QEMU-specific unique identifier for the CPU that has just been hot-plugged, - translate said QEMU-specific unique CPU identifier to the new CPU's APIC-ID. So how do we get there? - If the hotplug register block (the hotplug hardware interface) lets me fetch the APIC-ID for the new CPU immediately, in the hotplug SMI handler, that's best (most convenient). - If the QEMU-specific unique identifier fetched from the register block is a sequential CPU index (or "selector"), then I *could* use the CPU topology as a building block, for constructing the APIC-ID myself. I would decompose the CPU index into thread / core / die / socket offsets, based on the topology, and re-compose those offsets into an APIC-ID. - If the QEMU-specific unique identifier fetched from the register block is a sequential CPU index (or "selector"), then I could *also* use a complete mapping table (index to APIC-ID), fetched earlier from fw_cfg. I would use the CPU index as a subscript into that table. What really matters for OVMF is the APIC-ID of the just-plugged CPU. I definitely have to access the CPU hotplug register block in the SMI handler, because that's where I can get *any kind* of identifier for the new CPU. (Unless we want to implement a dedicated CPU hotplug controller for OVMF, that is.) Thanks Laszlo