Re: [PATCH v7 25/35] nvdimm acpi: init the resource used by NVDIMM ACPI

Xiao Guangrong Fri, 06 Nov 2015 00:38:36 -0800


On 11/05/2015 10:49 PM, Igor Mammedov wrote:

On Thu, 5 Nov 2015 21:33:39 +0800
Xiao Guangrong <guangrong.x...@linux.intel.com> wrote:



On 11/05/2015 09:03 PM, Igor Mammedov wrote:

On Thu, 5 Nov 2015 18:15:31 +0800
Xiao Guangrong <guangrong.x...@linux.intel.com> wrote:



On 11/05/2015 05:58 PM, Igor Mammedov wrote:

On Mon,  2 Nov 2015 17:13:27 +0800
Xiao Guangrong <guangrong.x...@linux.intel.com> wrote:

A page staring from 0xFF00000 and IO port 0x0a18 - 0xa1b in guest are

                                  ^^ missing one 0???

reserved for NVDIMM ACPI emulation, refer to docs/specs/acpi_nvdimm.txt
for detailed design

A parameter, 'nvdimm-support', is introduced for PIIX4_PM and ICH9-LPC
that controls if nvdimm support is enabled, it is true on default and
it is false on 2.4 and its earlier version to keep compatibility

Signed-off-by: Xiao Guangrong <guangrong.x...@linux.intel.com>

[...]

@@ -33,6 +33,15 @@
     */
    #define MIN_NAMESPACE_LABEL_SIZE      (128UL << 10)

+/*
+ * A page staring from 0xFF00000 and IO port 0x0a18 - 0xa1b in guest are

                                    ^^^ missing 0 or value in define below has 
an extra 0

+ * reserved for NVDIMM ACPI emulation, refer to docs/specs/acpi_nvdimm.txt
+ * for detailed design.
+ */
+#define NVDIMM_ACPI_MEM_BASE          0xFF000000ULL

it still maps RAM at arbitrary place,
that's the reason why VMGenID patches hasn't been merged for
more than several months.
I'm not against of using (more exactly I'm for it) direct mapping
but we should reach consensus when and how to use it first.

I'd wouldn't use addresses below 4G as it may be used firmware or
legacy hardware and I won't bet that 0xFF000000ULL won't conflict
with anything.
An alternative place to allocate reserve from could be high memory.
For pc we have "reserved-memory-end" which currently makes sure
that hotpluggable memory range isn't used by firmware.

How about making API that allows to map additional memory
ranges before reserved-memory-end and pushes it up as mappings are
added.


That what i did in the v1/v2 versions, however, as you noticed, using 64-bit
address in ACPI in QEMU is not a easy work - we can not simply make
SSDT.rev = 2 to apply 64 bit address, the reason i have documented in
v3's changelog:

     3) we figure out a unused memory hole below 4G that is 0xFF00000 ~
        0xFFF00000, this range is large enough for NVDIMM ACPI as build 64-bit
        ACPI SSDT/DSDT table will break windows XP.
        BTW, only make SSDT.rev = 2 can not work since the width is only 
depended
        on DSDT.rev based on 19.6.28 DefinitionBlock (Declare Definition Block)
        in ACPI spec:
| Note: For compatibility with ACPI versions before ACPI 2.0, the bit
| width of Integer objects is dependent on the ComplianceRevision of the DSDT.
| If the ComplianceRevision is less than 2, all integers are restricted to 32
| bits. Otherwise, full 64-bit integers are used. The version of the DSDT sets
| the global integer width for all integers, including integers in SSDTs.
     4) use the lowest ACPI spec version to document AML terms.

The only way introducing 64 bit address is adding XSDT support that what
Michael did before, however, it seems it need great efforts to do it as
it will break OVMF. It's a long term workload. :(

to enable 64-bit integers in AML it's sufficient to change DSDT revision to 2,
I already have a patch that switches DSDT/SSDT to rev2.
Tests show it doesn't break WindowsXP (which is rev1) and uses 64-bit integers
on linux & later Windows versions.


Great, i remembered i did the similar test (directly change DSDT to rev2) and it
caused winXP blue screen. Could you please tell me where i can find your patch?

https://github.com/imammedo/qemu/commits/mhpt_table_v2
following changes revision:
  pc: acpi: bump DSDT/SSDT compliance revision to v2
and here is user:
  acpi: memhp: simplify MCRS() using 64-bit math

when writing ASL one shall make sure that only XP supported
features are in global scope, which is evaluated when tables
are loaded and features of rev2 and higher are inside methods.
That way XP doesn't crash as far as it doesn't evaluate unsupported
opcode and one can guard those opcodes checking _REV object if neccesary.


Really a good study case to me, i tried your patch and moved the 64 bit
staffs to the private method, it worked. :)

Igor, is this the API you want?

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 6bf569a..aba29df 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -1291,6 +1291,38 @@ FWCfgState *xen_load_linux(PCMachineState *pcms,
     return fw_cfg;
 }

+static void pc_reserve_high_memory_init(PCMachineState *pcms,
+                                        uint64_t base, uint64_t align)
+{
+    pcms->reserve_high_memory.current_addr = ROUND_UP(base, align);
+}
+
+static uint64_t
+pc_reserve_high_memory_end(PCMachineState *pcms, int64_t align)
+{
+    return ROUND_UP(pcms->reserve_high_memory.current_addr, align);
+}
+
+uint64_t pc_reserve_high_memory(PCMachineState *pcms, uint64_t size,
+                                int64_t align, Error **errp)
+{
+    uint64_t end_addr, current_addr = pcms->reserve_high_memory.current_addr;
+
+    if (!current_addr) {
+        error_setg(errp, "reserved high memory is not initialized.");
+        return 0;
+    }
+
+    end_addr = pc_reserve_high_memory_end(pcms, align) + size;
+    if (current_addr > end_addr) {
+        error_setg(errp, "reserved high memory is not enough.");
+        return 0;
+    }
+
+    pcms->reserve_high_memory.current_addr = end_addr;
+    return end_addr;
+}
+
 FWCfgState *pc_memory_init(PCMachineState *pcms,
                            MemoryRegion *system_memory,
                            MemoryRegion *rom_memory,
@@ -1379,6 +1411,8 @@ FWCfgState *pc_memory_init(PCMachineState *pcms,
                            "hotplug-memory", hotplug_mem_size);
         memory_region_add_subregion(system_memory, pcms->hotplug_memory.base,
                                     &pcms->hotplug_memory.mr);
+        pc_reserve_high_memory_init(pcms, pcms->hotplug_memory.base +
+                                    hotplug_mem_size, 1ULL << 30);
     }

     /* Initialize PC system firmware */
@@ -1403,7 +1437,7 @@ FWCfgState *pc_memory_init(PCMachineState *pcms,
         uint64_t res_mem_end = pcms->hotplug_memory.base;

         if (!pcmc->broken_reserved_end) {
-            res_mem_end += memory_region_size(&pcms->hotplug_memory.mr);
+            res_mem_end = pc_reserve_high_memory_end(pcms, 0x1ULL << 30);
         }
         *val = cpu_to_le64(ROUND_UP(res_mem_end, 0x1ULL << 30));
         fw_cfg_add_file(fw_cfg, "etc/reserved-memory-end", val, sizeof(*val));
diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index 47162cf..fae3fea 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -20,6 +20,11 @@

 #define HPET_INTCAP "hpet-intcap"

+struct PCReserveHighMemory {
+    uint64_t current_addr;
+};
+typedef struct PCReserveHighMemory PCReserveHighMemory;
+
 /**
  * PCMachineState:
  * @acpi_dev: link to ACPI PM device that performs ACPI hotplug handling
@@ -41,6 +46,7 @@ struct PCMachineState {
     OnOffAuto smm;
     bool enforce_aligned_dimm;
     ram_addr_t below_4g_mem_size, above_4g_mem_size;
+    PCReserveHighMemory reserve_high_memory;
 };

 #define PC_MACHINE_ACPI_DEVICE_PROP "acpi-device"
@@ -175,6 +181,9 @@ PcGuestInfo *pc_guest_info_init(PCMachineState *pcms);

 void pc_set_legacy_acpi_data_size(void);

+uint64_t pc_reserve_high_memory(PCMachineState *pcms, uint64_t size,
+                                int64_t align, Error **errp);
+
 #define PCI_HOST_PROP_PCI_HOLE_START   "pci-hole-start"
 #define PCI_HOST_PROP_PCI_HOLE_END     "pci-hole-end"
 #define PCI_HOST_PROP_PCI_HOLE64_START "pci-hole64-start"


The luck thing is, the ACPI part is not ABI, we can move it to the high
memory if QEMU supports XSDT is ready in future development.

But mapped control region is ABI and we can't change it if we find out later
that it breaks something.


But the ACPI code is completely built by QEMU, which is transparent to guest
and guest should not depend on it, no?

unfortunately no, think about cross-version migration.


It makes sense. Stupid me. :(


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v7 25/35] nvdimm acpi: init the resource used by NVDIMM ACPI

Reply via email to