> -----Original Message----- > From: Laszlo Ersek [mailto:ler...@redhat.com] > Sent: 09 April 2019 16:09 > To: Shameerali Kolothum Thodi <shameerali.kolothum.th...@huawei.com>; > qemu-devel@nongnu.org; qemu-...@nongnu.org; eric.au...@redhat.com; > imamm...@redhat.com > Cc: peter.mayd...@linaro.org; shannon.zha...@gmail.com; > sa...@linux.intel.com; sebastien.bo...@intel.com; xuwei (O) > <xuw...@huawei.com>; ard.biesheu...@linaro.org; Linuxarm > <linux...@huawei.com> > Subject: Re: [PATCH v4 8/8] hw/arm/boot: Expose the PC-DIMM nodes in the > DT > > On 04/09/19 12:29, Shameer Kolothum wrote: > > This patch adds memory nodes corresponding to PC-DIMM regions. > > This will enable support for cold plugged device memory for Guests > > with DT boot. > > > > Signed-off-by: Shameer Kolothum <shameerali.kolothum.th...@huawei.com> > > Signed-off-by: Eric Auger <eric.au...@redhat.com> > > --- > > hw/arm/boot.c | 42 ++++++++++++++++++++++++++++++++++++++++++ > > 1 file changed, 42 insertions(+) > > > > diff --git a/hw/arm/boot.c b/hw/arm/boot.c > > index 8c840ba..150e1ed 100644 > > --- a/hw/arm/boot.c > > +++ b/hw/arm/boot.c > > @@ -19,6 +19,7 @@ > > #include "sysemu/numa.h" > > #include "hw/boards.h" > > #include "hw/loader.h" > > +#include "hw/mem/memory-device.h" > > #include "elf.h" > > #include "sysemu/device_tree.h" > > #include "qemu/config-file.h" > > @@ -538,6 +539,41 @@ static void fdt_add_psci_node(void *fdt) > > qemu_fdt_setprop_cell(fdt, "/psci", "migrate", migrate_fn); > > } > > > > +static int fdt_add_hotpluggable_memory_nodes(void *fdt, > > + uint32_t acells, > uint32_t scells) { > > + MemoryDeviceInfoList *info, *info_list = qmp_memory_device_list(); > > + MemoryDeviceInfo *mi; > > + int ret = 0; > > + > > + for (info = info_list; info != NULL; info = info->next) { > > + mi = info->value; > > + switch (mi->type) { > > + case MEMORY_DEVICE_INFO_KIND_DIMM: > > + { > > + PCDIMMDeviceInfo *di = mi->u.dimm.data; > > + > > + ret = fdt_add_memory_node(fdt, acells, di->addr, scells, > > + di->size, di->node, true); > > + if (ret) { > > + fprintf(stderr, > > + "couldn't add PCDIMM /memory@%"PRIx64" > node\n", > > + di->addr); > > + goto out; > > + } > > + break; > > + } > > + default: > > + fprintf(stderr, "%s memory nodes are not yet supported\n", > > + MemoryDeviceInfoKind_str(mi->type)); > > + ret = -ENOENT; > > + goto out; > > + } > > + } > > +out: > > + qapi_free_MemoryDeviceInfoList(info_list); > > + return ret; > > +} > > + > > int arm_load_dtb(hwaddr addr, const struct arm_boot_info *binfo, > > hwaddr addr_limit, AddressSpace *as) > > { > > @@ -637,6 +673,12 @@ int arm_load_dtb(hwaddr addr, const struct > arm_boot_info *binfo, > > } > > } > > > > + rc = fdt_add_hotpluggable_memory_nodes(fdt, acells, scells); > > + if (rc < 0) { > > + fprintf(stderr, "couldn't add hotpluggable memory nodes\n"); > > + goto fail; > > + } > > + > > rc = fdt_path_offset(fdt, "/chosen"); > > if (rc < 0) { > > qemu_fdt_add_subnode(fdt, "/chosen"); > > > > > Given patches #7 and #8, as I understand them, the firmware cannot > distinguish hotpluggable & present, from hotpluggable & absent. The firmware > can only skip both hotpluggable cases. That's fine in that the firmware will > hog > neither type -- but is that OK for the OS as well, for both ACPI boot and DT > boot?
Right. This only handles the hotpluggable-and-present condition. > Consider in particular the "hotpluggable & present, ACPI boot" case. Assuming > we modify the firmware to skip "hotpluggable" altogether, the UEFI memmap > will not include the range despite it being present at boot. Presumably, ACPI > will refer to the range somehow, however. Will that not confuse the OS? From my testing so far, without patches #7 and #8(ie, no UEFI memmap entry), ACPI boots fine. I think ACPI only relies on aml and SRAT. > When Igor raised this earlier, I suggested that hotpluggable-and-present > should be added by the firmware, but also allocated immediately, as > EfiBootServicesData type memory. This will prevent other drivers in the > firmware from allocating AcpiNVS or Reserved chunks from the same memory > range, the UEFI memmap will contain the range as EfiBootServicesData, and > then the OS can release that allocation in one go early during boot. Ok. Agree that hotpluggable-and-present case it may make sense to make use of that memory rather than just hiding it altogether. On another note, Does hotpluggable-and-absent case make any valid use case for a DT boot? I am not sure how we can hot-add memory in the case of DT boot later. I have not verified the sysfs probe interface yet and there are discussions of dropping that interface altogether. > But this really has to be clarified from the Linux kernel's expectations. > Please > formalize all of the following cases: Sure. I will wait for suggestions here and work on it. Thanks, Shameer > OS boot (DT/ACPI) hotpluggable & ... GetMemoryMap() should report as > DT/ACPI should report as > ----------------- ------------------ ------------------------------- > ------------------------ > DT > present ? ? > DT > absent ? ? > ACPI > present ? ? > ACPI > absent ? ? > > Again, this table is dictated by Linux. > > Thanks > Laszlo