RE: [PATCH v3 5/6] device-dax: use fallback nid when numa_node is invalid
Hi Dan > -Original Message- > From: Dan Williams > Sent: Thursday, July 9, 2020 11:39 AM > To: Justin He > Cc: Catalin Marinas ; Will Deacon > ; Tony Luck ; Fenghua Yu > ; Yoshinori Sato ; Rich > Felker ; Dave Hansen ; Andy > Lutomirski ; Peter Zijlstra ; > Thomas Gleixner ; Ingo Molnar ; > Borislav Petkov ; David Hildenbrand ; X86 > ML ; H. Peter Anvin ; Vishal Verma > ; Dave Jiang ; Andrew > Morton ; Baoquan He ; Chuhong > Yuan ; Mike Rapoport ; Logan > Gunthorpe ; Masahiro Yamada ; > Michal Hocko ; Linux ARM ker...@lists.infradead.org>; Linux Kernel Mailing List ker...@vger.kernel.org>; linux-i...@vger.kernel.org; Linux-sh s...@vger.kernel.org>; linux-nvdimm ; Linux MM > ; Jonathan Cameron ; Kaly > Xin > Subject: Re: [PATCH v3 5/6] device-dax: use fallback nid when numa_node is > invalid > > On Wed, Jul 8, 2020 at 7:07 PM Jia He wrote: > > > > numa_off is set unconditionally at the end of dummy_numa_init(), > > even with a fake numa node. ACPI detects node id as NUMA_NO_NODE(-1) in > > acpi_map_pxm_to_node() because it regards numa_off as turning off the > numa > > node. Hence dev_dax->target_node is NUMA_NO_NODE on arm64 with fake numa. > > > > Without this patch, pmem can't be probed as a RAM device on arm64 if > SRAT table > > isn't present: > > $ndctl create-namespace -fe namespace0.0 --mode=devdax --map=dev -s 1g - > a 64K > > kmem dax0.0: rejecting DAX region [mem 0x24040-0x2bfff] with > invalid node: -1 > > kmem: probe of dax0.0 failed with error -22 > > > > This fixes it by using fallback memory_add_physaddr_to_nid() as nid. > > > > Suggested-by: David Hildenbrand > > Signed-off-by: Jia He > > --- > > drivers/dax/kmem.c | 21 + > > 1 file changed, 13 insertions(+), 8 deletions(-) > > > > diff --git a/drivers/dax/kmem.c b/drivers/dax/kmem.c > > index 275aa5f87399..218f66057994 100644 > > --- a/drivers/dax/kmem.c > > +++ b/drivers/dax/kmem.c > > @@ -31,22 +31,23 @@ int dev_dax_kmem_probe(struct device *dev) > > int numa_node; > > int rc; > > > > + /* Hotplug starting at the beginning of the next block: */ > > + kmem_start = ALIGN(res->start, memory_block_size_bytes()); > > + > > /* > > * Ensure good NUMA information for the persistent memory. > > * Without this check, there is a risk that slow memory > > * could be mixed in a node with faster memory, causing > > -* unavoidable performance issues. > > +* unavoidable performance issues. Furthermore, fallback node > > +* id can be used when numa_node is invalid. > > */ > > numa_node = dev_dax->target_node; > > if (numa_node < 0) { > > - dev_warn(dev, "rejecting DAX region %pR with invalid > node: %d\n", > > -res, numa_node); > > - return -EINVAL; > > + numa_node = memory_add_physaddr_to_nid(kmem_start); > > I think this fixup belongs to the core to set a fallback value for > dev_dax->target_node. > > I'm close to having patches to provide a functional > phys_addr_to_target_node() for arm64. Should My this patch(5/6) wait on your new phys_addr_to_target_node() patch? Thanks for the clarification. -- Cheers, Justin (Jia He)
Re: [PATCH v3 5/6] device-dax: use fallback nid when numa_node is invalid
On Wed, Jul 8, 2020 at 7:07 PM Jia He wrote: > > numa_off is set unconditionally at the end of dummy_numa_init(), > even with a fake numa node. ACPI detects node id as NUMA_NO_NODE(-1) in > acpi_map_pxm_to_node() because it regards numa_off as turning off the numa > node. Hence dev_dax->target_node is NUMA_NO_NODE on arm64 with fake numa. > > Without this patch, pmem can't be probed as a RAM device on arm64 if SRAT > table > isn't present: > $ndctl create-namespace -fe namespace0.0 --mode=devdax --map=dev -s 1g -a 64K > kmem dax0.0: rejecting DAX region [mem 0x24040-0x2bfff] with invalid > node: -1 > kmem: probe of dax0.0 failed with error -22 > > This fixes it by using fallback memory_add_physaddr_to_nid() as nid. > > Suggested-by: David Hildenbrand > Signed-off-by: Jia He > --- > drivers/dax/kmem.c | 21 + > 1 file changed, 13 insertions(+), 8 deletions(-) > > diff --git a/drivers/dax/kmem.c b/drivers/dax/kmem.c > index 275aa5f87399..218f66057994 100644 > --- a/drivers/dax/kmem.c > +++ b/drivers/dax/kmem.c > @@ -31,22 +31,23 @@ int dev_dax_kmem_probe(struct device *dev) > int numa_node; > int rc; > > + /* Hotplug starting at the beginning of the next block: */ > + kmem_start = ALIGN(res->start, memory_block_size_bytes()); > + > /* > * Ensure good NUMA information for the persistent memory. > * Without this check, there is a risk that slow memory > * could be mixed in a node with faster memory, causing > -* unavoidable performance issues. > +* unavoidable performance issues. Furthermore, fallback node > +* id can be used when numa_node is invalid. > */ > numa_node = dev_dax->target_node; > if (numa_node < 0) { > - dev_warn(dev, "rejecting DAX region %pR with invalid node: > %d\n", > -res, numa_node); > - return -EINVAL; > + numa_node = memory_add_physaddr_to_nid(kmem_start); I think this fixup belongs to the core to set a fallback value for dev_dax->target_node. I'm close to having patches to provide a functional phys_addr_to_target_node() for arm64.
[PATCH v3 5/6] device-dax: use fallback nid when numa_node is invalid
numa_off is set unconditionally at the end of dummy_numa_init(), even with a fake numa node. ACPI detects node id as NUMA_NO_NODE(-1) in acpi_map_pxm_to_node() because it regards numa_off as turning off the numa node. Hence dev_dax->target_node is NUMA_NO_NODE on arm64 with fake numa. Without this patch, pmem can't be probed as a RAM device on arm64 if SRAT table isn't present: $ndctl create-namespace -fe namespace0.0 --mode=devdax --map=dev -s 1g -a 64K kmem dax0.0: rejecting DAX region [mem 0x24040-0x2bfff] with invalid node: -1 kmem: probe of dax0.0 failed with error -22 This fixes it by using fallback memory_add_physaddr_to_nid() as nid. Suggested-by: David Hildenbrand Signed-off-by: Jia He --- drivers/dax/kmem.c | 21 + 1 file changed, 13 insertions(+), 8 deletions(-) diff --git a/drivers/dax/kmem.c b/drivers/dax/kmem.c index 275aa5f87399..218f66057994 100644 --- a/drivers/dax/kmem.c +++ b/drivers/dax/kmem.c @@ -31,22 +31,23 @@ int dev_dax_kmem_probe(struct device *dev) int numa_node; int rc; + /* Hotplug starting at the beginning of the next block: */ + kmem_start = ALIGN(res->start, memory_block_size_bytes()); + /* * Ensure good NUMA information for the persistent memory. * Without this check, there is a risk that slow memory * could be mixed in a node with faster memory, causing -* unavoidable performance issues. +* unavoidable performance issues. Furthermore, fallback node +* id can be used when numa_node is invalid. */ numa_node = dev_dax->target_node; if (numa_node < 0) { - dev_warn(dev, "rejecting DAX region %pR with invalid node: %d\n", -res, numa_node); - return -EINVAL; + numa_node = memory_add_physaddr_to_nid(kmem_start); + dev_info(dev, "using nid %d for DAX region with undefined nid %pR\n", + numa_node, res); } - /* Hotplug starting at the beginning of the next block: */ - kmem_start = ALIGN(res->start, memory_block_size_bytes()); - kmem_size = resource_size(res); /* Adjust the size down to compensate for moving up kmem_start: */ kmem_size -= kmem_start - res->start; @@ -100,15 +101,19 @@ static int dev_dax_kmem_remove(struct device *dev) resource_size_t kmem_start = res->start; resource_size_t kmem_size = resource_size(res); const char *res_name = res->name; + int numa_node = dev_dax->target_node; int rc; + if (numa_node < 0) + numa_node = memory_add_physaddr_to_nid(kmem_start); + /* * We have one shot for removing memory, if some memory blocks were not * offline prior to calling this function remove_memory() will fail, and * there is no way to hotremove this memory until reboot because device * unbind will succeed even if we return failure. */ - rc = remove_memory(dev_dax->target_node, kmem_start, kmem_size); + rc = remove_memory(numa_node, kmem_start, kmem_size); if (rc) { any_hotremove_failed = true; dev_err(dev, -- 2.17.1