Re: [PATCH 16/21] mm: remove early_pfn_in_nid() and CONFIG_NODES_SPAN_OTHER_NODES
Hi Mike and Baoquan, On 4/22/20 6:13 PM, Baoquan He wrote: On 04/12/20 at 10:48pm, Mike Rapoport wrote: From: Mike Rapoport The commit f47ac088c406 ("mm: memmap_init: iterate over memblock regions This commit id should be a temporary one, will be changed when merged into maintainer's tree and linus's tree. Only saying last patch plus the patch subject is OK? rather that check each PFN") made early_pfn_in_nid() obsolete and since CONFIG_NODES_SPAN_OTHER_NODES is only used to pick a stub or a real implementation of early_pfn_in_nid() it is also not needed anymore. Remove both early_pfn_in_nid() and the CONFIG_NODES_SPAN_OTHER_NODES. Co-developed-by: Hoan Tran Signed-off-by: Hoan Tran Signed-off-by: Mike Rapoport --- arch/powerpc/Kconfig | 9 - arch/sparc/Kconfig | 9 - arch/x86/Kconfig | 9 - mm/page_alloc.c | 20 4 files changed, 47 deletions(-) diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index 5f86b22b7d2c..74f316deeae1 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -685,15 +685,6 @@ config ARCH_MEMORY_PROBE def_bool y depends on MEMORY_HOTPLUG -# Some NUMA nodes have memory ranges that span -# other nodes. Even though a pfn is valid and -# between a node's start and end pfns, it may not -# reside on that node. See memmap_init_zone() -# for details. -config NODES_SPAN_OTHER_NODES - def_bool y - depends on NEED_MULTIPLE_NODES - config STDBINUTILS bool "Using standard binutils settings" depends on 44x diff --git a/arch/sparc/Kconfig b/arch/sparc/Kconfig index 795206b7b552..0e4f3891b904 100644 --- a/arch/sparc/Kconfig +++ b/arch/sparc/Kconfig @@ -286,15 +286,6 @@ config NODES_SHIFT Specify the maximum number of NUMA Nodes available on the target system. Increases memory reserved to accommodate various tables. -# Some NUMA nodes have memory ranges that span -# other nodes. Even though a pfn is valid and -# between a node's start and end pfns, it may not -# reside on that node. See memmap_init_zone() -# for details. -config NODES_SPAN_OTHER_NODES - def_bool y - depends on NEED_MULTIPLE_NODES - config ARCH_SPARSEMEM_ENABLE def_bool y if SPARC64 select SPARSEMEM_VMEMMAP_ENABLE diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 9d3e95b4fb85..37dac095659e 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -1581,15 +1581,6 @@ config X86_64_ACPI_NUMA ---help--- Enable ACPI SRAT based node topology detection. -# Some NUMA nodes have memory ranges that span -# other nodes. Even though a pfn is valid and -# between a node's start and end pfns, it may not -# reside on that node. See memmap_init_zone() -# for details. -config NODES_SPAN_OTHER_NODES - def_bool y - depends on X86_64_ACPI_NUMA - config NUMA_EMU bool "NUMA emulation" depends on NUMA diff --git a/mm/page_alloc.c b/mm/page_alloc.c index c43ce8709457..343d87b8697d 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -1541,26 +1541,6 @@ int __meminit early_pfn_to_nid(unsigned long pfn) } #endif /* CONFIG_NEED_MULTIPLE_NODES */ -#ifdef CONFIG_NODES_SPAN_OTHER_NODES -/* Only safe to use early in boot when initialisation is single-threaded */ -static inline bool __meminit early_pfn_in_nid(unsigned long pfn, int node) -{ - int nid; - - nid = __early_pfn_to_nid(pfn, _pfnnid_cache); - if (nid >= 0 && nid != node) - return false; - return true; -} - -#else -static inline bool __meminit early_pfn_in_nid(unsigned long pfn, int node) -{ - return true; -} -#endif And macro early_pfn_valid() is not needed either, we may need remove it too. Otherwise, removing NODES_SPAN_OTHER_NODES in this patch looks good. Reviewed-by: Baoquan He I have tested this patch set on Arm64, and it worked as expected with the case where the node memory spans to other nodes or the old NODES_SPAN_OTHER_NODES config. Hope to the whole patch set will be upstream soon. Thanks and Regards Hoan - - void __init memblock_free_pages(struct page *page, unsigned long pfn, unsigned int order) { -- 2.25.1
Re: [PATCH v3 0/5] mm: Enable CONFIG_NODES_SPAN_OTHER_NODES by default for NUMA
Hi, On 4/3/20 12:09 AM, Baoquan He wrote: On 04/02/20 at 09:46pm, Hoan Tran wrote: Hi All, On 3/31/20 7:31 AM, Baoquan He wrote: On 03/31/20 at 04:21pm, Michal Hocko wrote: On Tue 31-03-20 22:03:32, Baoquan He wrote: Hi Michal, On 03/31/20 at 10:55am, Michal Hocko wrote: On Tue 31-03-20 11:14:23, Mike Rapoport wrote: Maybe I mis-read the code, but I don't see how this could happen. In the HAVE_MEMBLOCK_NODE_MAP=y case, free_area_init_node() calls calculate_node_totalpages() that ensures that node->node_zones are entirely within the node because this is checked in zone_spanned_pages_in_node(). zone_spanned_pages_in_node does chech the zone boundaries are within the node boundaries. But that doesn't really tell anything about other potential zones interleaving with the physical memory range. zone->spanned_pages simply gives the physical range for the zone including holes. Interleaving nodes are essentially a hole (__absent_pages_in_range is going to skip those). That means that when free_area_init_core simply goes over the whole physical zone range including holes and that is why we need to check both for physical and logical holes (aka other nodes). The life would be so much easier if the whole thing would simply iterate over memblocks... The memblock iterating sounds a great idea. I tried with putting the memblock iterating in the upper layer, memmap_init(), which is used for boot mem only anyway. Do you think it's doable and OK? It yes, I can work out a formal patch to make this simpler as you said. The draft code is as below. Like this it uses the existing code and involves little change. Doing this would be a step in the right direction! I haven't checked the code very closely though. The below sounds way too simple to be truth I am afraid. First for_each_mem_pfn_range is available only for CONFIG_HAVE_MEMBLOCK_NODE_MAP (which is one of the reasons why I keep saying that I really hate that being conditional). Also I haven't really checked the deferred initialization path - I have a very vague recollection that it has been converted to the memblock api but I have happilly dropped all that memory. Thanks for your quick response and pointing out the rest suspect aspects, I will investigate what you mentioned, see if they impact. I would like to check if we still move on with my patch to remove CONFIG_NODES_SPAN_OTHER_NODES and have another patch on top it? I think we would like to replace CONFIG_NODES_SPAN_OTHER_NODES with CONFIG_NUMA, and just let UMA return 0 as node id, as Michal replied in another mail. Anyway, your patch 2~5 are still needed to sit on top of the change of this new plan. Got it. Thanks for quick response. Regards Hoan
Re: [PATCH v3 0/5] mm: Enable CONFIG_NODES_SPAN_OTHER_NODES by default for NUMA
Hi All, On 3/31/20 7:31 AM, Baoquan He wrote: On 03/31/20 at 04:21pm, Michal Hocko wrote: On Tue 31-03-20 22:03:32, Baoquan He wrote: Hi Michal, On 03/31/20 at 10:55am, Michal Hocko wrote: On Tue 31-03-20 11:14:23, Mike Rapoport wrote: Maybe I mis-read the code, but I don't see how this could happen. In the HAVE_MEMBLOCK_NODE_MAP=y case, free_area_init_node() calls calculate_node_totalpages() that ensures that node->node_zones are entirely within the node because this is checked in zone_spanned_pages_in_node(). zone_spanned_pages_in_node does chech the zone boundaries are within the node boundaries. But that doesn't really tell anything about other potential zones interleaving with the physical memory range. zone->spanned_pages simply gives the physical range for the zone including holes. Interleaving nodes are essentially a hole (__absent_pages_in_range is going to skip those). That means that when free_area_init_core simply goes over the whole physical zone range including holes and that is why we need to check both for physical and logical holes (aka other nodes). The life would be so much easier if the whole thing would simply iterate over memblocks... The memblock iterating sounds a great idea. I tried with putting the memblock iterating in the upper layer, memmap_init(), which is used for boot mem only anyway. Do you think it's doable and OK? It yes, I can work out a formal patch to make this simpler as you said. The draft code is as below. Like this it uses the existing code and involves little change. Doing this would be a step in the right direction! I haven't checked the code very closely though. The below sounds way too simple to be truth I am afraid. First for_each_mem_pfn_range is available only for CONFIG_HAVE_MEMBLOCK_NODE_MAP (which is one of the reasons why I keep saying that I really hate that being conditional). Also I haven't really checked the deferred initialization path - I have a very vague recollection that it has been converted to the memblock api but I have happilly dropped all that memory. Thanks for your quick response and pointing out the rest suspect aspects, I will investigate what you mentioned, see if they impact. I would like to check if we still move on with my patch to remove CONFIG_NODES_SPAN_OTHER_NODES and have another patch on top it? Thanks Hoan diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 138a56c0f48f..558d421f294b 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -6007,14 +6007,6 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone, * function. They do not exist on hotplugged memory. */ if (context == MEMMAP_EARLY) { - if (!early_pfn_valid(pfn)) { - pfn = next_pfn(pfn); - continue; - } - if (!early_pfn_in_nid(pfn, nid)) { - pfn++; - continue; - } if (overlap_memmap_init(zone, )) continue; if (defer_init(nid, pfn, end_pfn)) @@ -6130,9 +6122,17 @@ static void __meminit zone_init_free_lists(struct zone *zone) } void __meminit __weak memmap_init(unsigned long size, int nid, - unsigned long zone, unsigned long start_pfn) + unsigned long zone, unsigned long range_start_pfn) { - memmap_init_zone(size, nid, zone, start_pfn, MEMMAP_EARLY, NULL); + unsigned long start_pfn, end_pfn; + unsigned long range_end_pfn = range_start_pfn + size; + int i; + for_each_mem_pfn_range(i, nid, _pfn, _pfn, NULL) { + start_pfn = clamp(start_pfn, range_start_pfn, range_end_pfn); + end_pfn = clamp(end_pfn, range_start_pfn, range_end_pfn); + if (end_pfn > start_pfn) + memmap_init_zone(size, nid, zone, start_pfn, MEMMAP_EARLY, NULL); + } } static int zone_batchsize(struct zone *zone) -- Michal Hocko SUSE Labs
[PATCH v3 1/5] mm: Enable CONFIG_NODES_SPAN_OTHER_NODES by default for NUMA
In NUMA layout which nodes have memory ranges that span across other nodes, the mm driver can detect the memory node id incorrectly. For example, with layout below Node 0 address: Node 1 address: Note: - Memory from low to high - 0/1: Node id - x: Invalid memory of a node When mm probes the memory map, without CONFIG_NODES_SPAN_OTHER_NODES config, mm only checks the memory validity but not the node id. Because of that, Node 1 also detects the memory from node 0 as below when it scans from the start address to the end address of node 1. Node 0 address: Node 1 address: This layout could occur on any architecture. Most of them enables this config by default with CONFIG_NUMA. This patch, by default, enables CONFIG_NODES_SPAN_OTHER_NODES or uses early_pfn_in_nid() for NUMA. Signed-off-by: Hoan Tran --- mm/page_alloc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index d047bf7..948c1c9 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -1467,7 +1467,7 @@ int __meminit early_pfn_to_nid(unsigned long pfn) } #endif -#ifdef CONFIG_NODES_SPAN_OTHER_NODES +#ifdef CONFIG_NUMA /* Only safe to use early in boot when initialisation is single-threaded */ static inline bool __meminit early_pfn_in_nid(unsigned long pfn, int node) { -- 1.8.3.1
[PATCH v3 5/5] s390: Kconfig: Remove CONFIG_NODES_SPAN_OTHER_NODES
Remove CONFIG_NODES_SPAN_OTHER_NODES as it's enabled by default with NUMA. Signed-off-by: Hoan Tran --- arch/s390/Kconfig | 8 1 file changed, 8 deletions(-) diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig index bc88841..d86066e 100644 --- a/arch/s390/Kconfig +++ b/arch/s390/Kconfig @@ -449,14 +449,6 @@ config NR_CPUS config HOTPLUG_CPU def_bool y -# Some NUMA nodes have memory ranges that span -# other nodes. Even though a pfn is valid and -# between a node's start and end pfns, it may not -# reside on that node. See memmap_init_zone() -# for details. <- They meant memory holes! -config NODES_SPAN_OTHER_NODES - def_bool NUMA - config NUMA bool "NUMA support" depends on SCHED_TOPOLOGY -- 1.8.3.1
[PATCH v3 4/5] sparc: Kconfig: Remove CONFIG_NODES_SPAN_OTHER_NODES
Remove CONFIG_NODES_SPAN_OTHER_NODES as it's enabled by default with NUMA. Signed-off-by: Hoan Tran --- arch/sparc/Kconfig | 9 - 1 file changed, 9 deletions(-) diff --git a/arch/sparc/Kconfig b/arch/sparc/Kconfig index eb24cb1..6fc615e 100644 --- a/arch/sparc/Kconfig +++ b/arch/sparc/Kconfig @@ -292,15 +292,6 @@ config NODES_SHIFT Specify the maximum number of NUMA Nodes available on the target system. Increases memory reserved to accommodate various tables. -# Some NUMA nodes have memory ranges that span -# other nodes. Even though a pfn is valid and -# between a node's start and end pfns, it may not -# reside on that node. See memmap_init_zone() -# for details. -config NODES_SPAN_OTHER_NODES - def_bool y - depends on NEED_MULTIPLE_NODES - config ARCH_SPARSEMEM_ENABLE def_bool y if SPARC64 select SPARSEMEM_VMEMMAP_ENABLE -- 1.8.3.1
[PATCH v3 3/5] x86: Kconfig: Remove CONFIG_NODES_SPAN_OTHER_NODES
Remove CONFIG_NODES_SPAN_OTHER_NODES as it's enabled by default. It is now enabled for x86(32 bit) configurations and do not depend on X64_64_ACPI_NUMA config. Because of that, on NUMA enabled system, early_pfn_in_nid() function is called by memmap_init_zone() during boot-time. It doesn't affect the performance at run-time. Signed-off-by: Hoan Tran --- arch/x86/Kconfig | 9 - 1 file changed, 9 deletions(-) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 5e89499..a938738 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -1581,15 +1581,6 @@ config X86_64_ACPI_NUMA ---help--- Enable ACPI SRAT based node topology detection. -# Some NUMA nodes have memory ranges that span -# other nodes. Even though a pfn is valid and -# between a node's start and end pfns, it may not -# reside on that node. See memmap_init_zone() -# for details. -config NODES_SPAN_OTHER_NODES - def_bool y - depends on X86_64_ACPI_NUMA - config NUMA_EMU bool "NUMA emulation" depends on NUMA -- 1.8.3.1
[PATCH v3 2/5] powerpc: Kconfig: Remove CONFIG_NODES_SPAN_OTHER_NODES
Remove CONFIG_NODES_SPAN_OTHER_NODES as it's enabled by default with NUMA. Signed-off-by: Hoan Tran --- arch/powerpc/Kconfig | 9 - 1 file changed, 9 deletions(-) diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index e2a4121..4af2699 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -686,15 +686,6 @@ config ARCH_MEMORY_PROBE def_bool y depends on MEMORY_HOTPLUG -# Some NUMA nodes have memory ranges that span -# other nodes. Even though a pfn is valid and -# between a node's start and end pfns, it may not -# reside on that node. See memmap_init_zone() -# for details. -config NODES_SPAN_OTHER_NODES - def_bool y - depends on NEED_MULTIPLE_NODES - config STDBINUTILS bool "Using standard binutils settings" depends on 44x -- 1.8.3.1
[PATCH v3 0/5] mm: Enable CONFIG_NODES_SPAN_OTHER_NODES by default for NUMA
In NUMA layout which nodes have memory ranges that span across other nodes, the mm driver can detect the memory node id incorrectly. For example, with layout below Node 0 address: Node 1 address: Note: - Memory from low to high - 0/1: Node id - x: Invalid memory of a node When mm probes the memory map, without CONFIG_NODES_SPAN_OTHER_NODES config, mm only checks the memory validity but not the node id. Because of that, Node 1 also detects the memory from node 0 as below when it scans from the start address to the end address of node 1. Node 0 address: Node 1 address: This layout could occur on any architecture. Most of them enables this config by default with CONFIG_NUMA. This patch, by default, enables CONFIG_NODES_SPAN_OTHER_NODES or uses early_pfn_in_nid() for NUMA. v3: * Revise the patch description V2: * Revise the patch description Hoan Tran (5): mm: Enable CONFIG_NODES_SPAN_OTHER_NODES by default for NUMA powerpc: Kconfig: Remove CONFIG_NODES_SPAN_OTHER_NODES x86: Kconfig: Remove CONFIG_NODES_SPAN_OTHER_NODES sparc: Kconfig: Remove CONFIG_NODES_SPAN_OTHER_NODES s390: Kconfig: Remove CONFIG_NODES_SPAN_OTHER_NODES arch/powerpc/Kconfig | 9 - arch/s390/Kconfig| 8 arch/sparc/Kconfig | 9 - arch/x86/Kconfig | 9 - mm/page_alloc.c | 2 +- 5 files changed, 1 insertion(+), 36 deletions(-) -- 1.8.3.1
Re: [PATCH v2 3/5] x86: Kconfig: Remove CONFIG_NODES_SPAN_OTHER_NODES
Hi Thomas, On 7/15/19 11:43 AM, Thomas Gleixner wrote: > On Thu, 11 Jul 2019, Hoan Tran OS wrote: > >> Remove CONFIG_NODES_SPAN_OTHER_NODES as it's enabled >> by default with NUMA. > > As I told you before this does not mention that the option is now enabled > even for x86(32bit) configurations which did not enable it before and does > not longer depend on X86_64_ACPI_NUMA. Agreed, let me add it into this patch description. > > And there is still no rationale why this makes sense. > As we know about the memmap_init_zone() function, it is used to initialize all pages. During initializing, early_pfn_in_nid() function makes sure the page is in the same node id. Otherwise, memmap_init_zone() only checks the page validity. It won't work with node memory spans across the others. The option CONFIG_NODES_SPAN_OTHER_NODES is only used to enable early_pfn_in_nid() function. It occurs during boot-time and won't affect the run-time performance. And I saw the majority NUMA architectures enable this option by default with NUMA. Thanks and Regards Hoan > Thanks, > > tglx >
Re: [PATCH v2 0/5] mm: Enable CONFIG_NODES_SPAN_OTHER_NODES by default for NUMA
Hi, On 7/30/19 1:14 AM, Michal Hocko wrote: > [Sorry for a late reply] > > On Mon 15-07-19 17:55:07, Hoan Tran OS wrote: >> Hi, >> >> On 7/12/19 10:00 PM, Michal Hocko wrote: > [...] >>> Hmm, I thought this was selectable. But I am obviously wrong here. >>> Looking more closely, it seems that this is indeed only about >>> __early_pfn_to_nid and as such not something that should add a config >>> symbol. This should have been called out in the changelog though. >> >> Yes, do you have any other comments about my patch? > > Not really. Just make sure to explicitly state that > CONFIG_NODES_SPAN_OTHER_NODES is only about __early_pfn_to_nid and that > doesn't really deserve it's own config and can be pulled under NUMA. Yes, I will add this info into the patch description. > >>> Also while at it, does HAVE_MEMBLOCK_NODE_MAP fall into a similar >>> bucket? Do we have any NUMA architecture that doesn't enable it? >>> >> >> As I checked with arch Kconfig files, there are 2 architectures, riscv >> and microblaze, do not support NUMA but enable this config. >> >> And 1 architecture, alpha, supports NUMA but does not enable this config. > > Care to have a look and clean this up please? Sure, I'll take a look. Thanks Hoan >
Re: [PATCH v2 0/5] mm: Enable CONFIG_NODES_SPAN_OTHER_NODES by default for NUMA
Hi, On 7/12/19 10:00 PM, Michal Hocko wrote: > On Fri 12-07-19 15:37:30, Will Deacon wrote: >> Hi all, >> >> On Fri, Jul 12, 2019 at 02:12:23PM +0200, Michal Hocko wrote: >>> On Fri 12-07-19 10:56:47, Hoan Tran OS wrote: >>> [...] >>>> It would be good if we can enable it by-default. Otherwise, let arch >>>> enables it by them-self. Do you have any suggestions? >>> >>> I can hardly make any suggestions when it is not really clear _why_ you >>> want to remove this config option in the first place. Please explain >>> what motivated you to make this change. >> >> Sorry, I think this confusion might actually be my fault and Hoan has just >> been implementing my vague suggestion here: >> >> https://lore.kernel.org/linux-arm-kernel/20190625101245.s4vxfosoop52gl4e@willie-the-truck/ >> >> If the preference of the mm folks is to leave CONFIG_NODES_SPAN_OTHER_NODES >> as it is, then we can define it for arm64. I just find it a bit weird that >> the majority of NUMA-capable architectures have to add a symbol in the arch >> Kconfig file, for what appears to be a performance optimisation applicable >> only to ia64, mips and sh. >> >> At the very least we could make the thing selectable. > > Hmm, I thought this was selectable. But I am obviously wrong here. > Looking more closely, it seems that this is indeed only about > __early_pfn_to_nid and as such not something that should add a config > symbol. This should have been called out in the changelog though. Yes, do you have any other comments about my patch? > > Also while at it, does HAVE_MEMBLOCK_NODE_MAP fall into a similar > bucket? Do we have any NUMA architecture that doesn't enable it? > As I checked with arch Kconfig files, there are 2 architectures, riscv and microblaze, do not support NUMA but enable this config. And 1 architecture, alpha, supports NUMA but does not enable this config. Thanks and Regards Hoan > Thanks! >
Re: [PATCH v2 0/5] mm: Enable CONFIG_NODES_SPAN_OTHER_NODES by default for NUMA
Hi, On 7/12/19 2:02 PM, Michal Hocko wrote: > On Thu 11-07-19 23:25:44, Hoan Tran OS wrote: >> In NUMA layout which nodes have memory ranges that span across other nodes, >> the mm driver can detect the memory node id incorrectly. >> >> For example, with layout below >> Node 0 address: >> Node 1 address: >> >> Note: >> - Memory from low to high >> - 0/1: Node id >> - x: Invalid memory of a node >> >> When mm probes the memory map, without CONFIG_NODES_SPAN_OTHER_NODES >> config, mm only checks the memory validity but not the node id. >> Because of that, Node 1 also detects the memory from node 0 as below >> when it scans from the start address to the end address of node 1. >> >> Node 0 address: >> Node 1 address: >> >> This layout could occur on any architecture. This patch enables >> CONFIG_NODES_SPAN_OTHER_NODES by default for NUMA to fix this issue. > > Yes it can occur on any arch but most sane platforms simply do not > overlap physical ranges. So I do not really see any reason to > unconditionally enable the config for everybody. What is an advantage? > As I observed from arch folder, there are 9 arch support NUMA config. ./arch/ia64/Kconfig:387:config NUMA ./arch/powerpc/Kconfig:582:config NUMA ./arch/sparc/Kconfig:281:config NUMA ./arch/alpha/Kconfig:557:config NUMA ./arch/sh/mm/Kconfig:112:config NUMA ./arch/arm64/Kconfig:841:config NUMA ./arch/x86/Kconfig:1531:config NUMA ./arch/mips/Kconfig:2646:config NUMA ./arch/s390/Kconfig:441:config NUMA And there are 5 arch enables CONFIG_NODES_SPAN_OTHER_NODES with NUMA arch/powerpc/Kconfig:637:config NODES_SPAN_OTHER_NODES arch/sparc/Kconfig:299:config NODES_SPAN_OTHER_NODES arch/x86/Kconfig:1575:config NODES_SPAN_OTHER_NODES arch/s390/Kconfig:446:config NODES_SPAN_OTHER_NODES arch/arm64 (which I intended to enable in the original patch) It would be good if we can enable it by-default. Otherwise, let arch enables it by them-self. Do you have any suggestions? Thanks Hoan
[PATCH v2 5/5] s390: Kconfig: Remove CONFIG_NODES_SPAN_OTHER_NODES
Remove CONFIG_NODES_SPAN_OTHER_NODES as it's enabled by default with NUMA. Signed-off-by: Hoan Tran --- arch/s390/Kconfig | 8 1 file changed, 8 deletions(-) diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig index 109243f..788a8e9 100644 --- a/arch/s390/Kconfig +++ b/arch/s390/Kconfig @@ -438,14 +438,6 @@ config HOTPLUG_CPU can be controlled through /sys/devices/system/cpu/cpu#. Say N if you want to disable CPU hotplug. -# Some NUMA nodes have memory ranges that span -# other nodes. Even though a pfn is valid and -# between a node's start and end pfns, it may not -# reside on that node. See memmap_init_zone() -# for details. <- They meant memory holes! -config NODES_SPAN_OTHER_NODES - def_bool NUMA - config NUMA bool "NUMA support" depends on SMP && SCHED_TOPOLOGY -- 2.7.4
[PATCH v2 4/5] sparc: Kconfig: Remove CONFIG_NODES_SPAN_OTHER_NODES
Remove CONFIG_NODES_SPAN_OTHER_NODES as it's enabled by default with NUMA. Signed-off-by: Hoan Tran --- arch/sparc/Kconfig | 9 - 1 file changed, 9 deletions(-) diff --git a/arch/sparc/Kconfig b/arch/sparc/Kconfig index 26ab6f5..13449ea 100644 --- a/arch/sparc/Kconfig +++ b/arch/sparc/Kconfig @@ -291,15 +291,6 @@ config NODES_SHIFT Specify the maximum number of NUMA Nodes available on the target system. Increases memory reserved to accommodate various tables. -# Some NUMA nodes have memory ranges that span -# other nodes. Even though a pfn is valid and -# between a node's start and end pfns, it may not -# reside on that node. See memmap_init_zone() -# for details. -config NODES_SPAN_OTHER_NODES - def_bool y - depends on NEED_MULTIPLE_NODES - config ARCH_SELECT_MEMORY_MODEL def_bool y if SPARC64 -- 2.7.4
[PATCH v2 3/5] x86: Kconfig: Remove CONFIG_NODES_SPAN_OTHER_NODES
Remove CONFIG_NODES_SPAN_OTHER_NODES as it's enabled by default with NUMA. Signed-off-by: Hoan Tran --- arch/x86/Kconfig | 9 - 1 file changed, 9 deletions(-) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 2bbbd4d..fa9318c 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -1567,15 +1567,6 @@ config X86_64_ACPI_NUMA ---help--- Enable ACPI SRAT based node topology detection. -# Some NUMA nodes have memory ranges that span -# other nodes. Even though a pfn is valid and -# between a node's start and end pfns, it may not -# reside on that node. See memmap_init_zone() -# for details. -config NODES_SPAN_OTHER_NODES - def_bool y - depends on X86_64_ACPI_NUMA - config NUMA_EMU bool "NUMA emulation" depends on NUMA -- 2.7.4
[PATCH v2 2/5] powerpc: Kconfig: Remove CONFIG_NODES_SPAN_OTHER_NODES
Remove CONFIG_NODES_SPAN_OTHER_NODES as it's enabled by default with NUMA. Signed-off-by: Hoan Tran --- arch/powerpc/Kconfig | 9 - 1 file changed, 9 deletions(-) diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index 8c1c636..bdde8bc 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -629,15 +629,6 @@ config ARCH_MEMORY_PROBE def_bool y depends on MEMORY_HOTPLUG -# Some NUMA nodes have memory ranges that span -# other nodes. Even though a pfn is valid and -# between a node's start and end pfns, it may not -# reside on that node. See memmap_init_zone() -# for details. -config NODES_SPAN_OTHER_NODES - def_bool y - depends on NEED_MULTIPLE_NODES - config STDBINUTILS bool "Using standard binutils settings" depends on 44x -- 2.7.4
[PATCH v2 1/5] mm: Enable CONFIG_NODES_SPAN_OTHER_NODES by default for NUMA
In NUMA layout which nodes have memory ranges that span across other nodes, the mm driver can detect the memory node id incorrectly. For example, with layout below Node 0 address: Node 1 address: Note: - Memory from low to high - 0/1: Node id - x: Invalid memory of a node When mm probes the memory map, without CONFIG_NODES_SPAN_OTHER_NODES config, mm only checks the memory validity but not the node id. Because of that, Node 1 also detects the memory from node 0 as below when it scans from the start address to the end address of node 1. Node 0 address: Node 1 address: This layout could occur on any architecture. This patch enables CONFIG_NODES_SPAN_OTHER_NODES by default for NUMA to fix this issue. Signed-off-by: Hoan Tran --- mm/page_alloc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index d66bc8a..6335505 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -1413,7 +1413,7 @@ int __meminit early_pfn_to_nid(unsigned long pfn) } #endif -#ifdef CONFIG_NODES_SPAN_OTHER_NODES +#ifdef CONFIG_NUMA /* Only safe to use early in boot when initialisation is single-threaded */ static inline bool __meminit early_pfn_in_nid(unsigned long pfn, int node) { -- 2.7.4
[PATCH v2 0/5] mm: Enable CONFIG_NODES_SPAN_OTHER_NODES by default for NUMA
In NUMA layout which nodes have memory ranges that span across other nodes, the mm driver can detect the memory node id incorrectly. For example, with layout below Node 0 address: Node 1 address: Note: - Memory from low to high - 0/1: Node id - x: Invalid memory of a node When mm probes the memory map, without CONFIG_NODES_SPAN_OTHER_NODES config, mm only checks the memory validity but not the node id. Because of that, Node 1 also detects the memory from node 0 as below when it scans from the start address to the end address of node 1. Node 0 address: Node 1 address: This layout could occur on any architecture. This patch enables CONFIG_NODES_SPAN_OTHER_NODES by default for NUMA to fix this issue. V2: * Revise the patch description Hoan Tran (5): mm: Enable CONFIG_NODES_SPAN_OTHER_NODES by default for NUMA powerpc: Kconfig: Remove CONFIG_NODES_SPAN_OTHER_NODES x86: Kconfig: Remove CONFIG_NODES_SPAN_OTHER_NODES sparc: Kconfig: Remove CONFIG_NODES_SPAN_OTHER_NODES s390: Kconfig: Remove CONFIG_NODES_SPAN_OTHER_NODES arch/powerpc/Kconfig | 9 - arch/s390/Kconfig| 8 arch/sparc/Kconfig | 9 - arch/x86/Kconfig | 9 - mm/page_alloc.c | 2 +- 5 files changed, 1 insertion(+), 36 deletions(-) -- 2.7.4
Re: [PATCH 3/5] x86: Kconfig: Remove CONFIG_NODES_SPAN_OTHER_NODES
Hi Thomas, On 7/10/19 12:58 PM, Thomas Gleixner wrote: > Hoan, > > On Wed, 10 Jul 2019, Hoan Tran OS wrote: >> On 6/25/19 3:45 PM, Thomas Gleixner wrote: >>> On Tue, 25 Jun 2019, Hoan Tran OS wrote: >>>> @@ -1567,15 +1567,6 @@ config X86_64_ACPI_NUMA >>>>---help--- >>>> Enable ACPI SRAT based node topology detection. >>>> >>>> -# Some NUMA nodes have memory ranges that span >>>> -# other nodes. Even though a pfn is valid and >>>> -# between a node's start and end pfns, it may not >>>> -# reside on that node. See memmap_init_zone() >>>> -# for details. >>>> -config NODES_SPAN_OTHER_NODES >>>> - def_bool y >>>> - depends on X86_64_ACPI_NUMA >>> >>> the changelog does not mention that this lifts the dependency on >>> X86_64_ACPI_NUMA and therefore enables that functionality for anything >>> which has NUMA enabled including 32bit. >>> >> >> I think this config is used for a NUMA layout which NUMA nodes addresses >> are spanned to other nodes. I think 32bit NUMA also have the same issue >> with that layout. Please correct me if I'm wrong. > > I'm not saying you're wrong, but it's your duty to provide the analysis why > this is correct for everything which has NUMA enabled. > >>> The core mm change gives no helpful information either. You just copied the >>> above comment text from some random Kconfig. >> >> Yes, as it's a correct comment and is used at multiple places. > > Well it maybe correct in terms of explaining what this is about, it still > does not explain why this is needed by default on everything which has NUMA > enabled. Let me send another patch with the detail explanation. Thanks Hoan > > Thanks, > > tglx >
Re: [PATCH 3/5] x86: Kconfig: Remove CONFIG_NODES_SPAN_OTHER_NODES
Hi Thomas, Thanks for you comments On 6/25/19 3:45 PM, Thomas Gleixner wrote: > Hoan, > > On Tue, 25 Jun 2019, Hoan Tran OS wrote: > > Please use 'x86/Kconfig: ' as prefix. > >> This patch removes CONFIG_NODES_SPAN_OTHER_NODES as it's >> enabled by default with NUMA. > > Please do not use 'This patch' in changelogs. It's pointless because we > already know that this is a patch. > > See also Documentation/process/submitting-patches.rst and search for 'This > patch' > > Simply say: > >Remove CONFIG_NODES_SPAN_OTHER_NODES as it's enabled by default with >NUMA. > Got it, will fix > But . > >> @@ -1567,15 +1567,6 @@ config X86_64_ACPI_NUMA >> ---help--- >>Enable ACPI SRAT based node topology detection. >> >> -# Some NUMA nodes have memory ranges that span >> -# other nodes. Even though a pfn is valid and >> -# between a node's start and end pfns, it may not >> -# reside on that node. See memmap_init_zone() >> -# for details. >> -config NODES_SPAN_OTHER_NODES >> -def_bool y >> -depends on X86_64_ACPI_NUMA > > the changelog does not mention that this lifts the dependency on > X86_64_ACPI_NUMA and therefore enables that functionality for anything > which has NUMA enabled including 32bit. > I think this config is used for a NUMA layout which NUMA nodes addresses are spanned to other nodes. I think 32bit NUMA also have the same issue with that layout. Please correct me if I'm wrong. > The core mm change gives no helpful information either. You just copied the > above comment text from some random Kconfig. Yes, as it's a correct comment and is used at multiple places. Thanks Hoan > > This needs a bit more data in the changelogs and the cover letter: > > - Why is it useful to enable it unconditionally > > - Why is it safe to do so, even if the architecture had constraints on > it > > - What's the potential impact > > Thanks, > > tglx >
[PATCH 5/5] s390: Kconfig: Remove CONFIG_NODES_SPAN_OTHER_NODES
This patch removes CONFIG_NODES_SPAN_OTHER_NODES as it's enabled by default with NUMA. Signed-off-by: Hoan Tran --- arch/s390/Kconfig | 8 1 file changed, 8 deletions(-) diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig index 109243f..788a8e9 100644 --- a/arch/s390/Kconfig +++ b/arch/s390/Kconfig @@ -438,14 +438,6 @@ config HOTPLUG_CPU can be controlled through /sys/devices/system/cpu/cpu#. Say N if you want to disable CPU hotplug. -# Some NUMA nodes have memory ranges that span -# other nodes. Even though a pfn is valid and -# between a node's start and end pfns, it may not -# reside on that node. See memmap_init_zone() -# for details. <- They meant memory holes! -config NODES_SPAN_OTHER_NODES - def_bool NUMA - config NUMA bool "NUMA support" depends on SMP && SCHED_TOPOLOGY -- 2.7.4
[PATCH 4/5] sparc: Kconfig: Remove CONFIG_NODES_SPAN_OTHER_NODES
This patch removes CONFIG_NODES_SPAN_OTHER_NODES as it's enabled by default with NUMA. Signed-off-by: Hoan Tran --- arch/sparc/Kconfig | 9 - 1 file changed, 9 deletions(-) diff --git a/arch/sparc/Kconfig b/arch/sparc/Kconfig index 26ab6f5..13449ea 100644 --- a/arch/sparc/Kconfig +++ b/arch/sparc/Kconfig @@ -291,15 +291,6 @@ config NODES_SHIFT Specify the maximum number of NUMA Nodes available on the target system. Increases memory reserved to accommodate various tables. -# Some NUMA nodes have memory ranges that span -# other nodes. Even though a pfn is valid and -# between a node's start and end pfns, it may not -# reside on that node. See memmap_init_zone() -# for details. -config NODES_SPAN_OTHER_NODES - def_bool y - depends on NEED_MULTIPLE_NODES - config ARCH_SELECT_MEMORY_MODEL def_bool y if SPARC64 -- 2.7.4
[PATCH 3/5] x86: Kconfig: Remove CONFIG_NODES_SPAN_OTHER_NODES
This patch removes CONFIG_NODES_SPAN_OTHER_NODES as it's enabled by default with NUMA. Signed-off-by: Hoan Tran --- arch/x86/Kconfig | 9 - 1 file changed, 9 deletions(-) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 2bbbd4d..fa9318c 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -1567,15 +1567,6 @@ config X86_64_ACPI_NUMA ---help--- Enable ACPI SRAT based node topology detection. -# Some NUMA nodes have memory ranges that span -# other nodes. Even though a pfn is valid and -# between a node's start and end pfns, it may not -# reside on that node. See memmap_init_zone() -# for details. -config NODES_SPAN_OTHER_NODES - def_bool y - depends on X86_64_ACPI_NUMA - config NUMA_EMU bool "NUMA emulation" depends on NUMA -- 2.7.4
[PATCH 2/5] powerpc: Kconfig: Remove CONFIG_NODES_SPAN_OTHER_NODES
This patch removes CONFIG_NODES_SPAN_OTHER_NODES as it's enabled by default with NUMA. Signed-off-by: Hoan Tran --- arch/powerpc/Kconfig | 9 - 1 file changed, 9 deletions(-) diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index 8c1c636..bdde8bc 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -629,15 +629,6 @@ config ARCH_MEMORY_PROBE def_bool y depends on MEMORY_HOTPLUG -# Some NUMA nodes have memory ranges that span -# other nodes. Even though a pfn is valid and -# between a node's start and end pfns, it may not -# reside on that node. See memmap_init_zone() -# for details. -config NODES_SPAN_OTHER_NODES - def_bool y - depends on NEED_MULTIPLE_NODES - config STDBINUTILS bool "Using standard binutils settings" depends on 44x -- 2.7.4
[PATCH 1/5] mm: Enable CONFIG_NODES_SPAN_OTHER_NODES by default for NUMA
This patch enables CONFIG_NODES_SPAN_OTHER_NODES by default for NUMA. As some NUMA nodes have memory ranges that span other nodes. Even though a pfn is valid and between a node's start and end pfns, it may not reside on that node. Signed-off-by: Hoan Tran --- mm/page_alloc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index d66bc8a..6335505 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -1413,7 +1413,7 @@ int __meminit early_pfn_to_nid(unsigned long pfn) } #endif -#ifdef CONFIG_NODES_SPAN_OTHER_NODES +#ifdef CONFIG_NUMA /* Only safe to use early in boot when initialisation is single-threaded */ static inline bool __meminit early_pfn_in_nid(unsigned long pfn, int node) { -- 2.7.4
[PATCH 0/5] Enable CONFIG_NODES_SPAN_OTHER_NODES by default for NUMA
This patch set enables CONFIG_NODES_SPAN_OTHER_NODES by default for NUMA. The original patch [1] [1] arm64: Kconfig: Enable NODES_SPAN_OTHER_NODES config for NUMA https://www.spinics.net/lists/arm-kernel/msg737306.html Hoan Tran (5): mm: Enable CONFIG_NODES_SPAN_OTHER_NODES by default for NUMA powerpc: Kconfig: Remove CONFIG_NODES_SPAN_OTHER_NODES x86: Kconfig: Remove CONFIG_NODES_SPAN_OTHER_NODES sparc: Kconfig: Remove CONFIG_NODES_SPAN_OTHER_NODES s390: Kconfig: Remove CONFIG_NODES_SPAN_OTHER_NODES arch/powerpc/Kconfig | 9 - arch/s390/Kconfig| 8 arch/sparc/Kconfig | 9 - arch/x86/Kconfig | 9 - mm/page_alloc.c | 2 +- 5 files changed, 1 insertion(+), 36 deletions(-) -- 2.7.4