Re: [PATCH 16/21] mm: remove early_pfn_in_nid() and CONFIG_NODES_SPAN_OTHER_NODES

2020-05-18 Thread Hoan Tran

Hi Mike and Baoquan,

On 4/22/20 6:13 PM, Baoquan He wrote:

On 04/12/20 at 10:48pm, Mike Rapoport wrote:

From: Mike Rapoport 

The commit f47ac088c406 ("mm: memmap_init: iterate over memblock regions


This commit id should be a temporary one, will be changed when merged
into maintainer's tree and linus's tree. Only saying last patch plus the
patch subject is OK?


rather that check each PFN") made early_pfn_in_nid() obsolete and since
CONFIG_NODES_SPAN_OTHER_NODES is only used to pick a stub or a real
implementation of early_pfn_in_nid() it is also not needed anymore.

Remove both early_pfn_in_nid() and the CONFIG_NODES_SPAN_OTHER_NODES.

Co-developed-by: Hoan Tran 
Signed-off-by: Hoan Tran 
Signed-off-by: Mike Rapoport 
---
  arch/powerpc/Kconfig |  9 -
  arch/sparc/Kconfig   |  9 -
  arch/x86/Kconfig |  9 -
  mm/page_alloc.c  | 20 
  4 files changed, 47 deletions(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 5f86b22b7d2c..74f316deeae1 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -685,15 +685,6 @@ config ARCH_MEMORY_PROBE
def_bool y
depends on MEMORY_HOTPLUG
  
-# Some NUMA nodes have memory ranges that span

-# other nodes.  Even though a pfn is valid and
-# between a node's start and end pfns, it may not
-# reside on that node.  See memmap_init_zone()
-# for details.
-config NODES_SPAN_OTHER_NODES
-   def_bool y
-   depends on NEED_MULTIPLE_NODES
-
  config STDBINUTILS
bool "Using standard binutils settings"
depends on 44x
diff --git a/arch/sparc/Kconfig b/arch/sparc/Kconfig
index 795206b7b552..0e4f3891b904 100644
--- a/arch/sparc/Kconfig
+++ b/arch/sparc/Kconfig
@@ -286,15 +286,6 @@ config NODES_SHIFT
  Specify the maximum number of NUMA Nodes available on the target
  system.  Increases memory reserved to accommodate various tables.
  
-# Some NUMA nodes have memory ranges that span

-# other nodes.  Even though a pfn is valid and
-# between a node's start and end pfns, it may not
-# reside on that node.  See memmap_init_zone()
-# for details.
-config NODES_SPAN_OTHER_NODES
-   def_bool y
-   depends on NEED_MULTIPLE_NODES
-
  config ARCH_SPARSEMEM_ENABLE
def_bool y if SPARC64
select SPARSEMEM_VMEMMAP_ENABLE
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 9d3e95b4fb85..37dac095659e 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1581,15 +1581,6 @@ config X86_64_ACPI_NUMA
---help---
  Enable ACPI SRAT based node topology detection.
  
-# Some NUMA nodes have memory ranges that span

-# other nodes.  Even though a pfn is valid and
-# between a node's start and end pfns, it may not
-# reside on that node.  See memmap_init_zone()
-# for details.
-config NODES_SPAN_OTHER_NODES
-   def_bool y
-   depends on X86_64_ACPI_NUMA
-
  config NUMA_EMU
bool "NUMA emulation"
depends on NUMA
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index c43ce8709457..343d87b8697d 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1541,26 +1541,6 @@ int __meminit early_pfn_to_nid(unsigned long pfn)
  }
  #endif /* CONFIG_NEED_MULTIPLE_NODES */
  
-#ifdef CONFIG_NODES_SPAN_OTHER_NODES

-/* Only safe to use early in boot when initialisation is single-threaded */
-static inline bool __meminit early_pfn_in_nid(unsigned long pfn, int node)
-{
-   int nid;
-
-   nid = __early_pfn_to_nid(pfn, _pfnnid_cache);
-   if (nid >= 0 && nid != node)
-   return false;
-   return true;
-}
-
-#else
-static inline bool __meminit early_pfn_in_nid(unsigned long pfn, int node)
-{
-   return true;
-}
-#endif


And macro early_pfn_valid() is not needed either, we may need remove it
too.

Otherwise, removing NODES_SPAN_OTHER_NODES in this patch looks good.

Reviewed-by: Baoquan He 


I have tested this patch set on Arm64, and it worked as expected with 
the case where the node memory spans to other nodes or the old 
NODES_SPAN_OTHER_NODES config.


Hope to the whole patch set will be upstream soon.

Thanks and Regards
Hoan




-
-
  void __init memblock_free_pages(struct page *page, unsigned long pfn,
unsigned int order)
  {
--
2.25.1





Re: [PATCH v3 0/5] mm: Enable CONFIG_NODES_SPAN_OTHER_NODES by default for NUMA

2020-04-03 Thread Hoan Tran

Hi,


On 4/3/20 12:09 AM, Baoquan He wrote:

On 04/02/20 at 09:46pm, Hoan Tran wrote:

Hi All,

On 3/31/20 7:31 AM, Baoquan He wrote:

On 03/31/20 at 04:21pm, Michal Hocko wrote:

On Tue 31-03-20 22:03:32, Baoquan He wrote:

Hi Michal,

On 03/31/20 at 10:55am, Michal Hocko wrote:

On Tue 31-03-20 11:14:23, Mike Rapoport wrote:

Maybe I mis-read the code, but I don't see how this could happen. In the
HAVE_MEMBLOCK_NODE_MAP=y case, free_area_init_node() calls
calculate_node_totalpages() that ensures that node->node_zones are entirely
within the node because this is checked in zone_spanned_pages_in_node().


zone_spanned_pages_in_node does chech the zone boundaries are within the
node boundaries. But that doesn't really tell anything about other
potential zones interleaving with the physical memory range.
zone->spanned_pages simply gives the physical range for the zone
including holes. Interleaving nodes are essentially a hole
(__absent_pages_in_range is going to skip those).

That means that when free_area_init_core simply goes over the whole
physical zone range including holes and that is why we need to check
both for physical and logical holes (aka other nodes).

The life would be so much easier if the whole thing would simply iterate
over memblocks...


The memblock iterating sounds a great idea. I tried with putting the
memblock iterating in the upper layer, memmap_init(), which is used for
boot mem only anyway. Do you think it's doable and OK? It yes, I can
work out a formal patch to make this simpler as you said. The draft code
is as below. Like this it uses the existing code and involves little change.


Doing this would be a step in the right direction! I haven't checked the
code very closely though. The below sounds way too simple to be truth I
am afraid. First for_each_mem_pfn_range is available only for
CONFIG_HAVE_MEMBLOCK_NODE_MAP (which is one of the reasons why I keep
saying that I really hate that being conditional). Also I haven't really
checked the deferred initialization path - I have a very vague
recollection that it has been converted to the memblock api but I have
happilly dropped all that memory.


Thanks for your quick response and pointing out the rest suspect aspects,
I will investigate what you mentioned, see if they impact.


I would like to check if we still move on with my patch to remove
CONFIG_NODES_SPAN_OTHER_NODES and have another patch on top it?


I think we would like to replace CONFIG_NODES_SPAN_OTHER_NODES with
CONFIG_NUMA, and just let UMA return 0 as node id, as Michal replied in
another mail. Anyway, your patch 2~5 are still needed to sit on top of
the change of this new plan.


Got it. Thanks for quick response.

Regards
Hoan




Re: [PATCH v3 0/5] mm: Enable CONFIG_NODES_SPAN_OTHER_NODES by default for NUMA

2020-04-02 Thread Hoan Tran

Hi All,

On 3/31/20 7:31 AM, Baoquan He wrote:

On 03/31/20 at 04:21pm, Michal Hocko wrote:

On Tue 31-03-20 22:03:32, Baoquan He wrote:

Hi Michal,

On 03/31/20 at 10:55am, Michal Hocko wrote:

On Tue 31-03-20 11:14:23, Mike Rapoport wrote:

Maybe I mis-read the code, but I don't see how this could happen. In the
HAVE_MEMBLOCK_NODE_MAP=y case, free_area_init_node() calls
calculate_node_totalpages() that ensures that node->node_zones are entirely
within the node because this is checked in zone_spanned_pages_in_node().


zone_spanned_pages_in_node does chech the zone boundaries are within the
node boundaries. But that doesn't really tell anything about other
potential zones interleaving with the physical memory range.
zone->spanned_pages simply gives the physical range for the zone
including holes. Interleaving nodes are essentially a hole
(__absent_pages_in_range is going to skip those).

That means that when free_area_init_core simply goes over the whole
physical zone range including holes and that is why we need to check
both for physical and logical holes (aka other nodes).

The life would be so much easier if the whole thing would simply iterate
over memblocks...


The memblock iterating sounds a great idea. I tried with putting the
memblock iterating in the upper layer, memmap_init(), which is used for
boot mem only anyway. Do you think it's doable and OK? It yes, I can
work out a formal patch to make this simpler as you said. The draft code
is as below. Like this it uses the existing code and involves little change.


Doing this would be a step in the right direction! I haven't checked the
code very closely though. The below sounds way too simple to be truth I
am afraid. First for_each_mem_pfn_range is available only for
CONFIG_HAVE_MEMBLOCK_NODE_MAP (which is one of the reasons why I keep
saying that I really hate that being conditional). Also I haven't really
checked the deferred initialization path - I have a very vague
recollection that it has been converted to the memblock api but I have
happilly dropped all that memory.


Thanks for your quick response and pointing out the rest suspect aspects,
I will investigate what you mentioned, see if they impact.


I would like to check if we still move on with my patch to remove 
CONFIG_NODES_SPAN_OTHER_NODES and have another patch on top it?


Thanks
Hoan



  

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 138a56c0f48f..558d421f294b 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -6007,14 +6007,6 @@ void __meminit memmap_init_zone(unsigned long size, int 
nid, unsigned long zone,
 * function.  They do not exist on hotplugged memory.
 */
if (context == MEMMAP_EARLY) {
-   if (!early_pfn_valid(pfn)) {
-   pfn = next_pfn(pfn);
-   continue;
-   }
-   if (!early_pfn_in_nid(pfn, nid)) {
-   pfn++;
-   continue;
-   }
if (overlap_memmap_init(zone, ))
continue;
if (defer_init(nid, pfn, end_pfn))
@@ -6130,9 +6122,17 @@ static void __meminit zone_init_free_lists(struct zone 
*zone)
  }
  
  void __meminit __weak memmap_init(unsigned long size, int nid,

- unsigned long zone, unsigned long start_pfn)
+ unsigned long zone, unsigned long 
range_start_pfn)
  {
-   memmap_init_zone(size, nid, zone, start_pfn, MEMMAP_EARLY, NULL);
+   unsigned long start_pfn, end_pfn;
+   unsigned long range_end_pfn = range_start_pfn + size;
+   int i;
+   for_each_mem_pfn_range(i, nid, _pfn, _pfn, NULL) {
+   start_pfn = clamp(start_pfn, range_start_pfn, range_end_pfn);
+   end_pfn = clamp(end_pfn, range_start_pfn, range_end_pfn);
+   if (end_pfn > start_pfn)
+   memmap_init_zone(size, nid, zone, start_pfn, 
MEMMAP_EARLY, NULL);
+   }
  }
  
  static int zone_batchsize(struct zone *zone)


--
Michal Hocko
SUSE Labs





[PATCH v3 1/5] mm: Enable CONFIG_NODES_SPAN_OTHER_NODES by default for NUMA

2020-03-28 Thread Hoan Tran
In NUMA layout which nodes have memory ranges that span across other nodes,
the mm driver can detect the memory node id incorrectly.

For example, with layout below
Node 0 address:    
Node 1 address:    

Note:
 - Memory from low to high
 - 0/1: Node id
 - x: Invalid memory of a node

When mm probes the memory map, without CONFIG_NODES_SPAN_OTHER_NODES
config, mm only checks the memory validity but not the node id.
Because of that, Node 1 also detects the memory from node 0 as below
when it scans from the start address to the end address of node 1.

Node 0 address:    
Node 1 address:    

This layout could occur on any architecture. Most of them enables
this config by default with CONFIG_NUMA. This patch, by default, enables
CONFIG_NODES_SPAN_OTHER_NODES or uses early_pfn_in_nid() for NUMA.

Signed-off-by: Hoan Tran 
---
 mm/page_alloc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index d047bf7..948c1c9 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1467,7 +1467,7 @@ int __meminit early_pfn_to_nid(unsigned long pfn)
 }
 #endif
 
-#ifdef CONFIG_NODES_SPAN_OTHER_NODES
+#ifdef CONFIG_NUMA
 /* Only safe to use early in boot when initialisation is single-threaded */
 static inline bool __meminit early_pfn_in_nid(unsigned long pfn, int node)
 {
-- 
1.8.3.1



[PATCH v3 5/5] s390: Kconfig: Remove CONFIG_NODES_SPAN_OTHER_NODES

2020-03-28 Thread Hoan Tran
Remove CONFIG_NODES_SPAN_OTHER_NODES as it's enabled
by default with NUMA.

Signed-off-by: Hoan Tran 
---
 arch/s390/Kconfig | 8 
 1 file changed, 8 deletions(-)

diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
index bc88841..d86066e 100644
--- a/arch/s390/Kconfig
+++ b/arch/s390/Kconfig
@@ -449,14 +449,6 @@ config NR_CPUS
 config HOTPLUG_CPU
def_bool y
 
-# Some NUMA nodes have memory ranges that span
-# other nodes. Even though a pfn is valid and
-# between a node's start and end pfns, it may not
-# reside on that node. See memmap_init_zone()
-# for details. <- They meant memory holes!
-config NODES_SPAN_OTHER_NODES
-   def_bool NUMA
-
 config NUMA
bool "NUMA support"
depends on SCHED_TOPOLOGY
-- 
1.8.3.1



[PATCH v3 4/5] sparc: Kconfig: Remove CONFIG_NODES_SPAN_OTHER_NODES

2020-03-28 Thread Hoan Tran
Remove CONFIG_NODES_SPAN_OTHER_NODES as it's enabled
by default with NUMA.

Signed-off-by: Hoan Tran 
---
 arch/sparc/Kconfig | 9 -
 1 file changed, 9 deletions(-)

diff --git a/arch/sparc/Kconfig b/arch/sparc/Kconfig
index eb24cb1..6fc615e 100644
--- a/arch/sparc/Kconfig
+++ b/arch/sparc/Kconfig
@@ -292,15 +292,6 @@ config NODES_SHIFT
  Specify the maximum number of NUMA Nodes available on the target
  system.  Increases memory reserved to accommodate various tables.
 
-# Some NUMA nodes have memory ranges that span
-# other nodes.  Even though a pfn is valid and
-# between a node's start and end pfns, it may not
-# reside on that node.  See memmap_init_zone()
-# for details.
-config NODES_SPAN_OTHER_NODES
-   def_bool y
-   depends on NEED_MULTIPLE_NODES
-
 config ARCH_SPARSEMEM_ENABLE
def_bool y if SPARC64
select SPARSEMEM_VMEMMAP_ENABLE
-- 
1.8.3.1



[PATCH v3 3/5] x86: Kconfig: Remove CONFIG_NODES_SPAN_OTHER_NODES

2020-03-28 Thread Hoan Tran
Remove CONFIG_NODES_SPAN_OTHER_NODES as it's enabled
by default. It is now enabled for x86(32 bit) configurations
and do not depend on X64_64_ACPI_NUMA config.
Because of that, on NUMA enabled system, early_pfn_in_nid() 
function is called by memmap_init_zone() during boot-time.
It doesn't affect the performance at run-time.

Signed-off-by: Hoan Tran 
---
 arch/x86/Kconfig | 9 -
 1 file changed, 9 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 5e89499..a938738 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1581,15 +1581,6 @@ config X86_64_ACPI_NUMA
---help---
  Enable ACPI SRAT based node topology detection.
 
-# Some NUMA nodes have memory ranges that span
-# other nodes.  Even though a pfn is valid and
-# between a node's start and end pfns, it may not
-# reside on that node.  See memmap_init_zone()
-# for details.
-config NODES_SPAN_OTHER_NODES
-   def_bool y
-   depends on X86_64_ACPI_NUMA
-
 config NUMA_EMU
bool "NUMA emulation"
depends on NUMA
-- 
1.8.3.1



[PATCH v3 2/5] powerpc: Kconfig: Remove CONFIG_NODES_SPAN_OTHER_NODES

2020-03-28 Thread Hoan Tran
Remove CONFIG_NODES_SPAN_OTHER_NODES as it's enabled by
default with NUMA.

Signed-off-by: Hoan Tran 
---
 arch/powerpc/Kconfig | 9 -
 1 file changed, 9 deletions(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index e2a4121..4af2699 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -686,15 +686,6 @@ config ARCH_MEMORY_PROBE
def_bool y
depends on MEMORY_HOTPLUG
 
-# Some NUMA nodes have memory ranges that span
-# other nodes.  Even though a pfn is valid and
-# between a node's start and end pfns, it may not
-# reside on that node.  See memmap_init_zone()
-# for details.
-config NODES_SPAN_OTHER_NODES
-   def_bool y
-   depends on NEED_MULTIPLE_NODES
-
 config STDBINUTILS
bool "Using standard binutils settings"
depends on 44x
-- 
1.8.3.1



[PATCH v3 0/5] mm: Enable CONFIG_NODES_SPAN_OTHER_NODES by default for NUMA

2020-03-28 Thread Hoan Tran
In NUMA layout which nodes have memory ranges that span across other nodes,
the mm driver can detect the memory node id incorrectly.

For example, with layout below
Node 0 address:    
Node 1 address:    

Note:
 - Memory from low to high
 - 0/1: Node id
 - x: Invalid memory of a node

When mm probes the memory map, without CONFIG_NODES_SPAN_OTHER_NODES
config, mm only checks the memory validity but not the node id.
Because of that, Node 1 also detects the memory from node 0 as below
when it scans from the start address to the end address of node 1.

Node 0 address:    
Node 1 address:    

This layout could occur on any architecture. Most of them enables
this config by default with CONFIG_NUMA. This patch, by default, enables
CONFIG_NODES_SPAN_OTHER_NODES or uses early_pfn_in_nid() for NUMA.

v3:
 * Revise the patch description

V2:
 * Revise the patch description

Hoan Tran (5):
  mm: Enable CONFIG_NODES_SPAN_OTHER_NODES by default for NUMA
  powerpc: Kconfig: Remove CONFIG_NODES_SPAN_OTHER_NODES
  x86: Kconfig: Remove CONFIG_NODES_SPAN_OTHER_NODES
  sparc: Kconfig: Remove CONFIG_NODES_SPAN_OTHER_NODES
  s390: Kconfig: Remove CONFIG_NODES_SPAN_OTHER_NODES

 arch/powerpc/Kconfig | 9 -
 arch/s390/Kconfig| 8 
 arch/sparc/Kconfig   | 9 -
 arch/x86/Kconfig | 9 -
 mm/page_alloc.c  | 2 +-
 5 files changed, 1 insertion(+), 36 deletions(-)

-- 
1.8.3.1



Re: [PATCH v2 3/5] x86: Kconfig: Remove CONFIG_NODES_SPAN_OTHER_NODES

2019-08-06 Thread Hoan Tran OS
Hi Thomas,


On 7/15/19 11:43 AM, Thomas Gleixner wrote:
> On Thu, 11 Jul 2019, Hoan Tran OS wrote:
> 
>> Remove CONFIG_NODES_SPAN_OTHER_NODES as it's enabled
>> by default with NUMA.
> 
> As I told you before this does not mention that the option is now enabled
> even for x86(32bit) configurations which did not enable it before and does
> not longer depend on X86_64_ACPI_NUMA.

Agreed, let me add it into this patch description.

> 
> And there is still no rationale why this makes sense.
> 

As we know about the memmap_init_zone() function, it is used to 
initialize all pages. During initializing, early_pfn_in_nid() function 
makes sure the page is in the same node id. Otherwise, 
memmap_init_zone() only checks the page validity. It won't work with 
node memory spans across the others.

The option CONFIG_NODES_SPAN_OTHER_NODES is only used to enable 
early_pfn_in_nid() function.

It occurs during boot-time and won't affect the run-time performance.
And I saw the majority NUMA architectures enable this option by default 
with NUMA.

Thanks and Regards
Hoan


> Thanks,
> 
>   tglx
> 


Re: [PATCH v2 0/5] mm: Enable CONFIG_NODES_SPAN_OTHER_NODES by default for NUMA

2019-07-30 Thread Hoan Tran OS
Hi,

On 7/30/19 1:14 AM, Michal Hocko wrote:
> [Sorry for a late reply]
> 
> On Mon 15-07-19 17:55:07, Hoan Tran OS wrote:
>> Hi,
>>
>> On 7/12/19 10:00 PM, Michal Hocko wrote:
> [...]
>>> Hmm, I thought this was selectable. But I am obviously wrong here.
>>> Looking more closely, it seems that this is indeed only about
>>> __early_pfn_to_nid and as such not something that should add a config
>>> symbol. This should have been called out in the changelog though.
>>
>> Yes, do you have any other comments about my patch?
> 
> Not really. Just make sure to explicitly state that
> CONFIG_NODES_SPAN_OTHER_NODES is only about __early_pfn_to_nid and that
> doesn't really deserve it's own config and can be pulled under NUMA.

Yes, I will add this info into the patch description.

> 
>>> Also while at it, does HAVE_MEMBLOCK_NODE_MAP fall into a similar
>>> bucket? Do we have any NUMA architecture that doesn't enable it?
>>>
>>
>> As I checked with arch Kconfig files, there are 2 architectures, riscv
>> and microblaze, do not support NUMA but enable this config.
>>
>> And 1 architecture, alpha, supports NUMA but does not enable this config.
> 
> Care to have a look and clean this up please?

Sure, I'll take a look.

Thanks
Hoan
> 




Re: [PATCH v2 0/5] mm: Enable CONFIG_NODES_SPAN_OTHER_NODES by default for NUMA

2019-07-15 Thread Hoan Tran OS
Hi,

On 7/12/19 10:00 PM, Michal Hocko wrote:
> On Fri 12-07-19 15:37:30, Will Deacon wrote:
>> Hi all,
>>
>> On Fri, Jul 12, 2019 at 02:12:23PM +0200, Michal Hocko wrote:
>>> On Fri 12-07-19 10:56:47, Hoan Tran OS wrote:
>>> [...]
>>>> It would be good if we can enable it by-default. Otherwise, let arch
>>>> enables it by them-self. Do you have any suggestions?
>>>
>>> I can hardly make any suggestions when it is not really clear _why_ you
>>> want to remove this config option in the first place. Please explain
>>> what motivated you to make this change.
>>
>> Sorry, I think this confusion might actually be my fault and Hoan has just
>> been implementing my vague suggestion here:
>>
>> https://lore.kernel.org/linux-arm-kernel/20190625101245.s4vxfosoop52gl4e@willie-the-truck/
>>
>> If the preference of the mm folks is to leave CONFIG_NODES_SPAN_OTHER_NODES
>> as it is, then we can define it for arm64. I just find it a bit weird that
>> the majority of NUMA-capable architectures have to add a symbol in the arch
>> Kconfig file, for what appears to be a performance optimisation applicable
>> only to ia64, mips and sh.
>>
>> At the very least we could make the thing selectable.
> 
> Hmm, I thought this was selectable. But I am obviously wrong here.
> Looking more closely, it seems that this is indeed only about
> __early_pfn_to_nid and as such not something that should add a config
> symbol. This should have been called out in the changelog though.

Yes, do you have any other comments about my patch?

> 
> Also while at it, does HAVE_MEMBLOCK_NODE_MAP fall into a similar
> bucket? Do we have any NUMA architecture that doesn't enable it?
> 

As I checked with arch Kconfig files, there are 2 architectures, riscv 
and microblaze, do not support NUMA but enable this config.

And 1 architecture, alpha, supports NUMA but does not enable this config.

Thanks and Regards
Hoan

> Thanks!
> 


Re: [PATCH v2 0/5] mm: Enable CONFIG_NODES_SPAN_OTHER_NODES by default for NUMA

2019-07-12 Thread Hoan Tran OS
Hi,

On 7/12/19 2:02 PM, Michal Hocko wrote:
> On Thu 11-07-19 23:25:44, Hoan Tran OS wrote:
>> In NUMA layout which nodes have memory ranges that span across other nodes,
>> the mm driver can detect the memory node id incorrectly.
>>
>> For example, with layout below
>> Node 0 address:    
>> Node 1 address:    
>>
>> Note:
>>   - Memory from low to high
>>   - 0/1: Node id
>>   - x: Invalid memory of a node
>>
>> When mm probes the memory map, without CONFIG_NODES_SPAN_OTHER_NODES
>> config, mm only checks the memory validity but not the node id.
>> Because of that, Node 1 also detects the memory from node 0 as below
>> when it scans from the start address to the end address of node 1.
>>
>> Node 0 address:    
>> Node 1 address:    
>>
>> This layout could occur on any architecture. This patch enables
>> CONFIG_NODES_SPAN_OTHER_NODES by default for NUMA to fix this issue.
> 
> Yes it can occur on any arch but most sane platforms simply do not
> overlap physical ranges. So I do not really see any reason to
> unconditionally enable the config for everybody. What is an advantage?
> 

As I observed from arch folder, there are 9 arch support NUMA config.

./arch/ia64/Kconfig:387:config NUMA
./arch/powerpc/Kconfig:582:config NUMA
./arch/sparc/Kconfig:281:config NUMA
./arch/alpha/Kconfig:557:config NUMA
./arch/sh/mm/Kconfig:112:config NUMA
./arch/arm64/Kconfig:841:config NUMA
./arch/x86/Kconfig:1531:config NUMA
./arch/mips/Kconfig:2646:config NUMA
./arch/s390/Kconfig:441:config NUMA

And there are 5 arch enables CONFIG_NODES_SPAN_OTHER_NODES with NUMA

arch/powerpc/Kconfig:637:config NODES_SPAN_OTHER_NODES
arch/sparc/Kconfig:299:config NODES_SPAN_OTHER_NODES
arch/x86/Kconfig:1575:config NODES_SPAN_OTHER_NODES
arch/s390/Kconfig:446:config NODES_SPAN_OTHER_NODES
arch/arm64 (which I intended to enable in the original patch)

It would be good if we can enable it by-default. Otherwise, let arch 
enables it by them-self. Do you have any suggestions?

Thanks
Hoan




[PATCH v2 5/5] s390: Kconfig: Remove CONFIG_NODES_SPAN_OTHER_NODES

2019-07-11 Thread Hoan Tran OS
Remove CONFIG_NODES_SPAN_OTHER_NODES as it's enabled
by default with NUMA.

Signed-off-by: Hoan Tran 
---
 arch/s390/Kconfig | 8 
 1 file changed, 8 deletions(-)

diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
index 109243f..788a8e9 100644
--- a/arch/s390/Kconfig
+++ b/arch/s390/Kconfig
@@ -438,14 +438,6 @@ config HOTPLUG_CPU
  can be controlled through /sys/devices/system/cpu/cpu#.
  Say N if you want to disable CPU hotplug.
 
-# Some NUMA nodes have memory ranges that span
-# other nodes. Even though a pfn is valid and
-# between a node's start and end pfns, it may not
-# reside on that node. See memmap_init_zone()
-# for details. <- They meant memory holes!
-config NODES_SPAN_OTHER_NODES
-   def_bool NUMA
-
 config NUMA
bool "NUMA support"
depends on SMP && SCHED_TOPOLOGY
-- 
2.7.4



[PATCH v2 4/5] sparc: Kconfig: Remove CONFIG_NODES_SPAN_OTHER_NODES

2019-07-11 Thread Hoan Tran OS
Remove CONFIG_NODES_SPAN_OTHER_NODES as it's enabled
by default with NUMA.

Signed-off-by: Hoan Tran 
---
 arch/sparc/Kconfig | 9 -
 1 file changed, 9 deletions(-)

diff --git a/arch/sparc/Kconfig b/arch/sparc/Kconfig
index 26ab6f5..13449ea 100644
--- a/arch/sparc/Kconfig
+++ b/arch/sparc/Kconfig
@@ -291,15 +291,6 @@ config NODES_SHIFT
  Specify the maximum number of NUMA Nodes available on the target
  system.  Increases memory reserved to accommodate various tables.
 
-# Some NUMA nodes have memory ranges that span
-# other nodes.  Even though a pfn is valid and
-# between a node's start and end pfns, it may not
-# reside on that node.  See memmap_init_zone()
-# for details.
-config NODES_SPAN_OTHER_NODES
-   def_bool y
-   depends on NEED_MULTIPLE_NODES
-
 config ARCH_SELECT_MEMORY_MODEL
def_bool y if SPARC64
 
-- 
2.7.4



[PATCH v2 3/5] x86: Kconfig: Remove CONFIG_NODES_SPAN_OTHER_NODES

2019-07-11 Thread Hoan Tran OS
Remove CONFIG_NODES_SPAN_OTHER_NODES as it's enabled
by default with NUMA.

Signed-off-by: Hoan Tran 
---
 arch/x86/Kconfig | 9 -
 1 file changed, 9 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 2bbbd4d..fa9318c 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1567,15 +1567,6 @@ config X86_64_ACPI_NUMA
---help---
  Enable ACPI SRAT based node topology detection.
 
-# Some NUMA nodes have memory ranges that span
-# other nodes.  Even though a pfn is valid and
-# between a node's start and end pfns, it may not
-# reside on that node.  See memmap_init_zone()
-# for details.
-config NODES_SPAN_OTHER_NODES
-   def_bool y
-   depends on X86_64_ACPI_NUMA
-
 config NUMA_EMU
bool "NUMA emulation"
depends on NUMA
-- 
2.7.4



[PATCH v2 2/5] powerpc: Kconfig: Remove CONFIG_NODES_SPAN_OTHER_NODES

2019-07-11 Thread Hoan Tran OS
Remove CONFIG_NODES_SPAN_OTHER_NODES as it's enabled by
default with NUMA.

Signed-off-by: Hoan Tran 
---
 arch/powerpc/Kconfig | 9 -
 1 file changed, 9 deletions(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 8c1c636..bdde8bc 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -629,15 +629,6 @@ config ARCH_MEMORY_PROBE
def_bool y
depends on MEMORY_HOTPLUG
 
-# Some NUMA nodes have memory ranges that span
-# other nodes.  Even though a pfn is valid and
-# between a node's start and end pfns, it may not
-# reside on that node.  See memmap_init_zone()
-# for details.
-config NODES_SPAN_OTHER_NODES
-   def_bool y
-   depends on NEED_MULTIPLE_NODES
-
 config STDBINUTILS
bool "Using standard binutils settings"
depends on 44x
-- 
2.7.4



[PATCH v2 1/5] mm: Enable CONFIG_NODES_SPAN_OTHER_NODES by default for NUMA

2019-07-11 Thread Hoan Tran OS
In NUMA layout which nodes have memory ranges that span across other nodes,
the mm driver can detect the memory node id incorrectly.

For example, with layout below
Node 0 address:    
Node 1 address:    

Note:
 - Memory from low to high
 - 0/1: Node id
 - x: Invalid memory of a node

When mm probes the memory map, without CONFIG_NODES_SPAN_OTHER_NODES
config, mm only checks the memory validity but not the node id.
Because of that, Node 1 also detects the memory from node 0 as below
when it scans from the start address to the end address of node 1.

Node 0 address:    
Node 1 address:    

This layout could occur on any architecture. This patch enables
CONFIG_NODES_SPAN_OTHER_NODES by default for NUMA to fix this issue.

Signed-off-by: Hoan Tran 
---
 mm/page_alloc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index d66bc8a..6335505 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1413,7 +1413,7 @@ int __meminit early_pfn_to_nid(unsigned long pfn)
 }
 #endif
 
-#ifdef CONFIG_NODES_SPAN_OTHER_NODES
+#ifdef CONFIG_NUMA
 /* Only safe to use early in boot when initialisation is single-threaded */
 static inline bool __meminit early_pfn_in_nid(unsigned long pfn, int node)
 {
-- 
2.7.4



[PATCH v2 0/5] mm: Enable CONFIG_NODES_SPAN_OTHER_NODES by default for NUMA

2019-07-11 Thread Hoan Tran OS
In NUMA layout which nodes have memory ranges that span across other nodes,
the mm driver can detect the memory node id incorrectly.

For example, with layout below
Node 0 address:    
Node 1 address:    

Note:
 - Memory from low to high
 - 0/1: Node id
 - x: Invalid memory of a node

When mm probes the memory map, without CONFIG_NODES_SPAN_OTHER_NODES
config, mm only checks the memory validity but not the node id.
Because of that, Node 1 also detects the memory from node 0 as below
when it scans from the start address to the end address of node 1.

Node 0 address:    
Node 1 address:    

This layout could occur on any architecture. This patch enables
CONFIG_NODES_SPAN_OTHER_NODES by default for NUMA to fix this issue.

V2:
 * Revise the patch description

Hoan Tran (5):
  mm: Enable CONFIG_NODES_SPAN_OTHER_NODES by default for NUMA
  powerpc: Kconfig: Remove CONFIG_NODES_SPAN_OTHER_NODES
  x86: Kconfig: Remove CONFIG_NODES_SPAN_OTHER_NODES
  sparc: Kconfig: Remove CONFIG_NODES_SPAN_OTHER_NODES
  s390: Kconfig: Remove CONFIG_NODES_SPAN_OTHER_NODES

 arch/powerpc/Kconfig | 9 -
 arch/s390/Kconfig| 8 
 arch/sparc/Kconfig   | 9 -
 arch/x86/Kconfig | 9 -
 mm/page_alloc.c  | 2 +-
 5 files changed, 1 insertion(+), 36 deletions(-)

-- 
2.7.4



Re: [PATCH 3/5] x86: Kconfig: Remove CONFIG_NODES_SPAN_OTHER_NODES

2019-07-10 Thread Hoan Tran OS
Hi Thomas,


On 7/10/19 12:58 PM, Thomas Gleixner wrote:
> Hoan,
> 
> On Wed, 10 Jul 2019, Hoan Tran OS wrote:
>> On 6/25/19 3:45 PM, Thomas Gleixner wrote:
>>> On Tue, 25 Jun 2019, Hoan Tran OS wrote:
>>>> @@ -1567,15 +1567,6 @@ config X86_64_ACPI_NUMA
>>>>---help---
>>>>  Enable ACPI SRAT based node topology detection.
>>>>
>>>> -# Some NUMA nodes have memory ranges that span
>>>> -# other nodes.  Even though a pfn is valid and
>>>> -# between a node's start and end pfns, it may not
>>>> -# reside on that node.  See memmap_init_zone()
>>>> -# for details.
>>>> -config NODES_SPAN_OTHER_NODES
>>>> -  def_bool y
>>>> -  depends on X86_64_ACPI_NUMA
>>>
>>> the changelog does not mention that this lifts the dependency on
>>> X86_64_ACPI_NUMA and therefore enables that functionality for anything
>>> which has NUMA enabled including 32bit.
>>>
>>
>> I think this config is used for a NUMA layout which NUMA nodes addresses
>> are spanned to other nodes. I think 32bit NUMA also have the same issue
>> with that layout. Please correct me if I'm wrong.
> 
> I'm not saying you're wrong, but it's your duty to provide the analysis why
> this is correct for everything which has NUMA enabled.
> 
>>> The core mm change gives no helpful information either. You just copied the
>>> above comment text from some random Kconfig.
>>
>> Yes, as it's a correct comment and is used at multiple places.
> 
> Well it maybe correct in terms of explaining what this is about, it still
> does not explain why this is needed by default on everything which has NUMA
> enabled.

Let me send another patch with the detail explanation.

Thanks
Hoan

> 
> Thanks,
> 
>   tglx
> 


Re: [PATCH 3/5] x86: Kconfig: Remove CONFIG_NODES_SPAN_OTHER_NODES

2019-07-09 Thread Hoan Tran OS
Hi Thomas,

Thanks for you comments

On 6/25/19 3:45 PM, Thomas Gleixner wrote:
> Hoan,
> 
> On Tue, 25 Jun 2019, Hoan Tran OS wrote:
> 
> Please use 'x86/Kconfig: ' as prefix.
> 
>> This patch removes CONFIG_NODES_SPAN_OTHER_NODES as it's
>> enabled by default with NUMA.
> 
> Please do not use 'This patch' in changelogs. It's pointless because we
> already know that this is a patch.
> 
> See also Documentation/process/submitting-patches.rst and search for 'This
> patch'
> 
> Simply say:
> 
>Remove CONFIG_NODES_SPAN_OTHER_NODES as it's enabled by default with
>NUMA.
> 

Got it, will fix

> But .
> 
>> @@ -1567,15 +1567,6 @@ config X86_64_ACPI_NUMA
>>  ---help---
>>Enable ACPI SRAT based node topology detection.
>>   
>> -# Some NUMA nodes have memory ranges that span
>> -# other nodes.  Even though a pfn is valid and
>> -# between a node's start and end pfns, it may not
>> -# reside on that node.  See memmap_init_zone()
>> -# for details.
>> -config NODES_SPAN_OTHER_NODES
>> -def_bool y
>> -depends on X86_64_ACPI_NUMA
> 
> the changelog does not mention that this lifts the dependency on
> X86_64_ACPI_NUMA and therefore enables that functionality for anything
> which has NUMA enabled including 32bit.
> 

I think this config is used for a NUMA layout which NUMA nodes addresses 
are spanned to other nodes. I think 32bit NUMA also have the same issue 
with that layout. Please correct me if I'm wrong.

> The core mm change gives no helpful information either. You just copied the
> above comment text from some random Kconfig.

Yes, as it's a correct comment and is used at multiple places.

Thanks
Hoan

> 
> This needs a bit more data in the changelogs and the cover letter:
> 
>   - Why is it useful to enable it unconditionally
> 
>   - Why is it safe to do so, even if the architecture had constraints on
> it
> 
>   - What's the potential impact
> 
> Thanks,
> 
>   tglx
> 


[PATCH 5/5] s390: Kconfig: Remove CONFIG_NODES_SPAN_OTHER_NODES

2019-06-25 Thread Hoan Tran OS
This patch removes CONFIG_NODES_SPAN_OTHER_NODES as it's
enabled by default with NUMA.

Signed-off-by: Hoan Tran 
---
 arch/s390/Kconfig | 8 
 1 file changed, 8 deletions(-)

diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
index 109243f..788a8e9 100644
--- a/arch/s390/Kconfig
+++ b/arch/s390/Kconfig
@@ -438,14 +438,6 @@ config HOTPLUG_CPU
  can be controlled through /sys/devices/system/cpu/cpu#.
  Say N if you want to disable CPU hotplug.
 
-# Some NUMA nodes have memory ranges that span
-# other nodes. Even though a pfn is valid and
-# between a node's start and end pfns, it may not
-# reside on that node. See memmap_init_zone()
-# for details. <- They meant memory holes!
-config NODES_SPAN_OTHER_NODES
-   def_bool NUMA
-
 config NUMA
bool "NUMA support"
depends on SMP && SCHED_TOPOLOGY
-- 
2.7.4



[PATCH 4/5] sparc: Kconfig: Remove CONFIG_NODES_SPAN_OTHER_NODES

2019-06-25 Thread Hoan Tran OS
This patch removes CONFIG_NODES_SPAN_OTHER_NODES as it's
enabled by default with NUMA.

Signed-off-by: Hoan Tran 
---
 arch/sparc/Kconfig | 9 -
 1 file changed, 9 deletions(-)

diff --git a/arch/sparc/Kconfig b/arch/sparc/Kconfig
index 26ab6f5..13449ea 100644
--- a/arch/sparc/Kconfig
+++ b/arch/sparc/Kconfig
@@ -291,15 +291,6 @@ config NODES_SHIFT
  Specify the maximum number of NUMA Nodes available on the target
  system.  Increases memory reserved to accommodate various tables.
 
-# Some NUMA nodes have memory ranges that span
-# other nodes.  Even though a pfn is valid and
-# between a node's start and end pfns, it may not
-# reside on that node.  See memmap_init_zone()
-# for details.
-config NODES_SPAN_OTHER_NODES
-   def_bool y
-   depends on NEED_MULTIPLE_NODES
-
 config ARCH_SELECT_MEMORY_MODEL
def_bool y if SPARC64
 
-- 
2.7.4



[PATCH 3/5] x86: Kconfig: Remove CONFIG_NODES_SPAN_OTHER_NODES

2019-06-25 Thread Hoan Tran OS
This patch removes CONFIG_NODES_SPAN_OTHER_NODES as it's
enabled by default with NUMA.

Signed-off-by: Hoan Tran 
---
 arch/x86/Kconfig | 9 -
 1 file changed, 9 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 2bbbd4d..fa9318c 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1567,15 +1567,6 @@ config X86_64_ACPI_NUMA
---help---
  Enable ACPI SRAT based node topology detection.
 
-# Some NUMA nodes have memory ranges that span
-# other nodes.  Even though a pfn is valid and
-# between a node's start and end pfns, it may not
-# reside on that node.  See memmap_init_zone()
-# for details.
-config NODES_SPAN_OTHER_NODES
-   def_bool y
-   depends on X86_64_ACPI_NUMA
-
 config NUMA_EMU
bool "NUMA emulation"
depends on NUMA
-- 
2.7.4



[PATCH 2/5] powerpc: Kconfig: Remove CONFIG_NODES_SPAN_OTHER_NODES

2019-06-25 Thread Hoan Tran OS
This patch removes CONFIG_NODES_SPAN_OTHER_NODES as it's
enabled by default with NUMA.

Signed-off-by: Hoan Tran 
---
 arch/powerpc/Kconfig | 9 -
 1 file changed, 9 deletions(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 8c1c636..bdde8bc 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -629,15 +629,6 @@ config ARCH_MEMORY_PROBE
def_bool y
depends on MEMORY_HOTPLUG
 
-# Some NUMA nodes have memory ranges that span
-# other nodes.  Even though a pfn is valid and
-# between a node's start and end pfns, it may not
-# reside on that node.  See memmap_init_zone()
-# for details.
-config NODES_SPAN_OTHER_NODES
-   def_bool y
-   depends on NEED_MULTIPLE_NODES
-
 config STDBINUTILS
bool "Using standard binutils settings"
depends on 44x
-- 
2.7.4



[PATCH 1/5] mm: Enable CONFIG_NODES_SPAN_OTHER_NODES by default for NUMA

2019-06-25 Thread Hoan Tran OS
This patch enables CONFIG_NODES_SPAN_OTHER_NODES by default
for NUMA. As some NUMA nodes have memory ranges that span other
nodes. Even though a pfn is valid and between a node's start and
end pfns, it may not reside on that node.

Signed-off-by: Hoan Tran 
---
 mm/page_alloc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index d66bc8a..6335505 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1413,7 +1413,7 @@ int __meminit early_pfn_to_nid(unsigned long pfn)
 }
 #endif
 
-#ifdef CONFIG_NODES_SPAN_OTHER_NODES
+#ifdef CONFIG_NUMA
 /* Only safe to use early in boot when initialisation is single-threaded */
 static inline bool __meminit early_pfn_in_nid(unsigned long pfn, int node)
 {
-- 
2.7.4



[PATCH 0/5] Enable CONFIG_NODES_SPAN_OTHER_NODES by default for NUMA

2019-06-25 Thread Hoan Tran OS
This patch set enables CONFIG_NODES_SPAN_OTHER_NODES by default
for NUMA.

The original patch [1]
[1] arm64: Kconfig: Enable NODES_SPAN_OTHER_NODES config for NUMA
https://www.spinics.net/lists/arm-kernel/msg737306.html

Hoan Tran (5):
  mm: Enable CONFIG_NODES_SPAN_OTHER_NODES by default for NUMA
  powerpc: Kconfig: Remove CONFIG_NODES_SPAN_OTHER_NODES
  x86: Kconfig: Remove CONFIG_NODES_SPAN_OTHER_NODES
  sparc: Kconfig: Remove CONFIG_NODES_SPAN_OTHER_NODES
  s390: Kconfig: Remove CONFIG_NODES_SPAN_OTHER_NODES

 arch/powerpc/Kconfig | 9 -
 arch/s390/Kconfig| 8 
 arch/sparc/Kconfig   | 9 -
 arch/x86/Kconfig | 9 -
 mm/page_alloc.c  | 2 +-
 5 files changed, 1 insertion(+), 36 deletions(-)

-- 
2.7.4