Re: [PATCH v1 1/3] powerpc/mm/radix: Create separate mappings for hot-plugged memory

2020-06-23 Thread Aneesh Kumar K.V
Bharata B Rao  writes:

> Memory that gets hot-plugged _during_ boot (and not the memory
> that gets plugged in after boot), is mapped with 1G mappings
> and will undergo splitting when it is unplugged. The splitting
> code has a few issues:
>
> 1. Recursive locking
> 
> Memory unplug path takes cpu_hotplug_lock and calls stop_machine()
> for splitting the mappings. However stop_machine() takes
> cpu_hotplug_lock again causing deadlock.
>
> 2. BUG: sleeping function called from in_atomic() context
> -
> Memory unplug path (remove_pagetable) takes init_mm.page_table_lock
> spinlock and later calls stop_machine() which does wait_for_completion()
>
> 3. Bad unlock unbalance
> ---
> Memory unplug path takes init_mm.page_table_lock spinlock and calls
> stop_machine(). The stop_machine thread function runs in a different
> thread context (migration thread) which tries to release and reaquire
> ptl. Releasing ptl from a different thread than which acquired it
> causes bad unlock unbalance.
>
> These problems can be avoided if we avoid mapping hot-plugged memory
> with 1G mapping, thereby removing the need for splitting them during
> unplug. Hence, during radix init, identify the hot-plugged memory region
> and create separate mappings for each LMB so that they don't get mapped
> with 1G mappings. The identification of hot-plugged memory has become
> possible after the commit b6eca183e23e ("powerpc/kernel: Enables memory
> hot-remove after reboot on pseries guests").
>
> To create separate mappings for every LMB in the hot-plugged
> region, we need lmb-size for which we use memory_block_size_bytes().
> Since this is early init time code, the machine type isn't probed yet
> and hence memory_block_size_bytes() would return the default LMB size
> as 16MB. Hence we end up issuing more number of mapping requests
> than earlier.

Considering we can split 1G pages correctly, we can avoid doing this?



>
> Signed-off-by: Bharata B Rao 
> ---
>  arch/powerpc/mm/book3s64/radix_pgtable.c | 15 ---
>  1 file changed, 12 insertions(+), 3 deletions(-)
>
> diff --git a/arch/powerpc/mm/book3s64/radix_pgtable.c 
> b/arch/powerpc/mm/book3s64/radix_pgtable.c
> index 8acb96de0e48..ffccfe00ca2a 100644
> --- a/arch/powerpc/mm/book3s64/radix_pgtable.c
> +++ b/arch/powerpc/mm/book3s64/radix_pgtable.c
> @@ -16,6 +16,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include 
>  #include 
> @@ -320,6 +321,8 @@ static void __init radix_init_pgtable(void)
>  {
>   unsigned long rts_field;
>   struct memblock_region *reg;
> + phys_addr_t addr;
> + u64 lmb_size = memory_block_size_bytes();
>  
>   /* We don't support slb for radix */
>   mmu_slb_size = 0;
> @@ -338,9 +341,15 @@ static void __init radix_init_pgtable(void)
>   continue;
>   }
>  
> - WARN_ON(create_physical_mapping(reg->base,
> - reg->base + reg->size,
> - -1, PAGE_KERNEL));
> + if (memblock_is_hotpluggable(reg)) {
> + for (addr = reg->base; addr < (reg->base + reg->size);
> +  addr += lmb_size)
> + WARN_ON(create_physical_mapping(addr,
> + addr + lmb_size, -1, PAGE_KERNEL));
> + } else
> + WARN_ON(create_physical_mapping(reg->base,
> + reg->base + reg->size,
> + -1, PAGE_KERNEL));
>   }
>  
>   /* Find out how many PID bits are supported */
> -- 
> 2.21.3


[PATCH v1 1/3] powerpc/mm/radix: Create separate mappings for hot-plugged memory

2020-06-23 Thread Bharata B Rao
Memory that gets hot-plugged _during_ boot (and not the memory
that gets plugged in after boot), is mapped with 1G mappings
and will undergo splitting when it is unplugged. The splitting
code has a few issues:

1. Recursive locking

Memory unplug path takes cpu_hotplug_lock and calls stop_machine()
for splitting the mappings. However stop_machine() takes
cpu_hotplug_lock again causing deadlock.

2. BUG: sleeping function called from in_atomic() context
-
Memory unplug path (remove_pagetable) takes init_mm.page_table_lock
spinlock and later calls stop_machine() which does wait_for_completion()

3. Bad unlock unbalance
---
Memory unplug path takes init_mm.page_table_lock spinlock and calls
stop_machine(). The stop_machine thread function runs in a different
thread context (migration thread) which tries to release and reaquire
ptl. Releasing ptl from a different thread than which acquired it
causes bad unlock unbalance.

These problems can be avoided if we avoid mapping hot-plugged memory
with 1G mapping, thereby removing the need for splitting them during
unplug. Hence, during radix init, identify the hot-plugged memory region
and create separate mappings for each LMB so that they don't get mapped
with 1G mappings. The identification of hot-plugged memory has become
possible after the commit b6eca183e23e ("powerpc/kernel: Enables memory
hot-remove after reboot on pseries guests").

To create separate mappings for every LMB in the hot-plugged
region, we need lmb-size for which we use memory_block_size_bytes().
Since this is early init time code, the machine type isn't probed yet
and hence memory_block_size_bytes() would return the default LMB size
as 16MB. Hence we end up issuing more number of mapping requests
than earlier.

Signed-off-by: Bharata B Rao 
---
 arch/powerpc/mm/book3s64/radix_pgtable.c | 15 ---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/mm/book3s64/radix_pgtable.c 
b/arch/powerpc/mm/book3s64/radix_pgtable.c
index 8acb96de0e48..ffccfe00ca2a 100644
--- a/arch/powerpc/mm/book3s64/radix_pgtable.c
+++ b/arch/powerpc/mm/book3s64/radix_pgtable.c
@@ -16,6 +16,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -320,6 +321,8 @@ static void __init radix_init_pgtable(void)
 {
unsigned long rts_field;
struct memblock_region *reg;
+   phys_addr_t addr;
+   u64 lmb_size = memory_block_size_bytes();
 
/* We don't support slb for radix */
mmu_slb_size = 0;
@@ -338,9 +341,15 @@ static void __init radix_init_pgtable(void)
continue;
}
 
-   WARN_ON(create_physical_mapping(reg->base,
-   reg->base + reg->size,
-   -1, PAGE_KERNEL));
+   if (memblock_is_hotpluggable(reg)) {
+   for (addr = reg->base; addr < (reg->base + reg->size);
+addr += lmb_size)
+   WARN_ON(create_physical_mapping(addr,
+   addr + lmb_size, -1, PAGE_KERNEL));
+   } else
+   WARN_ON(create_physical_mapping(reg->base,
+   reg->base + reg->size,
+   -1, PAGE_KERNEL));
}
 
/* Find out how many PID bits are supported */
-- 
2.21.3