Re: [PATCH v1 1/2] powerpc/pseries/hotplug-memory: stop checking is_mem_section_removable()

2020-04-09 Thread piliu



On 04/09/2020 03:26 PM, David Hildenbrand wrote:
> On 09.04.20 04:59, piliu wrote:
>>
>>
>> On 04/08/2020 10:46 AM, Baoquan He wrote:
>>> Add Pingfan to CC since he usually handles ppc related bugs for RHEL.
>>>
>>> On 04/07/20 at 03:54pm, David Hildenbrand wrote:
 In commit 53cdc1cb29e8 ("drivers/base/memory.c: indicate all memory
 blocks as removable"), the user space interface to compute whether a memory
 block can be offlined (exposed via
 /sys/devices/system/memory/memoryX/removable) has effectively been
 deprecated. We want to remove the leftovers of the kernel implementation.
>>>
>>> Pingfan, can you have a look at this change on PPC?  Please feel free to
>>> give comments if any concern, or offer ack if it's OK to you.
>>>

 When offlining a memory block (mm/memory_hotplug.c:__offline_pages()),
 we'll start by:
 1. Testing if it contains any holes, and reject if so
 2. Testing if pages belong to different zones, and reject if so
 3. Isolating the page range, checking if it contains any unmovable pages

 Using is_mem_section_removable() before trying to offline is not only racy,
 it can easily result in false positives/negatives. Let's stop manually
 checking is_mem_section_removable(), and let device_offline() handle it
 completely instead. We can remove the racy is_mem_section_removable()
 implementation next.

 We now take more locks (e.g., memory hotplug lock when offlining and the
 zone lock when isolating), but maybe we should optimize that
 implementation instead if this ever becomes a real problem (after all,
 memory unplug is already an expensive operation). We started using
 is_mem_section_removable() in commit 51925fb3c5c9 ("powerpc/pseries:
 Implement memory hotplug remove in the kernel"), with the initial
 hotremove support of lmbs.

 Cc: Nathan Fontenot 
 Cc: Michael Ellerman 
 Cc: Benjamin Herrenschmidt 
 Cc: Paul Mackerras 
 Cc: Michal Hocko 
 Cc: Andrew Morton 
 Cc: Oscar Salvador 
 Cc: Baoquan He 
 Cc: Wei Yang 
 Signed-off-by: David Hildenbrand 
 ---
  .../platforms/pseries/hotplug-memory.c| 26 +++
  1 file changed, 3 insertions(+), 23 deletions(-)

 diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c 
 b/arch/powerpc/platforms/pseries/hotplug-memory.c
 index b2cde1732301..5ace2f9a277e 100644
 --- a/arch/powerpc/platforms/pseries/hotplug-memory.c
 +++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
 @@ -337,39 +337,19 @@ static int pseries_remove_mem_node(struct 
 device_node *np)
  
  static bool lmb_is_removable(struct drmem_lmb *lmb)
  {
 -  int i, scns_per_block;
 -  bool rc = true;
 -  unsigned long pfn, block_sz;
 -  u64 phys_addr;
 -
if (!(lmb->flags & DRCONF_MEM_ASSIGNED))
return false;
  
 -  block_sz = memory_block_size_bytes();
 -  scns_per_block = block_sz / MIN_MEMORY_BLOCK_SIZE;
 -  phys_addr = lmb->base_addr;
 -
  #ifdef CONFIG_FA_DUMP
/*
 * Don't hot-remove memory that falls in fadump boot memory area
 * and memory that is reserved for capturing old kernel memory.
 */
 -  if (is_fadump_memory_area(phys_addr, block_sz))
 +  if (is_fadump_memory_area(lmb->base_addr, memory_block_size_bytes()))
return false;
  #endif
 -
 -  for (i = 0; i < scns_per_block; i++) {
 -  pfn = PFN_DOWN(phys_addr);
 -  if (!pfn_in_present_section(pfn)) {
 -  phys_addr += MIN_MEMORY_BLOCK_SIZE;
 -  continue;
 -  }
 -
 -  rc = rc && is_mem_section_removable(pfn, PAGES_PER_SECTION);
 -  phys_addr += MIN_MEMORY_BLOCK_SIZE;
 -  }
 -
 -  return rc;
 +  /* device_offline() will determine if we can actually remove this lmb */
 +  return true;
>> So I think here swaps the check and do sequence. At least it breaks
>> dlpar_memory_remove_by_count(). It is doable to remove
>> is_mem_section_removable(), but here should be more effort to re-arrange
>> the code.
>>
> 
> Thanks Pingfan,
> 
> 1. "swaps the check and do sequence":
> 
> Partially. Any caller of dlpar_remove_lmb() already has to deal with
> false positives. device_offline() can easily fail after
> dlpar_remove_lmb() == true. It's inherently racy.
> 
> 2. "breaks dlpar_memory_remove_by_count()"
> 
> Can you elaborate why it "breaks" it? It will simply try to
> offline+remove lmbs, detect that it wasn't able to offline+remove as
> much as it wanted (which could happen before as well easily), and re-add
> the already offlined+removed ones.
> 
I overlooked the re-add logic. Then I think
dlpar_memory_remove_by_count() is OK with this patch.
> 3. "more effort to re-arrange the code"
> 
> What would be your suggestion?
> 
I had thought about merging the two 

Re: [PATCH v1 1/2] powerpc/pseries/hotplug-memory: stop checking is_mem_section_removable()

2020-04-09 Thread David Hildenbrand
On 09.04.20 09:26, David Hildenbrand wrote:
> On 09.04.20 04:59, piliu wrote:
>>
>>
>> On 04/08/2020 10:46 AM, Baoquan He wrote:
>>> Add Pingfan to CC since he usually handles ppc related bugs for RHEL.
>>>
>>> On 04/07/20 at 03:54pm, David Hildenbrand wrote:
 In commit 53cdc1cb29e8 ("drivers/base/memory.c: indicate all memory
 blocks as removable"), the user space interface to compute whether a memory
 block can be offlined (exposed via
 /sys/devices/system/memory/memoryX/removable) has effectively been
 deprecated. We want to remove the leftovers of the kernel implementation.
>>>
>>> Pingfan, can you have a look at this change on PPC?  Please feel free to
>>> give comments if any concern, or offer ack if it's OK to you.
>>>

 When offlining a memory block (mm/memory_hotplug.c:__offline_pages()),
 we'll start by:
 1. Testing if it contains any holes, and reject if so
 2. Testing if pages belong to different zones, and reject if so
 3. Isolating the page range, checking if it contains any unmovable pages

 Using is_mem_section_removable() before trying to offline is not only racy,
 it can easily result in false positives/negatives. Let's stop manually
 checking is_mem_section_removable(), and let device_offline() handle it
 completely instead. We can remove the racy is_mem_section_removable()
 implementation next.

 We now take more locks (e.g., memory hotplug lock when offlining and the
 zone lock when isolating), but maybe we should optimize that
 implementation instead if this ever becomes a real problem (after all,
 memory unplug is already an expensive operation). We started using
 is_mem_section_removable() in commit 51925fb3c5c9 ("powerpc/pseries:
 Implement memory hotplug remove in the kernel"), with the initial
 hotremove support of lmbs.

 Cc: Nathan Fontenot 
 Cc: Michael Ellerman 
 Cc: Benjamin Herrenschmidt 
 Cc: Paul Mackerras 
 Cc: Michal Hocko 
 Cc: Andrew Morton 
 Cc: Oscar Salvador 
 Cc: Baoquan He 
 Cc: Wei Yang 
 Signed-off-by: David Hildenbrand 
 ---
  .../platforms/pseries/hotplug-memory.c| 26 +++
  1 file changed, 3 insertions(+), 23 deletions(-)

 diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c 
 b/arch/powerpc/platforms/pseries/hotplug-memory.c
 index b2cde1732301..5ace2f9a277e 100644
 --- a/arch/powerpc/platforms/pseries/hotplug-memory.c
 +++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
 @@ -337,39 +337,19 @@ static int pseries_remove_mem_node(struct 
 device_node *np)
  
  static bool lmb_is_removable(struct drmem_lmb *lmb)
  {
 -  int i, scns_per_block;
 -  bool rc = true;
 -  unsigned long pfn, block_sz;
 -  u64 phys_addr;
 -
if (!(lmb->flags & DRCONF_MEM_ASSIGNED))
return false;
  
 -  block_sz = memory_block_size_bytes();
 -  scns_per_block = block_sz / MIN_MEMORY_BLOCK_SIZE;
 -  phys_addr = lmb->base_addr;
 -
  #ifdef CONFIG_FA_DUMP
/*
 * Don't hot-remove memory that falls in fadump boot memory area
 * and memory that is reserved for capturing old kernel memory.
 */
 -  if (is_fadump_memory_area(phys_addr, block_sz))
 +  if (is_fadump_memory_area(lmb->base_addr, memory_block_size_bytes()))
return false;
  #endif
 -
 -  for (i = 0; i < scns_per_block; i++) {
 -  pfn = PFN_DOWN(phys_addr);
 -  if (!pfn_in_present_section(pfn)) {
 -  phys_addr += MIN_MEMORY_BLOCK_SIZE;
 -  continue;
 -  }
 -
 -  rc = rc && is_mem_section_removable(pfn, PAGES_PER_SECTION);
 -  phys_addr += MIN_MEMORY_BLOCK_SIZE;
 -  }
 -
 -  return rc;
 +  /* device_offline() will determine if we can actually remove this lmb */
 +  return true;
>> So I think here swaps the check and do sequence. At least it breaks
>> dlpar_memory_remove_by_count(). It is doable to remove
>> is_mem_section_removable(), but here should be more effort to re-arrange
>> the code.
>>
> 
> Thanks Pingfan,
> 
> 1. "swaps the check and do sequence":
> 
> Partially. Any caller of dlpar_remove_lmb() already has to deal with
> false positives. device_offline() can easily fail after
> dlpar_remove_lmb() == true. It's inherently racy.

Sorry, s/dlpar_remove_lmb/lmb_is_removable/


-- 
Thanks,

David / dhildenb



Re: [PATCH v1 1/2] powerpc/pseries/hotplug-memory: stop checking is_mem_section_removable()

2020-04-09 Thread Michal Hocko
On Thu 09-04-20 10:12:20, David Hildenbrand wrote:
> On 09.04.20 09:59, Michal Hocko wrote:
> > On Thu 09-04-20 17:26:01, Michael Ellerman wrote:
> >> David Hildenbrand  writes:
> >>
> >>> In commit 53cdc1cb29e8 ("drivers/base/memory.c: indicate all memory
> >>> blocks as removable"), the user space interface to compute whether a 
> >>> memory
> >>> block can be offlined (exposed via
> >>> /sys/devices/system/memory/memoryX/removable) has effectively been
> >>> deprecated. We want to remove the leftovers of the kernel implementation.
> >>>
> >>> When offlining a memory block (mm/memory_hotplug.c:__offline_pages()),
> >>> we'll start by:
> >>> 1. Testing if it contains any holes, and reject if so
> >>> 2. Testing if pages belong to different zones, and reject if so
> >>> 3. Isolating the page range, checking if it contains any unmovable pages
> >>>
> >>> Using is_mem_section_removable() before trying to offline is not only 
> >>> racy,
> >>> it can easily result in false positives/negatives. Let's stop manually
> >>> checking is_mem_section_removable(), and let device_offline() handle it
> >>> completely instead. We can remove the racy is_mem_section_removable()
> >>> implementation next.
> >>>
> >>> We now take more locks (e.g., memory hotplug lock when offlining and the
> >>> zone lock when isolating), but maybe we should optimize that
> >>> implementation instead if this ever becomes a real problem (after all,
> >>> memory unplug is already an expensive operation). We started using
> >>> is_mem_section_removable() in commit 51925fb3c5c9 ("powerpc/pseries:
> >>> Implement memory hotplug remove in the kernel"), with the initial
> >>> hotremove support of lmbs.
> >>
> >> It's also not very pretty in dmesg.
> >>
> >> Before:
> >>
> >>   pseries-hotplug-mem: Attempting to hot-add 10 LMB(s)
> >>   pseries-hotplug-mem: Memory hot-add failed, removing any added LMBs
> >>   dlpar: Could not handle DLPAR request "memory add count 10"
> > 
> > Yeah, there is more output but isn't that useful? Or put it differently
> > what is the actual problem from having those messages in the kernel log?
> > 
> > From the below you can clearly tell that there are kernel allocations
> > which prevent hot remove from happening.
> > 
> > If the overall size of the debugging output is a concern then we can
> > think of a way to reduce it. E.g. once you have a couple of pages
> > reported then all others from the same block are likely not interesting
> > much.
> > 
> 
> IIRC, we only report one page per block already. (and stop, as we
> detected something unmovable)

You are right.
-- 
Michal Hocko
SUSE Labs


Re: [PATCH v1 1/2] powerpc/pseries/hotplug-memory: stop checking is_mem_section_removable()

2020-04-09 Thread David Hildenbrand
On 09.04.20 09:59, Michal Hocko wrote:
> On Thu 09-04-20 17:26:01, Michael Ellerman wrote:
>> David Hildenbrand  writes:
>>
>>> In commit 53cdc1cb29e8 ("drivers/base/memory.c: indicate all memory
>>> blocks as removable"), the user space interface to compute whether a memory
>>> block can be offlined (exposed via
>>> /sys/devices/system/memory/memoryX/removable) has effectively been
>>> deprecated. We want to remove the leftovers of the kernel implementation.
>>>
>>> When offlining a memory block (mm/memory_hotplug.c:__offline_pages()),
>>> we'll start by:
>>> 1. Testing if it contains any holes, and reject if so
>>> 2. Testing if pages belong to different zones, and reject if so
>>> 3. Isolating the page range, checking if it contains any unmovable pages
>>>
>>> Using is_mem_section_removable() before trying to offline is not only racy,
>>> it can easily result in false positives/negatives. Let's stop manually
>>> checking is_mem_section_removable(), and let device_offline() handle it
>>> completely instead. We can remove the racy is_mem_section_removable()
>>> implementation next.
>>>
>>> We now take more locks (e.g., memory hotplug lock when offlining and the
>>> zone lock when isolating), but maybe we should optimize that
>>> implementation instead if this ever becomes a real problem (after all,
>>> memory unplug is already an expensive operation). We started using
>>> is_mem_section_removable() in commit 51925fb3c5c9 ("powerpc/pseries:
>>> Implement memory hotplug remove in the kernel"), with the initial
>>> hotremove support of lmbs.
>>
>> It's also not very pretty in dmesg.
>>
>> Before:
>>
>>   pseries-hotplug-mem: Attempting to hot-add 10 LMB(s)
>>   pseries-hotplug-mem: Memory hot-add failed, removing any added LMBs
>>   dlpar: Could not handle DLPAR request "memory add count 10"
> 
> Yeah, there is more output but isn't that useful? Or put it differently
> what is the actual problem from having those messages in the kernel log?
> 
> From the below you can clearly tell that there are kernel allocations
> which prevent hot remove from happening.
> 
> If the overall size of the debugging output is a concern then we can
> think of a way to reduce it. E.g. once you have a couple of pages
> reported then all others from the same block are likely not interesting
> much.
> 

IIRC, we only report one page per block already. (and stop, as we
detected something unmovable)

-- 
Thanks,

David / dhildenb



Re: [PATCH v1 1/2] powerpc/pseries/hotplug-memory: stop checking is_mem_section_removable()

2020-04-09 Thread Michal Hocko
On Thu 09-04-20 17:26:01, Michael Ellerman wrote:
> David Hildenbrand  writes:
> 
> > In commit 53cdc1cb29e8 ("drivers/base/memory.c: indicate all memory
> > blocks as removable"), the user space interface to compute whether a memory
> > block can be offlined (exposed via
> > /sys/devices/system/memory/memoryX/removable) has effectively been
> > deprecated. We want to remove the leftovers of the kernel implementation.
> >
> > When offlining a memory block (mm/memory_hotplug.c:__offline_pages()),
> > we'll start by:
> > 1. Testing if it contains any holes, and reject if so
> > 2. Testing if pages belong to different zones, and reject if so
> > 3. Isolating the page range, checking if it contains any unmovable pages
> >
> > Using is_mem_section_removable() before trying to offline is not only racy,
> > it can easily result in false positives/negatives. Let's stop manually
> > checking is_mem_section_removable(), and let device_offline() handle it
> > completely instead. We can remove the racy is_mem_section_removable()
> > implementation next.
> >
> > We now take more locks (e.g., memory hotplug lock when offlining and the
> > zone lock when isolating), but maybe we should optimize that
> > implementation instead if this ever becomes a real problem (after all,
> > memory unplug is already an expensive operation). We started using
> > is_mem_section_removable() in commit 51925fb3c5c9 ("powerpc/pseries:
> > Implement memory hotplug remove in the kernel"), with the initial
> > hotremove support of lmbs.
> 
> It's also not very pretty in dmesg.
> 
> Before:
> 
>   pseries-hotplug-mem: Attempting to hot-add 10 LMB(s)
>   pseries-hotplug-mem: Memory hot-add failed, removing any added LMBs
>   dlpar: Could not handle DLPAR request "memory add count 10"

Yeah, there is more output but isn't that useful? Or put it differently
what is the actual problem from having those messages in the kernel log?

>From the below you can clearly tell that there are kernel allocations
which prevent hot remove from happening.

If the overall size of the debugging output is a concern then we can
think of a way to reduce it. E.g. once you have a couple of pages
reported then all others from the same block are likely not interesting
much.
-- 
Michal Hocko
SUSE Labs


Re: [PATCH v1 1/2] powerpc/pseries/hotplug-memory: stop checking is_mem_section_removable()

2020-04-09 Thread David Hildenbrand
> It's also not very pretty in dmesg.
> 
> Before:
> 
>   pseries-hotplug-mem: Attempting to hot-add 10 LMB(s)
>   pseries-hotplug-mem: Memory hot-add failed, removing any added LMBs
>   dlpar: Could not handle DLPAR request "memory add count 10"
> 

Thanks for running it through the mill.

Here you test "hotadd", below you test "hot-remove". But yeah, there is
a change in the amount of dmesg.

> After:
> 
>   pseries-hotplug-mem: Attempting to hot-remove 10 LMB(s)
>   page:c00c01ca8200 refcount:1 mapcount:0 mapping:18c4a547 
> index:0xc0072a080180
>   flags: 0x700200(slab)
>   raw: 00700200 c00c01cffd48 c00781c51410 c00793327580
>   raw: c0072a080180 01550001 0001 
>   page dumped because: unmovable page
>   page:c00c01cc4a80 refcount:1 mapcount:0 mapping:18c4a547 
> index:0xc0079b110080
>   flags: 0x70()
>   raw: 0070 5deadbeef100 5deadbeef122 
>   raw: c0079b110080  0001 
>   page dumped because: unmovable page
>   page:c00c01d08200 refcount:1 mapcount:0 mapping:18c4a547 
> index:0xc0074208ff00
>   flags: 0x700200(slab)
>   raw: 00700200 c00781c5f150 c00c01d37f88 c00798a24880
>   raw: c0074208ff00 01550002 0001 
>   page dumped because: unmovable page
>   page:c00c01d40140 refcount:1 mapcount:0 mapping:18c4a547 
> index:0xc00750059c00
>   flags: 0x700200(slab)
>   raw: 00700200 c00c01dfcfc8 c00c01d3c288 c007851c2d00
>   raw: c00750059c00 0103 0001 
>   page dumped because: unmovable page
>   page:c00c01d9c000 refcount:1 mapcount:0 mapping:18c4a547 
> index:0xc2370080
>   flags: 0x70()
>   raw: 0070 5deadbeef100 5deadbeef122 
>   raw: c2370080  0001 
>   page dumped because: unmovable page
>   page:c00c01dc0200 refcount:1 mapcount:0 mapping:18c4a547 
> index:0xc2370080
>   flags: 0x70()
>   raw: 0070 5deadbeef100 5deadbeef122 
>   raw: c2370080  0001 
>   page dumped because: unmovable page
>   page:c00c01e0 refcount:1 mapcount:0 mapping:18c4a547 
> index:0x0
>   flags: 0x700200(slab)
>   raw: 00700200 5deadbeef100 5deadbeef122 c007a801f500
>   raw:  08000800 0001 
>   page dumped because: unmovable page
>   page:c00c01e40440 refcount:1 mapcount:0 mapping:18c4a547 
> index:0x0
>   flags: 0x700200(slab)
>   raw: 00700200 5deadbeef100 5deadbeef122 c007a801e380
>   raw:  00400040 0001 
>   page dumped because: unmovable page
>   page:c00c01e8 refcount:1 mapcount:0 mapping:18c4a547 
> index:0xc007a640
>   flags: 0x700200(slab)
>   raw: 00700200 c00c01e5af48 c00c01e80408 c00f42d00a00
>   raw: c007a640 066600a2 0001 
>   page dumped because: unmovable page
>   page:c00c03c89d40 refcount:2 mapcount:1 mapping:18c4a547 
> index:0x10a41
>   anon flags: 0x1780024(uptodate|active|swapbacked)
>   raw: 01780024 5deadbeef100 5deadbeef122 c007986b03c9
>   raw: 00010a41  0002 c340b000
>   page dumped because: unmovable page
>   page->mem_cgroup:c340b000
>   page:c00c03cc refcount:1 mapcount:0 mapping:18c4a547 
> index:0xc00f3000fd38
>   flags: 0x1700200(slab)
>   raw: 01700200 c00f3c911890 c00f3c911890 c0079fffd980
>   raw: c00f3000fd38 0073 0001 
>   page dumped because: unmovable page
>   page:c00c03d0 refcount:1 mapcount:0 mapping:18c4a547 
> index:0xc007a2ec
>   flags: 0x170()
>   raw: 0170 5deadbeef100 5deadbeef122 
>   raw: c007a2ec  0001 
>   page dumped because: unmovable page
>   page:c00c03e2c000 refcount:1 mapcount:0 mapping:18c4a547 
> index:0xc00f8b008400
>   flags: 0x2700200(slab)
>   raw: 02700200 c00f8e000190 c00f8e000190 c007a801e380
>   raw: c00f8b008400 00400038 0001 
>   page dumped because: unmovable page
>   page:c00c03fec000 refcount:1 mapcount:0 mapping:18c4a547 
> index:0x0
>   flags: 0x370()
>   raw: 0370 5deadbeef100 5deadbeef122 
>   raw:   

Re: [PATCH v1 1/2] powerpc/pseries/hotplug-memory: stop checking is_mem_section_removable()

2020-04-09 Thread David Hildenbrand
On 09.04.20 04:59, piliu wrote:
> 
> 
> On 04/08/2020 10:46 AM, Baoquan He wrote:
>> Add Pingfan to CC since he usually handles ppc related bugs for RHEL.
>>
>> On 04/07/20 at 03:54pm, David Hildenbrand wrote:
>>> In commit 53cdc1cb29e8 ("drivers/base/memory.c: indicate all memory
>>> blocks as removable"), the user space interface to compute whether a memory
>>> block can be offlined (exposed via
>>> /sys/devices/system/memory/memoryX/removable) has effectively been
>>> deprecated. We want to remove the leftovers of the kernel implementation.
>>
>> Pingfan, can you have a look at this change on PPC?  Please feel free to
>> give comments if any concern, or offer ack if it's OK to you.
>>
>>>
>>> When offlining a memory block (mm/memory_hotplug.c:__offline_pages()),
>>> we'll start by:
>>> 1. Testing if it contains any holes, and reject if so
>>> 2. Testing if pages belong to different zones, and reject if so
>>> 3. Isolating the page range, checking if it contains any unmovable pages
>>>
>>> Using is_mem_section_removable() before trying to offline is not only racy,
>>> it can easily result in false positives/negatives. Let's stop manually
>>> checking is_mem_section_removable(), and let device_offline() handle it
>>> completely instead. We can remove the racy is_mem_section_removable()
>>> implementation next.
>>>
>>> We now take more locks (e.g., memory hotplug lock when offlining and the
>>> zone lock when isolating), but maybe we should optimize that
>>> implementation instead if this ever becomes a real problem (after all,
>>> memory unplug is already an expensive operation). We started using
>>> is_mem_section_removable() in commit 51925fb3c5c9 ("powerpc/pseries:
>>> Implement memory hotplug remove in the kernel"), with the initial
>>> hotremove support of lmbs.
>>>
>>> Cc: Nathan Fontenot 
>>> Cc: Michael Ellerman 
>>> Cc: Benjamin Herrenschmidt 
>>> Cc: Paul Mackerras 
>>> Cc: Michal Hocko 
>>> Cc: Andrew Morton 
>>> Cc: Oscar Salvador 
>>> Cc: Baoquan He 
>>> Cc: Wei Yang 
>>> Signed-off-by: David Hildenbrand 
>>> ---
>>>  .../platforms/pseries/hotplug-memory.c| 26 +++
>>>  1 file changed, 3 insertions(+), 23 deletions(-)
>>>
>>> diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c 
>>> b/arch/powerpc/platforms/pseries/hotplug-memory.c
>>> index b2cde1732301..5ace2f9a277e 100644
>>> --- a/arch/powerpc/platforms/pseries/hotplug-memory.c
>>> +++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
>>> @@ -337,39 +337,19 @@ static int pseries_remove_mem_node(struct device_node 
>>> *np)
>>>  
>>>  static bool lmb_is_removable(struct drmem_lmb *lmb)
>>>  {
>>> -   int i, scns_per_block;
>>> -   bool rc = true;
>>> -   unsigned long pfn, block_sz;
>>> -   u64 phys_addr;
>>> -
>>> if (!(lmb->flags & DRCONF_MEM_ASSIGNED))
>>> return false;
>>>  
>>> -   block_sz = memory_block_size_bytes();
>>> -   scns_per_block = block_sz / MIN_MEMORY_BLOCK_SIZE;
>>> -   phys_addr = lmb->base_addr;
>>> -
>>>  #ifdef CONFIG_FA_DUMP
>>> /*
>>>  * Don't hot-remove memory that falls in fadump boot memory area
>>>  * and memory that is reserved for capturing old kernel memory.
>>>  */
>>> -   if (is_fadump_memory_area(phys_addr, block_sz))
>>> +   if (is_fadump_memory_area(lmb->base_addr, memory_block_size_bytes()))
>>> return false;
>>>  #endif
>>> -
>>> -   for (i = 0; i < scns_per_block; i++) {
>>> -   pfn = PFN_DOWN(phys_addr);
>>> -   if (!pfn_in_present_section(pfn)) {
>>> -   phys_addr += MIN_MEMORY_BLOCK_SIZE;
>>> -   continue;
>>> -   }
>>> -
>>> -   rc = rc && is_mem_section_removable(pfn, PAGES_PER_SECTION);
>>> -   phys_addr += MIN_MEMORY_BLOCK_SIZE;
>>> -   }
>>> -
>>> -   return rc;
>>> +   /* device_offline() will determine if we can actually remove this lmb */
>>> +   return true;
> So I think here swaps the check and do sequence. At least it breaks
> dlpar_memory_remove_by_count(). It is doable to remove
> is_mem_section_removable(), but here should be more effort to re-arrange
> the code.
> 

Thanks Pingfan,

1. "swaps the check and do sequence":

Partially. Any caller of dlpar_remove_lmb() already has to deal with
false positives. device_offline() can easily fail after
dlpar_remove_lmb() == true. It's inherently racy.

2. "breaks dlpar_memory_remove_by_count()"

Can you elaborate why it "breaks" it? It will simply try to
offline+remove lmbs, detect that it wasn't able to offline+remove as
much as it wanted (which could happen before as well easily), and re-add
the already offlined+removed ones.

3. "more effort to re-arrange the code"

What would be your suggestion?

We would rip out that racy check if we can remove as much memory as
requested in dlpar_memory_remove_by_count() and simply always try to
remove + recover.


-- 
Thanks,

David / dhildenb



Re: [PATCH v1 1/2] powerpc/pseries/hotplug-memory: stop checking is_mem_section_removable()

2020-04-09 Thread Michael Ellerman
David Hildenbrand  writes:

> In commit 53cdc1cb29e8 ("drivers/base/memory.c: indicate all memory
> blocks as removable"), the user space interface to compute whether a memory
> block can be offlined (exposed via
> /sys/devices/system/memory/memoryX/removable) has effectively been
> deprecated. We want to remove the leftovers of the kernel implementation.
>
> When offlining a memory block (mm/memory_hotplug.c:__offline_pages()),
> we'll start by:
> 1. Testing if it contains any holes, and reject if so
> 2. Testing if pages belong to different zones, and reject if so
> 3. Isolating the page range, checking if it contains any unmovable pages
>
> Using is_mem_section_removable() before trying to offline is not only racy,
> it can easily result in false positives/negatives. Let's stop manually
> checking is_mem_section_removable(), and let device_offline() handle it
> completely instead. We can remove the racy is_mem_section_removable()
> implementation next.
>
> We now take more locks (e.g., memory hotplug lock when offlining and the
> zone lock when isolating), but maybe we should optimize that
> implementation instead if this ever becomes a real problem (after all,
> memory unplug is already an expensive operation). We started using
> is_mem_section_removable() in commit 51925fb3c5c9 ("powerpc/pseries:
> Implement memory hotplug remove in the kernel"), with the initial
> hotremove support of lmbs.

It's also not very pretty in dmesg.

Before:

  pseries-hotplug-mem: Attempting to hot-add 10 LMB(s)
  pseries-hotplug-mem: Memory hot-add failed, removing any added LMBs
  dlpar: Could not handle DLPAR request "memory add count 10"

After:

  pseries-hotplug-mem: Attempting to hot-remove 10 LMB(s)
  page:c00c01ca8200 refcount:1 mapcount:0 mapping:18c4a547 
index:0xc0072a080180
  flags: 0x700200(slab)
  raw: 00700200 c00c01cffd48 c00781c51410 c00793327580
  raw: c0072a080180 01550001 0001 
  page dumped because: unmovable page
  page:c00c01cc4a80 refcount:1 mapcount:0 mapping:18c4a547 
index:0xc0079b110080
  flags: 0x70()
  raw: 0070 5deadbeef100 5deadbeef122 
  raw: c0079b110080  0001 
  page dumped because: unmovable page
  page:c00c01d08200 refcount:1 mapcount:0 mapping:18c4a547 
index:0xc0074208ff00
  flags: 0x700200(slab)
  raw: 00700200 c00781c5f150 c00c01d37f88 c00798a24880
  raw: c0074208ff00 01550002 0001 
  page dumped because: unmovable page
  page:c00c01d40140 refcount:1 mapcount:0 mapping:18c4a547 
index:0xc00750059c00
  flags: 0x700200(slab)
  raw: 00700200 c00c01dfcfc8 c00c01d3c288 c007851c2d00
  raw: c00750059c00 0103 0001 
  page dumped because: unmovable page
  page:c00c01d9c000 refcount:1 mapcount:0 mapping:18c4a547 
index:0xc2370080
  flags: 0x70()
  raw: 0070 5deadbeef100 5deadbeef122 
  raw: c2370080  0001 
  page dumped because: unmovable page
  page:c00c01dc0200 refcount:1 mapcount:0 mapping:18c4a547 
index:0xc2370080
  flags: 0x70()
  raw: 0070 5deadbeef100 5deadbeef122 
  raw: c2370080  0001 
  page dumped because: unmovable page
  page:c00c01e0 refcount:1 mapcount:0 mapping:18c4a547 index:0x0
  flags: 0x700200(slab)
  raw: 00700200 5deadbeef100 5deadbeef122 c007a801f500
  raw:  08000800 0001 
  page dumped because: unmovable page
  page:c00c01e40440 refcount:1 mapcount:0 mapping:18c4a547 index:0x0
  flags: 0x700200(slab)
  raw: 00700200 5deadbeef100 5deadbeef122 c007a801e380
  raw:  00400040 0001 
  page dumped because: unmovable page
  page:c00c01e8 refcount:1 mapcount:0 mapping:18c4a547 
index:0xc007a640
  flags: 0x700200(slab)
  raw: 00700200 c00c01e5af48 c00c01e80408 c00f42d00a00
  raw: c007a640 066600a2 0001 
  page dumped because: unmovable page
  page:c00c03c89d40 refcount:2 mapcount:1 mapping:18c4a547 
index:0x10a41
  anon flags: 0x1780024(uptodate|active|swapbacked)
  raw: 01780024 5deadbeef100 5deadbeef122 c007986b03c9
  raw: 00010a41  0002 c340b000
  page dumped because: unmovable page
  page->mem_cgroup:c340b000
  page:c00c03cc refcount:1 mapcount:0 mapping:18c4a547 
index:0xc00f3000fd38
  

Re: [PATCH v1 1/2] powerpc/pseries/hotplug-memory: stop checking is_mem_section_removable()

2020-04-08 Thread piliu



On 04/08/2020 10:46 AM, Baoquan He wrote:
> Add Pingfan to CC since he usually handles ppc related bugs for RHEL.
> 
> On 04/07/20 at 03:54pm, David Hildenbrand wrote:
>> In commit 53cdc1cb29e8 ("drivers/base/memory.c: indicate all memory
>> blocks as removable"), the user space interface to compute whether a memory
>> block can be offlined (exposed via
>> /sys/devices/system/memory/memoryX/removable) has effectively been
>> deprecated. We want to remove the leftovers of the kernel implementation.
> 
> Pingfan, can you have a look at this change on PPC?  Please feel free to
> give comments if any concern, or offer ack if it's OK to you.
> 
>>
>> When offlining a memory block (mm/memory_hotplug.c:__offline_pages()),
>> we'll start by:
>> 1. Testing if it contains any holes, and reject if so
>> 2. Testing if pages belong to different zones, and reject if so
>> 3. Isolating the page range, checking if it contains any unmovable pages
>>
>> Using is_mem_section_removable() before trying to offline is not only racy,
>> it can easily result in false positives/negatives. Let's stop manually
>> checking is_mem_section_removable(), and let device_offline() handle it
>> completely instead. We can remove the racy is_mem_section_removable()
>> implementation next.
>>
>> We now take more locks (e.g., memory hotplug lock when offlining and the
>> zone lock when isolating), but maybe we should optimize that
>> implementation instead if this ever becomes a real problem (after all,
>> memory unplug is already an expensive operation). We started using
>> is_mem_section_removable() in commit 51925fb3c5c9 ("powerpc/pseries:
>> Implement memory hotplug remove in the kernel"), with the initial
>> hotremove support of lmbs.
>>
>> Cc: Nathan Fontenot 
>> Cc: Michael Ellerman 
>> Cc: Benjamin Herrenschmidt 
>> Cc: Paul Mackerras 
>> Cc: Michal Hocko 
>> Cc: Andrew Morton 
>> Cc: Oscar Salvador 
>> Cc: Baoquan He 
>> Cc: Wei Yang 
>> Signed-off-by: David Hildenbrand 
>> ---
>>  .../platforms/pseries/hotplug-memory.c| 26 +++
>>  1 file changed, 3 insertions(+), 23 deletions(-)
>>
>> diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c 
>> b/arch/powerpc/platforms/pseries/hotplug-memory.c
>> index b2cde1732301..5ace2f9a277e 100644
>> --- a/arch/powerpc/platforms/pseries/hotplug-memory.c
>> +++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
>> @@ -337,39 +337,19 @@ static int pseries_remove_mem_node(struct device_node 
>> *np)
>>  
>>  static bool lmb_is_removable(struct drmem_lmb *lmb)
>>  {
>> -int i, scns_per_block;
>> -bool rc = true;
>> -unsigned long pfn, block_sz;
>> -u64 phys_addr;
>> -
>>  if (!(lmb->flags & DRCONF_MEM_ASSIGNED))
>>  return false;
>>  
>> -block_sz = memory_block_size_bytes();
>> -scns_per_block = block_sz / MIN_MEMORY_BLOCK_SIZE;
>> -phys_addr = lmb->base_addr;
>> -
>>  #ifdef CONFIG_FA_DUMP
>>  /*
>>   * Don't hot-remove memory that falls in fadump boot memory area
>>   * and memory that is reserved for capturing old kernel memory.
>>   */
>> -if (is_fadump_memory_area(phys_addr, block_sz))
>> +if (is_fadump_memory_area(lmb->base_addr, memory_block_size_bytes()))
>>  return false;
>>  #endif
>> -
>> -for (i = 0; i < scns_per_block; i++) {
>> -pfn = PFN_DOWN(phys_addr);
>> -if (!pfn_in_present_section(pfn)) {
>> -phys_addr += MIN_MEMORY_BLOCK_SIZE;
>> -continue;
>> -}
>> -
>> -rc = rc && is_mem_section_removable(pfn, PAGES_PER_SECTION);
>> -phys_addr += MIN_MEMORY_BLOCK_SIZE;
>> -}
>> -
>> -return rc;
>> +/* device_offline() will determine if we can actually remove this lmb */
>> +return true;
So I think here swaps the check and do sequence. At least it breaks
dlpar_memory_remove_by_count(). It is doable to remove
is_mem_section_removable(), but here should be more effort to re-arrange
the code.

Thanks,
Pingfan
>>  }
>>  
>>  static int dlpar_add_lmb(struct drmem_lmb *);
>> -- 
>> 2.25.1
>>



Re: [PATCH v1 1/2] powerpc/pseries/hotplug-memory: stop checking is_mem_section_removable()

2020-04-07 Thread Baoquan He
Add Pingfan to CC since he usually handles ppc related bugs for RHEL.

On 04/07/20 at 03:54pm, David Hildenbrand wrote:
> In commit 53cdc1cb29e8 ("drivers/base/memory.c: indicate all memory
> blocks as removable"), the user space interface to compute whether a memory
> block can be offlined (exposed via
> /sys/devices/system/memory/memoryX/removable) has effectively been
> deprecated. We want to remove the leftovers of the kernel implementation.

Pingfan, can you have a look at this change on PPC?  Please feel free to
give comments if any concern, or offer ack if it's OK to you.

> 
> When offlining a memory block (mm/memory_hotplug.c:__offline_pages()),
> we'll start by:
> 1. Testing if it contains any holes, and reject if so
> 2. Testing if pages belong to different zones, and reject if so
> 3. Isolating the page range, checking if it contains any unmovable pages
> 
> Using is_mem_section_removable() before trying to offline is not only racy,
> it can easily result in false positives/negatives. Let's stop manually
> checking is_mem_section_removable(), and let device_offline() handle it
> completely instead. We can remove the racy is_mem_section_removable()
> implementation next.
> 
> We now take more locks (e.g., memory hotplug lock when offlining and the
> zone lock when isolating), but maybe we should optimize that
> implementation instead if this ever becomes a real problem (after all,
> memory unplug is already an expensive operation). We started using
> is_mem_section_removable() in commit 51925fb3c5c9 ("powerpc/pseries:
> Implement memory hotplug remove in the kernel"), with the initial
> hotremove support of lmbs.
> 
> Cc: Nathan Fontenot 
> Cc: Michael Ellerman 
> Cc: Benjamin Herrenschmidt 
> Cc: Paul Mackerras 
> Cc: Michal Hocko 
> Cc: Andrew Morton 
> Cc: Oscar Salvador 
> Cc: Baoquan He 
> Cc: Wei Yang 
> Signed-off-by: David Hildenbrand 
> ---
>  .../platforms/pseries/hotplug-memory.c| 26 +++
>  1 file changed, 3 insertions(+), 23 deletions(-)
> 
> diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c 
> b/arch/powerpc/platforms/pseries/hotplug-memory.c
> index b2cde1732301..5ace2f9a277e 100644
> --- a/arch/powerpc/platforms/pseries/hotplug-memory.c
> +++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
> @@ -337,39 +337,19 @@ static int pseries_remove_mem_node(struct device_node 
> *np)
>  
>  static bool lmb_is_removable(struct drmem_lmb *lmb)
>  {
> - int i, scns_per_block;
> - bool rc = true;
> - unsigned long pfn, block_sz;
> - u64 phys_addr;
> -
>   if (!(lmb->flags & DRCONF_MEM_ASSIGNED))
>   return false;
>  
> - block_sz = memory_block_size_bytes();
> - scns_per_block = block_sz / MIN_MEMORY_BLOCK_SIZE;
> - phys_addr = lmb->base_addr;
> -
>  #ifdef CONFIG_FA_DUMP
>   /*
>* Don't hot-remove memory that falls in fadump boot memory area
>* and memory that is reserved for capturing old kernel memory.
>*/
> - if (is_fadump_memory_area(phys_addr, block_sz))
> + if (is_fadump_memory_area(lmb->base_addr, memory_block_size_bytes()))
>   return false;
>  #endif
> -
> - for (i = 0; i < scns_per_block; i++) {
> - pfn = PFN_DOWN(phys_addr);
> - if (!pfn_in_present_section(pfn)) {
> - phys_addr += MIN_MEMORY_BLOCK_SIZE;
> - continue;
> - }
> -
> - rc = rc && is_mem_section_removable(pfn, PAGES_PER_SECTION);
> - phys_addr += MIN_MEMORY_BLOCK_SIZE;
> - }
> -
> - return rc;
> + /* device_offline() will determine if we can actually remove this lmb */
> + return true;
>  }
>  
>  static int dlpar_add_lmb(struct drmem_lmb *);
> -- 
> 2.25.1
> 



Re: [PATCH v1 1/2] powerpc/pseries/hotplug-memory: stop checking is_mem_section_removable()

2020-04-07 Thread Michal Hocko
On Tue 07-04-20 15:54:15, David Hildenbrand wrote:
> In commit 53cdc1cb29e8 ("drivers/base/memory.c: indicate all memory
> blocks as removable"), the user space interface to compute whether a memory
> block can be offlined (exposed via
> /sys/devices/system/memory/memoryX/removable) has effectively been
> deprecated. We want to remove the leftovers of the kernel implementation.
> 
> When offlining a memory block (mm/memory_hotplug.c:__offline_pages()),
> we'll start by:
> 1. Testing if it contains any holes, and reject if so
> 2. Testing if pages belong to different zones, and reject if so
> 3. Isolating the page range, checking if it contains any unmovable pages
> 
> Using is_mem_section_removable() before trying to offline is not only racy,
> it can easily result in false positives/negatives. Let's stop manually
> checking is_mem_section_removable(), and let device_offline() handle it
> completely instead. We can remove the racy is_mem_section_removable()
> implementation next.
> 
> We now take more locks (e.g., memory hotplug lock when offlining and the
> zone lock when isolating), but maybe we should optimize that
> implementation instead if this ever becomes a real problem (after all,
> memory unplug is already an expensive operation). We started using
> is_mem_section_removable() in commit 51925fb3c5c9 ("powerpc/pseries:
> Implement memory hotplug remove in the kernel"), with the initial
> hotremove support of lmbs.

I am not familiar with this code but it makes sense to make it sync with
the global behavior.

> Cc: Nathan Fontenot 
> Cc: Michael Ellerman 
> Cc: Benjamin Herrenschmidt 
> Cc: Paul Mackerras 
> Cc: Michal Hocko 
> Cc: Andrew Morton 
> Cc: Oscar Salvador 
> Cc: Baoquan He 
> Cc: Wei Yang 
> Signed-off-by: David Hildenbrand 
> ---
>  .../platforms/pseries/hotplug-memory.c| 26 +++
>  1 file changed, 3 insertions(+), 23 deletions(-)
> 
> diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c 
> b/arch/powerpc/platforms/pseries/hotplug-memory.c
> index b2cde1732301..5ace2f9a277e 100644
> --- a/arch/powerpc/platforms/pseries/hotplug-memory.c
> +++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
> @@ -337,39 +337,19 @@ static int pseries_remove_mem_node(struct device_node 
> *np)
>  
>  static bool lmb_is_removable(struct drmem_lmb *lmb)
>  {
> - int i, scns_per_block;
> - bool rc = true;
> - unsigned long pfn, block_sz;
> - u64 phys_addr;
> -
>   if (!(lmb->flags & DRCONF_MEM_ASSIGNED))
>   return false;
>  
> - block_sz = memory_block_size_bytes();
> - scns_per_block = block_sz / MIN_MEMORY_BLOCK_SIZE;
> - phys_addr = lmb->base_addr;
> -
>  #ifdef CONFIG_FA_DUMP
>   /*
>* Don't hot-remove memory that falls in fadump boot memory area
>* and memory that is reserved for capturing old kernel memory.
>*/
> - if (is_fadump_memory_area(phys_addr, block_sz))
> + if (is_fadump_memory_area(lmb->base_addr, memory_block_size_bytes()))
>   return false;
>  #endif
> -
> - for (i = 0; i < scns_per_block; i++) {
> - pfn = PFN_DOWN(phys_addr);
> - if (!pfn_in_present_section(pfn)) {
> - phys_addr += MIN_MEMORY_BLOCK_SIZE;
> - continue;
> - }
> -
> - rc = rc && is_mem_section_removable(pfn, PAGES_PER_SECTION);
> - phys_addr += MIN_MEMORY_BLOCK_SIZE;
> - }
> -
> - return rc;
> + /* device_offline() will determine if we can actually remove this lmb */
> + return true;
>  }
>  
>  static int dlpar_add_lmb(struct drmem_lmb *);
> -- 
> 2.25.1

-- 
Michal Hocko
SUSE Labs