Re: [PATCHv4 2/2] powerpc/pseries: update device tree before ejecting hotplug uevents

2020-08-04 Thread Pingfan Liu
On Tue, Aug 4, 2020 at 12:29 AM Laurent Dufour  wrote:
>
[...]
> >   lmb_set_nid(lmb);
> >   lmb->flags |= DRCONF_MEM_ASSIGNED;
> > + if (dt_update) {
> > + ret = drmem_update_dt();
> > + if (ret)
> > + pr_warn("%s fail to update dt, but continue\n", 
> > __func__);
> > + }
> >
> >   block_sz = memory_block_size_bytes();
>
> In the case the call to __add_memory is failing, the flag DRCONF_MEM_ASSIGNED
> should be cleared as I mentioned in your previous patch. In addition to this,
Yes.
> the DT should be updated, or the caller should manage that but that will be 
> hard
> since it doesn't know where the error happened in this function.
Yeah, it is hard to manage it by caller, so just updating dt  is a
easier method.
>
> >
> > @@ -625,7 +653,11 @@ static int dlpar_add_lmb(struct drmem_lmb *lmb)
> >   invalidate_lmb_associativity_index(lmb);
> >   lmb_clear_nid(lmb);
> >   lmb->flags &= ~DRCONF_MEM_ASSIGNED;
> > -
> > + if (dt_update) {
> > + ret = drmem_update_dt();
> > + if (ret)
> > + pr_warn("%s fail to update dt during 
> > rollback, but continue\n", __func__);
> > + }
> >   __remove_memory(nid, base_addr, block_sz);
> >   }
> >
> > @@ -638,6 +670,7 @@ static int dlpar_memory_add_by_count(u32 lmbs_to_add)
> >   int lmbs_available = 0;
> >   int lmbs_added = 0;
> >   int rc;
> > + bool dt_update = false;
> >
> >   pr_info("Attempting to hot-add %d LMB(s)\n", lmbs_to_add);
> >
> > @@ -664,7 +697,7 @@ static int dlpar_memory_add_by_count(u32 lmbs_to_add)
> >   if (rc)
> >   continue;
> >
> > - rc = dlpar_add_lmb(lmb);
> > + rc = dlpar_add_lmb(lmb, dt_update);
> >   if (rc) {
> >   dlpar_release_drc(lmb->drc_index);
> >   continue;
> > @@ -678,16 +711,23 @@ static int dlpar_memory_add_by_count(u32 lmbs_to_add)
> >   lmbs_added++;
> >   if (lmbs_added == lmbs_to_add)
> >   break;
> > + else if (lmbs_added == lmbs_to_add - 1)
> > + dt_update = true;
>
> In the case the number of LMB to add is 1, dt_update is never set to true, and
> the device tree is never updated. You need to set dt_update to true earlier in
> the loop block.
Oh, I will fix it in V5
>
> >   }
> >
> >   if (lmbs_added != lmbs_to_add) {
> > + bool rollback_dt_update = false;
> > +
> >   pr_err("Memory hot-add failed, removing any added LMBs\n");
> >
> >   for_each_drmem_lmb(lmb) {
> >   if (!drmem_lmb_reserved(lmb))
> >   continue;
> >
> > - rc = dlpar_remove_lmb(lmb);
> > + if (--lmbs_added == 0 && dt_update)
> > + rollback_dt_update = true;
>
> That test may have to be review to deal with error during the last LMB 
> addition,
> the DT may have been updated before __add_memory() is failing in
> dlpar_add_lmb(). In that case, lmbs_added == 0 and that branch is not covered.
> That's not an issue if dlpar_add_lmb() is handling that case (see my comment 
> above).
I will fix it in next version.

Thanks for your review.

Regards,
Pingfan


Re: [PATCHv4 2/2] powerpc/pseries: update device tree before ejecting hotplug uevents

2020-08-03 Thread Laurent Dufour

Le 30/07/2020 à 15:33, Pingfan Liu a écrit :

A bug is observed on pseries by taking the following steps on rhel:
-1. drmgr -c mem -r -q 5
-2. echo c > /proc/sysrq-trigger

And then, the failure looks like:
kdump: saving to /sysroot//var/crash/127.0.0.1-2020-01-16-02:06:14/
kdump: saving vmcore-dmesg.txt
kdump: saving vmcore-dmesg.txt complete
kdump: saving vmcore
  Checking for memory holes : [  0.0 %] /   
Checking for memory holes : [100.0 %] | 
  Excluding unnecessary pages   : [100.0 %] \   
Copying data  : [  0.3 %] - 
 eta: 38s[   44.337636] hash-mmu: mm: Hashing failure ! EA=0x7fffba40 
access=0x8004 current=makedumpfile
[   44.337663] hash-mmu: trap=0x300 vsid=0x13a109c ssize=1 base psize=2 
psize 2 pte=0xc0005504
[   44.337677] hash-mmu: mm: Hashing failure ! EA=0x7fffba40 
access=0x8004 current=makedumpfile
[   44.337692] hash-mmu: trap=0x300 vsid=0x13a109c ssize=1 base psize=2 
psize 2 pte=0xc0005504
[   44.337708] makedumpfile[469]: unhandled signal 7 at 7fffba40 nip 
7fffbbc4d7fc lr 00011356ca3c code 2
[   44.338548] Core dump to |/bin/false pipe failed
/lib/kdump-lib-initramfs.sh: line 98:   469 Bus error   
$CORE_COLLECTOR /proc/vmcore 
$_mp/$KDUMP_PATH/$HOST_IP-$DATEDIR/vmcore-incomplete
kdump: saving vmcore failed

* Root cause *
   After analyzing, it turns out that in the current implementation,
when hot-removing lmb, the KOBJ_REMOVE event ejects before the dt updating as
the code __remove_memory() comes before drmem_update_dt().
So in kdump kernel, when read_from_oldmem() resorts to
pSeries_lpar_hpte_insert() to install hpte, but fails with -2 due to
non-exist pfn. And finally, low_hash_fault() raise SIGBUS to process, as it
can be observed "Bus error"

 From a viewpoint of listener and publisher, the publisher notifies the
listener before data is ready.  This introduces a problem where udev
launches kexec-tools (due to KOBJ_REMOVE) and loads a stale dt before
updating. And in capture kernel, makedumpfile will access the memory based
on the stale dt info, and hit a SIGBUS error due to an un-existed lmb.

* Fix *
This bug is introduced by commit 063b8b1251fd
("powerpc/pseries/memory-hotplug: Only update DT once per memory DLPAR
request"), which tried to combine all the dt updating into one.

To fix this issue, meanwhile not to introduce a quadratic runtime
complexity by the model:
   dlpar_memory_add_by_count
 for_each_drmem_lmb <--
   dlpar_add_lmb
 drmem_update_dt(_v1|_v2)
   for_each_drmem_lmb   <--
The dt should still be only updated once, and just before the last memory
online/offline event is ejected to user space. Achieve this by tracing the
num of lmb added or removed.

Signed-off-by: Pingfan Liu 
Cc: Michael Ellerman 
Cc: Hari Bathini 
Cc: Nathan Lynch 
Cc: Nathan Fontenot 
Cc: ke...@lists.infradead.org
To: linuxppc-dev@lists.ozlabs.org
---
v3 -> v4: resolve a quadratic runtime complexity issue.
   This series is applied on next-test branch
  arch/powerpc/platforms/pseries/hotplug-memory.c | 88 ++---
  1 file changed, 66 insertions(+), 22 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c 
b/arch/powerpc/platforms/pseries/hotplug-memory.c
index 1a3ac3b..e07d5b1 100644
--- a/arch/powerpc/platforms/pseries/hotplug-memory.c
+++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
@@ -350,13 +350,13 @@ static bool lmb_is_removable(struct drmem_lmb *lmb)
return true;
  }
  
-static int dlpar_add_lmb(struct drmem_lmb *);

+static int dlpar_add_lmb(struct drmem_lmb *lmb, bool dt_update);
  
-static int dlpar_remove_lmb(struct drmem_lmb *lmb)

+static int dlpar_remove_lmb(struct drmem_lmb *lmb, bool dt_update)
  {
unsigned long block_sz;
phys_addr_t base_addr;
-   int rc, nid;
+   int rc, ret, nid;
  
  	if (!lmb_is_removable(lmb))

return -EINVAL;
@@ -372,6 +372,11 @@ static int dlpar_remove_lmb(struct drmem_lmb *lmb)
invalidate_lmb_associativity_index(lmb);
lmb_clear_nid(lmb);
lmb->flags &= ~DRCONF_MEM_ASSIGNED;
+   if (dt_update) {
+   ret = drmem_update_dt();
+   if (ret)
+   pr_warn("%s fail to update dt, but continue\n", 
__func__);
+   }
  
  	__remove_memory(nid, base_addr, block_sz);
  
@@ -387,6 +392,7 @@ static int dlpar_memory_remove_by_count(u32 lmbs_to_remove)

int lmbs_removed = 0;
int lmbs_available = 0;
int rc;
+   bool dt_update = false;
  
  	pr_info("Attempting to hot-remove %d LMB(s)\n", lmbs_to_remove);
  
@@ -409,7 +415,7 @@ static int dlpar_memory_remove_by_count(u32 lmbs_to_remove)

}
  
  	for_each_drmem_lmb(lmb) {

-   rc = dlpar_remove_lmb(lmb);
+   

[PATCHv4 2/2] powerpc/pseries: update device tree before ejecting hotplug uevents

2020-07-30 Thread Pingfan Liu
A bug is observed on pseries by taking the following steps on rhel:
-1. drmgr -c mem -r -q 5
-2. echo c > /proc/sysrq-trigger

And then, the failure looks like:
kdump: saving to /sysroot//var/crash/127.0.0.1-2020-01-16-02:06:14/
kdump: saving vmcore-dmesg.txt
kdump: saving vmcore-dmesg.txt complete
kdump: saving vmcore
 Checking for memory holes : [  0.0 %] /
   Checking for memory holes : [100.0 %] |  
 Excluding unnecessary pages   : [100.0 %] \
   Copying data  : [  0.3 %] -  
eta: 38s[   44.337636] hash-mmu: mm: Hashing failure ! EA=0x7fffba40 
access=0x8004 current=makedumpfile
[   44.337663] hash-mmu: trap=0x300 vsid=0x13a109c ssize=1 base psize=2 
psize 2 pte=0xc0005504
[   44.337677] hash-mmu: mm: Hashing failure ! EA=0x7fffba40 
access=0x8004 current=makedumpfile
[   44.337692] hash-mmu: trap=0x300 vsid=0x13a109c ssize=1 base psize=2 
psize 2 pte=0xc0005504
[   44.337708] makedumpfile[469]: unhandled signal 7 at 7fffba40 nip 
7fffbbc4d7fc lr 00011356ca3c code 2
[   44.338548] Core dump to |/bin/false pipe failed
/lib/kdump-lib-initramfs.sh: line 98:   469 Bus error   
$CORE_COLLECTOR /proc/vmcore 
$_mp/$KDUMP_PATH/$HOST_IP-$DATEDIR/vmcore-incomplete
kdump: saving vmcore failed

* Root cause *
  After analyzing, it turns out that in the current implementation,
when hot-removing lmb, the KOBJ_REMOVE event ejects before the dt updating as
the code __remove_memory() comes before drmem_update_dt().
So in kdump kernel, when read_from_oldmem() resorts to
pSeries_lpar_hpte_insert() to install hpte, but fails with -2 due to
non-exist pfn. And finally, low_hash_fault() raise SIGBUS to process, as it
can be observed "Bus error"

>From a viewpoint of listener and publisher, the publisher notifies the
listener before data is ready.  This introduces a problem where udev
launches kexec-tools (due to KOBJ_REMOVE) and loads a stale dt before
updating. And in capture kernel, makedumpfile will access the memory based
on the stale dt info, and hit a SIGBUS error due to an un-existed lmb.

* Fix *
This bug is introduced by commit 063b8b1251fd
("powerpc/pseries/memory-hotplug: Only update DT once per memory DLPAR
request"), which tried to combine all the dt updating into one.

To fix this issue, meanwhile not to introduce a quadratic runtime
complexity by the model:
  dlpar_memory_add_by_count
for_each_drmem_lmb <--
  dlpar_add_lmb
drmem_update_dt(_v1|_v2)
  for_each_drmem_lmb   <--
The dt should still be only updated once, and just before the last memory
online/offline event is ejected to user space. Achieve this by tracing the
num of lmb added or removed.

Signed-off-by: Pingfan Liu 
Cc: Michael Ellerman 
Cc: Hari Bathini 
Cc: Nathan Lynch 
Cc: Nathan Fontenot 
Cc: ke...@lists.infradead.org
To: linuxppc-dev@lists.ozlabs.org
---
v3 -> v4: resolve a quadratic runtime complexity issue.
  This series is applied on next-test branch
 arch/powerpc/platforms/pseries/hotplug-memory.c | 88 ++---
 1 file changed, 66 insertions(+), 22 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c 
b/arch/powerpc/platforms/pseries/hotplug-memory.c
index 1a3ac3b..e07d5b1 100644
--- a/arch/powerpc/platforms/pseries/hotplug-memory.c
+++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
@@ -350,13 +350,13 @@ static bool lmb_is_removable(struct drmem_lmb *lmb)
return true;
 }
 
-static int dlpar_add_lmb(struct drmem_lmb *);
+static int dlpar_add_lmb(struct drmem_lmb *lmb, bool dt_update);
 
-static int dlpar_remove_lmb(struct drmem_lmb *lmb)
+static int dlpar_remove_lmb(struct drmem_lmb *lmb, bool dt_update)
 {
unsigned long block_sz;
phys_addr_t base_addr;
-   int rc, nid;
+   int rc, ret, nid;
 
if (!lmb_is_removable(lmb))
return -EINVAL;
@@ -372,6 +372,11 @@ static int dlpar_remove_lmb(struct drmem_lmb *lmb)
invalidate_lmb_associativity_index(lmb);
lmb_clear_nid(lmb);
lmb->flags &= ~DRCONF_MEM_ASSIGNED;
+   if (dt_update) {
+   ret = drmem_update_dt();
+   if (ret)
+   pr_warn("%s fail to update dt, but continue\n", 
__func__);
+   }
 
__remove_memory(nid, base_addr, block_sz);
 
@@ -387,6 +392,7 @@ static int dlpar_memory_remove_by_count(u32 lmbs_to_remove)
int lmbs_removed = 0;
int lmbs_available = 0;
int rc;
+   bool dt_update = false;
 
pr_info("Attempting to hot-remove %d LMB(s)\n", lmbs_to_remove);
 
@@ -409,7 +415,7 @@ static int dlpar_memory_remove_by_count(u32 lmbs_to_remove)
}
 
for_each_drmem_lmb(lmb) {
-   rc = dlpar_remove_lmb(lmb);
+   rc = dlpar_remove_lmb(lmb, dt_update);