Re: [PATCH 2/3] hotplug-memory.c: enhance dlpar_memory_remove* LMB checks

2021-05-12 Thread David Gibson
On Wed, May 12, 2021 at 05:35:39PM -0300, Daniel Henrique Barboza wrote:
> 
> On 5/3/21 10:02 PM, David Gibson wrote:
> > On Fri, Apr 30, 2021 at 09:09:16AM -0300, Daniel Henrique Barboza wrote:
> > > dlpar_memory_remove_by_ic() validates the amount of LMBs to be removed
> > > by checking !DRCONF_MEM_RESERVED, and in the following loop before
> > > dlpar_remove_lmb() a check for DRCONF_MEM_ASSIGNED is made before
> > > removing it. This means that a LMB that is both !DRCONF_MEM_RESERVED and
> > > !DRCONF_MEM_ASSIGNED will be counted as valid, but then not being
> > > removed.  The function will end up not removing all 'lmbs_to_remove'
> > > LMBs while also not reporting any errors.
> > > 
> > > Comparing it to dlpar_memory_remove_by_count(), the validation is done
> > > via lmb_is_removable(), which checks for DRCONF_MEM_ASSIGNED and fadump
> > > constraints. No additional check is made afterwards, and
> > > DRCONF_MEM_RESERVED is never checked before dlpar_remove_lmb(). The
> > > function doesn't have the same 'check A for validation, then B for
> > > removal' issue as remove_by_ic(), but it's not checking if the LMB is
> > > reserved.
> > > 
> > > There is no reason for these functions to validate the same operation in
> > > two different manners.
> > 
> > Actually, I think there is: remove_by_ic() is handling a request to
> > remove a specific range of LMBs.  If any are reserved, they can't be
> > removed and so this needs to fail.  But if they are !ASSIGNED, that
> > essentially means they're *already* removed (or never added), so
> > "removing" them is, correctly, a no-op.
> > 
> > remove_by_count(), in contrast, is being asked to remove a fixed
> > number of LMBs from wherever they can be found, and for that it needs
> > to find LMBs that haven't already been removed.
> > 
> > Basically remove_by_ic() is an absolute request: "make this set of
> > LMBs be not-plugged", whereas remove_by_count() is a relative request
> > "make N less LMBs be plugged".
> > 
> > 
> > So I think remove_by_ic()s existing handling is correct.  I'm less
> > sure if remove_by_count() ignoring RESERVED is correct - I couldn't
> > quickly find under what circumstances RESERVED gets set.
> 
> RESERVED is never set by the kernel. It is written in the DT by the
> firmware/hypervisor and the kernel just checks its value. QEMU sets it in
> spapr_dt_dynamic_memory() with the following comment:
> 
> 
> /*
>  * LMB information for RMA, boot time RAM and gap b/n RAM and
>  * device memory region -- all these are marked as reserved
>  * and as having no valid DRC.
>  */
> dynamic_memory[0] = cpu_to_be32(addr >> 32);
> dynamic_memory[1] = cpu_to_be32(addr & 0x);
> dynamic_memory[2] = cpu_to_be32(0);
> dynamic_memory[3] = cpu_to_be32(0); /* reserved */
> dynamic_memory[4] = cpu_to_be32(-1);
> dynamic_memory[5] = cpu_to_be32(SPAPR_LMB_FLAGS_RESERVED |
> SPAPR_LMB_FLAGS_DRC_INVALID);
> 
> 
> The flag is formally described in LOPAR section 4.2.8, "Reserved Memory":
> 
> "Memory nodes marked with the special value of the “status” property of
> “reserved” is not to be used or altered by the base OS."
> 
> 
> This makes me confident that we should check DRCONF_MEM_RESERVED in
> remove_by_count() as well, since phyp needs do adhere to these semantics and
> shouldn't be able to remove a LMB marked as RESERVED.

Right.  I doubt it would have caused a problem in practice, because
I'm pretty sure we should never get an LMB which is RESERVED &&
ASSIGNED, but it's probably safer to make it explicit.


-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [PATCH 2/3] hotplug-memory.c: enhance dlpar_memory_remove* LMB checks

2021-05-12 Thread Daniel Henrique Barboza



On 5/3/21 10:02 PM, David Gibson wrote:

On Fri, Apr 30, 2021 at 09:09:16AM -0300, Daniel Henrique Barboza wrote:

dlpar_memory_remove_by_ic() validates the amount of LMBs to be removed
by checking !DRCONF_MEM_RESERVED, and in the following loop before
dlpar_remove_lmb() a check for DRCONF_MEM_ASSIGNED is made before
removing it. This means that a LMB that is both !DRCONF_MEM_RESERVED and
!DRCONF_MEM_ASSIGNED will be counted as valid, but then not being
removed.  The function will end up not removing all 'lmbs_to_remove'
LMBs while also not reporting any errors.

Comparing it to dlpar_memory_remove_by_count(), the validation is done
via lmb_is_removable(), which checks for DRCONF_MEM_ASSIGNED and fadump
constraints. No additional check is made afterwards, and
DRCONF_MEM_RESERVED is never checked before dlpar_remove_lmb(). The
function doesn't have the same 'check A for validation, then B for
removal' issue as remove_by_ic(), but it's not checking if the LMB is
reserved.

There is no reason for these functions to validate the same operation in
two different manners.


Actually, I think there is: remove_by_ic() is handling a request to
remove a specific range of LMBs.  If any are reserved, they can't be
removed and so this needs to fail.  But if they are !ASSIGNED, that
essentially means they're *already* removed (or never added), so
"removing" them is, correctly, a no-op.

remove_by_count(), in contrast, is being asked to remove a fixed
number of LMBs from wherever they can be found, and for that it needs
to find LMBs that haven't already been removed.

Basically remove_by_ic() is an absolute request: "make this set of
LMBs be not-plugged", whereas remove_by_count() is a relative request
"make N less LMBs be plugged".


So I think remove_by_ic()s existing handling is correct.  I'm less
sure if remove_by_count() ignoring RESERVED is correct - I couldn't
quickly find under what circumstances RESERVED gets set.


RESERVED is never set by the kernel. It is written in the DT by the
firmware/hypervisor and the kernel just checks its value. QEMU sets it in
spapr_dt_dynamic_memory() with the following comment:


/*
 * LMB information for RMA, boot time RAM and gap b/n RAM and
 * device memory region -- all these are marked as reserved
 * and as having no valid DRC.
 */
dynamic_memory[0] = cpu_to_be32(addr >> 32);
dynamic_memory[1] = cpu_to_be32(addr & 0x);
dynamic_memory[2] = cpu_to_be32(0);
dynamic_memory[3] = cpu_to_be32(0); /* reserved */
dynamic_memory[4] = cpu_to_be32(-1);
dynamic_memory[5] = cpu_to_be32(SPAPR_LMB_FLAGS_RESERVED |
SPAPR_LMB_FLAGS_DRC_INVALID);


The flag is formally described in LOPAR section 4.2.8, "Reserved Memory":

"Memory nodes marked with the special value of the “status” property of
“reserved” is not to be used or altered by the base OS."


This makes me confident that we should check DRCONF_MEM_RESERVED in
remove_by_count() as well, since phyp needs do adhere to these semantics and
shouldn't be able to remove a LMB marked as RESERVED.



Thanks,



Daniel








This patch addresses that by changing
lmb_is_removable() to also check for DRCONF_MEM_RESERVED to tell if a
lmb is removable, making dlpar_memory_remove_by_count() take the
reservation state into account when counting the LMBs.
lmb_is_removable() is then used in the validation step of
dlpar_memory_remove_by_ic(), which is already checking for both states
but in different stages, to avoid counting a LMB that is not assigned as
eligible for removal. We can then skip the check before
dlpar_remove_lmb() since we're validating all LMBs beforehand.

Signed-off-by: Daniel Henrique Barboza 
---
  arch/powerpc/platforms/pseries/hotplug-memory.c | 8 +++-
  1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c 
b/arch/powerpc/platforms/pseries/hotplug-memory.c
index bb98574a84a2..4e6d162c3f1a 100644
--- a/arch/powerpc/platforms/pseries/hotplug-memory.c
+++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
@@ -348,7 +348,8 @@ static int pseries_remove_mem_node(struct device_node *np)
  
  static bool lmb_is_removable(struct drmem_lmb *lmb)

  {
-   if (!(lmb->flags & DRCONF_MEM_ASSIGNED))
+   if ((lmb->flags & DRCONF_MEM_RESERVED) ||
+   !(lmb->flags & DRCONF_MEM_ASSIGNED))
return false;
  
  #ifdef CONFIG_FA_DUMP

@@ -523,7 +524,7 @@ static int dlpar_memory_remove_by_ic(u32 lmbs_to_remove, 
u32 drc_index)
  
  	/* Validate that there are enough LMBs to satisfy the request */

for_each_drmem_lmb_in_range(lmb, start_lmb, end_lmb) {
-   if (lmb->flags & DRCONF_MEM_RESERVED)
+   if (!lmb_is_removable(lmb))
break;
  
  		lmbs_available++;

@@ -533,9 +534,6 @@ static int dlpar_memory_remove_by_ic(u32 

Re: [PATCH 2/3] hotplug-memory.c: enhance dlpar_memory_remove* LMB checks

2021-05-09 Thread David Gibson
On Fri, May 07, 2021 at 01:36:06PM -0300, Daniel Henrique Barboza wrote:
> 
> 
> On 5/3/21 10:02 PM, David Gibson wrote:
> > On Fri, Apr 30, 2021 at 09:09:16AM -0300, Daniel Henrique Barboza wrote:
> > > dlpar_memory_remove_by_ic() validates the amount of LMBs to be removed
> > > by checking !DRCONF_MEM_RESERVED, and in the following loop before
> > > dlpar_remove_lmb() a check for DRCONF_MEM_ASSIGNED is made before
> > > removing it. This means that a LMB that is both !DRCONF_MEM_RESERVED and
> > > !DRCONF_MEM_ASSIGNED will be counted as valid, but then not being
> > > removed.  The function will end up not removing all 'lmbs_to_remove'
> > > LMBs while also not reporting any errors.
> > > 
> > > Comparing it to dlpar_memory_remove_by_count(), the validation is done
> > > via lmb_is_removable(), which checks for DRCONF_MEM_ASSIGNED and fadump
> > > constraints. No additional check is made afterwards, and
> > > DRCONF_MEM_RESERVED is never checked before dlpar_remove_lmb(). The
> > > function doesn't have the same 'check A for validation, then B for
> > > removal' issue as remove_by_ic(), but it's not checking if the LMB is
> > > reserved.
> > > 
> > > There is no reason for these functions to validate the same operation in
> > > two different manners.
> > 
> > Actually, I think there is: remove_by_ic() is handling a request to
> > remove a specific range of LMBs.  If any are reserved, they can't be
> > removed and so this needs to fail.  But if they are !ASSIGNED, that
> > essentially means they're *already* removed (or never added), so
> > "removing" them is, correctly, a no-op.
> 
> I guess that makes sense. Although I am not aware of any situation, at least
> thinking about how QEMU adds/removes LMBs, where some LMBs would be removed
> 'ad-hoc' in the middle of a LMB range that maps to a QEMU DIMM, I can't say
> that this wouldn't never happen either.

Right.  I believe a user could explicitly offline LMBs in the middle
of a DIMM. There's not much reason to do so, but it's possible.  There
might also be situations involving memory errors where individual LMBs
could get offlined.

> It is sensible to make remove_by_ic()
> resilient to this situation.
> 
> I'll re-send this patch just with the remove_by_count() change.
> 
> 
> Thanks,
> 
> 
> Daniel
> 
> > 
> > remove_by_count(), in contrast, is being asked to remove a fixed
> > number of LMBs from wherever they can be found, and for that it needs
> > to find LMBs that haven't already been removed.
> > 
> > Basically remove_by_ic() is an absolute request: "make this set of
> > LMBs be not-plugged", whereas remove_by_count() is a relative request
> > "make N less LMBs be plugged".
> > 
> > 
> > So I think remove_by_ic()s existing handling is correct.  I'm less
> > sure if remove_by_count() ignoring RESERVED is correct - I couldn't
> > quickly find under what circumstances RESERVED gets set.
> > 
> > 
> > > This patch addresses that by changing
> > > lmb_is_removable() to also check for DRCONF_MEM_RESERVED to tell if a
> > > lmb is removable, making dlpar_memory_remove_by_count() take the
> > > reservation state into account when counting the LMBs.
> > > lmb_is_removable() is then used in the validation step of
> > > dlpar_memory_remove_by_ic(), which is already checking for both states
> > > but in different stages, to avoid counting a LMB that is not assigned as
> > > eligible for removal. We can then skip the check before
> > > dlpar_remove_lmb() since we're validating all LMBs beforehand.
> > > 
> > > Signed-off-by: Daniel Henrique Barboza 
> > > ---
> > >   arch/powerpc/platforms/pseries/hotplug-memory.c | 8 +++-
> > >   1 file changed, 3 insertions(+), 5 deletions(-)
> > > 
> > > diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c 
> > > b/arch/powerpc/platforms/pseries/hotplug-memory.c
> > > index bb98574a84a2..4e6d162c3f1a 100644
> > > --- a/arch/powerpc/platforms/pseries/hotplug-memory.c
> > > +++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
> > > @@ -348,7 +348,8 @@ static int pseries_remove_mem_node(struct device_node 
> > > *np)
> > >   static bool lmb_is_removable(struct drmem_lmb *lmb)
> > >   {
> > > - if (!(lmb->flags & DRCONF_MEM_ASSIGNED))
> > > + if ((lmb->flags & DRCONF_MEM_RESERVED) ||
> > > + !(lmb->flags & DRCONF_MEM_ASSIGNED))
> > >   return false;
> > >   #ifdef CONFIG_FA_DUMP
> > > @@ -523,7 +524,7 @@ static int dlpar_memory_remove_by_ic(u32 
> > > lmbs_to_remove, u32 drc_index)
> > >   /* Validate that there are enough LMBs to satisfy the request */
> > >   for_each_drmem_lmb_in_range(lmb, start_lmb, end_lmb) {
> > > - if (lmb->flags & DRCONF_MEM_RESERVED)
> > > + if (!lmb_is_removable(lmb))
> > >   break;
> > >   lmbs_available++;
> > > @@ -533,9 +534,6 @@ static int dlpar_memory_remove_by_ic(u32 
> > > lmbs_to_remove, u32 drc_index)
> > >   return -EINVAL;
> > >   

Re: [PATCH 2/3] hotplug-memory.c: enhance dlpar_memory_remove* LMB checks

2021-05-07 Thread Daniel Henrique Barboza




On 5/3/21 10:02 PM, David Gibson wrote:

On Fri, Apr 30, 2021 at 09:09:16AM -0300, Daniel Henrique Barboza wrote:

dlpar_memory_remove_by_ic() validates the amount of LMBs to be removed
by checking !DRCONF_MEM_RESERVED, and in the following loop before
dlpar_remove_lmb() a check for DRCONF_MEM_ASSIGNED is made before
removing it. This means that a LMB that is both !DRCONF_MEM_RESERVED and
!DRCONF_MEM_ASSIGNED will be counted as valid, but then not being
removed.  The function will end up not removing all 'lmbs_to_remove'
LMBs while also not reporting any errors.

Comparing it to dlpar_memory_remove_by_count(), the validation is done
via lmb_is_removable(), which checks for DRCONF_MEM_ASSIGNED and fadump
constraints. No additional check is made afterwards, and
DRCONF_MEM_RESERVED is never checked before dlpar_remove_lmb(). The
function doesn't have the same 'check A for validation, then B for
removal' issue as remove_by_ic(), but it's not checking if the LMB is
reserved.

There is no reason for these functions to validate the same operation in
two different manners.


Actually, I think there is: remove_by_ic() is handling a request to
remove a specific range of LMBs.  If any are reserved, they can't be
removed and so this needs to fail.  But if they are !ASSIGNED, that
essentially means they're *already* removed (or never added), so
"removing" them is, correctly, a no-op.


I guess that makes sense. Although I am not aware of any situation, at least
thinking about how QEMU adds/removes LMBs, where some LMBs would be removed
'ad-hoc' in the middle of a LMB range that maps to a QEMU DIMM, I can't say
that this wouldn't never happen either. It is sensible to make remove_by_ic()
resilient to this situation.

I'll re-send this patch just with the remove_by_count() change.


Thanks,


Daniel



remove_by_count(), in contrast, is being asked to remove a fixed
number of LMBs from wherever they can be found, and for that it needs
to find LMBs that haven't already been removed.

Basically remove_by_ic() is an absolute request: "make this set of
LMBs be not-plugged", whereas remove_by_count() is a relative request
"make N less LMBs be plugged".


So I think remove_by_ic()s existing handling is correct.  I'm less
sure if remove_by_count() ignoring RESERVED is correct - I couldn't
quickly find under what circumstances RESERVED gets set.



This patch addresses that by changing
lmb_is_removable() to also check for DRCONF_MEM_RESERVED to tell if a
lmb is removable, making dlpar_memory_remove_by_count() take the
reservation state into account when counting the LMBs.
lmb_is_removable() is then used in the validation step of
dlpar_memory_remove_by_ic(), which is already checking for both states
but in different stages, to avoid counting a LMB that is not assigned as
eligible for removal. We can then skip the check before
dlpar_remove_lmb() since we're validating all LMBs beforehand.

Signed-off-by: Daniel Henrique Barboza 
---
  arch/powerpc/platforms/pseries/hotplug-memory.c | 8 +++-
  1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c 
b/arch/powerpc/platforms/pseries/hotplug-memory.c
index bb98574a84a2..4e6d162c3f1a 100644
--- a/arch/powerpc/platforms/pseries/hotplug-memory.c
+++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
@@ -348,7 +348,8 @@ static int pseries_remove_mem_node(struct device_node *np)
  
  static bool lmb_is_removable(struct drmem_lmb *lmb)

  {
-   if (!(lmb->flags & DRCONF_MEM_ASSIGNED))
+   if ((lmb->flags & DRCONF_MEM_RESERVED) ||
+   !(lmb->flags & DRCONF_MEM_ASSIGNED))
return false;
  
  #ifdef CONFIG_FA_DUMP

@@ -523,7 +524,7 @@ static int dlpar_memory_remove_by_ic(u32 lmbs_to_remove, 
u32 drc_index)
  
  	/* Validate that there are enough LMBs to satisfy the request */

for_each_drmem_lmb_in_range(lmb, start_lmb, end_lmb) {
-   if (lmb->flags & DRCONF_MEM_RESERVED)
+   if (!lmb_is_removable(lmb))
break;
  
  		lmbs_available++;

@@ -533,9 +534,6 @@ static int dlpar_memory_remove_by_ic(u32 lmbs_to_remove, 
u32 drc_index)
return -EINVAL;
  
  	for_each_drmem_lmb_in_range(lmb, start_lmb, end_lmb) {

-   if (!(lmb->flags & DRCONF_MEM_ASSIGNED))
-   continue;
-
rc = dlpar_remove_lmb(lmb);
if (rc)
break;




Re: [PATCH 2/3] hotplug-memory.c: enhance dlpar_memory_remove* LMB checks

2021-05-03 Thread David Gibson
On Fri, Apr 30, 2021 at 09:09:16AM -0300, Daniel Henrique Barboza wrote:
> dlpar_memory_remove_by_ic() validates the amount of LMBs to be removed
> by checking !DRCONF_MEM_RESERVED, and in the following loop before
> dlpar_remove_lmb() a check for DRCONF_MEM_ASSIGNED is made before
> removing it. This means that a LMB that is both !DRCONF_MEM_RESERVED and
> !DRCONF_MEM_ASSIGNED will be counted as valid, but then not being
> removed.  The function will end up not removing all 'lmbs_to_remove'
> LMBs while also not reporting any errors.
> 
> Comparing it to dlpar_memory_remove_by_count(), the validation is done
> via lmb_is_removable(), which checks for DRCONF_MEM_ASSIGNED and fadump
> constraints. No additional check is made afterwards, and
> DRCONF_MEM_RESERVED is never checked before dlpar_remove_lmb(). The
> function doesn't have the same 'check A for validation, then B for
> removal' issue as remove_by_ic(), but it's not checking if the LMB is
> reserved.
> 
> There is no reason for these functions to validate the same operation in
> two different manners.

Actually, I think there is: remove_by_ic() is handling a request to
remove a specific range of LMBs.  If any are reserved, they can't be
removed and so this needs to fail.  But if they are !ASSIGNED, that
essentially means they're *already* removed (or never added), so
"removing" them is, correctly, a no-op.

remove_by_count(), in contrast, is being asked to remove a fixed
number of LMBs from wherever they can be found, and for that it needs
to find LMBs that haven't already been removed.

Basically remove_by_ic() is an absolute request: "make this set of
LMBs be not-plugged", whereas remove_by_count() is a relative request
"make N less LMBs be plugged".


So I think remove_by_ic()s existing handling is correct.  I'm less
sure if remove_by_count() ignoring RESERVED is correct - I couldn't
quickly find under what circumstances RESERVED gets set.


> This patch addresses that by changing
> lmb_is_removable() to also check for DRCONF_MEM_RESERVED to tell if a
> lmb is removable, making dlpar_memory_remove_by_count() take the
> reservation state into account when counting the LMBs.
> lmb_is_removable() is then used in the validation step of
> dlpar_memory_remove_by_ic(), which is already checking for both states
> but in different stages, to avoid counting a LMB that is not assigned as
> eligible for removal. We can then skip the check before
> dlpar_remove_lmb() since we're validating all LMBs beforehand.
> 
> Signed-off-by: Daniel Henrique Barboza 
> ---
>  arch/powerpc/platforms/pseries/hotplug-memory.c | 8 +++-
>  1 file changed, 3 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c 
> b/arch/powerpc/platforms/pseries/hotplug-memory.c
> index bb98574a84a2..4e6d162c3f1a 100644
> --- a/arch/powerpc/platforms/pseries/hotplug-memory.c
> +++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
> @@ -348,7 +348,8 @@ static int pseries_remove_mem_node(struct device_node *np)
>  
>  static bool lmb_is_removable(struct drmem_lmb *lmb)
>  {
> - if (!(lmb->flags & DRCONF_MEM_ASSIGNED))
> + if ((lmb->flags & DRCONF_MEM_RESERVED) ||
> + !(lmb->flags & DRCONF_MEM_ASSIGNED))
>   return false;
>  
>  #ifdef CONFIG_FA_DUMP
> @@ -523,7 +524,7 @@ static int dlpar_memory_remove_by_ic(u32 lmbs_to_remove, 
> u32 drc_index)
>  
>   /* Validate that there are enough LMBs to satisfy the request */
>   for_each_drmem_lmb_in_range(lmb, start_lmb, end_lmb) {
> - if (lmb->flags & DRCONF_MEM_RESERVED)
> + if (!lmb_is_removable(lmb))
>   break;
>  
>   lmbs_available++;
> @@ -533,9 +534,6 @@ static int dlpar_memory_remove_by_ic(u32 lmbs_to_remove, 
> u32 drc_index)
>   return -EINVAL;
>  
>   for_each_drmem_lmb_in_range(lmb, start_lmb, end_lmb) {
> - if (!(lmb->flags & DRCONF_MEM_ASSIGNED))
> - continue;
> -
>   rc = dlpar_remove_lmb(lmb);
>   if (rc)
>   break;

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature