[PATCH v4] powerpc: setup_64: set up PACA before parsing device tree

2020-03-15 Thread Daniel Axtens
Currently, we set up the PACA after parsing the device tree for CPU
features. Before that, r13 contains random data, which means there is
random data in r13 while we're running the generic dt parsing code.

This random data varies depending on whether we boot through a vmlinux or a
zImage: for the vmlinux case it's usually around zero, but for zImages we
see random values like 912a72603d420015.

This is poor practice, and can also lead to difficult-to-debug crashes. For
example, when kcov is enabled, the kcov instrumentation attempts to read
preempt_count out of the current task, which goes via the PACA. This then
crashes in the zImage case.

To resolve this:

 - move the PACA setup to before the cpu feature parsing.

 - because we no longer have access to cpu feature flags in PACA setup,
   change the HV feature test in the PACA setup path to consider the actual
   value of the MSR rather than the CPU feature.

Translations get switched on once we leave early_setup, so I think we'd
already catch any other cases where the PACA or task aren't set up.

Boot tested on a P9 guest and host.

Fixes: fb0b0a73b223 ("powerpc: Enable kcov")
Cc: Andrew Donnellan 
Cc: sta...@vger.kernel.org
Reviewed-by: Andrew Donnellan 
Suggested-by: Michael Ellerman 
Signed-off-by: Daniel Axtens 

---

Regarding moving the comment about printk()-safety:
I am about 75% sure that the thing that makes printk() safe is the PACA,
not the CPU features. That's what commit 24d9649574fb ("[POWERPC] Document
when printk is useable") seems to indicate, but as someone wise recently
told me, "bootstrapping is hard", so I may be totally wrong.

v4: Update commit message and clarify that the mfmsr() approach is not
 for general use. Thanks Nick Piggin.

v3: Update comment, thanks Christophe Leroy.
Remove a comment in dt_cpu_ftrs.c that is no longer accurate - thanks
  Andrew. I think we want to retain all the code still, but I'm open to
  being told otherwise.
---
 arch/powerpc/kernel/dt_cpu_ftrs.c |  1 -
 arch/powerpc/kernel/paca.c| 10 +++---
 arch/powerpc/kernel/setup_64.c| 20 +++-
 3 files changed, 22 insertions(+), 9 deletions(-)

diff --git a/arch/powerpc/kernel/dt_cpu_ftrs.c 
b/arch/powerpc/kernel/dt_cpu_ftrs.c
index 182b4047c1ef..36bc0d5c4f3a 100644
--- a/arch/powerpc/kernel/dt_cpu_ftrs.c
+++ b/arch/powerpc/kernel/dt_cpu_ftrs.c
@@ -139,7 +139,6 @@ static void __init cpufeatures_setup_cpu(void)
/* Initialize the base environment -- clear FSCR/HFSCR.  */
hv_mode = !!(mfmsr() & MSR_HV);
if (hv_mode) {
-   /* CPU_FTR_HVMODE is used early in PACA setup */
cur_cpu_spec->cpu_features |= CPU_FTR_HVMODE;
mtspr(SPRN_HFSCR, 0);
}
diff --git a/arch/powerpc/kernel/paca.c b/arch/powerpc/kernel/paca.c
index 949eceb254d8..0ee6308541b1 100644
--- a/arch/powerpc/kernel/paca.c
+++ b/arch/powerpc/kernel/paca.c
@@ -214,11 +214,15 @@ void setup_paca(struct paca_struct *new_paca)
/* On Book3E, initialize the TLB miss exception frames */
mtspr(SPRN_SPRG_TLB_EXFRAME, local_paca->extlb);
 #else
-   /* In HV mode, we setup both HPACA and PACA to avoid problems
+   /*
+* In HV mode, we setup both HPACA and PACA to avoid problems
 * if we do a GET_PACA() before the feature fixups have been
-* applied
+* applied.
+*
+* Normally you should test against CPU_FTR_HVMODE, but CPU features
+* are not yet set up when we first reach here.
 */
-   if (early_cpu_has_feature(CPU_FTR_HVMODE))
+   if (mfmsr() & MSR_HV)
mtspr(SPRN_SPRG_HPACA, local_paca);
 #endif
mtspr(SPRN_SPRG_PACA, local_paca);
diff --git a/arch/powerpc/kernel/setup_64.c b/arch/powerpc/kernel/setup_64.c
index e05e6dd67ae6..2259da8e8685 100644
--- a/arch/powerpc/kernel/setup_64.c
+++ b/arch/powerpc/kernel/setup_64.c
@@ -285,18 +285,28 @@ void __init early_setup(unsigned long dt_ptr)
 
/*  printk is _NOT_ safe to use here ! --- */
 
-   /* Try new device tree based feature discovery ... */
-   if (!dt_cpu_ftrs_init(__va(dt_ptr)))
-   /* Otherwise use the old style CPU table */
-   identify_cpu(0, mfspr(SPRN_PVR));
+   /*
+* Assume we're on cpu 0 for now.
+*
+* We need to load a PACA very early if we are using kcov. kcov will
+* call in_task() in its instrumentation, which relies on the current
+* task from the PACA. dt_cpu_ftrs_init is coveraged-enabled and also
+* calls into the coverage-enabled generic dt library.
+*
+* Set up a temporary paca. It is going to be replaced below.
+*/
 
-   /* Assume we're on cpu 0 for now. Don't write to the paca yet! */
initialise_paca(_paca, 0);
setup_paca(_paca);
fixup_boot_paca();
 
/*  printk is now safe to use --- */
 
+   /* Try new device tree based feature 

Re: [PATCHv3 2/2] pseries/scm: buffer pmem's bound addr in dt for kexec kernel

2020-03-15 Thread Aneesh Kumar K.V

On 3/4/20 2:17 PM, Pingfan Liu wrote:

At present, plpar_hcall(H_SCM_BIND_MEM, ...) takes a very long time, so
if dumping to fsdax, it will take a very long time.




that should be fixed by

faa6d21153fd11e139dd880044521389b34a24f2
Author:   Aneesh Kumar K.V 
AuthorDate:   Tue Sep 3 18:04:52 2019 +0530
Commit:   Michael Ellerman 
CommitDate:   Wed Sep 25 08:32:59 2019 +1000

powerpc/nvdimm: use H_SCM_QUERY hcall on H_OVERLAP error

Right now we force an unbind of SCM memory at drcindex on H_OVERLAP error.
This really slows down operations like kexec where we get the H_OVERLAP
error because we don't go through a full hypervisor re init.

H_OVERLAP error for a H_SCM_BIND_MEM hcall indicates that SCM memory at
drc index is already bound. Since we don't specify a logical memory
address for bind hcall, we can use the H_SCM_QUERY hcall to query
the already bound logical address.





Take a closer look, during the papr_scm initialization, the only
configuration is through drc_pmem_bind()-> plpar_hcall(H_SCM_BIND_MEM,
...), which helps to set up the bound address.

On pseries, for kexec -l/-p kernel, there is no reset of hardware, and this
step can be stepped around to save times.  So the pmem bound address can be
passed to the 2nd kernel through a dynamic added property "bound-addr" in
dt node 'ibm,pmemory'.



-aneesh



Re: [PATCHv3 2/2] pseries/scm: buffer pmem's bound addr in dt for kexec kernel

2020-03-15 Thread Pingfan Liu
Appreciate for your kind review. And I have some comment as below.

On Fri, Mar 13, 2020 at 11:18 AM Oliver O'Halloran  wrote:
>
> On Wed, Mar 4, 2020 at 7:50 PM Pingfan Liu  wrote:
> >
> > At present, plpar_hcall(H_SCM_BIND_MEM, ...) takes a very long time, so
> > if dumping to fsdax, it will take a very long time.
> >
> > Take a closer look, during the papr_scm initialization, the only
> > configuration is through drc_pmem_bind()-> plpar_hcall(H_SCM_BIND_MEM,
> > ...), which helps to set up the bound address.
> >
> > On pseries, for kexec -l/-p kernel, there is no reset of hardware, and this
> > step can be stepped around to save times.  So the pmem bound address can be
> > passed to the 2nd kernel through a dynamic added property "bound-addr" in
> > dt node 'ibm,pmemory'.
> >
> > Signed-off-by: Pingfan Liu 
> > To: linuxppc-dev@lists.ozlabs.org
> > Cc: Benjamin Herrenschmidt 
> > Cc: Paul Mackerras 
> > Cc: Michael Ellerman 
> > Cc: Hari Bathini 
> > Cc: Aneesh Kumar K.V 
> > Cc: Oliver O'Halloran 
> > Cc: Dan Williams 
> > Cc: Andrew Donnellan 
> > Cc: Christophe Leroy 
> > Cc: Rob Herring 
> > Cc: Frank Rowand 
> > Cc: ke...@lists.infradead.org
> > ---
> > note: This patch has not been tested since I can not get such a pseries 
> > with pmem.
> >   Please kindly to give some suggestion, thanks.
>
> There was some qemu patches to implement the Hcall interface floating
> around a while ago. I'm not sure they ever made it into upstream qemu
> though.
Unfortunately, it does not appear in latest qemu code. I think
probably virt-pmem has achieved the same feature.
>
> > ---
> >  arch/powerpc/platforms/pseries/of_helpers.c |  1 +
> >  arch/powerpc/platforms/pseries/papr_scm.c   | 33 
> > -
> >  drivers/of/base.c   |  1 +
> >  3 files changed, 25 insertions(+), 10 deletions(-)
> >
> > diff --git a/arch/powerpc/platforms/pseries/of_helpers.c 
> > b/arch/powerpc/platforms/pseries/of_helpers.c
> > index 1022e0f..2c7bab4 100644
> > --- a/arch/powerpc/platforms/pseries/of_helpers.c
> > +++ b/arch/powerpc/platforms/pseries/of_helpers.c
> > @@ -34,6 +34,7 @@ struct property *new_property(const char *name, const int 
> > length,
> > kfree(new);
> > return NULL;
> >  }
> > +EXPORT_SYMBOL(new_property);
> >
> >  /**
> >   * pseries_of_derive_parent - basically like dirname(1)
> > diff --git a/arch/powerpc/platforms/pseries/papr_scm.c 
> > b/arch/powerpc/platforms/pseries/papr_scm.c
> > index 0b4467e..54ae903 100644
> > --- a/arch/powerpc/platforms/pseries/papr_scm.c
> > +++ b/arch/powerpc/platforms/pseries/papr_scm.c
> > @@ -14,6 +14,7 @@
> >  #include 
> >
> >  #include 
> > +#include "of_helpers.h"
> >
> >  #define BIND_ANY_ADDR (~0ul)
> >
> > @@ -383,7 +384,7 @@ static int papr_scm_probe(struct platform_device *pdev)
> >  {
> > struct device_node *dn = pdev->dev.of_node;
> > u32 drc_index, metadata_size;
> > -   u64 blocks, block_size;
> > +   u64 blocks, block_size, bound_addr = 0;
> > struct papr_scm_priv *p;
> > const char *uuid_str;
> > u64 uuid[2];
> > @@ -440,17 +441,29 @@ static int papr_scm_probe(struct platform_device 
> > *pdev)
> > p->metadata_size = metadata_size;
> > p->pdev = pdev;
> >
> > -   /* request the hypervisor to bind this region to somewhere in 
> > memory */
> > -   rc = drc_pmem_bind(p);
> > +   of_property_read_u64(dn, "bound-addr", _addr);
> > +   if (bound_addr) {
> > +   p->bound_addr = bound_addr;
> > +   } else {
> > +   struct property *property;
> > +   u64 big;
> >
> > -   /* If phyp says drc memory still bound then force unbound and retry 
> > */
> > -   if (rc == H_OVERLAP)
> > -   rc = drc_pmem_query_n_bind(p);
> > +   /* request the hypervisor to bind this region to somewhere 
> > in memory */
> > +   rc = drc_pmem_bind(p);
> >
> > -   if (rc != H_SUCCESS) {
> > -   dev_err(>pdev->dev, "bind err: %d\n", rc);
> > -   rc = -ENXIO;
> > -   goto err;
> > +   /* If phyp says drc memory still bound then force unbound 
> > and retry */
> > +   if (rc == H_OVERLAP)
> > +   rc = drc_pmem_query_n_bind(p);
> > +
> > +   if (rc != H_SUCCESS) {
> > +   dev_err(>pdev->dev, "bind err: %d\n", rc);
> > +   rc = -ENXIO;
> > +   goto err;
> > +   }
> > +   big = cpu_to_be64(p->bound_addr);
> > +   property = new_property("bound-addr", sizeof(u64), (const 
> > unsigned char *),
> > +   NULL);
>
> That should probably be "linux,bound-addr"
OK, thanks for suggestion.
>
> The other thing that stands out to me is that you aren't removing the
Yes, you are right. I will fix it in V2.
> property when the region is unbound. As a general rule I'd prefer we
> didn't 

[PATCH AUTOSEL 4.4 2/7] powerpc: Include .BTF section

2020-03-15 Thread Sasha Levin
From: "Naveen N. Rao" 

[ Upstream commit cb0cc635c7a9fa8a3a0f75d4d896721819c63add ]

Selecting CONFIG_DEBUG_INFO_BTF results in the below warning from ld:
  ld: warning: orphan section `.BTF' from `.btf.vmlinux.bin.o' being placed in 
section `.BTF'

Include .BTF section in vmlinux explicitly to fix the same.

Signed-off-by: Naveen N. Rao 
Signed-off-by: Michael Ellerman 
Link: 
https://lore.kernel.org/r/20200220113132.857132-1-naveen.n@linux.vnet.ibm.com
Signed-off-by: Sasha Levin 
---
 arch/powerpc/kernel/vmlinux.lds.S | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/arch/powerpc/kernel/vmlinux.lds.S 
b/arch/powerpc/kernel/vmlinux.lds.S
index 876ac9d52afcb..9b1e297be6730 100644
--- a/arch/powerpc/kernel/vmlinux.lds.S
+++ b/arch/powerpc/kernel/vmlinux.lds.S
@@ -255,6 +255,12 @@ SECTIONS
*(.branch_lt)
}
 
+#ifdef CONFIG_DEBUG_INFO_BTF
+   .BTF : AT(ADDR(.BTF) - LOAD_OFFSET) {
+   *(.BTF)
+   }
+#endif
+
.opd : AT(ADDR(.opd) - LOAD_OFFSET) {
*(.opd)
}
-- 
2.20.1



[PATCH AUTOSEL 4.9 2/7] powerpc: Include .BTF section

2020-03-15 Thread Sasha Levin
From: "Naveen N. Rao" 

[ Upstream commit cb0cc635c7a9fa8a3a0f75d4d896721819c63add ]

Selecting CONFIG_DEBUG_INFO_BTF results in the below warning from ld:
  ld: warning: orphan section `.BTF' from `.btf.vmlinux.bin.o' being placed in 
section `.BTF'

Include .BTF section in vmlinux explicitly to fix the same.

Signed-off-by: Naveen N. Rao 
Signed-off-by: Michael Ellerman 
Link: 
https://lore.kernel.org/r/20200220113132.857132-1-naveen.n@linux.vnet.ibm.com
Signed-off-by: Sasha Levin 
---
 arch/powerpc/kernel/vmlinux.lds.S | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/arch/powerpc/kernel/vmlinux.lds.S 
b/arch/powerpc/kernel/vmlinux.lds.S
index 50d3650608558..c20510497c49d 100644
--- a/arch/powerpc/kernel/vmlinux.lds.S
+++ b/arch/powerpc/kernel/vmlinux.lds.S
@@ -315,6 +315,12 @@ SECTIONS
*(.branch_lt)
}
 
+#ifdef CONFIG_DEBUG_INFO_BTF
+   .BTF : AT(ADDR(.BTF) - LOAD_OFFSET) {
+   *(.BTF)
+   }
+#endif
+
.opd : AT(ADDR(.opd) - LOAD_OFFSET) {
*(.opd)
}
-- 
2.20.1



[PATCH AUTOSEL 4.14 03/15] powerpc: Include .BTF section

2020-03-15 Thread Sasha Levin
From: "Naveen N. Rao" 

[ Upstream commit cb0cc635c7a9fa8a3a0f75d4d896721819c63add ]

Selecting CONFIG_DEBUG_INFO_BTF results in the below warning from ld:
  ld: warning: orphan section `.BTF' from `.btf.vmlinux.bin.o' being placed in 
section `.BTF'

Include .BTF section in vmlinux explicitly to fix the same.

Signed-off-by: Naveen N. Rao 
Signed-off-by: Michael Ellerman 
Link: 
https://lore.kernel.org/r/20200220113132.857132-1-naveen.n@linux.vnet.ibm.com
Signed-off-by: Sasha Levin 
---
 arch/powerpc/kernel/vmlinux.lds.S | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/arch/powerpc/kernel/vmlinux.lds.S 
b/arch/powerpc/kernel/vmlinux.lds.S
index b0cf4af7ba840..e4da937d6cf91 100644
--- a/arch/powerpc/kernel/vmlinux.lds.S
+++ b/arch/powerpc/kernel/vmlinux.lds.S
@@ -317,6 +317,12 @@ SECTIONS
*(.branch_lt)
}
 
+#ifdef CONFIG_DEBUG_INFO_BTF
+   .BTF : AT(ADDR(.BTF) - LOAD_OFFSET) {
+   *(.BTF)
+   }
+#endif
+
.opd : AT(ADDR(.opd) - LOAD_OFFSET) {
*(.opd)
}
-- 
2.20.1



[PATCH AUTOSEL 4.19 04/20] powerpc: Include .BTF section

2020-03-15 Thread Sasha Levin
From: "Naveen N. Rao" 

[ Upstream commit cb0cc635c7a9fa8a3a0f75d4d896721819c63add ]

Selecting CONFIG_DEBUG_INFO_BTF results in the below warning from ld:
  ld: warning: orphan section `.BTF' from `.btf.vmlinux.bin.o' being placed in 
section `.BTF'

Include .BTF section in vmlinux explicitly to fix the same.

Signed-off-by: Naveen N. Rao 
Signed-off-by: Michael Ellerman 
Link: 
https://lore.kernel.org/r/20200220113132.857132-1-naveen.n@linux.vnet.ibm.com
Signed-off-by: Sasha Levin 
---
 arch/powerpc/kernel/vmlinux.lds.S | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/arch/powerpc/kernel/vmlinux.lds.S 
b/arch/powerpc/kernel/vmlinux.lds.S
index fd35eddf32669..d081d726ca8ea 100644
--- a/arch/powerpc/kernel/vmlinux.lds.S
+++ b/arch/powerpc/kernel/vmlinux.lds.S
@@ -322,6 +322,12 @@ SECTIONS
*(.branch_lt)
}
 
+#ifdef CONFIG_DEBUG_INFO_BTF
+   .BTF : AT(ADDR(.BTF) - LOAD_OFFSET) {
+   *(.BTF)
+   }
+#endif
+
.opd : AT(ADDR(.opd) - LOAD_OFFSET) {
__start_opd = .;
KEEP(*(.opd))
-- 
2.20.1



[PATCH AUTOSEL 5.4 10/35] powerpc: Include .BTF section

2020-03-15 Thread Sasha Levin
From: "Naveen N. Rao" 

[ Upstream commit cb0cc635c7a9fa8a3a0f75d4d896721819c63add ]

Selecting CONFIG_DEBUG_INFO_BTF results in the below warning from ld:
  ld: warning: orphan section `.BTF' from `.btf.vmlinux.bin.o' being placed in 
section `.BTF'

Include .BTF section in vmlinux explicitly to fix the same.

Signed-off-by: Naveen N. Rao 
Signed-off-by: Michael Ellerman 
Link: 
https://lore.kernel.org/r/20200220113132.857132-1-naveen.n@linux.vnet.ibm.com
Signed-off-by: Sasha Levin 
---
 arch/powerpc/kernel/vmlinux.lds.S | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/arch/powerpc/kernel/vmlinux.lds.S 
b/arch/powerpc/kernel/vmlinux.lds.S
index 060a1acd7c6d7..4638d28633888 100644
--- a/arch/powerpc/kernel/vmlinux.lds.S
+++ b/arch/powerpc/kernel/vmlinux.lds.S
@@ -326,6 +326,12 @@ SECTIONS
*(.branch_lt)
}
 
+#ifdef CONFIG_DEBUG_INFO_BTF
+   .BTF : AT(ADDR(.BTF) - LOAD_OFFSET) {
+   *(.BTF)
+   }
+#endif
+
.opd : AT(ADDR(.opd) - LOAD_OFFSET) {
__start_opd = .;
KEEP(*(.opd))
-- 
2.20.1



[PATCH AUTOSEL 5.5 11/41] powerpc: Include .BTF section

2020-03-15 Thread Sasha Levin
From: "Naveen N. Rao" 

[ Upstream commit cb0cc635c7a9fa8a3a0f75d4d896721819c63add ]

Selecting CONFIG_DEBUG_INFO_BTF results in the below warning from ld:
  ld: warning: orphan section `.BTF' from `.btf.vmlinux.bin.o' being placed in 
section `.BTF'

Include .BTF section in vmlinux explicitly to fix the same.

Signed-off-by: Naveen N. Rao 
Signed-off-by: Michael Ellerman 
Link: 
https://lore.kernel.org/r/20200220113132.857132-1-naveen.n@linux.vnet.ibm.com
Signed-off-by: Sasha Levin 
---
 arch/powerpc/kernel/vmlinux.lds.S | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/arch/powerpc/kernel/vmlinux.lds.S 
b/arch/powerpc/kernel/vmlinux.lds.S
index 8834220036a51..857ab49750f18 100644
--- a/arch/powerpc/kernel/vmlinux.lds.S
+++ b/arch/powerpc/kernel/vmlinux.lds.S
@@ -303,6 +303,12 @@ SECTIONS
*(.branch_lt)
}
 
+#ifdef CONFIG_DEBUG_INFO_BTF
+   .BTF : AT(ADDR(.BTF) - LOAD_OFFSET) {
+   *(.BTF)
+   }
+#endif
+
.opd : AT(ADDR(.opd) - LOAD_OFFSET) {
__start_opd = .;
KEEP(*(.opd))
-- 
2.20.1



[RFC PATCH v1] powerpc/XIVE: SVM: share the event-queue page with the Hypervisor.

2020-03-15 Thread Ram Pai
XIVE interrupt controller maintains a Event-Queue(EQ) page. This page is
used to communicate events with the Hypervisor/Qemu. In Secure-VM,
unless a page is shared with the Hypervisor, the Hypervisor will
not be able to read/write to that page.

Explicitly share the EQ page with the Hypervisor, and unshare it
during cleanup.  This enables SVM to use XIVE.

(NOTE: If the Hypervisor/Ultravisor is unable to target interrupts
 directly to Secure VM, use "kernel_irqchip=off" on the qemu command
 line).

Cc: kvm-...@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Michael Ellerman 
Cc: Thiago Jung Bauermann 
Cc: Michael Anderson 
Cc: Sukadev Bhattiprolu 
Cc: Alexey Kardashevskiy 
Cc: Paul Mackerras 
Cc: Greg Kurz 
Cc: Cedric Le Goater 
Cc: David Gibson 
Signed-off-by: Ram Pai 
---
 arch/powerpc/sysdev/xive/spapr.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/arch/powerpc/sysdev/xive/spapr.c b/arch/powerpc/sysdev/xive/spapr.c
index 55dc61c..608b52f 100644
--- a/arch/powerpc/sysdev/xive/spapr.c
+++ b/arch/powerpc/sysdev/xive/spapr.c
@@ -26,6 +26,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 
 #include "xive-internal.h"
 
@@ -501,6 +503,9 @@ static int xive_spapr_configure_queue(u32 target, struct 
xive_q *q, u8 prio,
rc = -EIO;
} else {
q->qpage = qpage;
+   if (is_secure_guest())
+   uv_share_page(PHYS_PFN(qpage_phys),
+   1 << xive_alloc_order(order));
}
 fail:
return rc;
@@ -534,6 +539,8 @@ static void xive_spapr_cleanup_queue(unsigned int cpu, 
struct xive_cpu *xc,
   hw_cpu, prio);
 
alloc_order = xive_alloc_order(xive_queue_shift);
+   if (is_secure_guest())
+   uv_unshare_page(PHYS_PFN(__pa(q->qpage)), 1 << alloc_order);
free_pages((unsigned long)q->qpage, alloc_order);
q->qpage = NULL;
 }
-- 
1.8.3.1



Re: [PATCH 3/3] mm/page_alloc: Keep memoryless cpuless node 0 offline

2020-03-15 Thread Christopher Lameter
On Wed, 11 Mar 2020, Srikar Dronamraju wrote:

> Currently Linux kernel with CONFIG_NUMA on a system with multiple
> possible nodes, marks node 0 as online at boot.  However in practice,
> there are systems which have node 0 as memoryless and cpuless.

Would it not be better and simpler to require that node 0 always has
memory (and processors)? A  mininum operational set?

We can dynamically number the nodes right? So just make sure that the
firmware properly creates memory on node 0?