Re: [PATCH v3 03/16] pseries/fadump: move out platform specific support from generic code

2019-07-02 Thread Oliver O'Halloran
On Wed, 2019-06-26 at 02:16 +0530, Hari Bathini wrote:
> Introduce callbacks for platform specific operations like register,
> unregister, invalidate & such, and move pseries specific code into
> platform code.

Please don't move around large blocks of code *and* change the code in
a single patch. It makes reviewing the changes extremely tedious since
the changes are mixed in with hundreds of lines of nothing.

> Signed-off-by: Hari Bathini 
> ---
>  arch/powerpc/include/asm/fadump.h|   75 
>  arch/powerpc/kernel/fadump-common.h  |   38 ++
>  arch/powerpc/kernel/fadump.c |  500 ++---
>  arch/powerpc/platforms/pseries/Makefile  |1 
>  arch/powerpc/platforms/pseries/rtas-fadump.c |  529 
> ++
>  arch/powerpc/platforms/pseries/rtas-fadump.h |   96 +
>  6 files changed, 700 insertions(+), 539 deletions(-)
>  create mode 100644 arch/powerpc/platforms/pseries/rtas-fadump.c
>  create mode 100644 arch/powerpc/platforms/pseries/rtas-fadump.h
> 

> +static struct fadump_ops pseries_fadump_ops = {
> + .init_fadump_mem_struct = pseries_init_fadump_mem_struct,
> + .register_fadump= pseries_register_fadump,

I realise you are just translating the existing interface, but why is
init_fadump_mem_struct() done as a seperate step and not as a part of
the registration function? The struct doesn't seem to be necessary
until the actual registration happens.

> + .unregister_fadump  = pseries_unregister_fadump,
> + .invalidate_fadump  = pseries_invalidate_fadump,
> + .process_fadump = pseries_process_fadump,
> + .fadump_region_show = pseries_fadump_region_show,

> + .crash_fadump   = pseries_crash_fadump,

Rename this to fadump_trigger or something, it's not clear what it
does.





Re: [PATCH v3 01/16] powerpc/fadump: move internal fadump code to a new file

2019-07-02 Thread Oliver O'Halloran
On Wed, 2019-06-26 at 02:15 +0530, Hari Bathini wrote:
> Refactoring fadump code means internal fadump code is referenced from
> different places. For ease, move internal code to a new file.

Can you elaborate a bit? I don't really get what the difference between
fadump and fadump-internal code is supposed to be. Why can't all this
just live in fadump.c?




Re: [RFC PATCH v2 11/12] powerpc/ptrace: create ppc_gethwdinfo()

2019-07-02 Thread Ravi Bangoria



On 6/28/19 9:18 PM, Christophe Leroy wrote:
> Create ippc_gethwdinfo() to handle PPC_PTRACE_GETHWDBGINFO and
> reduce ifdef mess
> 
> Signed-off-by: Christophe Leroy 
> ---

Reviewed-by: Ravi Bangoria 



Re: [RFC PATCH v2 12/12] powerpc/ptrace: move ptrace_triggered() into hw_breakpoint.c

2019-07-02 Thread Ravi Bangoria



On 6/28/19 9:18 PM, Christophe Leroy wrote:
> ptrace_triggered() is declared in asm/hw_breakpoint.h and
> only needed when CONFIG_HW_BREAKPOINT is set, so move it
> into hw_breakpoint.c
> 
> Signed-off-by: Christophe Leroy 

Reviewed-by: Ravi Bangoria 



Re: [RFC PATCH v2 10/12] powerpc/ptrace: create ptrace_get_debugreg()

2019-07-02 Thread Ravi Bangoria



On 6/28/19 9:17 PM, Christophe Leroy wrote:
> Create ptrace_get_debugreg() to handle PTRACE_GET_DEBUGREG and
> reduce ifdef mess
> 
> Signed-off-by: Christophe Leroy 
> ---
>  arch/powerpc/kernel/ptrace/ptrace-adv.c   |  9 +
>  arch/powerpc/kernel/ptrace/ptrace-decl.h  |  2 ++
>  arch/powerpc/kernel/ptrace/ptrace-noadv.c | 13 +
>  arch/powerpc/kernel/ptrace/ptrace.c   | 18 ++
>  4 files changed, 26 insertions(+), 16 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/ptrace/ptrace-adv.c 
> b/arch/powerpc/kernel/ptrace/ptrace-adv.c
> index 86e71fa6c5c8..dcc765940344 100644
> --- a/arch/powerpc/kernel/ptrace/ptrace-adv.c
> +++ b/arch/powerpc/kernel/ptrace/ptrace-adv.c
> @@ -83,6 +83,15 @@ void user_disable_single_step(struct task_struct *task)
>   clear_tsk_thread_flag(task, TIF_SINGLESTEP);
>  }
>  
> +int ptrace_get_debugreg(struct task_struct *child, unsigned long addr,
> + unsigned long __user *datalp)
> +{
> + /* We only support one DABR and no IABRS at the moment */

No DABR / IABR in ptrace-adv.c

> + if (addr > 0)
> + return -EINVAL;
> + return put_user(child->thread.debug.dac1, datalp);
> +}
> +
>  int ptrace_set_debugreg(struct task_struct *task, unsigned long addr, 
> unsigned long data)
>  {
>   /* For ppc64 we support one DABR and no IABR's at the moment (ppc64).
> diff --git a/arch/powerpc/kernel/ptrace/ptrace-decl.h 
> b/arch/powerpc/kernel/ptrace/ptrace-decl.h
> index bdba09a87aea..4b4b6a1d508a 100644
> --- a/arch/powerpc/kernel/ptrace/ptrace-decl.h
> +++ b/arch/powerpc/kernel/ptrace/ptrace-decl.h
> @@ -176,6 +176,8 @@ int tm_cgpr32_set(struct task_struct *target, const 
> struct user_regset *regset,
>  extern const struct user_regset_view user_ppc_native_view;
>  
>  /* ptrace-(no)adv */
> +int ptrace_get_debugreg(struct task_struct *child, unsigned long addr,
> + unsigned long __user *datalp);
>  int ptrace_set_debugreg(struct task_struct *task, unsigned long addr, 
> unsigned long data);
>  long ppc_set_hwdebug(struct task_struct *child, struct ppc_hw_breakpoint 
> *bp_info);
>  long ppc_del_hwdebug(struct task_struct *child, long data);
> diff --git a/arch/powerpc/kernel/ptrace/ptrace-noadv.c 
> b/arch/powerpc/kernel/ptrace/ptrace-noadv.c
> index 7db330c94538..985cca136f85 100644
> --- a/arch/powerpc/kernel/ptrace/ptrace-noadv.c
> +++ b/arch/powerpc/kernel/ptrace/ptrace-noadv.c
> @@ -64,6 +64,19 @@ void user_disable_single_step(struct task_struct *task)
>   clear_tsk_thread_flag(task, TIF_SINGLESTEP);
>  }
>  
> +int ptrace_get_debugreg(struct task_struct *child, unsigned long addr,
> + unsigned long __user *datalp)
> +{
> + unsigned long dabr_fake;
> +
> + /* We only support one DABR and no IABRS at the moment */
> + if (addr > 0)
> + return -EINVAL;
> + dabr_fake = ((child->thread.hw_brk.address & (~HW_BRK_TYPE_DABR)) |
> +  (child->thread.hw_brk.type & HW_BRK_TYPE_DABR));
> + return put_user(dabr_fake, datalp);
> +}
> +
>  int ptrace_set_debugreg(struct task_struct *task, unsigned long addr, 
> unsigned long data)
>  {
>  #ifdef CONFIG_HAVE_HW_BREAKPOINT
> diff --git a/arch/powerpc/kernel/ptrace/ptrace.c 
> b/arch/powerpc/kernel/ptrace/ptrace.c
> index 377e0e541d5f..e789afae6f56 100644
> --- a/arch/powerpc/kernel/ptrace/ptrace.c
> +++ b/arch/powerpc/kernel/ptrace/ptrace.c
> @@ -211,23 +211,9 @@ long arch_ptrace(struct task_struct *child, long request,
>   break;
>   }
>  
> - case PTRACE_GET_DEBUGREG: {
> -#ifndef CONFIG_PPC_ADV_DEBUG_REGS
> - unsigned long dabr_fake;
> -#endif
> - ret = -EINVAL;
> - /* We only support one DABR and no IABRS at the moment */
> - if (addr > 0)
> - break;
> -#ifdef CONFIG_PPC_ADV_DEBUG_REGS
> - ret = put_user(child->thread.debug.dac1, datalp);
> -#else
> - dabr_fake = ((child->thread.hw_brk.address & 
> (~HW_BRK_TYPE_DABR)) |
> -  (child->thread.hw_brk.type & HW_BRK_TYPE_DABR));
> - ret = put_user(dabr_fake, datalp);
> -#endif
> + case PTRACE_GET_DEBUGREG:
> + ret = ptrace_get_debugreg(child, addr, datalp);
>   break;
> - }
>  
>   case PTRACE_SET_DEBUGREG:
>   ret = ptrace_set_debugreg(child, addr, data);
> 

Otherwise,

Reviewed-by: Ravi Bangoria 



Re: [RFC PATCH v2 09/12] powerpc/ptrace: split out ADV_DEBUG_REGS related functions.

2019-07-02 Thread Ravi Bangoria



On 6/28/19 9:17 PM, Christophe Leroy wrote:
> diff --git a/arch/powerpc/kernel/ptrace/ptrace-adv.c 
> b/arch/powerpc/kernel/ptrace/ptrace-adv.c
> new file mode 100644
> index ..86e71fa6c5c8
> --- /dev/null
> +++ b/arch/powerpc/kernel/ptrace/ptrace-adv.c
> @@ -0,0 +1,487 @@
> +/* SPDX-License-Identifier: GPL-2.0-or-later */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#include 
> +
> +void user_enable_single_step(struct task_struct *task)
> +{
> + struct pt_regs *regs = task->thread.regs;
> +
> + if (regs != NULL) {
> + task->thread.debug.dbcr0 &= ~DBCR0_BT;
> + task->thread.debug.dbcr0 |= DBCR0_IDM | DBCR0_IC;
> + regs->msr |= MSR_DE;
> + }
> + set_tsk_thread_flag(task, TIF_SINGLESTEP);
> +}
> +
> +void user_enable_block_step(struct task_struct *task)
> +{
> + struct pt_regs *regs = task->thread.regs;
> +
> + if (regs != NULL) {
> + task->thread.debug.dbcr0 &= ~DBCR0_IC;
> + task->thread.debug.dbcr0 = DBCR0_IDM | DBCR0_BT;
> + regs->msr |= MSR_DE;
> + }
> + set_tsk_thread_flag(task, TIF_SINGLESTEP);
> +}
> +
> +void user_disable_single_step(struct task_struct *task)
> +{
> + struct pt_regs *regs = task->thread.regs;
> +
> + if (regs != NULL) {
> + /*
> +  * The logic to disable single stepping should be as
> +  * simple as turning off the Instruction Complete flag.
> +  * And, after doing so, if all debug flags are off, turn
> +  * off DBCR0(IDM) and MSR(DE)  Torez
> +  */
> + task->thread.debug.dbcr0 &= ~(DBCR0_IC|DBCR0_BT);
> + /*
> +  * Test to see if any of the DBCR_ACTIVE_EVENTS bits are set.
> +  */
> + if (!DBCR_ACTIVE_EVENTS(task->thread.debug.dbcr0,
> + task->thread.debug.dbcr1)) {
> + /*
> +  * All debug events were off.
> +  */
> + task->thread.debug.dbcr0 &= ~DBCR0_IDM;
> + regs->msr &= ~MSR_DE;
> + }
> + }
> + clear_tsk_thread_flag(task, TIF_SINGLESTEP);
> +}
> +
> +int ptrace_set_debugreg(struct task_struct *task, unsigned long addr, 
> unsigned long data)
> +{
> + /* For ppc64 we support one DABR and no IABR's at the moment (ppc64).
> +  *  For embedded processors we support one DAC and no IAC's at the
> +  *  moment.
> +  */

I guess mentioning DABR and IABR doesn't make sense in ptrace-adv.c?

> + if (addr > 0)
> + return -EINVAL;
> +
> + /* The bottom 3 bits in dabr are flags */

Same here.

> + if ((data & ~0x7UL) >= TASK_SIZE)
> + return -EIO;
> +
> + /* As described above, it was assumed 3 bits were passed with the data
> +  *  address, but we will assume only the mode bits will be passed
> +  *  as to not cause alignment restrictions for DAC-based processors.
> +  */
> +
> + /* DAC's hold the whole address without any mode flags */
> + task->thread.debug.dac1 = data & ~0x3UL;
> +
> + if (task->thread.debug.dac1 == 0) {
> + dbcr_dac(task) &= ~(DBCR_DAC1R | DBCR_DAC1W);
> + if (!DBCR_ACTIVE_EVENTS(task->thread.debug.dbcr0,
> + task->thread.debug.dbcr1)) {
> + task->thread.regs->msr &= ~MSR_DE;
> + task->thread.debug.dbcr0 &= ~DBCR0_IDM;
> + }
> + return 0;
> + }
> +
> + /* Read or Write bits must be set */
> +
> + if (!(data & 0x3UL))
> + return -EINVAL;
> +
> + /* Set the Internal Debugging flag (IDM bit 1) for the DBCR0
> +register */
> + task->thread.debug.dbcr0 |= DBCR0_IDM;
> +
> + /* Check for write and read flags and set DBCR0
> +accordingly */
> + dbcr_dac(task) &= ~(DBCR_DAC1R|DBCR_DAC1W);
> + if (data & 0x1UL)
> + dbcr_dac(task) |= DBCR_DAC1R;
> + if (data & 0x2UL)
> + dbcr_dac(task) |= DBCR_DAC1W;
> + task->thread.regs->msr |= MSR_DE;
> + return 0;
> +}
> +
> +static long set_instruction_bp(struct task_struct *child,
> +   struct ppc_hw_breakpoint *bp_info)
> +{
> + int slot;
> + int slot1_in_use = ((child->thread.debug.dbcr0 & DBCR0_IAC1) != 0);
> + int slot2_in_use = ((child->thread.debug.dbcr0 & DBCR0_IAC2) != 0);
> + int slot3_in_use = ((child->thread.debug.dbcr0 & DBCR0_IAC3) != 0);
> + int slot4_in_use = ((child->thread.debug.dbcr0 & DBCR0_IAC4) != 0);
> +
> + if (dbcr_iac_range(child) 

[PATCH AUTOSEL 5.1 09/39] selftests/powerpc: Add test of fork with mapping above 512TB

2019-07-02 Thread Sasha Levin
From: Michael Ellerman 

[ Upstream commit 16391bfc862342f285195013b73c1394fab28b97 ]

This tests that when a process with a mapping above 512TB forks we
correctly separate the parent and child address spaces. This exercises
the bug in the context id handling fixed in the previous commit.

Signed-off-by: Michael Ellerman 
Signed-off-by: Sasha Levin 
---
 tools/testing/selftests/powerpc/mm/.gitignore |  3 +-
 tools/testing/selftests/powerpc/mm/Makefile   |  4 +-
 .../powerpc/mm/large_vm_fork_separation.c | 87 +++
 3 files changed, 92 insertions(+), 2 deletions(-)
 create mode 100644 
tools/testing/selftests/powerpc/mm/large_vm_fork_separation.c

diff --git a/tools/testing/selftests/powerpc/mm/.gitignore 
b/tools/testing/selftests/powerpc/mm/.gitignore
index ba919308fe30..d503b8764a8e 100644
--- a/tools/testing/selftests/powerpc/mm/.gitignore
+++ b/tools/testing/selftests/powerpc/mm/.gitignore
@@ -3,4 +3,5 @@ subpage_prot
 tempfile
 prot_sao
 segv_errors
-wild_bctr
\ No newline at end of file
+wild_bctr
+large_vm_fork_separation
\ No newline at end of file
diff --git a/tools/testing/selftests/powerpc/mm/Makefile 
b/tools/testing/selftests/powerpc/mm/Makefile
index 43d68420e363..f1fbc15800c4 100644
--- a/tools/testing/selftests/powerpc/mm/Makefile
+++ b/tools/testing/selftests/powerpc/mm/Makefile
@@ -2,7 +2,8 @@
 noarg:
$(MAKE) -C ../
 
-TEST_GEN_PROGS := hugetlb_vs_thp_test subpage_prot prot_sao segv_errors 
wild_bctr
+TEST_GEN_PROGS := hugetlb_vs_thp_test subpage_prot prot_sao segv_errors 
wild_bctr \
+ large_vm_fork_separation
 TEST_GEN_FILES := tempfile
 
 top_srcdir = ../../../../..
@@ -13,6 +14,7 @@ $(TEST_GEN_PROGS): ../harness.c
 $(OUTPUT)/prot_sao: ../utils.c
 
 $(OUTPUT)/wild_bctr: CFLAGS += -m64
+$(OUTPUT)/large_vm_fork_separation: CFLAGS += -m64
 
 $(OUTPUT)/tempfile:
dd if=/dev/zero of=$@ bs=64k count=1
diff --git a/tools/testing/selftests/powerpc/mm/large_vm_fork_separation.c 
b/tools/testing/selftests/powerpc/mm/large_vm_fork_separation.c
new file mode 100644
index ..2363a7f3ab0d
--- /dev/null
+++ b/tools/testing/selftests/powerpc/mm/large_vm_fork_separation.c
@@ -0,0 +1,87 @@
+// SPDX-License-Identifier: GPL-2.0+
+//
+// Copyright 2019, Michael Ellerman, IBM Corp.
+//
+// Test that allocating memory beyond the memory limit and then forking is
+// handled correctly, ie. the child is able to access the mappings beyond the
+// memory limit and the child's writes are not visible to the parent.
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "utils.h"
+
+
+#ifndef MAP_FIXED_NOREPLACE
+#define MAP_FIXED_NOREPLACEMAP_FIXED   // "Should be safe" above 512TB
+#endif
+
+
+static int test(void)
+{
+   int p2c[2], c2p[2], rc, status, c, *p;
+   unsigned long page_size;
+   pid_t pid;
+
+   page_size = sysconf(_SC_PAGESIZE);
+   SKIP_IF(page_size != 65536);
+
+   // Create a mapping at 512TB to allocate an extended_id
+   p = mmap((void *)(512ul << 40), page_size, PROT_READ | PROT_WRITE,
+   MAP_PRIVATE | MAP_ANONYMOUS | MAP_FIXED_NOREPLACE, -1, 0);
+   if (p == MAP_FAILED) {
+   perror("mmap");
+   printf("Error: couldn't mmap(), confirm kernel has 4TB 
support?\n");
+   return 1;
+   }
+
+   printf("parent writing %p = 1\n", p);
+   *p = 1;
+
+   FAIL_IF(pipe(p2c) == -1 || pipe(c2p) == -1);
+
+   pid = fork();
+   if (pid == 0) {
+   FAIL_IF(read(p2c[0], , 1) != 1);
+
+   pid = getpid();
+   printf("child writing  %p = %d\n", p, pid);
+   *p = pid;
+
+   FAIL_IF(write(c2p[1], , 1) != 1);
+   FAIL_IF(read(p2c[0], , 1) != 1);
+   exit(0);
+   }
+
+   c = 0;
+   FAIL_IF(write(p2c[1], , 1) != 1);
+   FAIL_IF(read(c2p[0], , 1) != 1);
+
+   // Prevent compiler optimisation
+   barrier();
+
+   rc = 0;
+   printf("parent reading %p = %d\n", p, *p);
+   if (*p != 1) {
+   printf("Error: BUG! parent saw child's write! *p = %d\n", *p);
+   rc = 1;
+   }
+
+   FAIL_IF(write(p2c[1], , 1) != 1);
+   FAIL_IF(waitpid(pid, , 0) == -1);
+   FAIL_IF(!WIFEXITED(status) || WEXITSTATUS(status));
+
+   if (rc == 0)
+   printf("success: test completed OK\n");
+
+   return rc;
+}
+
+int main(void)
+{
+   return test_harness(test, "large_vm_fork_separation");
+}
-- 
2.20.1



Re: [PATCH v2] powerpc/mm/nvdimm: Add an informative message if we fail to allocate altmap block

2019-07-02 Thread Oliver O'Halloran
On Tue, Jul 2, 2019 at 12:33 AM Aneesh Kumar K.V
 wrote:
>
> Allocation from altmap area can fail based on vmemmap page size used. Add 
> kernel
> info message to indicate the failure. That allows the user to identify 
> whether they
> are really using persistent memory reserved space for per-page metadata.
>
> The message looks like:
> [  136.587212] altmap block allocation failed, falling back to system memory
>
> Signed-off-by: Aneesh Kumar K.V 
> ---
>  arch/powerpc/mm/init_64.c | 6 +-
>  1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/arch/powerpc/mm/init_64.c b/arch/powerpc/mm/init_64.c
> index a4e17a979e45..f3b64f49082b 100644
> --- a/arch/powerpc/mm/init_64.c
> +++ b/arch/powerpc/mm/init_64.c
> @@ -194,8 +194,12 @@ int __meminit vmemmap_populate(unsigned long start, 
> unsigned long end, int node,
>  * fail due to alignment issues when using 16MB hugepages, so
>  * fall back to system memory if the altmap allocation fail.
>  */
> -   if (altmap)
> +   if (altmap) {
> p = altmap_alloc_block_buf(page_size, altmap);
> +   if (!p)
> +   pr_debug("altmap block allocation failed, " \
> +   "falling back to system memory");
> +   }
> if (!p)
> p = vmemmap_alloc_block_buf(page_size, node);
> if (!p)
> --
> 2.21.0
>

I'll let mpe decide if he cares about the split line thing :)

Reviewed-by: Oliver O'Halloran 


[PATCH 3/3] KVM: PPC: Book3S HV: Save and restore guest visible PSSCR bits on pseries

2019-07-02 Thread Suraj Jitindar Singh
The performance stop status and control register (PSSCR) is used to
control the power saving facilities of the processor. This register has
various fields, some of which can be modified only in hypervisor state,
and others which can be modified in both hypervisor and priviledged
non-hypervisor state. The bits which can be modified in priviledged
non-hypervisor state are referred to as guest visible.

Currently the L0 hypervisor saves and restores both it's own host value
as well as the guest value of the psscr when context switching between
the hypervisor and guest. However a nested hypervisor running it's own
nested guests (as indicated by kvmhv_on_pseries()) doesn't context
switch the psscr register. This means that if a nested (L2) guest
modified the psscr that the L1 guest hypervisor will run with this
value, and if the L1 guest hypervisor modified this value and then goes
to run the nested (L2) guest again that the L2 psscr value will be lost.

Fix this by having the (L1) nested hypervisor save and restore both its
host and the guest psscr value when entering and exiting a nested (L2)
guest. Note that only the guest visible parts of the psscr are context
switched since this is all the L1 nested hypervisor can access, this is
fine however as these are the only fields the L0 hypervisor provides
guest control of anyway and so all other fields are ignored.

This could also have been implemented by adding the psscr register to
the hv_regs passed to the L0 hypervisor as input to the H_ENTER_NESTED
hcall, however this would have meant updating the structure layout and
thus required modifications to both the L0 and L1 kernels. Whereas the
approach used doesn't require L0 kernel modifications while achieving
the same result.

Fixes: 95a6432ce903 "KVM: PPC: Book3S HV: Streamlined guest entry/exit path on 
P9 for radix guests"

Signed-off-by: Suraj Jitindar Singh 
---
 arch/powerpc/kvm/book3s_hv.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index b682a429f3ef..cde3f5a4b3e4 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -3569,9 +3569,18 @@ int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 
time_limit,
mtspr(SPRN_DEC, vcpu->arch.dec_expires - mftb());
 
if (kvmhv_on_pseries()) {
+   /*
+* We need to save and restore the guest visible part of the
+* psscr (i.e. using SPRN_PSSCR_PR) since the hypervisor
+* doesn't do this for us. Note only required if pseries since
+* this is done in kvmhv_load_hv_regs_and_go() below otherwise.
+*/
+   unsigned long host_psscr;
/* call our hypervisor to load up HV regs and go */
struct hv_guest_state hvregs;
 
+   host_psscr = mfspr(SPRN_PSSCR_PR);
+   mtspr(SPRN_PSSCR_PR, vcpu->arch.psscr);
kvmhv_save_hv_regs(vcpu, );
hvregs.lpcr = lpcr;
vcpu->arch.regs.msr = vcpu->arch.shregs.msr;
@@ -3590,6 +3599,8 @@ int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 
time_limit,
vcpu->arch.shregs.msr = vcpu->arch.regs.msr;
vcpu->arch.shregs.dar = mfspr(SPRN_DAR);
vcpu->arch.shregs.dsisr = mfspr(SPRN_DSISR);
+   vcpu->arch.psscr = mfspr(SPRN_PSSCR_PR);
+   mtspr(SPRN_PSSCR_PR, host_psscr);
 
/* H_CEDE has to be handled now, not later */
if (trap == BOOK3S_INTERRUPT_SYSCALL && !vcpu->arch.nested &&
-- 
2.13.6



[PATCH 2/3] PPC: PMC: Set pmcregs_in_use in paca when running as LPAR

2019-07-02 Thread Suraj Jitindar Singh
The ability to run nested guests under KVM means that a guest can also
act as a hypervisor for it's own nested guest. Currently
ppc_set_pmu_inuse() assumes that either FW_FEATURE_LPAR is set,
indicating a guest environment, and so sets the pmcregs_in_use flag in
the lppaca, or that it isn't set, indicating a hypervisor environment,
and so sets the pmcregs_in_use flag in the paca.

The pmcregs_in_use flag in the lppaca is used to communicate this
information to a hypervisor and so must be set in a guest environment.
The pmcregs_in_use flag in the paca is used by KVM code to determine
whether the host state of the performance monitoring unit (PMU) must be
saved and restored when running a guest.

Thus when a guest also acts as a hypervisor it must set this bit in both
places since it needs to ensure both that the real hypervisor saves it's
pmu registers when it runs (requires pmcregs_in_use flag in lppaca), and
that it saves it's own pmu registers when running a nested guest
(requires pmcregs_in_use flag in paca).

Modify ppc_set_pmu_inuse() so that the pmcregs_in_use bit is set in both
the lppaca and the paca when a guest (LPAR) is running with the
capability of running it's own guests (CONFIG_KVM_BOOK3S_HV_POSSIBLE).

Fixes: 95a6432ce903 "KVM: PPC: Book3S HV: Streamlined guest entry/exit path on 
P9 for radix guests"

Signed-off-by: Suraj Jitindar Singh 
---
 arch/powerpc/include/asm/pmc.h | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/include/asm/pmc.h b/arch/powerpc/include/asm/pmc.h
index dc9a1ca70edf..c6bbe9778d3c 100644
--- a/arch/powerpc/include/asm/pmc.h
+++ b/arch/powerpc/include/asm/pmc.h
@@ -27,11 +27,10 @@ static inline void ppc_set_pmu_inuse(int inuse)
 #ifdef CONFIG_PPC_PSERIES
get_lppaca()->pmcregs_in_use = inuse;
 #endif
-   } else {
+   }
 #ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE
-   get_paca()->pmcregs_in_use = inuse;
+   get_paca()->pmcregs_in_use = inuse;
 #endif
-   }
 #endif
 }
 
-- 
2.13.6



[PATCH 1/3] KVM: PPC: Book3S HV: Always save guest pmu for guest capable of nesting

2019-07-02 Thread Suraj Jitindar Singh
The performance monitoring unit (PMU) registers are saved on guest exit
when the guest has set the pmcregs_in_use flag in its lppaca, if it
exists, or unconditionally if it doesn't. If a nested guest is being
run then the hypervisor doesn't, and in most cases can't, know if the
pmu registers are in use since it doesn't know the location of the lppaca
for the nested guest, although it may have one for its immediate guest.
This results in the values of these registers being lost across nested
guest entry and exit in the case where the nested guest was making use
of the performance monitoring facility while it's nested guest hypervisor
wasn't.

Further more the hypervisor could interrupt a guest hypervisor between
when it has loaded up the pmu registers and it calling H_ENTER_NESTED or
between returning from the nested guest to the guest hypervisor and the
guest hypervisor reading the pmu registers, in kvmhv_p9_guest_entry().
This means that it isn't sufficient to just save the pmu registers when
entering or exiting a nested guest, but that it is necessary to always
save the pmu registers whenever a guest is capable of running nested guests
to ensure the register values aren't lost in the context switch.

Ensure the pmu register values are preserved by always saving their
value into the vcpu struct when a guest is capable of running nested
guests.

This should have minimal performance impact however any impact can be
avoided by booting a guest with "-machine pseries,cap-nested-hv=false"
on the qemu commandline.

Fixes: 95a6432ce903 "KVM: PPC: Book3S HV: Streamlined guest entry/exit path on 
P9 for radix guests"

Signed-off-by: Suraj Jitindar Singh 
---
 arch/powerpc/kvm/book3s_hv.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index ec1804f822af..b682a429f3ef 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -3654,6 +3654,8 @@ int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 
time_limit,
vcpu->arch.vpa.dirty = 1;
save_pmu = lp->pmcregs_in_use;
}
+   /* Must save pmu if this guest is capable of running nested guests */
+   save_pmu |= nesting_enabled(vcpu->kvm);
 
kvmhv_save_guest_pmu(vcpu, save_pmu);
 
-- 
2.13.6



Re: [PATCH] powerpc/powernv/idle: Fix restore of SPRN_LDBAR for POWER9 stop state.

2019-07-02 Thread Nicholas Piggin
Madhavan Srinivasan's on July 2, 2019 8:58 pm:
> From: Athira Rajeev 
> 
> commit 10d91611f426 ("powerpc/64s: Reimplement book3s idle code in C")
> reimplemented book3S code to pltform/powernv/idle.c. But when doing so
> missed to add the per-thread LDBAR update in the core_woken path of
> the power9_idle_stop(). Patch fixes the same.
> 
> Fixes: 10d91611f426 ("powerpc/64s: Reimplement book3s idle code in C")
> Signed-off-by: Athira Rajeev 
> Signed-off-by: Madhavan Srinivasan 
> ---
>  arch/powerpc/platforms/powernv/idle.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/powerpc/platforms/powernv/idle.c 
> b/arch/powerpc/platforms/powernv/idle.c
> index 2f4479b94ac3..fd14a6237954 100644
> --- a/arch/powerpc/platforms/powernv/idle.c
> +++ b/arch/powerpc/platforms/powernv/idle.c
> @@ -758,7 +758,6 @@ static unsigned long power9_idle_stop(unsigned long 
> psscr, bool mmu_on)
>   mtspr(SPRN_PTCR,sprs.ptcr);
>   mtspr(SPRN_RPR, sprs.rpr);
>   mtspr(SPRN_TSCR,sprs.tscr);
> - mtspr(SPRN_LDBAR,   sprs.ldbar);
>  
>   if (pls >= pnv_first_tb_loss_level) {
>   /* TB loss */
> @@ -790,6 +789,7 @@ static unsigned long power9_idle_stop(unsigned long 
> psscr, bool mmu_on)
>   mtspr(SPRN_MMCR0,   sprs.mmcr0);
>   mtspr(SPRN_MMCR1,   sprs.mmcr1);
>   mtspr(SPRN_MMCR2,   sprs.mmcr2);
> + mtspr(SPRN_LDBAR,   sprs.ldbar);

Oh that's another one I messed up, thanks for the fix. I must have
confused myself with the SPR table in the UM :(

Reviewed-by: Nicholas Piggin 



Re: [PATCH V2] mm/ioremap: Probe platform for p4d huge map support

2019-07-02 Thread Andrew Morton
On Fri, 28 Jun 2019 10:50:31 +0530 Anshuman Khandual 
 wrote:

> Finishing up what the commit c2febafc67734a ("mm: convert generic code to
> 5-level paging") started out while levelling up P4D huge mapping support
> at par with PUD and PMD. A new arch call back arch_ioremap_p4d_supported()
> is being added which just maintains status quo (P4D huge map not supported)
> on x86, arm64 and powerpc.

Does this have any runtime effects?  If so, what are they and why?  If
not, what's the actual point?



Re: [PATCH net] net/ibmvnic: Report last valid speed and duplex values to ethtool

2019-07-02 Thread David Miller
From: Thomas Falcon 
Date: Thu, 27 Jun 2019 12:09:13 -0500

> This patch resolves an issue with sensitive bonding modes
> that require valid speed and duplex settings to function
> properly. Currently, the adapter will report that device
> speed and duplex is unknown if the communication link
> with firmware is unavailable. This decision can break LACP
> configurations if the timing is right.
> 
> For example, if invalid speeds are reported, the slave
> device's link state is set to a transitional "fail" state
> and the LACP port is disabled. However, if valid speeds
> are reported later but the link state has not been altered,
> the LACP port will remain disabled. If the link state then
> transitions back to "up" from "fail," it results in a state
> such that the slave reports valid speed/duplex and is up,
> but the LACP port will remain disabled.
> 
> Workaround this by reporting the last recorded speed
> and duplex settings unless the device has never been
> activated. In that case or when the hypervisor gives
> invalid values, continue to report unknown speed or
> duplex to ethtool.
> 
> Signed-off-by: Thomas Falcon 

Like Andrew, I have my conerns about this.

If the firmware is unavailable, the link is effectively down.  So
you should report link down and unknown link parameters.

Bonding and LACP should do the right thing when the firwmare is
reachable again after the migration and the link goes back up.

If bonding/LACP isn't doing that, then the bug is there.


Re: [RFC PATCH] Replaces long number representation by BIT() macro

2019-07-02 Thread Segher Boessenkool
On Tue, Jul 02, 2019 at 11:16:35AM -0500, Segher Boessenkool wrote:
> On Wed, Jul 03, 2019 at 01:19:34AM +1000, Michael Ellerman wrote:
> > What we could do is switch to the `UL` macro from include/linux/const.h,
> > rather than using our own ASM_CONST.
> 
> You need gas 2.28 or later for that though.

Oh, but apparently I cannot read.  That macro should work fine.


Segher


Re: [RFC PATCH] Replaces long number representation by BIT() macro

2019-07-02 Thread Segher Boessenkool
On Wed, Jul 03, 2019 at 01:19:34AM +1000, Michael Ellerman wrote:
> What we could do is switch to the `UL` macro from include/linux/const.h,
> rather than using our own ASM_CONST.

You need gas 2.28 or later for that though.

https://sourceware.org/git/?p=binutils-gdb.git;a=commitdiff;h=86b80085
https://sourceware.org/git/?p=binutils-gdb.git;a=commitdiff;h=e140100a

What is the minimum required (for powerpc) now?


Segher


Re: ["RFC PATCH" 1/2] powerpc/mm: Fix node look up with numa=off boot

2019-07-02 Thread Nathan Lynch
"Aneesh Kumar K.V"  writes:
>> Just checking: do people still need numa=off? Seems like it's a
>> maintenance burden :-)
>> 
>
> That is used in kdump kernel.

I see, thanks.


[PATCH] powerpc: Enable CONFIG_IPV6 in ppc64_defconfig

2019-07-02 Thread sathnaga
From: Satheesh Rajendran 

Enable CONFIG_IPV6 in ppc64_defconfig to enable
certain network functionalities required for tests.

Signed-off-by: Michael Ellerman 
Signed-off-by: Satheesh Rajendran 
---
 arch/powerpc/configs/ppc64_defconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/configs/ppc64_defconfig 
b/arch/powerpc/configs/ppc64_defconfig
index 91fdb619b484..93fd9792d030 100644
--- a/arch/powerpc/configs/ppc64_defconfig
+++ b/arch/powerpc/configs/ppc64_defconfig
@@ -89,7 +89,7 @@ CONFIG_SYN_COOKIES=y
 CONFIG_INET_AH=m
 CONFIG_INET_ESP=m
 CONFIG_INET_IPCOMP=m
-# CONFIG_IPV6 is not set
+CONFIG_IPV6=y
 CONFIG_NETFILTER=y
 # CONFIG_NETFILTER_ADVANCED is not set
 CONFIG_BRIDGE=m
-- 
2.21.0



Re: [RFC PATCH] Replaces long number representation by BIT() macro

2019-07-02 Thread Michael Ellerman
Hi Leonardo,

Leonardo Bras  writes:
> The main reason of this change is to make these bitmasks more readable.
>
> The macro ASM_CONST() just appends an UL to it's parameter, so it can be
> easily replaced by BIT_MASK, that already uses a UL representation.
>
> ASM_CONST() in this file may behave different if __ASSEMBLY__ is defined,
> as it is used on .S files, just leaving the parameter as is.

Thanks for the patch, but I don't consider this an improvement in
readability.

At boot we print the firmware features, eg:

  firmware_features = 0x0001c45ffc5f

And it's much easier to match that up to the full constants, than to the
bit numbers.

Similarly in memory or register dumps.

What we could do is switch to the `UL` macro from include/linux/const.h,
rather than using our own ASM_CONST.

cheers

> diff --git a/arch/powerpc/include/asm/firmware.h 
> b/arch/powerpc/include/asm/firmware.h
> index 00bc42d95679..7a5b0cc0bc85 100644
> --- a/arch/powerpc/include/asm/firmware.h
> +++ b/arch/powerpc/include/asm/firmware.h
> @@ -14,46 +14,45 @@
>  
>  #ifdef __KERNEL__
>  
> -#include 
> -
> +#include 
>  /* firmware feature bitmask values */
>  
> -#define FW_FEATURE_PFT   ASM_CONST(0x0001)
> -#define FW_FEATURE_TCE   ASM_CONST(0x0002)
> -#define FW_FEATURE_SPRG0 ASM_CONST(0x0004)
> -#define FW_FEATURE_DABR  ASM_CONST(0x0008)
> -#define FW_FEATURE_COPY  ASM_CONST(0x0010)
> -#define FW_FEATURE_ASR   ASM_CONST(0x0020)
> -#define FW_FEATURE_DEBUG ASM_CONST(0x0040)
> -#define FW_FEATURE_TERM  ASM_CONST(0x0080)
> -#define FW_FEATURE_PERF  ASM_CONST(0x0100)
> -#define FW_FEATURE_DUMP  ASM_CONST(0x0200)
> -#define FW_FEATURE_INTERRUPT ASM_CONST(0x0400)
> -#define FW_FEATURE_MIGRATE   ASM_CONST(0x0800)
> -#define FW_FEATURE_PERFMON   ASM_CONST(0x1000)
> -#define FW_FEATURE_CRQ   ASM_CONST(0x2000)
> -#define FW_FEATURE_VIO   ASM_CONST(0x4000)
> -#define FW_FEATURE_RDMA  ASM_CONST(0x8000)
> -#define FW_FEATURE_LLAN  ASM_CONST(0x0001)
> -#define FW_FEATURE_BULK_REMOVE   ASM_CONST(0x0002)
> -#define FW_FEATURE_XDABR ASM_CONST(0x0004)
> -#define FW_FEATURE_MULTITCE  ASM_CONST(0x0008)
> -#define FW_FEATURE_SPLPARASM_CONST(0x0010)
> -#define FW_FEATURE_LPAR  ASM_CONST(0x0040)
> -#define FW_FEATURE_PS3_LV1   ASM_CONST(0x0080)
> -#define FW_FEATURE_HPT_RESIZEASM_CONST(0x0100)
> -#define FW_FEATURE_CMO   ASM_CONST(0x0200)
> -#define FW_FEATURE_VPHN  ASM_CONST(0x0400)
> -#define FW_FEATURE_XCMO  ASM_CONST(0x0800)
> -#define FW_FEATURE_OPAL  ASM_CONST(0x1000)
> -#define FW_FEATURE_SET_MODE  ASM_CONST(0x4000)
> -#define FW_FEATURE_BEST_ENERGY   ASM_CONST(0x8000)
> -#define FW_FEATURE_TYPE1_AFFINITY ASM_CONST(0x0001)
> -#define FW_FEATURE_PRRN  ASM_CONST(0x0002)
> -#define FW_FEATURE_DRMEM_V2  ASM_CONST(0x0004)
> -#define FW_FEATURE_DRC_INFO  ASM_CONST(0x0008)
> -#define FW_FEATURE_BLOCK_REMOVE ASM_CONST(0x0010)
> -#define FW_FEATURE_PAPR_SCM  ASM_CONST(0x0020)
> +#define FW_FEATURE_PFT   BIT(0)
> +#define FW_FEATURE_TCE   BIT(1)
> +#define FW_FEATURE_SPRG0 BIT(2)
> +#define FW_FEATURE_DABR  BIT(3)
> +#define FW_FEATURE_COPY  BIT(4)
> +#define FW_FEATURE_ASR   BIT(5)
> +#define FW_FEATURE_DEBUG BIT(6)
> +#define FW_FEATURE_TERM  BIT(7)
> +#define FW_FEATURE_PERF  BIT(8)
> +#define FW_FEATURE_DUMP  BIT(9)
> +#define FW_FEATURE_INTERRUPT BIT(10)
> +#define FW_FEATURE_MIGRATE   BIT(11)
> +#define FW_FEATURE_PERFMON   BIT(12)
> +#define FW_FEATURE_CRQ   BIT(13)
> +#define FW_FEATURE_VIO   BIT(14)
> +#define FW_FEATURE_RDMA  BIT(15)
> +#define FW_FEATURE_LLAN  BIT(16)
> +#define FW_FEATURE_BULK_REMOVE   BIT(17)
> +#define FW_FEATURE_XDABR BIT(18)
> +#define FW_FEATURE_MULTITCE  BIT(19)
> +#define FW_FEATURE_SPLPARBIT(20)
> +#define FW_FEATURE_LPAR  BIT(22)
> +#define FW_FEATURE_PS3_LV1   BIT(23)
> +#define FW_FEATURE_HPT_RESIZEBIT(24)
> +#define FW_FEATURE_CMO   BIT(25)
> +#define FW_FEATURE_VPHN  BIT(26)
> +#define FW_FEATURE_XCMO  BIT(27)
> +#define FW_FEATURE_OPAL  BIT(28)
> +#define FW_FEATURE_SET_MODE  BIT(30)
> +#define FW_FEATURE_BEST_ENERGY   BIT(31)
> +#define FW_FEATURE_TYPE1_AFFINITY BIT(32)
> +#define 

Re: [v2 03/12] powerpc/mce: Add MCE notification chain

2019-07-02 Thread Reza Arbab

On Tue, Jul 02, 2019 at 10:49:23AM +0530, Santosh Sivaraj wrote:

+static BLOCKING_NOTIFIER_HEAD(mce_notifier_list);


Mahesh suggested using an atomic notifier chain instead of blocking, 
since we are in an interrupt.


--
Reza Arbab



[PATCH -next] powerpc/powernv: Make some sysbols static

2019-07-02 Thread YueHaibing
Fix sparse warnings:

arch/powerpc/platforms/powernv/opal-psr.c:20:1:
 warning: symbol 'psr_mutex' was not declared. Should it be static?
arch/powerpc/platforms/powernv/opal-psr.c:27:3:
 warning: symbol 'psr_attrs' was not declared. Should it be static?
arch/powerpc/platforms/powernv/opal-powercap.c:20:1:
 warning: symbol 'powercap_mutex' was not declared. Should it be static?
arch/powerpc/platforms/powernv/opal-sensor-groups.c:20:1:
 warning: symbol 'sg_mutex' was not declared. Should it be static?

Reported-by: Hulk Robot 
Signed-off-by: YueHaibing 
---
 arch/powerpc/platforms/powernv/opal-powercap.c  | 2 +-
 arch/powerpc/platforms/powernv/opal-psr.c   | 4 ++--
 arch/powerpc/platforms/powernv/opal-sensor-groups.c | 2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/opal-powercap.c 
b/arch/powerpc/platforms/powernv/opal-powercap.c
index dc599e7..c16d44f 100644
--- a/arch/powerpc/platforms/powernv/opal-powercap.c
+++ b/arch/powerpc/platforms/powernv/opal-powercap.c
@@ -13,7 +13,7 @@
 
 #include 
 
-DEFINE_MUTEX(powercap_mutex);
+static DEFINE_MUTEX(powercap_mutex);
 
 static struct kobject *powercap_kobj;
 
diff --git a/arch/powerpc/platforms/powernv/opal-psr.c 
b/arch/powerpc/platforms/powernv/opal-psr.c
index b6ccb30..69d7e75 100644
--- a/arch/powerpc/platforms/powernv/opal-psr.c
+++ b/arch/powerpc/platforms/powernv/opal-psr.c
@@ -13,11 +13,11 @@
 
 #include 
 
-DEFINE_MUTEX(psr_mutex);
+static DEFINE_MUTEX(psr_mutex);
 
 static struct kobject *psr_kobj;
 
-struct psr_attr {
+static struct psr_attr {
u32 handle;
struct kobj_attribute attr;
 } *psr_attrs;
diff --git a/arch/powerpc/platforms/powernv/opal-sensor-groups.c 
b/arch/powerpc/platforms/powernv/opal-sensor-groups.c
index 31f13c1..f8ae1fb 100644
--- a/arch/powerpc/platforms/powernv/opal-sensor-groups.c
+++ b/arch/powerpc/platforms/powernv/opal-sensor-groups.c
@@ -13,7 +13,7 @@
 
 #include 
 
-DEFINE_MUTEX(sg_mutex);
+static DEFINE_MUTEX(sg_mutex);
 
 static struct kobject *sg_kobj;
 
-- 
2.7.4




[PATCH] powerpc/setup: Adjust six seq_printf() calls in show_cpuinfo()

2019-07-02 Thread Markus Elfring
From: Markus Elfring 
Date: Tue, 2 Jul 2019 14:41:42 +0200

A bit of information should be put into a sequence.
Thus improve the execution speed for this data output by better usage
of corresponding functions.

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring 
---
 arch/powerpc/kernel/setup-common.c | 11 +--
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/kernel/setup-common.c 
b/arch/powerpc/kernel/setup-common.c
index 1f8db666468d..a381723b11bd 100644
--- a/arch/powerpc/kernel/setup-common.c
+++ b/arch/powerpc/kernel/setup-common.c
@@ -239,18 +239,17 @@ static int show_cpuinfo(struct seq_file *m, void *v)
maj = (pvr >> 8) & 0xFF;
min = pvr & 0xFF;

-   seq_printf(m, "processor\t: %lu\n", cpu_id);
-   seq_printf(m, "cpu\t\t: ");
+   seq_printf(m, "processor\t: %lu\ncpu\t\t: ", cpu_id);

if (cur_cpu_spec->pvr_mask && cur_cpu_spec->cpu_name)
-   seq_printf(m, "%s", cur_cpu_spec->cpu_name);
+   seq_puts(m, cur_cpu_spec->cpu_name);
else
seq_printf(m, "unknown (%08x)", pvr);

if (cpu_has_feature(CPU_FTR_ALTIVEC))
-   seq_printf(m, ", altivec supported");
+   seq_puts(m, ", altivec supported");

-   seq_printf(m, "\n");
+   seq_putc(m, '\n');

 #ifdef CONFIG_TAU
if (cpu_has_feature(CPU_FTR_TAU)) {
@@ -332,7 +331,7 @@ static int show_cpuinfo(struct seq_file *m, void *v)
seq_printf(m, "bogomips\t: %lu.%02lu\n", loops_per_jiffy / 
(50 / HZ),
   (loops_per_jiffy / (5000 / HZ)) % 100);

-   seq_printf(m, "\n");
+   seq_putc(m, '\n');

/* If this is the last cpu, print the summary */
if (cpumask_next(cpu_id, cpu_online_mask) >= nr_cpu_ids)
--
2.22.0



[PATCH] powerpc/powernv/idle: Fix restore of SPRN_LDBAR for POWER9 stop state.

2019-07-02 Thread Madhavan Srinivasan
From: Athira Rajeev 

commit 10d91611f426 ("powerpc/64s: Reimplement book3s idle code in C")
reimplemented book3S code to pltform/powernv/idle.c. But when doing so
missed to add the per-thread LDBAR update in the core_woken path of
the power9_idle_stop(). Patch fixes the same.

Fixes: 10d91611f426 ("powerpc/64s: Reimplement book3s idle code in C")
Signed-off-by: Athira Rajeev 
Signed-off-by: Madhavan Srinivasan 
---
 arch/powerpc/platforms/powernv/idle.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/powernv/idle.c 
b/arch/powerpc/platforms/powernv/idle.c
index 2f4479b94ac3..fd14a6237954 100644
--- a/arch/powerpc/platforms/powernv/idle.c
+++ b/arch/powerpc/platforms/powernv/idle.c
@@ -758,7 +758,6 @@ static unsigned long power9_idle_stop(unsigned long psscr, 
bool mmu_on)
mtspr(SPRN_PTCR,sprs.ptcr);
mtspr(SPRN_RPR, sprs.rpr);
mtspr(SPRN_TSCR,sprs.tscr);
-   mtspr(SPRN_LDBAR,   sprs.ldbar);
 
if (pls >= pnv_first_tb_loss_level) {
/* TB loss */
@@ -790,6 +789,7 @@ static unsigned long power9_idle_stop(unsigned long psscr, 
bool mmu_on)
mtspr(SPRN_MMCR0,   sprs.mmcr0);
mtspr(SPRN_MMCR1,   sprs.mmcr1);
mtspr(SPRN_MMCR2,   sprs.mmcr2);
+   mtspr(SPRN_LDBAR,   sprs.ldbar);
 
mtspr(SPRN_SPRG3,   local_paca->sprg_vdso);
 
-- 
2.20.1



Re: Re: [PATCH 1/3] arm64: mm: Add p?d_large() definitions

2019-07-02 Thread Will Deacon
On Tue, Jul 02, 2019 at 01:07:11PM +1000, Nicholas Piggin wrote:
> Will Deacon's on July 1, 2019 8:15 pm:
> > On Mon, Jul 01, 2019 at 11:03:51AM +0100, Steven Price wrote:
> >> On 01/07/2019 10:27, Will Deacon wrote:
> >> > On Sun, Jun 23, 2019 at 07:44:44PM +1000, Nicholas Piggin wrote:
> >> >> walk_page_range() is going to be allowed to walk page tables other than
> >> >> those of user space. For this it needs to know when it has reached a
> >> >> 'leaf' entry in the page tables. This information will be provided by 
> >> >> the
> >> >> p?d_large() functions/macros.
> >> > 
> >> > I can't remember whether or not I asked this before, but why not call
> >> > this macro p?d_leaf() if that's what it's identifying? "Large" and "huge"
> >> > are usually synonymous, so I find this naming needlessly confusing based
> >> > on this patch in isolation.
> 
> Those page table macro names are horrible. Large, huge, leaf, wtf?
> They could do with a sensible renaming. But this series just follows
> naming that's alreay there on x86.

I realise that, and I wasn't meaning to have a go at you. Just wanted to
make my opinion clear by having a moan :)

Will


[PATCH] powerpc: Use nid as fallback for chip_id

2019-07-02 Thread Srikar Dronamraju
One of the uses of chip_id is to find out all cores that are part of the same
chip. However ibm,chip_id property is not present in device-tree of PowerVM
Lpars. Hence lscpu output shows one core per socket and multiple cores.

Before the patch.
# lscpu
Architecture:ppc64le
Byte Order:  Little Endian
CPU(s):  128
On-line CPU(s) list: 0-127
Thread(s) per core:  8
Core(s) per socket:  1
Socket(s):   16
NUMA node(s):2
Model:   2.2 (pvr 004e 0202)
Model name:  POWER9 (architected), altivec supported
Hypervisor vendor:   pHyp
Virtualization type: para
L1d cache:   32K
L1i cache:   32K
L2 cache:512K
L3 cache:10240K
NUMA node0 CPU(s):   0-63
NUMA node1 CPU(s):   64-127

# cat /sys/devices/system/cpu/cpu0/topology/physical_package_id
-1

Signed-off-by: Srikar Dronamraju 
---
 arch/powerpc/kernel/prom.c | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c
index 7159e791a70d..0b8918b43580 100644
--- a/arch/powerpc/kernel/prom.c
+++ b/arch/powerpc/kernel/prom.c
@@ -867,18 +867,24 @@ EXPORT_SYMBOL(of_get_ibm_chip_id);
  * @cpu: The logical cpu number.
  *
  * Return the value of the ibm,chip-id property corresponding to the given
- * logical cpu number. If the chip-id can not be found, returns -1.
+ * logical cpu number. If the chip-id can not be found, return nid.
+ *
  */
 int cpu_to_chip_id(int cpu)
 {
struct device_node *np;
+   int chip_id = -1;
 
np = of_get_cpu_node(cpu, NULL);
if (!np)
return -1;
 
+   chip_id = of_get_ibm_chip_id(np);
+   if (chip_id == -1)
+   chip_id = of_node_to_nid(np);
+
of_node_put(np);
-   return of_get_ibm_chip_id(np);
+   return chip_id;
 }
 EXPORT_SYMBOL(cpu_to_chip_id);
 
-- 
2.18.1



Re: [v2 09/12] powerpc/mce: Enable MCE notifiers in external modules

2019-07-02 Thread Mahesh Jagannath Salgaonkar
On 7/2/19 11:47 AM, Nicholas Piggin wrote:
> Santosh Sivaraj's on July 2, 2019 3:19 pm:
>> From: Reza Arbab 
>>
>> Signed-off-by: Reza Arbab 
>> ---
>>  arch/powerpc/kernel/exceptions-64s.S | 6 ++
>>  arch/powerpc/kernel/mce.c| 2 ++
>>  2 files changed, 8 insertions(+)
>>
>> diff --git a/arch/powerpc/kernel/exceptions-64s.S 
>> b/arch/powerpc/kernel/exceptions-64s.S
>> index c83e38a403fd..311f1392a2ec 100644
>> --- a/arch/powerpc/kernel/exceptions-64s.S
>> +++ b/arch/powerpc/kernel/exceptions-64s.S
>> @@ -458,6 +458,12 @@ EXC_COMMON_BEGIN(machine_check_handle_early)
>>  bl  machine_check_early
>>  std r3,RESULT(r1)   /* Save result */
>>  
>> +/* Notifiers may be in a module, so enable virtual addressing. */
>> +mfmsr   r11
>> +ori r11,r11,MSR_IR
>> +ori r11,r11,MSR_DR
>> +mtmsr   r11
> 
> Can't do this, we could take a machine check somewhere the MMU is
> not sane (in fact the guest early mce handling that was added recently
> should not be enabling virtual mode either, which needs to be fixed).

Looks like they need this to be able to run notifier chain which may
fail in real mode.

> 
> Thanks,
> Nick
> 



Re: [PATCH v5 0/4] Additional fixes on Talitos driver

2019-07-02 Thread Christophe Leroy

Hi Herbert,

Le 24/06/2019 à 09:21, Christophe Leroy a écrit :

This series is the last set of fixes for the Talitos driver.


Do you plan to apply this series, or are you expecting anythink from 
myself ?


Thanks
Christophe



We now get a fully clean boot on both SEC1 (SEC1.2 on mpc885) and
SEC2 (SEC2.2 on mpc8321E) with CONFIG_CRYPTO_MANAGER_EXTRA_TESTS:

[3.385197] bus: 'platform': really_probe: probing driver talitos with 
device ff02.crypto
[3.450982] random: fast init done
[   12.252548] alg: No test for authenc(hmac(md5),cbc(aes)) 
(authenc-hmac-md5-cbc-aes-talitos-hsna)
[   12.262226] alg: No test for authenc(hmac(md5),cbc(des3_ede)) 
(authenc-hmac-md5-cbc-3des-talitos-hsna)
[   43.310737] Bug in SEC1, padding ourself
[   45.603318] random: crng init done
[   54.612333] talitos ff02.crypto: fsl,sec1.2 algorithms registered in 
/proc/crypto
[   54.620232] driver: 'talitos': driver_bound: bound to device 
'ff02.crypto'

[1.193721] bus: 'platform': really_probe: probing driver talitos with 
device b003.crypto
[1.229197] random: fast init done
[2.714920] alg: No test for authenc(hmac(sha224),cbc(aes)) 
(authenc-hmac-sha224-cbc-aes-talitos)
[2.724312] alg: No test for authenc(hmac(sha224),cbc(aes)) 
(authenc-hmac-sha224-cbc-aes-talitos-hsna)
[4.482045] alg: No test for authenc(hmac(md5),cbc(aes)) 
(authenc-hmac-md5-cbc-aes-talitos)
[4.490940] alg: No test for authenc(hmac(md5),cbc(aes)) 
(authenc-hmac-md5-cbc-aes-talitos-hsna)
[4.500280] alg: No test for authenc(hmac(md5),cbc(des3_ede)) 
(authenc-hmac-md5-cbc-3des-talitos)
[4.509727] alg: No test for authenc(hmac(md5),cbc(des3_ede)) 
(authenc-hmac-md5-cbc-3des-talitos-hsna)
[6.631781] random: crng init done
[   11.521795] talitos b003.crypto: fsl,sec2.2 algorithms registered in 
/proc/crypto
[   11.529803] driver: 'talitos': driver_bound: bound to device 
'b003.crypto'

v2: dropped patch 1 which was irrelevant due to a rebase weirdness. Added Cc to 
stable on the 2 first patches.

v3:
  - removed stable reference in patch 1
  - reworded patch 1 to include name of patch 2 for the dependency.
  - mentionned this dependency in patch 2 as well.
  - corrected the Fixes: sha1 in patch 4
  
v4:

  - using scatterwalk_ffwd() instead of opencodying SG list forwarding.
  - Added a patch to fix sg_copy_to_buffer() when sg->offset() is greater than 
PAGE_SIZE,
  otherwise sg_copy_to_buffer() fails when the list has been forwarded with 
scatterwalk_ffwd().
  - taken the patch "crypto: talitos - eliminate unneeded 'done' functions at build 
time"
  out of the series because it is independent.
  - added a helper to find the header field associated to a request in 
flush_channe()
  
v5:

  - Replacing the while loop by a direct shift/mask operation, as suggested by 
Herbert in patch 1.

Christophe Leroy (4):
   lib/scatterlist: Fix mapping iterator when sg->offset is greater than
 PAGE_SIZE
   crypto: talitos - move struct talitos_edesc into talitos.h
   crypto: talitos - fix hash on SEC1.
   crypto: talitos - drop icv_ool

  drivers/crypto/talitos.c | 102 +++
  drivers/crypto/talitos.h |  28 +
  lib/scatterlist.c|   9 +++--
  3 files changed, 74 insertions(+), 65 deletions(-)



[PATCH v2] powerpc/imc: Dont create debugfs files for cpu-less nodes

2019-07-02 Thread Madhavan Srinivasan
Commit <684d984038aa> ('powerpc/powernv: Add debugfs interface for imc-mode
and imc') added debugfs interface for the nest imc pmu devices to support
changing of different ucode modes. Primarily adding this capability for
debug. But when doing so, the code did not consider the case of cpu-less
nodes. So when reading the _cmd_ or _mode_ file of a cpu-less node
will create this crash.

[ 1139.415461][ T5301] Faulting instruction address: 0xc00d0d58
[ 1139.415492][ T5301] Oops: Kernel access of bad area, sig: 11 [#1]
[ 1139.415509][ T5301] LE PAGE_SIZE=64K MMU=Radix MMU=Hash SMP NR_CPUS=256
DEBUG_PAGEALLOC NUMA PowerNV
[ 1139.415542][ T5301] Modules linked in: i2c_opal i2c_core ip_tables x_tables
xfs sd_mod bnx2x mdio ahci libahci tg3 libphy libata firmware_class dm_mirror
dm_region_hash dm_log dm_mod
[ 1139.415595][ T5301] CPU: 67 PID: 5301 Comm: cat Not tainted 5.2.0-rc6-next-
20190627+ #19
[ 1139.415634][ T5301] NIP:  c00d0d58 LR: c049aa18 
CTR:c00d0d50
[ 1139.415675][ T5301] REGS: c00020194548f9e0 TRAP: 0300   Not tainted  
(5.2.0-rc6-next-20190627+)
[ 1139.415705][ T5301] MSR:  90009033   
CR:28022822  XER: 
[ 1139.415777][ T5301] CFAR: c049aa14 DAR: 0003fc08 
DSISR:4000 IRQMASK: 0
[ 1139.415777][ T5301] GPR00: c049aa18 c00020194548fc70 
c16f8b03fc08
[ 1139.415777][ T5301] GPR04: c00020194548fcd0  
14884e7300011eaa
[ 1139.415777][ T5301] GPR08: 7eea5a52 c00d0d50 

[ 1139.415777][ T5301] GPR12: c00d0d50 c000201fff7f8c00 

[ 1139.415777][ T5301] GPR16: 000d 7fffeb0c3368 

[ 1139.415777][ T5301] GPR20:   
0002
[ 1139.415777][ T5301] GPR24:   
000200010ec9
[ 1139.415777][ T5301] GPR28: c00020194548fdf0 c00020049a584ef8 
c00020049a584ea8
[ 1139.416116][ T5301] NIP [c00d0d58] imc_mem_get+0x8/0x20
[ 1139.416143][ T5301] LR [c049aa18] simple_attr_read+0x118/0x170
[ 1139.416158][ T5301] Call Trace:
[ 1139.416182][ T5301] [c00020194548fc70] 
[c049a970]simple_attr_read+0x70/0x170 (unreliable)
[ 1139.416255][ T5301] [c00020194548fd10] 
[c054385c]debugfs_attr_read+0x6c/0xb0
[ 1139.416305][ T5301] [c00020194548fd60] [c0454c1c]__vfs_read+0x3c/0x70
[ 1139.416363][ T5301] [c00020194548fd80] [c0454d0c] vfs_read+0xbc/0x1a0
[ 1139.416392][ T5301] [c00020194548fdd0] [c045519c]ksys_read+0x7c/0x140
[ 1139.416434][ T5301] [c00020194548fe20] 
[c000b108]system_call+0x5c/0x70
[ 1139.416473][ T5301] Instruction dump:
[ 1139.416511][ T5301] 4e800020 6000 7c0802a6 6000 7c801d28 3860 
4e800020 6000
[ 1139.416572][ T5301] 6000 6000 7c0802a6 6000 <7d201c28> 3860 
f924 4e800020
[ 1139.416636][ T5301] ---[ end trace c44d1fb4ace04784 ]---
[ 1139.520686][ T5301]
[ 1140.520820][ T5301] Kernel panic - not syncing: Fatal exception

Patch fixes the issue with a more robust check for vbase to NULL.

Before patch, ls output for the debugfs imc directory

# ls /sys/kernel/debug/powerpc/imc/
imc_cmd_0imc_cmd_251  imc_cmd_253  imc_cmd_255  imc_mode_0imc_mode_251  
imc_mode_253  imc_mode_255
imc_cmd_250  imc_cmd_252  imc_cmd_254  imc_cmd_8imc_mode_250  imc_mode_252  
imc_mode_254  imc_mode_8

After patch, ls output for the debugfs imc directory

# ls /sys/kernel/debug/powerpc/imc/
imc_cmd_0  imc_cmd_8  imc_mode_0  imc_mode_8

Fixes: 684d984038aa ('powerpc/powernv: Add debugfs interface for imc-mode and 
imc')
Reported-by: Qian Cai 
Suggested-by: Michael Ellerman 
Signed-off-by: Madhavan Srinivasan 
---
Changelog v1:
- Modified the cpumask check.
  
 arch/powerpc/platforms/powernv/opal-imc.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/opal-imc.c 
b/arch/powerpc/platforms/powernv/opal-imc.c
index 186109bdd41b..e04b20625cb9 100644
--- a/arch/powerpc/platforms/powernv/opal-imc.c
+++ b/arch/powerpc/platforms/powernv/opal-imc.c
@@ -53,9 +53,9 @@ static void export_imc_mode_and_cmd(struct device_node *node,
struct imc_pmu *pmu_ptr)
 {
static u64 loc, *imc_mode_addr, *imc_cmd_addr;
-   int chip = 0, nid;
char mode[16], cmd[16];
u32 cb_offset;
+   struct imc_mem_info *ptr = pmu_ptr->mem_info;
 
imc_debugfs_parent = debugfs_create_dir("imc", powerpc_debugfs_root);
 
@@ -69,20 +69,20 @@ static void export_imc_mode_and_cmd(struct device_node 
*node,
if (of_property_read_u32(node, "cb_offset", _offset))
cb_offset = IMC_CNTL_BLK_OFFSET;
 
-   for_each_node(nid) {
-   loc = (u64)(pmu_ptr->mem_info[chip].vbase) + cb_offset;
+   while (ptr->vbase != NULL) {
+   loc = 

[PATCH] powerpc/64s/exception: Remove unused SOFTEN_VALUE_0x980

2019-07-02 Thread Michael Ellerman
Remove SOFTEN_VALUE_0x980, it's been unused since commit
dabe859ec636 ("powerpc: Give hypervisor decrementer interrupts their
own handler") (Sep 2012).

Signed-off-by: Michael Ellerman 
---
 arch/powerpc/include/asm/exception-64s.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/powerpc/include/asm/exception-64s.h 
b/arch/powerpc/include/asm/exception-64s.h
index b590765f6e45..b4f8b745ba01 100644
--- a/arch/powerpc/include/asm/exception-64s.h
+++ b/arch/powerpc/include/asm/exception-64s.h
@@ -583,7 +583,6 @@ END_FTR_SECTION_NESTED(ftr,ftr,943)
 /* This associate vector numbers with bits in paca->irq_happened */
 #define SOFTEN_VALUE_0x500 PACA_IRQ_EE
 #define SOFTEN_VALUE_0x900 PACA_IRQ_DEC
-#define SOFTEN_VALUE_0x980 PACA_IRQ_DEC
 #define SOFTEN_VALUE_0xa00 PACA_IRQ_DBELL
 #define SOFTEN_VALUE_0xe80 PACA_IRQ_DBELL
 #define SOFTEN_VALUE_0xe60 PACA_IRQ_HMI
-- 
2.20.1



Re: [RFC 09/11] pci/hotplug/pnv-php: Relax check when disabling slot

2019-07-02 Thread Andrew Donnellan

On 19/6/19 11:28 pm, Frederic Barrat wrote:

The driver only allows to disable a slot in the POPULATED
state. However, if an error occurs while enabling the slot, say
because the link couldn't be trained, then the POPULATED state may not
be reached, yet the power state of the slot is on. So allow to disable
a slot in the REGISTERED state. Removing the devices will do nothing
since it's not populated, and we'll set the power state of the slot
back to off.

Signed-off-by: Frederic Barrat 


Reviewed-by: Andrew Donnellan 


---
  drivers/pci/hotplug/pnv_php.c | 8 +++-
  1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/hotplug/pnv_php.c b/drivers/pci/hotplug/pnv_php.c
index f9c624334ef7..74b62a8e11e7 100644
--- a/drivers/pci/hotplug/pnv_php.c
+++ b/drivers/pci/hotplug/pnv_php.c
@@ -523,7 +523,13 @@ static int pnv_php_disable_slot(struct hotplug_slot *slot)
struct pnv_php_slot *php_slot = to_pnv_php_slot(slot);
int ret;
  
-	if (php_slot->state != PNV_PHP_STATE_POPULATED)

+   /*
+* Allow to disable a slot already in the registered state to
+* cover cases where the slot couldn't be enabled and never
+* reached the populated state
+*/
+   if (php_slot->state != PNV_PHP_STATE_POPULATED &&
+   php_slot->state != PNV_PHP_STATE_REGISTERED)
return 0;
  
  	/* Remove all devices behind the slot */




--
Andrew Donnellan  OzLabs, ADL Canberra
a...@linux.ibm.com IBM Australia Limited



Re: [RFC 11/11] ocxl: Add PCI hotplug dependency to Kconfig

2019-07-02 Thread Andrew Donnellan

On 19/6/19 11:28 pm, Frederic Barrat wrote:

The PCI hotplug framework is used to update the devices when a new
image is written to the FPGA.

Signed-off-by: Frederic Barrat 


Acked-by: Andrew Donnellan 


---
  drivers/misc/ocxl/Kconfig | 1 +
  1 file changed, 1 insertion(+)

diff --git a/drivers/misc/ocxl/Kconfig b/drivers/misc/ocxl/Kconfig
index 7fb6d39d4c5a..13a5d9f30369 100644
--- a/drivers/misc/ocxl/Kconfig
+++ b/drivers/misc/ocxl/Kconfig
@@ -12,6 +12,7 @@ config OCXL
tristate "OpenCAPI coherent accelerator support"
depends on PPC_POWERNV && PCI && EEH
select OCXL_BASE
+   select HOTPLUG_PCI_POWERNV
default m
help
  Select this option to enable the ocxl driver for Open



--
Andrew Donnellan  OzLabs, ADL Canberra
a...@linux.ibm.com IBM Australia Limited



Re: [PATCH] powerpc/configs: Disable /dev/port in skiroot defconfig

2019-07-02 Thread Daniel Axtens
Michael Ellerman  writes:

> Daniel Axtens  writes:
>> While reviewing lockdown patches, I discovered that we still enable
>> /dev/port (CONFIG_DEVPORT) in skiroot.
>>
>> We don't need it. Deselect CONFIG_DEVPORT for skiroot.
>
> Why don't we need it? :)

I should have explained this better :)

/dev/port is used for old x86 style IO accesses.

It's set up in drivers/char/mem.c, and is only created if
arch_has_dev_port() returns true. Per arch/powerpc/include/asm/io.h, on
PPC64 with PCI, this is only true if there's a legacy ISA bridge.

Even if a system has a legacy ISA bridge installed, we have no business
accessing it in skiroot.

Regards,
Daniel
>
> cheers
>
>> diff --git a/arch/powerpc/configs/skiroot_defconfig 
>> b/arch/powerpc/configs/skiroot_defconfig
>> index 5ba131c30f6b..b2e8f37156eb 100644
>> --- a/arch/powerpc/configs/skiroot_defconfig
>> +++ b/arch/powerpc/configs/skiroot_defconfig
>> @@ -212,6 +212,7 @@ CONFIG_IPMI_WATCHDOG=y
>>  CONFIG_HW_RANDOM=y
>>  CONFIG_TCG_TPM=y
>>  CONFIG_TCG_TIS_I2C_NUVOTON=y
>> +# CONFIG_DEVPORT is not set
>>  CONFIG_I2C=y
>>  # CONFIG_I2C_COMPAT is not set
>>  CONFIG_I2C_CHARDEV=y
>> -- 
>> 2.20.1


Re: vmlinux.o(.text+0x40e): Section mismatch in reference from the variable start_here_multiplatform to the function

2019-07-02 Thread Christophe Leroy




Le 02/07/2019 à 08:23, Christian Zigotzky a écrit :

Hi All,

I get the following error messages after compiling the RC7 of kernel 5.2:

WARNING: vmlinux.o(.text+0x40e): Section mismatch in reference from the 
variable start_here_multiplatform to the function .init.text:.early_setup()

The function start_here_multiplatform() references
the function __init .early_setup().
This is often because start_here_multiplatform lacks a __init
annotation or the annotation of .early_setup is wrong.


Harmless warning.

Fix at 
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/arch/powerpc/kernel/head_64.S?h=next-20190701=9c4e4c90ec24652921e31e9551fcaedc26eec86d


Will be cherry-picked by stable once merged into 4.3 I guess.

Christophe



FATAL: modpost: Section mismatches detected.
Set CONFIG_SECTION_MISMATCH_WARN_ONLY=y to allow them.
scripts/Makefile.modpost:97: recipe for target 'vmlinux.o' failed
make[1]: *** [vmlinux.o] Error 1
Makefile:1052: recipe for target 'vmlinux' failed
make: *** [vmlinux] Error 2

Please find attached the kernel config.

Any hints?

Thanks,
Christian


[v2 04/12] powerpc/mce: Move machine_check_ue_event() call

2019-07-02 Thread Santosh Sivaraj
From: Reza Arbab 

Move the call site of machine_check_ue_event() slightly later in the MCE
codepath. No functional change intended--this is prep for a later patch
to conditionally skip the call.

Signed-off-by: Reza Arbab 
---
 arch/powerpc/kernel/mce.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/mce.c b/arch/powerpc/kernel/mce.c
index 24d350a934e4..0ab171b41ede 100644
--- a/arch/powerpc/kernel/mce.c
+++ b/arch/powerpc/kernel/mce.c
@@ -156,7 +156,6 @@ void save_mce_event(struct pt_regs *regs, long handled,
if (phys_addr != ULONG_MAX) {
mce->u.ue_error.physical_address_provided = true;
mce->u.ue_error.physical_address = phys_addr;
-   machine_check_ue_event(mce);
}
}
return;
@@ -656,4 +655,8 @@ void machine_check_notify(struct pt_regs *regs)
return;
 
blocking_notifier_call_chain(_notifier_list, 0, );
+
+   if (evt.error_type == MCE_ERROR_TYPE_UE &&
+   evt.u.ue_error.physical_address_provided)
+   machine_check_ue_event();
 }
-- 
2.20.1



[v2 03/12] powerpc/mce: Add MCE notification chain

2019-07-02 Thread Santosh Sivaraj
From: Reza Arbab 

Signed-off-by: Reza Arbab 
---
 arch/powerpc/include/asm/asm-prototypes.h |  1 +
 arch/powerpc/include/asm/mce.h|  4 
 arch/powerpc/kernel/exceptions-64s.S  |  4 
 arch/powerpc/kernel/mce.c | 22 ++
 4 files changed, 31 insertions(+)

diff --git a/arch/powerpc/include/asm/asm-prototypes.h 
b/arch/powerpc/include/asm/asm-prototypes.h
index ec1c97a8e8cb..f66f26ef3ce0 100644
--- a/arch/powerpc/include/asm/asm-prototypes.h
+++ b/arch/powerpc/include/asm/asm-prototypes.h
@@ -72,6 +72,7 @@ void machine_check_exception(struct pt_regs *regs);
 void emulation_assist_interrupt(struct pt_regs *regs);
 long do_slb_fault(struct pt_regs *regs, unsigned long ea);
 void do_bad_slb_fault(struct pt_regs *regs, unsigned long ea, long err);
+void machine_check_notify(struct pt_regs *regs);
 
 /* signals, syscalls and interrupts */
 long sys_swapcontext(struct ucontext __user *old_ctx,
diff --git a/arch/powerpc/include/asm/mce.h b/arch/powerpc/include/asm/mce.h
index 94888a7025b3..948bef579086 100644
--- a/arch/powerpc/include/asm/mce.h
+++ b/arch/powerpc/include/asm/mce.h
@@ -214,4 +214,8 @@ unsigned long addr_to_pfn(struct pt_regs *regs, unsigned 
long addr,
 #ifdef CONFIG_PPC_BOOK3S_64
 void flush_and_reload_slb(void);
 #endif /* CONFIG_PPC_BOOK3S_64 */
+
+int mce_register_notifier(struct notifier_block *nb);
+int mce_unregister_notifier(struct notifier_block *nb);
+
 #endif /* __ASM_PPC64_MCE_H__ */
diff --git a/arch/powerpc/kernel/exceptions-64s.S 
b/arch/powerpc/kernel/exceptions-64s.S
index 6b86055e5251..2e56014fca21 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -457,6 +457,10 @@ EXC_COMMON_BEGIN(machine_check_handle_early)
addir3,r1,STACK_FRAME_OVERHEAD
bl  machine_check_early
std r3,RESULT(r1)   /* Save result */
+
+   addir3,r1,STACK_FRAME_OVERHEAD
+   bl  machine_check_notify
+
ld  r12,_MSR(r1)
 BEGIN_FTR_SECTION
b   4f
diff --git a/arch/powerpc/kernel/mce.c b/arch/powerpc/kernel/mce.c
index e78c4f18ea0a..24d350a934e4 100644
--- a/arch/powerpc/kernel/mce.c
+++ b/arch/powerpc/kernel/mce.c
@@ -42,6 +42,18 @@ static struct irq_work mce_event_process_work = {
 
 DECLARE_WORK(mce_ue_event_work, machine_process_ue_event);
 
+static BLOCKING_NOTIFIER_HEAD(mce_notifier_list);
+
+int mce_register_notifier(struct notifier_block *nb)
+{
+   return blocking_notifier_chain_register(_notifier_list, nb);
+}
+
+int mce_unregister_notifier(struct notifier_block *nb)
+{
+   return blocking_notifier_chain_unregister(_notifier_list, nb);
+}
+
 static void mce_set_error_info(struct machine_check_event *mce,
   struct mce_error_info *mce_err)
 {
@@ -635,3 +647,13 @@ long hmi_exception_realmode(struct pt_regs *regs)
 
return 1;
 }
+
+void machine_check_notify(struct pt_regs *regs)
+{
+   struct machine_check_event evt;
+
+   if (!get_mce_event(, MCE_EVENT_DONTRELEASE))
+   return;
+
+   blocking_notifier_call_chain(_notifier_list, 0, );
+}
-- 
2.20.1



[v2 02/12] powerpc/mce: Bug fixes for MCE handling in kernel space

2019-07-02 Thread Santosh Sivaraj
From: Balbir Singh 

The code currently assumes PAGE_SHIFT as the shift value of
the pfn, this works correctly (mostly) for user space pages,
but the correct thing to do is

1. Extract the shift value returned via the pte-walk API's
2. Use the shift value to access the instruction address.

Note, the final physical address still use PAGE_SHIFT for
computation. handle_ierror() is not modified and handle_derror()
is modified just for extracting the correct instruction
address.

This is largely due to __find_linux_pte() returning pfn's
shifted by pdshift. The code is much more generic and can
handle shift values returned.

Fixes: ba41e1e1ccb9 ("powerpc/mce: Hookup derror (load/store) UE errors")

Signed-off-by: Balbir Singh 
[ar...@linux.ibm.com: Fixup pseries_do_memory_failure()]
---
 arch/powerpc/include/asm/mce.h   |  3 ++-
 arch/powerpc/kernel/mce_power.c  | 26 --
 arch/powerpc/platforms/pseries/ras.c |  6 --
 3 files changed, 22 insertions(+), 13 deletions(-)

diff --git a/arch/powerpc/include/asm/mce.h b/arch/powerpc/include/asm/mce.h
index a4c6a74ad2fb..94888a7025b3 100644
--- a/arch/powerpc/include/asm/mce.h
+++ b/arch/powerpc/include/asm/mce.h
@@ -209,7 +209,8 @@ extern void release_mce_event(void);
 extern void machine_check_queue_event(void);
 extern void machine_check_print_event_info(struct machine_check_event *evt,
   bool user_mode, bool in_guest);
-unsigned long addr_to_pfn(struct pt_regs *regs, unsigned long addr);
+unsigned long addr_to_pfn(struct pt_regs *regs, unsigned long addr,
+ unsigned int *shift);
 #ifdef CONFIG_PPC_BOOK3S_64
 void flush_and_reload_slb(void);
 #endif /* CONFIG_PPC_BOOK3S_64 */
diff --git a/arch/powerpc/kernel/mce_power.c b/arch/powerpc/kernel/mce_power.c
index e39536aad30d..04666c0b40a8 100644
--- a/arch/powerpc/kernel/mce_power.c
+++ b/arch/powerpc/kernel/mce_power.c
@@ -23,7 +23,8 @@
  * Convert an address related to an mm to a PFN. NOTE: we are in real
  * mode, we could potentially race with page table updates.
  */
-unsigned long addr_to_pfn(struct pt_regs *regs, unsigned long addr)
+unsigned long addr_to_pfn(struct pt_regs *regs, unsigned long addr,
+ unsigned int *shift)
 {
pte_t *ptep;
unsigned long flags;
@@ -36,13 +37,15 @@ unsigned long addr_to_pfn(struct pt_regs *regs, unsigned 
long addr)
 
local_irq_save(flags);
if (mm == current->mm)
-   ptep = find_current_mm_pte(mm->pgd, addr, NULL, NULL);
+   ptep = find_current_mm_pte(mm->pgd, addr, NULL, shift);
else
-   ptep = find_init_mm_pte(addr, NULL);
+   ptep = find_init_mm_pte(addr, shift);
local_irq_restore(flags);
if (!ptep || pte_special(*ptep))
return ULONG_MAX;
-   return pte_pfn(*ptep);
+   if (!*shift)
+   *shift = PAGE_SHIFT;
+   return (pte_val(*ptep) & PTE_RPN_MASK) >> *shift;
 }
 
 /* flush SLBs and reload */
@@ -358,15 +361,16 @@ static int mce_find_instr_ea_and_pfn(struct pt_regs 
*regs, uint64_t *addr,
unsigned long pfn, instr_addr;
struct instruction_op op;
struct pt_regs tmp = *regs;
+   unsigned int shift;
 
-   pfn = addr_to_pfn(regs, regs->nip);
+   pfn = addr_to_pfn(regs, regs->nip, );
if (pfn != ULONG_MAX) {
-   instr_addr = (pfn << PAGE_SHIFT) + (regs->nip & ~PAGE_MASK);
+   instr_addr = (pfn << shift) + (regs->nip & ((1 << shift) - 1));
instr = *(unsigned int *)(instr_addr);
if (!analyse_instr(, , instr)) {
-   pfn = addr_to_pfn(regs, op.ea);
+   pfn = addr_to_pfn(regs, op.ea, );
*addr = op.ea;
-   *phys_addr = (pfn << PAGE_SHIFT);
+   *phys_addr = (pfn << shift);
return 0;
}
/*
@@ -442,12 +446,14 @@ static int mce_handle_ierror(struct pt_regs *regs,
if (mce_err->sync_error &&
table[i].error_type == MCE_ERROR_TYPE_UE) {
unsigned long pfn;
+   unsigned int shift;
 
if (get_paca()->in_mce < MAX_MCE_DEPTH) {
-   pfn = addr_to_pfn(regs, regs->nip);
+   pfn = addr_to_pfn(regs, regs->nip,
+ );
if (pfn != ULONG_MAX) {
*phys_addr =
-   (pfn << PAGE_SHIFT);
+   (pfn << shift);
}
}
}
diff --git 

[v2 00/12] powerpc: implement machine check safe memcpy

2019-07-02 Thread Santosh Sivaraj
During a memcpy from a pmem device, if a machine check exception is
generated we end up in a panic. In case of fsdax read, this should
only result in a -EIO. Avoid MCE by implementing memcpy_mcsafe.

Before this patch series:

```
bash-4.4# mount -o dax /dev/pmem0 /mnt/pmem/
[ 7621.714094] Disabling lock debugging due to kernel taint
[ 7621.714099] MCE: CPU0: machine check (Severe) Host UE Load/Store [Not 
recovered]
[ 7621.714104] MCE: CPU0: NIP: [c0088978] memcpy_power7+0x418/0x7e0
[ 7621.714107] MCE: CPU0: Hardware error
[ 7621.714112] opal: Hardware platform error: Unrecoverable Machine Check 
exception
[ 7621.714118] CPU: 0 PID: 1368 Comm: mount Tainted: G   M  
5.2.0-rc5-00239-g241e39004581 #50
[ 7621.714123] NIP:  c0088978 LR: c08e16f8 CTR: 01de
[ 7621.714129] REGS: c000fffbfd70 TRAP: 0200   Tainted: G   M   
(5.2.0-rc5-00239-g241e39004581)
[ 7621.714131] MSR:  92209033   CR: 
24428840  XER: 0004
[ 7621.714160] CFAR: c00889a8 DAR: deadbeefdeadbeef DSISR: 8000 
IRQMASK: 0
[ 7621.714171] GPR00: 0e00 c000f0b8b1e0 c12cf100 
c000ed8e1100 
[ 7621.714186] GPR04: c2001100 0001 0200 
03fff1272000 
[ 7621.714201] GPR08: 8000 0010 0020 
0030 
[ 7621.714216] GPR12: 0040 7fffb8c6d390 0050 
0060 
[ 7621.714232] GPR16: 0070  0001 
c000f0b8b960 
[ 7621.714247] GPR20: 0001 c000f0b8b940 0001 
0001 
[ 7621.714262] GPR24: c1382560 c00c003b6380 c00c003b6380 
0001 
[ 7621.714277] GPR28:  0001 c200 
0001 
[ 7621.714294] NIP [c0088978] memcpy_power7+0x418/0x7e0
[ 7621.714298] LR [c08e16f8] pmem_do_bvec+0xf8/0x430
...  ...
```

After this patch series:

```
bash-4.4# mount -o dax /dev/pmem0 /mnt/pmem/
[25302.883978] Buffer I/O error on dev pmem0, logical block 0, async page read
[25303.020816] EXT4-fs (pmem0): DAX enabled. Warning: EXPERIMENTAL, use at your 
own risk
[25303.021236] EXT4-fs (pmem0): Can't read superblock on 2nd try
[25303.152515] EXT4-fs (pmem0): DAX enabled. Warning: EXPERIMENTAL, use at your 
own risk
[25303.284031] EXT4-fs (pmem0): DAX enabled. Warning: EXPERIMENTAL, use at your 
own risk
[25304.084100] UDF-fs: bad mount option "dax" or missing value
mount: /mnt/pmem: wrong fs type, bad option, bad superblock on /dev/pmem0, 
missing codepage or helper program, or other error.
```

MCE is injected on a pmem address using mambo. The last patch which restores 
r13 is only for testing
on mambo, where r13 is not restored upon hittin vector 200.

The memcpy code can be optimised by adding VMX optimizations and GAS macros can 
be used to enable code
reusablity, which I will send as another series.

--
Balbir Singh (2):
  powerpc/mce: Bug fixes for MCE handling in kernel space
  powerpc/memcpy: Add memcpy_mcsafe for pmem

Reza Arbab (8):
  powerpc/mce: Make machine_check_ue_event() static
  powerpc/mce: Add MCE notification chain
  powerpc/mce: Move machine_check_ue_event() call
  powerpc/mce: Allow notifier callback to handle MCE
  powerpc/mce: Add fixup address to UE events
  powerpc/mce: Handle memcpy_mcsafe()
  powerpc/mce: Enable MCE notifiers in external modules
  powerpc/64s: Save r13 in machine_check_common_early

Santosh Sivaraj (2):
  powerpc/memcpy_mcsafe: return remaining bytes
  powerpc: add machine check safe copy_to_user

 arch/powerpc/Kconfig  |   1 +
 arch/powerpc/include/asm/asm-prototypes.h |   1 +
 arch/powerpc/include/asm/mce.h|  13 +-
 arch/powerpc/include/asm/string.h |   2 +
 arch/powerpc/include/asm/uaccess.h|  12 ++
 arch/powerpc/kernel/exceptions-64s.S  |  14 ++
 arch/powerpc/kernel/mce.c | 102 +-
 arch/powerpc/kernel/mce_power.c   |  26 ++-
 arch/powerpc/lib/Makefile |   2 +-
 arch/powerpc/lib/memcpy_mcsafe_64.S   | 226 ++
 arch/powerpc/platforms/pseries/ras.c  |   6 +-
 11 files changed, 386 insertions(+), 19 deletions(-)
 create mode 100644 arch/powerpc/lib/memcpy_mcsafe_64.S

-- 
2.20.1



Re: [RFC 02/11] powerpc/powernv/ioda: Protect PE list

2019-07-02 Thread Andrew Donnellan

On 19/6/19 11:28 pm, Frederic Barrat wrote:

Protect the PHB's list of PE. Probably not needed as long as it was
populated during PHB creation, but it feels right and will become
required once we can add/remove opencapi devices on hotplug.

Signed-off-by: Frederic Barrat 


Reviewed-by: Andrew Donnellan 


---
  arch/powerpc/platforms/powernv/pci-ioda.c | 6 +-
  1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c 
b/arch/powerpc/platforms/powernv/pci-ioda.c
index 3082912e2600..2c063b05bb64 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -1078,8 +1078,9 @@ static struct pnv_ioda_pe *pnv_ioda_setup_dev_PE(struct 
pci_dev *dev)
}
  
  	/* Put PE to the list */

+   mutex_lock(>ioda.pe_list_mutex);
list_add_tail(>list, >ioda.pe_list);
-
+   mutex_unlock(>ioda.pe_list_mutex);
return pe;
  }
  
@@ -3501,7 +3502,10 @@ static void pnv_ioda_release_pe(struct pnv_ioda_pe *pe)

struct pnv_phb *phb = pe->phb;
struct pnv_ioda_pe *slave, *tmp;
  
+	mutex_lock(>ioda.pe_list_mutex);

list_del(>list);
+   mutex_unlock(>ioda.pe_list_mutex);
+
switch (phb->type) {
case PNV_PHB_IODA1:
pnv_pci_ioda1_release_pe_dma(pe);



--
Andrew Donnellan  OzLabs, ADL Canberra
a...@linux.ibm.com IBM Australia Limited



Re: [v2 12/12] powerpc/64s: Save r13 in machine_check_common_early

2019-07-02 Thread Nicholas Piggin
Santosh Sivaraj's on July 2, 2019 3:19 pm:
> From: Reza Arbab 
> 
> Testing my memcpy_mcsafe() work in progress with an injected UE, I get
> an error like this immediately after the function returns:
> 
> BUG: Unable to handle kernel data access at 0x7fff84dec8f8
> Faulting instruction address: 0xc008009c00b0
> Oops: Kernel access of bad area, sig: 11 [#1]
> LE PAGE_SIZE=64K MMU=Radix MMU=Hash SMP NR_CPUS=2048 NUMA PowerNV
> Modules linked in: mce(O+) vmx_crypto crc32c_vpmsum
> CPU: 0 PID: 1375 Comm: modprobe Tainted: G   O  5.1.0-rc6 #267
> NIP:  c008009c00b0 LR: c008009c00a8 CTR: c0095f90
> REGS: c000ee197790 TRAP: 0300   Tainted: G   O   (5.1.0-rc6)
> MSR:  9280b033   CR: 88002826  
> XER: 0004
> CFAR: c0095f8c DAR: 7fff84dec8f8 DSISR: 4000 IRQMASK: 0
> GPR00: 6c6c6568 c000ee197a20 c008009c8400 fff2
> GPR04: c008009c02e0 0006  c3c834c8
> GPR08: 0080 776a6681b7fb5100  c008009c01c8
> GPR12: c0095f90 7fff84debc00 4d071440 
> GPR16: 00010601 c008009e c0c98dd8 c0c98d98
> GPR20: c3bba970 c008009c04d0 c008009c0618 c01e5820
> GPR24:  0100 0001 c3bba958
> GPR28: c008009c02e8 c008009c0318 c008009c02e0 
> NIP [c008009c00b0] cause_ue+0xa8/0xe8 [mce]
> LR [c008009c00a8] cause_ue+0xa0/0xe8 [mce]
> 
> To fix, ensure that r13 is properly restored after an MCE.
> 
> This commit is needed for testing this series, this is a possible simulator
> bug.

This introduces a bug, of course -- MCE occurring when r13 != PACA
will corrupt r13.

> ---
>  arch/powerpc/kernel/exceptions-64s.S | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/arch/powerpc/kernel/exceptions-64s.S 
> b/arch/powerpc/kernel/exceptions-64s.S
> index 311f1392a2ec..932d8d05892c 100644
> --- a/arch/powerpc/kernel/exceptions-64s.S
> +++ b/arch/powerpc/kernel/exceptions-64s.S
> @@ -265,6 +265,7 @@ ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE)
>  EXC_REAL_END(machine_check, 0x200, 0x100)
>  EXC_VIRT_NONE(0x4200, 0x100)
>  TRAMP_REAL_BEGIN(machine_check_common_early)
> + SET_SCRATCH0(r13)   /* save r13 */
>   EXCEPTION_PROLOG_1(PACA_EXMC, NOTEST, 0x200)
>   /*
>* Register contents:
> -- 
> 2.20.1
> 
> 


Re: [v2 09/12] powerpc/mce: Enable MCE notifiers in external modules

2019-07-02 Thread Nicholas Piggin
Santosh Sivaraj's on July 2, 2019 3:19 pm:
> From: Reza Arbab 
> 
> Signed-off-by: Reza Arbab 
> ---
>  arch/powerpc/kernel/exceptions-64s.S | 6 ++
>  arch/powerpc/kernel/mce.c| 2 ++
>  2 files changed, 8 insertions(+)
> 
> diff --git a/arch/powerpc/kernel/exceptions-64s.S 
> b/arch/powerpc/kernel/exceptions-64s.S
> index c83e38a403fd..311f1392a2ec 100644
> --- a/arch/powerpc/kernel/exceptions-64s.S
> +++ b/arch/powerpc/kernel/exceptions-64s.S
> @@ -458,6 +458,12 @@ EXC_COMMON_BEGIN(machine_check_handle_early)
>   bl  machine_check_early
>   std r3,RESULT(r1)   /* Save result */
>  
> + /* Notifiers may be in a module, so enable virtual addressing. */
> + mfmsr   r11
> + ori r11,r11,MSR_IR
> + ori r11,r11,MSR_DR
> + mtmsr   r11

Can't do this, we could take a machine check somewhere the MMU is
not sane (in fact the guest early mce handling that was added recently
should not be enabling virtual mode either, which needs to be fixed).

Thanks,
Nick