Re: [PATCH 1/2] powerpc/perf: Infrastructure to support checking of attr.config*

2021-02-24 Thread Madhavan Srinivasan



On 2/24/21 8:17 PM, Paul A. Clarke wrote:

On Wed, Feb 24, 2021 at 07:58:39PM +0530, Madhavan Srinivasan wrote:

Introduce code to support the checking of attr.config* for
values which are reserved for a given platform.
Performance Monitoring Unit (PMU) configuration registers
have fileds that are reserved and specific values to bit fields

s/fileds/fields/


as reserved. Writing a none zero values in these fields

Should the previous sentences say something like "required values
for specific bit fields" or "specific bit fields that are reserved"?

s/none zero/non-zero/


or writing invalid value to bit fields will have unknown
behaviours.

Patch here add a generic call-back function "check_attr_config"

s/add/adds/ or "This patch adds ..." or just "Add ...".



Thanks for the review. Will fix it.





in "struct power_pmu", to be called in event_init to
check for attr.config* values for a given platform.
"check_attr_config" is valid only for raw event type.

Suggested-by: Alexey Kardashevskiy 
Signed-off-by: Madhavan Srinivasan 
---
  arch/powerpc/include/asm/perf_event_server.h |  6 ++
  arch/powerpc/perf/core-book3s.c  | 12 
  2 files changed, 18 insertions(+)

diff --git a/arch/powerpc/include/asm/perf_event_server.h 
b/arch/powerpc/include/asm/perf_event_server.h
index 00e7e671bb4b..dde97d7d9253 100644
--- a/arch/powerpc/include/asm/perf_event_server.h
+++ b/arch/powerpc/include/asm/perf_event_server.h
@@ -67,6 +67,12 @@ struct power_pmu {
 * the pmu supports extended perf regs capability
 */
int capabilities;
+   /*
+* Function to check event code for values which are
+* reserved. Function takes struct perf_event as input,
+* since event code could be spread in attr.config*
+*/
+   int (*check_attr_config)(struct perf_event *ev);
  };

  /*
diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
index 6817331e22ff..679d67506299 100644
--- a/arch/powerpc/perf/core-book3s.c
+++ b/arch/powerpc/perf/core-book3s.c
@@ -1958,6 +1958,18 @@ static int power_pmu_event_init(struct perf_event *event)

if (ppmu->blacklist_ev && is_event_blacklisted(ev))
return -EINVAL;
+   /*
+* PMU config registers have fileds that are
+* reserved and spacific values to bit fileds be reserved.

s/spacific/specific/
s/fileds/fields/
Same comment about "specific values to bit fields be reserved", and
rewording that to be more clear.


+* This call-back will check the event code for same.
+*
+* Event type hardware and hw_cache will not value
+* invalid values in the event code which is not true
+* for raw event type.

I confess I don't understand what this means. (But it could be just me!)



My bad. What I wanted to say was, this check is needed only

for raw event type, since tools like fuzzer use it to provide

randomized event code values for test. Will fix the comment

Thanks for the review comments.





+*/
+   if (ppmu->check_attr_config &&
+   ppmu->check_attr_config(event))
+   return -EINVAL;
break;
default:
return -ENOENT;
--

PC


[PATCH] selftests/powerpc: Add uaccess flush test

2021-02-24 Thread Daniel Axtens
From: Thadeu Lima de Souza Cascardo 

Also based on the RFI and entry flush tests, it counts the L1D misses
by doing a syscall that does user access: uname, in this case.

Signed-off-by: Thadeu Lima de Souza Cascardo 
[dja: forward port, rename function]
Signed-off-by: Daniel Axtens 

---

This applies on top of Russell's change to use better constants:
https://patchwork.ozlabs.org/project/linuxppc-dev/patch/20210223070227.2916871-1-rus...@russell.cc/

It's possible that we could share some more code between the tests, but
it hardly seems worth it.
---
 .../selftests/powerpc/security/Makefile   |   3 +-
 .../selftests/powerpc/security/flush_utils.c  |  13 ++
 .../selftests/powerpc/security/flush_utils.h  |   3 +
 .../powerpc/security/uaccess_flush.c  | 158 ++
 4 files changed, 176 insertions(+), 1 deletion(-)
 create mode 100644 tools/testing/selftests/powerpc/security/uaccess_flush.c

diff --git a/tools/testing/selftests/powerpc/security/Makefile 
b/tools/testing/selftests/powerpc/security/Makefile
index f25e854fe370..844d18cd5f93 100644
--- a/tools/testing/selftests/powerpc/security/Makefile
+++ b/tools/testing/selftests/powerpc/security/Makefile
@@ -1,6 +1,6 @@
 # SPDX-License-Identifier: GPL-2.0+
 
-TEST_GEN_PROGS := rfi_flush entry_flush spectre_v2
+TEST_GEN_PROGS := rfi_flush entry_flush uaccess_flush spectre_v2
 top_srcdir = ../../../../..
 
 CFLAGS += -I../../../../../usr/include
@@ -13,3 +13,4 @@ $(OUTPUT)/spectre_v2: CFLAGS += -m64
 $(OUTPUT)/spectre_v2: ../pmu/event.c branch_loops.S
 $(OUTPUT)/rfi_flush: flush_utils.c
 $(OUTPUT)/entry_flush: flush_utils.c
+$(OUTPUT)/uaccess_flush: flush_utils.c
diff --git a/tools/testing/selftests/powerpc/security/flush_utils.c 
b/tools/testing/selftests/powerpc/security/flush_utils.c
index 0c3c4c40c7fb..4d95965cb751 100644
--- a/tools/testing/selftests/powerpc/security/flush_utils.c
+++ b/tools/testing/selftests/powerpc/security/flush_utils.c
@@ -13,6 +13,7 @@
 #include 
 #include 
 #include 
+#include 
 #include "utils.h"
 #include "flush_utils.h"
 
@@ -35,6 +36,18 @@ void syscall_loop(char *p, unsigned long iterations,
}
 }
 
+void syscall_loop_uaccess(char *p, unsigned long iterations,
+ unsigned long zero_size)
+{
+   struct utsname utsname;
+
+   for (unsigned long i = 0; i < iterations; i++) {
+   for (unsigned long j = 0; j < zero_size; j += CACHELINE_SIZE)
+   load(p + j);
+   uname();
+   }
+}
+
 static void sigill_handler(int signr, siginfo_t *info, void *unused)
 {
static int warned;
diff --git a/tools/testing/selftests/powerpc/security/flush_utils.h 
b/tools/testing/selftests/powerpc/security/flush_utils.h
index 7a3d60292916..e1e68281f7ac 100644
--- a/tools/testing/selftests/powerpc/security/flush_utils.h
+++ b/tools/testing/selftests/powerpc/security/flush_utils.h
@@ -16,6 +16,9 @@
 void syscall_loop(char *p, unsigned long iterations,
  unsigned long zero_size);
 
+void syscall_loop_uaccess(char *p, unsigned long iterations,
+ unsigned long zero_size);
+
 void set_dscr(unsigned long val);
 
 #endif /* _SELFTESTS_POWERPC_SECURITY_FLUSH_UTILS_H */
diff --git a/tools/testing/selftests/powerpc/security/uaccess_flush.c 
b/tools/testing/selftests/powerpc/security/uaccess_flush.c
new file mode 100644
index ..cf80f960e38a
--- /dev/null
+++ b/tools/testing/selftests/powerpc/security/uaccess_flush.c
@@ -0,0 +1,158 @@
+// SPDX-License-Identifier: GPL-2.0+
+
+/*
+ * Copyright 2018 IBM Corporation.
+ * Copyright 2020 Canonical Ltd.
+ */
+
+#define __SANE_USERSPACE_TYPES__
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include "utils.h"
+#include "flush_utils.h"
+
+int uaccess_flush_test(void)
+{
+   char *p;
+   int repetitions = 10;
+   int fd, passes = 0, iter, rc = 0;
+   struct perf_event_read v;
+   __u64 l1d_misses_total = 0;
+   unsigned long iterations = 10, zero_size = 24 * 1024;
+   unsigned long l1d_misses_expected;
+   int rfi_flush_orig;
+   int entry_flush_orig;
+   int uaccess_flush, uaccess_flush_orig;
+
+   SKIP_IF(geteuid() != 0);
+
+   // The PMU event we use only works on Power7 or later
+   SKIP_IF(!have_hwcap(PPC_FEATURE_ARCH_2_06));
+
+   if (read_debugfs_file("powerpc/rfi_flush", _flush_orig) < 0) {
+   perror("Unable to read powerpc/rfi_flush debugfs file");
+   SKIP_IF(1);
+   }
+
+   if (read_debugfs_file("powerpc/entry_flush", _flush_orig) < 0) {
+   perror("Unable to read powerpc/entry_flush debugfs file");
+   SKIP_IF(1);
+   }
+
+   if (read_debugfs_file("powerpc/uaccess_flush", _flush_orig) < 
0) {
+   perror("Unable to read powerpc/entry_flush debugfs file");
+   SKIP_IF(1);
+   }
+
+   if (rfi_flush_orig != 0) {
+   if 

[PATCH] docs: powerpc: Fix tables in syscall64-abi.rst

2021-02-24 Thread Andrew Donnellan
Commit 209b44c804c ("docs: powerpc: syscall64-abi.rst: fix a malformed
table") attempted to fix the formatting of tables in syscall64-abi.rst, but
inadvertently changed some register names.

Redo the tables with the correct register names, and while we're here,
clean things up to separate the registers into different rows and add
headings.

Fixes: 209b44c804c ("docs: powerpc: syscall64-abi.rst: fix a malformed table")
Signed-off-by: Andrew Donnellan 
---
 Documentation/powerpc/syscall64-abi.rst | 51 -
 1 file changed, 32 insertions(+), 19 deletions(-)

diff --git a/Documentation/powerpc/syscall64-abi.rst 
b/Documentation/powerpc/syscall64-abi.rst
index cf9b2857c72a..dabee3729e5a 100644
--- a/Documentation/powerpc/syscall64-abi.rst
+++ b/Documentation/powerpc/syscall64-abi.rst
@@ -46,25 +46,38 @@ stack frame LR and CR save fields are not used.
 
 Register preservation rules
 ---
-Register preservation rules match the ELF ABI calling sequence with the
-following differences:
-
-++
-|For the sc instruction, differences with the ELF ABI   |
-+--+--+--+
-| r0   | Volatile | (System call number.)   |
-| rr3  | Volatile | (Parameter 1, and return value.)|
-| rr4-r8   | Volatile | (Parameters 2-6.)   |
-| rcr0 | Volatile | (cr0.SO is the return error condition.)
 |
-| rcr1, cr5-7  | Nonvolatile  |
 |
-| rlr  | Nonvolatile  |
 |
-+--+--+--+
-|  For the scv 0 instruction, differences with the ELF ABI  |
-+--+--+--+
-| r0   | Volatile | (System call number.)   |
-| r3   | Volatile | (Parameter 1, and return value.)|
-| r4-r8| Volatile | (Parameters 2-6.)   |
-+--+--+--+
+Register preservation rules match the ELF ABI calling sequence with some
+differences.
+
+For the sc instruction, the differences from the ELF ABI are as follows:
+
++--++-+
+| Register | Preservation Rules | Purpose |
++==++=+
+| r0   | Volatile   | (System call number.)   |
++--++-+
+| r3   | Volatile   | (Parameter 1, and return value.)|
++--++-+
+| r4-r8| Volatile   | (Parameters 2-6.)   |
++--++-+
+| cr0  | Volatile   | (cr0.SO is the return error condition.) |
++--++-+
+| cr1, cr5-7   | Nonvolatile| |
++--++-+
+| lr   | Nonvolatile| |
++--++-+
+
+For the scv 0 instruction, the differences from the ELF ABI are as follows:
+
++--++-+
+| Register | Preservation Rules | Purpose |
++==++=+
+| r0   | Volatile   | (System call number.)   |
++--++-+
+| r3   | Volatile   | (Parameter 1, and return value.)|
++--++-+
+| r4-r8| Volatile   | (Parameters 2-6.)   |
++--++-+
 
 All floating point and vector data registers as well as control and status
 registers are nonvolatile.
-- 
2.20.1



Re: [PATCH v6 07/10] powerpc/signal64: Replace restore_sigcontext() w/ unsafe_restore_sigcontext()

2021-02-24 Thread Christopher M. Riedl
On Tue Feb 23, 2021 at 11:36 AM CST, Christophe Leroy wrote:
>
>
> Le 21/02/2021 à 02:23, Christopher M. Riedl a écrit :
> > Previously restore_sigcontext() performed a costly KUAP switch on every
> > uaccess operation. These repeated uaccess switches cause a significant
> > drop in signal handling performance.
> > 
> > Rewrite restore_sigcontext() to assume that a userspace read access
> > window is open by replacing all uaccess functions with their 'unsafe'
> > versions. Modify the callers to first open, call
> > unsafe_restore_sigcontext(), and then close the uaccess window.
> > 
> > Signed-off-by: Christopher M. Riedl 
> > ---
> >   arch/powerpc/kernel/signal_64.c | 68 -
> >   1 file changed, 41 insertions(+), 27 deletions(-)
> > 
> > diff --git a/arch/powerpc/kernel/signal_64.c 
> > b/arch/powerpc/kernel/signal_64.c
> > index 3faaa736ed62..76b525261f61 100644
> > --- a/arch/powerpc/kernel/signal_64.c
> > +++ b/arch/powerpc/kernel/signal_64.c
> > @@ -326,14 +326,14 @@ static long setup_tm_sigcontexts(struct sigcontext 
> > __user *sc,
> >   /*
> >* Restore the sigcontext from the signal frame.
> >*/
> > -
> > -static long restore_sigcontext(struct task_struct *tsk, sigset_t *set, int 
> > sig,
> > - struct sigcontext __user *sc)
> > +#define unsafe_restore_sigcontext(tsk, set, sig, sc, e) \
> > +   unsafe_op_wrap(__unsafe_restore_sigcontext(tsk, set, sig, sc), e)
>
> unsafe_op_wrap() was not initially meant to be used outside of uaccess.h
>
> In the begining, it has been copied from include/linux/uaccess.h and was
> used
> for unsafe_put_user(), unsafe_get_user() and unsafe_copy_to_user().
> After other changes, only
> unsafe_get_user() is still using it and I'm going to drop
> unsafe_op_wrap() soon.
>
> I'd prefer if you can do the same as unsafe_save_general_regs() and
> others in signal_32.c

Sounds good, will change this in the next version (and also the wrapper
around unsafe_setup_sigcontext()).

>
> > +static long notrace __unsafe_restore_sigcontext(struct task_struct *tsk, 
> > sigset_t *set,
> > +   int sig, struct sigcontext 
> > __user *sc)
> >   {
> >   #ifdef CONFIG_ALTIVEC
> > elf_vrreg_t __user *v_regs;
> >   #endif
> > -   unsigned long err = 0;
> > unsigned long save_r13 = 0;
> > unsigned long msr;
> > struct pt_regs *regs = tsk->thread.regs;
> > @@ -348,27 +348,28 @@ static long restore_sigcontext(struct task_struct 
> > *tsk, sigset_t *set, int sig,
> > save_r13 = regs->gpr[13];
> >   
> > /* copy the GPRs */
> > -   err |= __copy_from_user(regs->gpr, sc->gp_regs, sizeof(regs->gpr));
> > -   err |= __get_user(regs->nip, >gp_regs[PT_NIP]);
> > +   unsafe_copy_from_user(regs->gpr, sc->gp_regs, sizeof(regs->gpr),
> > + efault_out);
>
> I think it would be better to keep the above on a single line for
> readability.
> Nowadays we tolerate 100 chars lines for cases like this one.

Ok, changed this (and the line you mention further below) in the next
version.

>
> > +   unsafe_get_user(regs->nip, >gp_regs[PT_NIP], efault_out);
> > /* get MSR separately, transfer the LE bit if doing signal return */
> > -   err |= __get_user(msr, >gp_regs[PT_MSR]);
> > +   unsafe_get_user(msr, >gp_regs[PT_MSR], efault_out);
> > if (sig)
> > regs->msr = (regs->msr & ~MSR_LE) | (msr & MSR_LE);
> > -   err |= __get_user(regs->orig_gpr3, >gp_regs[PT_ORIG_R3]);
> > -   err |= __get_user(regs->ctr, >gp_regs[PT_CTR]);
> > -   err |= __get_user(regs->link, >gp_regs[PT_LNK]);
> > -   err |= __get_user(regs->xer, >gp_regs[PT_XER]);
> > -   err |= __get_user(regs->ccr, >gp_regs[PT_CCR]);
> > +   unsafe_get_user(regs->orig_gpr3, >gp_regs[PT_ORIG_R3], efault_out);
> > +   unsafe_get_user(regs->ctr, >gp_regs[PT_CTR], efault_out);
> > +   unsafe_get_user(regs->link, >gp_regs[PT_LNK], efault_out);
> > +   unsafe_get_user(regs->xer, >gp_regs[PT_XER], efault_out);
> > +   unsafe_get_user(regs->ccr, >gp_regs[PT_CCR], efault_out);
> > /* Don't allow userspace to set SOFTE */
> > set_trap_norestart(regs);
> > -   err |= __get_user(regs->dar, >gp_regs[PT_DAR]);
> > -   err |= __get_user(regs->dsisr, >gp_regs[PT_DSISR]);
> > -   err |= __get_user(regs->result, >gp_regs[PT_RESULT]);
> > +   unsafe_get_user(regs->dar, >gp_regs[PT_DAR], efault_out);
> > +   unsafe_get_user(regs->dsisr, >gp_regs[PT_DSISR], efault_out);
> > +   unsafe_get_user(regs->result, >gp_regs[PT_RESULT], efault_out);
> >   
> > if (!sig)
> > regs->gpr[13] = save_r13;
> > if (set != NULL)
> > -   err |=  __get_user(set->sig[0], >oldmask);
> > +   unsafe_get_user(set->sig[0], >oldmask, efault_out);
> >   
> > /*
> >  * Force reload of FP/VEC.
> > @@ -378,29 +379,28 @@ static long restore_sigcontext(struct task_struct 
> > *tsk, sigset_t *set, int sig,
> > regs->msr &= ~(MSR_FP | MSR_FE0 | MSR_FE1 | MSR_VEC | MSR_VSX);
> >   
> >   

[PATCH 3/3] powerpc/sstep: Always test lmw and stmw

2021-02-24 Thread Jordan Niethe
Load Multiple Word (lmw) and Store Multiple Word (stmw) will raise an
Alignment Exception:
  - Little Endian mode: always
  - Big Endian mode: address not word aligned

These conditions do not depend on cache inhibited memory. Test the
alignment handler emulation of these instructions regardless of if there
is cache inhibited memory available or not.

Signed-off-by: Jordan Niethe 
---
 .../powerpc/alignment/alignment_handler.c | 96 ++-
 1 file changed, 94 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/powerpc/alignment/alignment_handler.c 
b/tools/testing/selftests/powerpc/alignment/alignment_handler.c
index f5eb5b85a2cf..c3003f95e043 100644
--- a/tools/testing/selftests/powerpc/alignment/alignment_handler.c
+++ b/tools/testing/selftests/powerpc/alignment/alignment_handler.c
@@ -45,6 +45,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "utils.h"
 #include "instructions.h"
@@ -434,7 +435,6 @@ int test_alignment_handler_integer(void)
LOAD_DFORM_TEST(ldu);
LOAD_XFORM_TEST(ldx);
LOAD_XFORM_TEST(ldux);
-   LOAD_DFORM_TEST(lmw);
STORE_DFORM_TEST(stb);
STORE_XFORM_TEST(stbx);
STORE_DFORM_TEST(stbu);
@@ -453,7 +453,6 @@ int test_alignment_handler_integer(void)
STORE_XFORM_TEST(stdx);
STORE_DFORM_TEST(stdu);
STORE_XFORM_TEST(stdux);
-   STORE_DFORM_TEST(stmw);
 
return rc;
 }
@@ -599,6 +598,97 @@ int test_alignment_handler_fp_prefix(void)
return rc;
 }
 
+int test_alignment_handler_multiple(void)
+{
+   int offset, width, r, rc = 0;
+   void *src1, *dst1, *src2, *dst2;
+
+   rc = posix_memalign(, bufsize, bufsize);
+   if (rc) {
+   printf("\n");
+   return rc;
+   }
+
+   rc = posix_memalign(, bufsize, bufsize);
+   if (rc) {
+   printf("\n");
+   free(src1);
+   return rc;
+   }
+
+   src2 = malloc(bufsize);
+   if (!src2) {
+   printf("\n");
+   free(src1);
+   free(dst1);
+   return -ENOMEM;
+   }
+
+   dst2 = malloc(bufsize);
+   if (!dst2) {
+   printf("\n");
+   free(src1);
+   free(dst1);
+   free(src2);
+   return -ENOMEM;
+   }
+
+   /* lmw */
+   width = 4;
+   printf("\tDoing lmw:\t");
+   for (offset = 0; offset < width; offset++) {
+   preload_data(src1, offset, width);
+   preload_data(src2, offset, width);
+
+   asm volatile("lmw  31, 0(%0) ; std 31, 0(%1)"
+:: "r"(src1 + offset), "r"(dst1 + offset), "r"(0)
+: "memory", "r31");
+
+   memcpy(dst2 + offset, src1 + offset, width);
+
+   r = test_memcmp(dst1, dst2, width, offset, "test_lmw");
+   if (r && !debug) {
+   printf("FAILED: Wrong Data\n");
+   break;
+   }
+   }
+
+   if (!r)
+   printf("PASSED\n");
+   else
+   rc |= 1;
+
+   /* stmw */
+   width = 4;
+   printf("\tDoing stmw:\t");
+   for (offset = 0; offset < width; offset++) {
+   preload_data(src1, offset, width);
+   preload_data(src2, offset, width);
+
+   asm volatile("ld  31, 0(%0) ; stmw 31, 0(%1)"
+:: "r"(src1 + offset), "r"(dst1 + offset), "r"(0)
+: "memory", "r31");
+
+   memcpy(dst2 + offset, src1 + offset, width);
+
+   r = test_memcmp(dst1, dst2, width, offset, "test_stmw");
+   if (r && !debug) {
+   printf("FAILED: Wrong Data\n");
+   break;
+   }
+   }
+   if (!r)
+   printf("PASSED\n");
+   else
+   rc |= 1;
+
+   free(src1);
+   free(src2);
+   free(dst1);
+   free(dst2);
+   return rc;
+}
+
 void usage(char *prog)
 {
printf("Usage: %s [options] [path [offset]]\n", prog);
@@ -673,5 +763,7 @@ int main(int argc, char *argv[])
   "test_alignment_handler_fp_206");
rc |= test_harness(test_alignment_handler_fp_prefix,
   "test_alignment_handler_fp_prefix");
+   rc |= test_harness(test_alignment_handler_multiple,
+  "test_alignment_handler_multiple");
return rc;
 }
-- 
2.25.1



[PATCH 2/3] selftests/powerpc: Suggest memtrace instead of /dev/mem for ci memory

2021-02-24 Thread Jordan Niethe
The suggested alternative for getting cache-inhibited memory with 'mem='
and /dev/mem is pretty hacky. Also, PAPR guests do not allow system
memory to be mapped cache-inhibited so despite /dev/mem being available
this will not work which can cause confusion.  Instead recommend using
the memtrace buffers. memtrace is only available on powernv so there
will not be any chance of trying to do this in a guest.

Signed-off-by: Jordan Niethe 
---
 .../selftests/powerpc/alignment/alignment_handler.c   | 11 +--
 1 file changed, 1 insertion(+), 10 deletions(-)

diff --git a/tools/testing/selftests/powerpc/alignment/alignment_handler.c 
b/tools/testing/selftests/powerpc/alignment/alignment_handler.c
index cb53a8b777e6..f5eb5b85a2cf 100644
--- a/tools/testing/selftests/powerpc/alignment/alignment_handler.c
+++ b/tools/testing/selftests/powerpc/alignment/alignment_handler.c
@@ -10,16 +10,7 @@
  *
  * We create two sets of source and destination buffers, one in regular memory,
  * the other cache-inhibited (by default we use /dev/fb0 for this, but an
- * alterative path for cache-inhibited memory may be provided).
- *
- * One way to get cache-inhibited memory is to use the "mem" kernel parameter
- * to limit the kernel to less memory than actually exists.  Addresses above
- * the limit may still be accessed but will be treated as cache-inhibited. For
- * example, if there is actually 4GB of memory and the parameter "mem=3GB" is
- * used, memory from address 0xC000 onwards is treated as cache-inhibited.
- * To access this region /dev/mem is used. The kernel should be configured
- * without CONFIG_STRICT_DEVMEM. In this case use:
- * ./alignment_handler /dev/mem 0xc000
+ * alterative path for cache-inhibited memory may be provided, e.g. memtrace).
  *
  * We initialise the source buffers, then use whichever set of load/store
  * instructions is under test to copy bytes from the source buffers to the
-- 
2.25.1



[PATCH 1/3] powernv/memtrace: Allow mmaping trace buffers

2021-02-24 Thread Jordan Niethe
Let the memory removed from the linear mapping to be used for the trace
buffers be mmaped. This is a useful way of providing cache-inhibited
memory for the alignment_handler selftest.

Signed-off-by: Jordan Niethe 
---
 arch/powerpc/platforms/powernv/memtrace.c | 18 +-
 1 file changed, 17 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/powernv/memtrace.c 
b/arch/powerpc/platforms/powernv/memtrace.c
index 5fc9408bb0b3..8a1df39305e9 100644
--- a/arch/powerpc/platforms/powernv/memtrace.c
+++ b/arch/powerpc/platforms/powernv/memtrace.c
@@ -45,10 +45,26 @@ static ssize_t memtrace_read(struct file *filp, char __user 
*ubuf,
return simple_read_from_buffer(ubuf, count, ppos, ent->mem, ent->size);
 }
 
+int memtrace_mmap(struct file *filp, struct vm_area_struct *vma)
+{
+   struct memtrace_entry *ent = filp->private_data;
+
+   if (ent->size < vma->vm_end - vma->vm_start)
+   return -EINVAL;
+
+   if (vma->vm_pgoff << PAGE_SHIFT >= ent->size)
+   return -EINVAL;
+
+   vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
+   return remap_pfn_range(vma, vma->vm_start, PHYS_PFN(ent->start) + 
vma->vm_pgoff,
+  vma->vm_end - vma->vm_start, vma->vm_page_prot);
+}
+
 static const struct file_operations memtrace_fops = {
.llseek = default_llseek,
.read   = memtrace_read,
.open   = simple_open,
+   .mmap   = memtrace_mmap,
 };
 
 static void memtrace_clear_range(unsigned long start_pfn,
@@ -158,7 +174,7 @@ static int memtrace_init_debugfs(void)
dir = debugfs_create_dir(ent->name, memtrace_debugfs_dir);
 
ent->dir = dir;
-   debugfs_create_file("trace", 0400, dir, ent, _fops);
+   debugfs_create_file_unsafe("trace", 0600, dir, ent, 
_fops);
debugfs_create_x64("start", 0400, dir, >start);
debugfs_create_x64("size", 0400, dir, >size);
}
-- 
2.25.1



[PATCH] powerpc/sstep: Fix VSX instruction emulation

2021-02-24 Thread Jordan Niethe
Commit af99da74333b ("powerpc/sstep: Support VSX vector paired storage
access instructions") added loading and storing 32 word long data into
adjacent VSRs. However the calculation used to determine if two VSRs
needed to be loaded/stored inadvertently prevented the load/storing
taking place for instructions with a data length less than 16 words.

This causes the emulation to not function correctly, which can be seen
by the alignment_handler selftest:

$ ./alignment_handler
[snip]
test: test_alignment_handler_vsx_207
tags: git_version:powerpc-5.12-1-0-g82d2c16b350f
VSX: 2.07B
Doing lxsspx:   PASSED
Doing lxsiwax:  FAILED: Wrong Data
Doing lxsiwzx:  PASSED
Doing stxsspx:  PASSED
Doing stxsiwx:  PASSED
failure: test_alignment_handler_vsx_207
test: test_alignment_handler_vsx_300
tags: git_version:powerpc-5.12-1-0-g82d2c16b350f
VSX: 3.00B
Doing lxsd: PASSED
Doing lxsibzx:  PASSED
Doing lxsihzx:  PASSED
Doing lxssp:FAILED: Wrong Data
Doing lxv:  PASSED
Doing lxvb16x:  PASSED
Doing lxvh8x:   PASSED
Doing lxvx: PASSED
Doing lxvwsx:   FAILED: Wrong Data
Doing lxvl: PASSED
Doing lxvll:PASSED
Doing stxsd:PASSED
Doing stxsibx:  PASSED
Doing stxsihx:  PASSED
Doing stxssp:   PASSED
Doing stxv: PASSED
Doing stxvb16x: PASSED
Doing stxvh8x:  PASSED
Doing stxvx:PASSED
Doing stxvl:PASSED
Doing stxvll:   PASSED
failure: test_alignment_handler_vsx_300
[snip]

Fix this by making sure all VSX instruction emulation correctly
load/store from the VSRs.

Fixes: af99da74333b ("powerpc/sstep: Support VSX vector paired storage access 
instructions")
Signed-off-by: Jordan Niethe 
---
 arch/powerpc/lib/sstep.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
index 683f7c20f74b..3953e63bbba5 100644
--- a/arch/powerpc/lib/sstep.c
+++ b/arch/powerpc/lib/sstep.c
@@ -902,7 +902,7 @@ static nokprobe_inline int do_vsx_load(struct 
instruction_op *op,
if (!address_ok(regs, ea, size) || copy_mem_in(mem, ea, size, regs))
return -EFAULT;
 
-   nr_vsx_regs = size / sizeof(__vector128);
+   nr_vsx_regs = max(1ul, size / sizeof(__vector128));
emulate_vsx_load(op, buf, mem, cross_endian);
preempt_disable();
if (reg < 32) {
@@ -949,7 +949,7 @@ static nokprobe_inline int do_vsx_store(struct 
instruction_op *op,
if (!address_ok(regs, ea, size))
return -EFAULT;
 
-   nr_vsx_regs = size / sizeof(__vector128);
+   nr_vsx_regs = max(1ul, size / sizeof(__vector128));
preempt_disable();
if (reg < 32) {
/* FP regs + extensions */
-- 
2.25.1



[RFC PATCH 8/8] powerpc/64/asm: don't reassign labels

2021-02-24 Thread Daniel Axtens
The assembler really does not like us reassigning things to the same
label:

:7:9: error: invalid reassignment of non-absolute variable 
'fs_label'

This happens across a bunch of platforms:
https://github.com/ClangBuiltLinux/linux/issues/1043
https://github.com/ClangBuiltLinux/linux/issues/1008
https://github.com/ClangBuiltLinux/linux/issues/920
https://github.com/ClangBuiltLinux/linux/issues/1050

There is no hope of getting this fixed in LLVM, so if we want to build
with LLVM_IAS, we need to hack around it ourselves.

For us the big problem comes from this:

\#define USE_FIXED_SECTION(sname)   \
fs_label = start_##sname;   \
fs_start = sname##_start;   \
use_ftsec sname;

\#define USE_TEXT_SECTION()
fs_label = start_text;  \
fs_start = text_start;  \
.text

and in particular fs_label.

I have tried to work around it by not setting those 'variables', and
requiring that users of the variables instead track for themselves
what section they are in. This isn't amazing, by any stretch, but it
gets us further in the compilation.

I'm still stuck with the following from head_64.S:

.balign 8
p_end: .8byte _end - copy_to_here

4:
/*
 * Now copy the rest of the kernel up to _end, add
 * _end - copy_to_here to the copy limit and run again.
 */
addis   r8,r26,(ABS_ADDR(p_end, text))@ha
ld  r8,(ABS_ADDR(p_end, text))@l(r8)
add r5,r5,r8
5:  bl  copy_and_flush  /* copy the rest */

9:  b   start_here_multiplatform

Clang does not like this code - in particular it complains about the addis, 
saying

:0: error: expected relocatable expression

I don't know what's special about p_end, because just above we do an
ABS_ADDR(4f, text) and that seems to work just fine.

Signed-off-by: Daniel Axtens 
---
 arch/powerpc/include/asm/head-64.h   | 12 +--
 arch/powerpc/kernel/exceptions-64s.S | 31 ++--
 arch/powerpc/kernel/head_64.S| 16 +++---
 3 files changed, 29 insertions(+), 30 deletions(-)

diff --git a/arch/powerpc/include/asm/head-64.h 
b/arch/powerpc/include/asm/head-64.h
index 7d8ccab47e86..43849a777f91 100644
--- a/arch/powerpc/include/asm/head-64.h
+++ b/arch/powerpc/include/asm/head-64.h
@@ -98,13 +98,9 @@ linker_stub_catch:   
\
. = sname##_len;
 
 #define USE_FIXED_SECTION(sname)   \
-   fs_label = start_##sname;   \
-   fs_start = sname##_start;   \
use_ftsec sname;
 
 #define USE_TEXT_SECTION() \
-   fs_label = start_text;  \
-   fs_start = text_start;  \
.text
 
 #define CLOSE_FIXED_SECTION(sname) \
@@ -161,13 +157,15 @@ end_##sname:
  * - ABS_ADDR is used to find the absolute address of any symbol, from within
  *   a fixed section.
  */
-#define DEFINE_FIXED_SYMBOL(label) \
-   label##_absolute = (label - fs_label + fs_start)
+// define label as being _in_ sname
+#define DEFINE_FIXED_SYMBOL(label, sname) \
+   label##_absolute = (label - start_ ## sname + sname ## _start)
 
 #define FIXED_SYMBOL_ABS_ADDR(label)   \
(label##_absolute)
 
-#define ABS_ADDR(label) (label - fs_label + fs_start)
+// find label from _within_ sname
+#define ABS_ADDR(label, sname) (label - start_ ## sname + sname ## _start)
 
 #endif /* __ASSEMBLY__ */
 
diff --git a/arch/powerpc/kernel/exceptions-64s.S 
b/arch/powerpc/kernel/exceptions-64s.S
index 720fb9892745..295d90202665 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -64,7 +64,7 @@
.balign IFETCH_ALIGN_BYTES; \
.global name;   \
_ASM_NOKPROBE_SYMBOL(name); \
-   DEFINE_FIXED_SYMBOL(name);  \
+   DEFINE_FIXED_SYMBOL(name, text);\
 name:
 
 #define TRAMP_REAL_BEGIN(name) \
@@ -92,18 +92,18 @@ name:
ld  reg,PACAKBASE(r13); /* get high part of  */   \
ori reg,reg,FIXED_SYMBOL_ABS_ADDR(label)
 
-#define __LOAD_HANDLER(reg, label) \
+#define __LOAD_HANDLER(reg, label, section)
\
ld  reg,PACAKBASE(r13); \
-   ori reg,reg,(ABS_ADDR(label))@l
+   ori reg,reg,(ABS_ADDR(label, section))@l
 
 /*
  * Branches from unrelocated code (e.g., interrupts) to labels outside
  * head-y require >64K offsets.
  

[RFC PATCH 7/8] powerpc/purgatory: drop .machine specifier

2021-02-24 Thread Daniel Axtens
It's ignored by future versions of llvm's integrated assembler (by not -11).
I'm not sure what it does for us in gas.

Signed-off-by: Daniel Axtens 
---
 arch/powerpc/purgatory/trampoline_64.S | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/purgatory/trampoline_64.S 
b/arch/powerpc/purgatory/trampoline_64.S
index d956b8a35fd1..e6a2740a5da0 100644
--- a/arch/powerpc/purgatory/trampoline_64.S
+++ b/arch/powerpc/purgatory/trampoline_64.S
@@ -12,7 +12,7 @@
 #include 
 #include 
 
-   .machine ppc64
+//upgrade clang, gets ignored  .machine ppc64
.balign 256
.globl purgatory_start
 purgatory_start:
-- 
2.27.0



[RFC PATCH 6/8] powerpc/mm/book3s64/hash: drop pre 2.06 tlbiel for clang

2021-02-24 Thread Daniel Axtens
The llvm integrated assembler does not recognise the ISA 2.05 tlbiel
version. Eventually do this more smartly.

Signed-off-by: Daniel Axtens 
---
 arch/powerpc/mm/book3s64/hash_native.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/arch/powerpc/mm/book3s64/hash_native.c 
b/arch/powerpc/mm/book3s64/hash_native.c
index 52e170bd95ae..c5937f69a452 100644
--- a/arch/powerpc/mm/book3s64/hash_native.c
+++ b/arch/powerpc/mm/book3s64/hash_native.c
@@ -267,9 +267,14 @@ static inline void __tlbiel(unsigned long vpn, int psize, 
int apsize, int ssize)
va |= ssize << 8;
sllp = get_sllp_encoding(apsize);
va |= sllp << 5;
+#if 0
asm volatile(ASM_FTR_IFSET("tlbiel %0", "tlbiel %0,0", %1)
 : : "r" (va), "i" (CPU_FTR_ARCH_206)
 : "memory");
+#endif
+   asm volatile("tlbiel %0"
+: : "r" (va)
+: "memory");
break;
default:
/* We need 14 to 14 + i bits of va */
@@ -286,9 +291,14 @@ static inline void __tlbiel(unsigned long vpn, int psize, 
int apsize, int ssize)
 */
va |= (vpn & 0xfe);
va |= 1; /* L */
+#if 0
asm volatile(ASM_FTR_IFSET("tlbiel %0", "tlbiel %0,1", %1)
 : : "r" (va), "i" (CPU_FTR_ARCH_206)
 : "memory");
+#endif
+   asm volatile("tlbiel %0"
+: : "r" (va)
+: "memory");
break;
}
trace_tlbie(0, 1, va, 0, 0, 0, 0);
-- 
2.27.0



[RFC PATCH 5/8] poweprc/lib/quad: Provide macros for lq/stq

2021-02-24 Thread Daniel Axtens
For some reason the integrated assembler in clang-11 doesn't recognise
them. Eventually we should fix it there too.

Signed-off-by: Daniel Axtens 
---
 arch/powerpc/include/asm/ppc-opcode.h | 4 
 arch/powerpc/lib/quad.S   | 4 ++--
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/ppc-opcode.h 
b/arch/powerpc/include/asm/ppc-opcode.h
index ed161ef2b3ca..a5249631cb83 100644
--- a/arch/powerpc/include/asm/ppc-opcode.h
+++ b/arch/powerpc/include/asm/ppc-opcode.h
@@ -339,11 +339,13 @@
 #define PPC_RAW_DARN(t, l) (0x7c0005e6 | ___PPC_RT(t) | (((l) & 
0x3) << 16))
 #define PPC_RAW_DCBAL(a, b)(0x7c2005ec | __PPC_RA(a) | __PPC_RB(b))
 #define PPC_RAW_DCBZL(a, b)(0x7c2007ec | __PPC_RA(a) | __PPC_RB(b))
+#define PPC_RAW_LQ(t, a, dq)   (0xe000 | ___PPC_RT(t) | 
___PPC_RA(a) | (((dq) & 0xfff) << 3))
 #define PPC_RAW_LQARX(t, a, b, eh) (0x7c000228 | ___PPC_RT(t) | 
___PPC_RA(a) | ___PPC_RB(b) | __PPC_EH(eh))
 #define PPC_RAW_LDARX(t, a, b, eh) (0x7ca8 | ___PPC_RT(t) | 
___PPC_RA(a) | ___PPC_RB(b) | __PPC_EH(eh))
 #define PPC_RAW_LWARX(t, a, b, eh) (0x7c28 | ___PPC_RT(t) | 
___PPC_RA(a) | ___PPC_RB(b) | __PPC_EH(eh))
 #define PPC_RAW_PHWSYNC(0x7c8004ac)
 #define PPC_RAW_PLWSYNC(0x7ca004ac)
+#define PPC_RAW_STQ(t, a, ds)  (0xf802 | ___PPC_RT(t) | 
___PPC_RA(a) | (((ds) & 0xfff) << 3))
 #define PPC_RAW_STQCX(t, a, b) (0x7c00016d | ___PPC_RT(t) | 
___PPC_RA(a) | ___PPC_RB(b))
 #define PPC_RAW_MADDHD(t, a, b, c) (0x1030 | ___PPC_RT(t) | 
___PPC_RA(a) | ___PPC_RB(b) | ___PPC_RC(c))
 #define PPC_RAW_MADDHDU(t, a, b, c)(0x1031 | ___PPC_RT(t) | 
___PPC_RA(a) | ___PPC_RB(b) | ___PPC_RC(c))
@@ -530,9 +532,11 @@
 #definePPC_DCBZL(a, b) stringify_in_c(.long PPC_RAW_DCBZL(a, 
b))
 #definePPC_DIVDE(t, a, b)  stringify_in_c(.long PPC_RAW_DIVDE(t, 
a, b))
 #definePPC_DIVDEU(t, a, b) stringify_in_c(.long PPC_RAW_DIVDEU(t, 
a, b))
+#define PPC_LQ(t, a, dq)   stringify_in_c(.long PPC_RAW_LQ(t, a, dq))
 #define PPC_LQARX(t, a, b, eh) stringify_in_c(.long PPC_RAW_LQARX(t, a, b, eh))
 #define PPC_LDARX(t, a, b, eh) stringify_in_c(.long PPC_RAW_LDARX(t, a, b, eh))
 #define PPC_LWARX(t, a, b, eh) stringify_in_c(.long PPC_RAW_LWARX(t, a, b, eh))
+#define PPC_STQ(t, a, ds)  stringify_in_c(.long PPC_RAW_STQ(t, a, ds))
 #define PPC_STQCX(t, a, b) stringify_in_c(.long PPC_RAW_STQCX(t, a, b))
 #define PPC_MADDHD(t, a, b, c) stringify_in_c(.long PPC_RAW_MADDHD(t, a, b, c))
 #define PPC_MADDHDU(t, a, b, c)stringify_in_c(.long PPC_RAW_MADDHDU(t, 
a, b, c))
diff --git a/arch/powerpc/lib/quad.S b/arch/powerpc/lib/quad.S
index da71760e50b5..de802a817992 100644
--- a/arch/powerpc/lib/quad.S
+++ b/arch/powerpc/lib/quad.S
@@ -15,7 +15,7 @@
 
 /* do_lq(unsigned long ea, unsigned long *regs) */
 _GLOBAL(do_lq)
-1: lq  r6, 0(r3)
+1: PPC_LQ(6, 3, 0)
std r6, 0(r4)
std r7, 8(r4)
li  r3, 0
@@ -26,7 +26,7 @@ _GLOBAL(do_lq)
 
 /* do_stq(unsigned long ea, unsigned long val0, unsigned long val1) */
 _GLOBAL(do_stq)
-1: stq r4, 0(r3)
+1: PPC_STQ(4, 3, 0)
li  r3, 0
blr
 2: li  r3, -EFAULT
-- 
2.27.0



[RFC PATCH 4/8] powerpc/ppc_asm: use plain numbers for registers

2021-02-24 Thread Daniel Axtens
This is dumb but makes the llvm integrated assembler happy.
https://github.com/ClangBuiltLinux/linux/issues/764

Signed-off-by: Daniel Axtens 
---
 arch/powerpc/include/asm/ppc_asm.h | 64 +++---
 1 file changed, 32 insertions(+), 32 deletions(-)

diff --git a/arch/powerpc/include/asm/ppc_asm.h 
b/arch/powerpc/include/asm/ppc_asm.h
index 3dceb64fc9af..49da2cf4c2d5 100644
--- a/arch/powerpc/include/asm/ppc_asm.h
+++ b/arch/powerpc/include/asm/ppc_asm.h
@@ -509,38 +509,38 @@ END_FTR_SECTION_NESTED(CPU_FTR_CELL_TB_BUG, 
CPU_FTR_CELL_TB_BUG, 96)
  * Use R0-31 only when really nessesary.
  */
 
-#definer0  %r0
-#definer1  %r1
-#definer2  %r2
-#definer3  %r3
-#definer4  %r4
-#definer5  %r5
-#definer6  %r6
-#definer7  %r7
-#definer8  %r8
-#definer9  %r9
-#definer10 %r10
-#definer11 %r11
-#definer12 %r12
-#definer13 %r13
-#definer14 %r14
-#definer15 %r15
-#definer16 %r16
-#definer17 %r17
-#definer18 %r18
-#definer19 %r19
-#definer20 %r20
-#definer21 %r21
-#definer22 %r22
-#definer23 %r23
-#definer24 %r24
-#definer25 %r25
-#definer26 %r26
-#definer27 %r27
-#definer28 %r28
-#definer29 %r29
-#definer30 %r30
-#definer31 %r31
+#definer0  0
+#definer1  1
+#definer2  2
+#definer3  3
+#definer4  4
+#definer5  5
+#definer6  6
+#definer7  7
+#definer8  8
+#definer9  9
+#definer10 10
+#definer11 11
+#definer12 12
+#definer13 13
+#definer14 14
+#definer15 15
+#definer16 16
+#definer17 17
+#definer18 18
+#definer19 19
+#definer20 20
+#definer21 21
+#definer22 22
+#definer23 23
+#definer24 24
+#definer25 25
+#definer26 26
+#definer27 27
+#definer28 28
+#definer29 29
+#definer30 30
+#definer31 31
 
 
 /* Floating Point Registers (FPRs) */
-- 
2.27.0



[RFC PATCH 3/8] powerpc/head-64: do less gas-specific stuff with sections

2021-02-24 Thread Daniel Axtens
Reopening the section without specifying the same flags breaks
the llvm integrated assembler. Don't do it: just specify all the
flags all the time.

Signed-off-by: Daniel Axtens 
---
 arch/powerpc/include/asm/head-64.h | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/include/asm/head-64.h 
b/arch/powerpc/include/asm/head-64.h
index 4cb9efa2eb21..7d8ccab47e86 100644
--- a/arch/powerpc/include/asm/head-64.h
+++ b/arch/powerpc/include/asm/head-64.h
@@ -15,10 +15,10 @@
 .macro define_data_ftsec name
.section ".head.data.\name\()","a",@progbits
 .endm
-.macro use_ftsec name
-   .section ".head.text.\name\()"
-.endm
-
+//.macro use_ftsec name
+// .section ".head.text.\name\()"
+//.endm
+#define use_ftsec define_ftsec
 /*
  * Fixed (location) sections are used by opening fixed sections and emitting
  * fixed section entries into them before closing them. Multiple fixed sections
-- 
2.27.0



[RFC PATCH 2/8] powerpc: check for support for -Wa,-m{power4,any}

2021-02-24 Thread Daniel Axtens
LLVM's integrated assembler does not like either -Wa,-mpower4
or -Wa,-many. So just don't pass them if they're not supported.

Signed-off-by: Daniel Axtens 
---
 arch/powerpc/Makefile | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/Makefile b/arch/powerpc/Makefile
index 08cf0eade56a..3e2c72d20bb8 100644
--- a/arch/powerpc/Makefile
+++ b/arch/powerpc/Makefile
@@ -252,7 +252,9 @@ cpu-as-$(CONFIG_E500)   += -Wa,-me500
 # When using '-many -mpower4' gas will first try and find a matching power4
 # mnemonic and failing that it will allow any valid mnemonic that GAS knows
 # about. GCC will pass -many to GAS when assembling, clang does not.
-cpu-as-$(CONFIG_PPC_BOOK3S_64) += -Wa,-mpower4 -Wa,-many
+# LLVM IAS doesn't understand either flag: 
https://github.com/ClangBuiltLinux/linux/issues/675
+# but LLVM IAS only supports ISA >= 2.06 for Book3S 64 anyway...
+cpu-as-$(CONFIG_PPC_BOOK3S_64) += $(call as-option,-Wa$(comma)-mpower4) $(call 
as-option,-Wa$(comma)-many)
 cpu-as-$(CONFIG_PPC_E500MC)+= $(call as-option,-Wa$(comma)-me500mc)
 
 KBUILD_AFLAGS += $(cpu-as-y)
-- 
2.27.0



[PATCH 1/8] powerpc/64s/exception: Clean up a missed SRR specifier

2021-02-24 Thread Daniel Axtens
Nick's patch cleaning up the SRR specifiers in exception-64s.S
missed a single instance of EXC_HV_OR_STD. Clean that up.

Caught by clang's integrated assembler.

Fixes: 3f7fbd97d07d ("powerpc/64s/exception: Clean up SRR specifiers")
Acked-by: Nicholas Piggin 
Signed-off-by: Daniel Axtens 
---
 arch/powerpc/kernel/exceptions-64s.S | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/exceptions-64s.S 
b/arch/powerpc/kernel/exceptions-64s.S
index a4bd3c114a0a..720fb9892745 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -466,7 +466,7 @@ DEFINE_FIXED_SYMBOL(\name\()_common_real)
 
ld  r10,PACAKMSR(r13)   /* get MSR value for kernel */
/* MSR[RI] is clear iff using SRR regs */
-   .if IHSRR == EXC_HV_OR_STD
+   .if IHSRR_IF_HVMODE
BEGIN_FTR_SECTION
xorir10,r10,MSR_RI
END_FTR_SECTION_IFCLR(CPU_FTR_HVMODE)
-- 
2.27.0



[RFC PATCH 0/8] WIP support for the LLVM integrated assembler

2021-02-24 Thread Daniel Axtens
To support Clang's CFI we need LTO. For LTO, we need to be able to compile
with the LLVM integrated assembler.

Currently, we can't.

This series gets us a bit closer, but I'm still stuck and I'm hoping
someone can point me in the right direction.

Patch 1 is a fix that can be merged at any time.

The rest of this series is pretty rough, but with it, building like this:

make CC=clang-11 LD=ld.lld-11 AR=llvm-ar-11 NM=llvm-nm-11 STRIP=llvm-strip-11 \
 OBJCOPY=llvm-objcopy-11 OBJDUMP=llvm-objdump-11 READELF=llvm-readelf-11 \
 HOSTCC=clang-11 HOSTCXX=clang++-11 HOSTAR=llvm-ar-11 HOSTLD=ld.lld-11 \
 LLVM_IAS=1  vmlinux

on a pseries_le_defconfig without Werror works except for head-64.S,
which still fails as described in the final patch. Help would be
appreciated because it's deep magic all around.

Apart from the very very dodgy change to drop the tlbiel feature
section, none of the de-gas-ing changed the compiled binary for me
under gcc-10.2.0-13ubuntu1.

Daniel Axtens (8):
  powerpc/64s/exception: Clean up a missed SRR specifier
  powerpc: check for support for -Wa,-m{power4,any}
  powerpc/head-64: do less gas-specific stuff with sections
  powerpc/ppc_asm: use plain numbers for registers
  poweprc/lib/quad: Provide macros for lq/stq
  powerpc/mm/book3s64/hash: drop pre 2.06 tlbiel for clang
  powerpc/purgatory: drop .machine specifier
  powerpc/64/asm: don't reassign labels

 arch/powerpc/Makefile  |  4 +-
 arch/powerpc/include/asm/head-64.h | 20 
 arch/powerpc/include/asm/ppc-opcode.h  |  4 ++
 arch/powerpc/include/asm/ppc_asm.h | 64 +-
 arch/powerpc/kernel/exceptions-64s.S   | 33 ++---
 arch/powerpc/kernel/head_64.S  | 16 +++
 arch/powerpc/lib/quad.S|  4 +-
 arch/powerpc/mm/book3s64/hash_native.c | 10 
 arch/powerpc/purgatory/trampoline_64.S |  2 +-
 9 files changed, 86 insertions(+), 71 deletions(-)

-- 
2.27.0



Re: [PATCH v6 06/10] powerpc/signal64: Replace setup_sigcontext() w/ unsafe_setup_sigcontext()

2021-02-24 Thread Christopher M. Riedl
On Tue Feb 23, 2021 at 11:12 AM CST, Christophe Leroy wrote:
>
>
> Le 21/02/2021 à 02:23, Christopher M. Riedl a écrit :
> > Previously setup_sigcontext() performed a costly KUAP switch on every
> > uaccess operation. These repeated uaccess switches cause a significant
> > drop in signal handling performance.
> > 
> > Rewrite setup_sigcontext() to assume that a userspace write access window
> > is open by replacing all uaccess functions with their 'unsafe' versions.
> > Modify the callers to first open, call unsafe_setup_sigcontext() and
> > then close the uaccess window.
>
> Do you plan to also convert setup_tm_sigcontexts() ?
> It would allow to then remove copy_fpr_to_user() and
> copy_ckfpr_to_user() and maybe other functions too.

I don't intend to convert the TM functions as part of this series.
Partially because I've been "threatened" with TM ownership for touching
the code :) and also because TM enhancements are a pretty low priority I
think.

>
> Christophe
>
> > 
> > Signed-off-by: Christopher M. Riedl 
> > ---
> >   arch/powerpc/kernel/signal_64.c | 71 -
> >   1 file changed, 44 insertions(+), 27 deletions(-)
> > 
> > diff --git a/arch/powerpc/kernel/signal_64.c 
> > b/arch/powerpc/kernel/signal_64.c
> > index bd8d210c9115..3faaa736ed62 100644
> > --- a/arch/powerpc/kernel/signal_64.c
> > +++ b/arch/powerpc/kernel/signal_64.c
> > @@ -101,9 +101,13 @@ static void prepare_setup_sigcontext(struct 
> > task_struct *tsk)
> >* Set up the sigcontext for the signal frame.
> >*/
> >   
> > -static long setup_sigcontext(struct sigcontext __user *sc,
> > -   struct task_struct *tsk, int signr, sigset_t *set,
> > -   unsigned long handler, int ctx_has_vsx_region)
> > +#define unsafe_setup_sigcontext(sc, tsk, signr, set, handler,  
> > \
> > +   ctx_has_vsx_region, e)  \
> > +   unsafe_op_wrap(__unsafe_setup_sigcontext(sc, tsk, signr, set,   \
> > +   handler, ctx_has_vsx_region), e)
> > +static long notrace __unsafe_setup_sigcontext(struct sigcontext __user *sc,
> > +   struct task_struct *tsk, int signr, 
> > sigset_t *set,
> > +   unsigned long handler, int 
> > ctx_has_vsx_region)
> >   {
> > /* When CONFIG_ALTIVEC is set, we _always_ setup v_regs even if the
> >  * process never used altivec yet (MSR_VEC is zero in pt_regs of
> > @@ -118,20 +122,19 @@ static long setup_sigcontext(struct sigcontext __user 
> > *sc,
> >   #endif
> > struct pt_regs *regs = tsk->thread.regs;
> > unsigned long msr = regs->msr;
> > -   long err = 0;
> > /* Force usr to alway see softe as 1 (interrupts enabled) */
> > unsigned long softe = 0x1;
> >   
> > BUG_ON(tsk != current);
> >   
> >   #ifdef CONFIG_ALTIVEC
> > -   err |= __put_user(v_regs, >v_regs);
> > +   unsafe_put_user(v_regs, >v_regs, efault_out);
> >   
> > /* save altivec registers */
> > if (tsk->thread.used_vr) {
> > /* Copy 33 vec registers (vr0..31 and vscr) to the stack */
> > -   err |= __copy_to_user(v_regs, >thread.vr_state,
> > - 33 * sizeof(vector128));
> > +   unsafe_copy_to_user(v_regs, >thread.vr_state,
> > +   33 * sizeof(vector128), efault_out);
> > /* set MSR_VEC in the MSR value in the frame to indicate that 
> > sc->v_reg)
> >  * contains valid data.
> >  */
> > @@ -140,12 +143,12 @@ static long setup_sigcontext(struct sigcontext __user 
> > *sc,
> > /* We always copy to/from vrsave, it's 0 if we don't have or don't
> >  * use altivec.
> >  */
> > -   err |= __put_user(tsk->thread.vrsave, (u32 __user *)_regs[33]);
> > +   unsafe_put_user(tsk->thread.vrsave, (u32 __user *)_regs[33], 
> > efault_out);
> >   #else /* CONFIG_ALTIVEC */
> > -   err |= __put_user(0, >v_regs);
> > +   unsafe_put_user(0, >v_regs, efault_out);
> >   #endif /* CONFIG_ALTIVEC */
> > /* copy fpr regs and fpscr */
> > -   err |= copy_fpr_to_user(>fp_regs, tsk);
> > +   unsafe_copy_fpr_to_user(>fp_regs, tsk, efault_out);
> >   
> > /*
> >  * Clear the MSR VSX bit to indicate there is no valid state attached
> > @@ -160,24 +163,27 @@ static long setup_sigcontext(struct sigcontext __user 
> > *sc,
> >  */
> > if (tsk->thread.used_vsr && ctx_has_vsx_region) {
> > v_regs += ELF_NVRREG;
> > -   err |= copy_vsx_to_user(v_regs, tsk);
> > +   unsafe_copy_vsx_to_user(v_regs, tsk, efault_out);
> > /* set MSR_VSX in the MSR value in the frame to
> >  * indicate that sc->vs_reg) contains valid data.
> >  */
> > msr |= MSR_VSX;
> > }
> >   #endif /* CONFIG_VSX */
> > -   err |= __put_user(>gp_regs, >regs);
> > +   unsafe_put_user(>gp_regs, >regs, efault_out);
> > WARN_ON(!FULL_REGS(regs));
> > -   err |= __copy_to_user(>gp_regs, 

Re: [PATCH 12/13] KVM: PPC: Book3S HV: Move radix MMU switching together in the P9 path

2021-02-24 Thread Fabiano Rosas
Nicholas Piggin  writes:

> Switching the MMU from radix<->radix mode is tricky particularly as the
> MMU can remain enabled and requires a certain sequence of SPR updates.
> Move these together into their own functions.
>
> This also includes the radix TLB check / flush because it's tied in to
> MMU switching due to tlbiel getting LPID from LPIDR.
>
> Signed-off-by: Nicholas Piggin 
> ---



> @@ -4117,7 +4138,7 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 
> time_limit,
>  {
>   struct kvm_run *run = vcpu->run;
>   int trap, r, pcpu;
> - int srcu_idx, lpid;
> + int srcu_idx;
>   struct kvmppc_vcore *vc;
>   struct kvm *kvm = vcpu->kvm;
>   struct kvm_nested_guest *nested = vcpu->arch.nested;
> @@ -4191,13 +4212,6 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 
> time_limit,
>   vc->vcore_state = VCORE_RUNNING;
>   trace_kvmppc_run_core(vc, 0);
>
> - if (cpu_has_feature(CPU_FTR_HVMODE)) {
> - lpid = nested ? nested->shadow_lpid : kvm->arch.lpid;
> - mtspr(SPRN_LPID, lpid);
> - isync();
> - kvmppc_check_need_tlb_flush(kvm, pcpu, nested);
> - }
> -

What about the counterpart to this^ down below?

if (cpu_has_feature(CPU_FTR_HVMODE)) {
mtspr(SPRN_LPID, kvm->arch.host_lpid);
isync();
}

>   guest_enter_irqoff();
>
>   srcu_idx = srcu_read_lock(>srcu);


Re: [PATCH v6 01/10] powerpc/uaccess: Add unsafe_copy_from_user

2021-02-24 Thread Christopher M. Riedl
On Tue Feb 23, 2021 at 11:15 AM CST, Christophe Leroy wrote:
>
>
> Le 21/02/2021 à 02:23, Christopher M. Riedl a écrit :
> > Just wrap __copy_tofrom_user() for the usual 'unsafe' pattern which
> > accepts a label to goto on error.
> > 
> > Signed-off-by: Christopher M. Riedl 
> > Reviewed-by: Daniel Axtens 
> > ---
> >   arch/powerpc/include/asm/uaccess.h | 3 +++
> >   1 file changed, 3 insertions(+)
> > 
> > diff --git a/arch/powerpc/include/asm/uaccess.h 
> > b/arch/powerpc/include/asm/uaccess.h
> > index 78e2a3990eab..33b2de642120 100644
> > --- a/arch/powerpc/include/asm/uaccess.h
> > +++ b/arch/powerpc/include/asm/uaccess.h
> > @@ -487,6 +487,9 @@ user_write_access_begin(const void __user *ptr, size_t 
> > len)
> >   #define unsafe_put_user(x, p, e) \
> > __unsafe_put_user_goto((__typeof__(*(p)))(x), (p), sizeof(*(p)), e)
> >   
> > +#define unsafe_copy_from_user(d, s, l, e) \
> > +   unsafe_op_wrap(__copy_tofrom_user((__force void __user *)d, s, l), e)
> > +
>
> Could we perform same as unsafe_copy_to_user() instead of calling an
> external function which is
> banned in principle inside uaccess blocks ?

Yup, with your patch to move the barrier_nospec() into the allowance
helpers this makes sense now. I just tried it and performance does not
change significantly w/ either radix or hash translation. I will include
this change in the next spin - thanks!

>
>
> >   #define unsafe_copy_to_user(d, s, l, e) \
> >   do {  
> > \
> > u8 __user *_dst = (u8 __user *)(d); \
> > 



Re: [PATCH] arch: powerpc: kernel: Change droping to dropping in the file traps.c

2021-02-24 Thread Randy Dunlap
On 2/23/21 11:55 PM, Bhaskar Chowdhury wrote:
> 
> s/droping/dropping/
> 
> Signed-off-by: Bhaskar Chowdhury 

Acked-by: Randy Dunlap 

> ---
>  arch/powerpc/kernel/traps.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
> index 1583fd1c6010..83a53b67412a 100644
> --- a/arch/powerpc/kernel/traps.c
> +++ b/arch/powerpc/kernel/traps.c
> @@ -405,7 +405,7 @@ void hv_nmi_check_nonrecoverable(struct pt_regs *regs)
>* Now test if the interrupt has hit a range that may be using
>* HSPRG1 without having RI=0 (i.e., an HSRR interrupt). The
>* problem ranges all run un-relocated. Test real and virt modes
> -  * at the same time by droping the high bit of the nip (virt mode
> +  * at the same time by dropping the high bit of the nip (virt mode
>* entry points still have the +0x4000 offset).
>*/
>   nip &= ~0xc000ULL;
> --
> 2.30.1
> 


-- 
~Randy



Re: [PATCH v5 2/3] KVM: PPC: Book3S HV: Add support for H_RPT_INVALIDATE

2021-02-24 Thread Fabiano Rosas
Bharata B Rao  writes:

> Implement H_RPT_INVALIDATE hcall and add KVM capability
> KVM_CAP_PPC_RPT_INVALIDATE to indicate the support for the same.
>
> This hcall does two types of TLB invalidations:
>
> 1. Process-scoped invalidations for guests with LPCR[GTSE]=0.
>This is currently not used in KVM as GTSE is not usually
>disabled in KVM.
> 2. Partition-scoped invalidations that an L1 hypervisor does on
>behalf of an L2 guest. This replaces the uses of the existing
>hcall H_TLB_INVALIDATE.
>
> In order to handle process scoped invalidations of L2, we
> intercept the nested exit handling code in L0 only to handle
> H_TLB_INVALIDATE hcall.
>
> Process scoped tlbie invalidations from L1 and nested guests
> need RS register for TLBIE instruction to contain both PID and
> LPID.  This patch introduces primitives that execute tlbie
> instruction with both PID and LPID set in prepartion for
> H_RPT_INVALIDATE hcall.
>
> Signed-off-by: Bharata B Rao 
> ---
>  Documentation/virt/kvm/api.rst|  18 +++
>  .../include/asm/book3s/64/tlbflush-radix.h|   4 +
>  arch/powerpc/include/asm/kvm_book3s.h |   3 +
>  arch/powerpc/include/asm/mmu_context.h|  11 ++
>  arch/powerpc/kvm/book3s_hv.c  |  90 +++
>  arch/powerpc/kvm/book3s_hv_nested.c   |  77 +
>  arch/powerpc/kvm/powerpc.c|   3 +
>  arch/powerpc/mm/book3s64/radix_tlb.c  | 147 +-
>  include/uapi/linux/kvm.h  |   1 +
>  9 files changed, 350 insertions(+), 4 deletions(-)
>
> diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
> index 45fd862ac128..38ce3f21b21f 100644
> --- a/Documentation/virt/kvm/api.rst
> +++ b/Documentation/virt/kvm/api.rst
> @@ -6225,6 +6225,24 @@ KVM_RUN_BUS_LOCK flag is used to distinguish between 
> them.
>  This capability can be used to check / enable 2nd DAWR feature provided
>  by POWER10 processor.
>  
> +7.23 KVM_CAP_PPC_RPT_INVALIDATE
> +--
> +
> +:Capability: KVM_CAP_PPC_RPT_INVALIDATE
> +:Architectures: ppc
> +:Type: vm
> +
> +This capability indicates that the kernel is capable of handling
> +H_RPT_INVALIDATE hcall.
> +
> +In order to enable the use of H_RPT_INVALIDATE in the guest,
> +user space might have to advertise it for the guest. For example,
> +IBM pSeries (sPAPR) guest starts using it if "hcall-rpt-invalidate" is
> +present in the "ibm,hypertas-functions" device-tree property.
> +
> +This capability is enabled for hypervisors on platforms like POWER9
> +that support radix MMU.
> +
>  8. Other capabilities.
>  ==
>  
> diff --git a/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h 
> b/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h
> index 8b33601cdb9d..a46fd37ad552 100644
> --- a/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h
> +++ b/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h
> @@ -4,6 +4,10 @@
>  
>  #include 
>  
> +#define RIC_FLUSH_TLB 0
> +#define RIC_FLUSH_PWC 1
> +#define RIC_FLUSH_ALL 2
> +
>  struct vm_area_struct;
>  struct mm_struct;
>  struct mmu_gather;
> diff --git a/arch/powerpc/include/asm/kvm_book3s.h 
> b/arch/powerpc/include/asm/kvm_book3s.h
> index 2f5f919f6cd3..a1515f94400e 100644
> --- a/arch/powerpc/include/asm/kvm_book3s.h
> +++ b/arch/powerpc/include/asm/kvm_book3s.h
> @@ -305,6 +305,9 @@ void kvmhv_set_ptbl_entry(unsigned int lpid, u64 dw0, u64 
> dw1);
>  void kvmhv_release_all_nested(struct kvm *kvm);
>  long kvmhv_enter_nested_guest(struct kvm_vcpu *vcpu);
>  long kvmhv_do_nested_tlbie(struct kvm_vcpu *vcpu);
> +long kvmhv_h_rpti_nested(struct kvm_vcpu *vcpu, unsigned long lpid,
> +  unsigned long type, unsigned long pg_sizes,
> +  unsigned long start, unsigned long end);
>  int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu,
> u64 time_limit, unsigned long lpcr);
>  void kvmhv_save_hv_regs(struct kvm_vcpu *vcpu, struct hv_guest_state *hr);
> diff --git a/arch/powerpc/include/asm/mmu_context.h 
> b/arch/powerpc/include/asm/mmu_context.h
> index 652ce85f9410..820caf4e01b7 100644
> --- a/arch/powerpc/include/asm/mmu_context.h
> +++ b/arch/powerpc/include/asm/mmu_context.h
> @@ -124,8 +124,19 @@ static inline bool need_extra_context(struct mm_struct 
> *mm, unsigned long ea)
>  
>  #if defined(CONFIG_KVM_BOOK3S_HV_POSSIBLE) && defined(CONFIG_PPC_RADIX_MMU)
>  extern void radix_kvm_prefetch_workaround(struct mm_struct *mm);
> +void do_h_rpt_invalidate(unsigned long pid, unsigned long lpid,
> +  unsigned long type, unsigned long page_size,
> +  unsigned long psize, unsigned long start,
> +  unsigned long end);
>  #else
>  static inline void radix_kvm_prefetch_workaround(struct mm_struct *mm) { }
> +static inline void do_h_rpt_invalidate(unsigned long pid,
> +unsigned long lpid,
> + 

Re: [PATCH 1/2] powerpc/perf: Infrastructure to support checking of attr.config*

2021-02-24 Thread Paul A. Clarke
On Wed, Feb 24, 2021 at 07:58:39PM +0530, Madhavan Srinivasan wrote:
> Introduce code to support the checking of attr.config* for
> values which are reserved for a given platform.
> Performance Monitoring Unit (PMU) configuration registers
> have fileds that are reserved and specific values to bit fields

s/fileds/fields/

> as reserved. Writing a none zero values in these fields

Should the previous sentences say something like "required values
for specific bit fields" or "specific bit fields that are reserved"?

s/none zero/non-zero/

> or writing invalid value to bit fields will have unknown
> behaviours.
> 
> Patch here add a generic call-back function "check_attr_config"

s/add/adds/ or "This patch adds ..." or just "Add ...".

> in "struct power_pmu", to be called in event_init to
> check for attr.config* values for a given platform.
> "check_attr_config" is valid only for raw event type.
> 
> Suggested-by: Alexey Kardashevskiy 
> Signed-off-by: Madhavan Srinivasan 
> ---
>  arch/powerpc/include/asm/perf_event_server.h |  6 ++
>  arch/powerpc/perf/core-book3s.c  | 12 
>  2 files changed, 18 insertions(+)
> 
> diff --git a/arch/powerpc/include/asm/perf_event_server.h 
> b/arch/powerpc/include/asm/perf_event_server.h
> index 00e7e671bb4b..dde97d7d9253 100644
> --- a/arch/powerpc/include/asm/perf_event_server.h
> +++ b/arch/powerpc/include/asm/perf_event_server.h
> @@ -67,6 +67,12 @@ struct power_pmu {
>* the pmu supports extended perf regs capability
>*/
>   int capabilities;
> + /*
> +  * Function to check event code for values which are
> +  * reserved. Function takes struct perf_event as input,
> +  * since event code could be spread in attr.config*
> +  */
> + int (*check_attr_config)(struct perf_event *ev);
>  };
> 
>  /*
> diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
> index 6817331e22ff..679d67506299 100644
> --- a/arch/powerpc/perf/core-book3s.c
> +++ b/arch/powerpc/perf/core-book3s.c
> @@ -1958,6 +1958,18 @@ static int power_pmu_event_init(struct perf_event 
> *event)
> 
>   if (ppmu->blacklist_ev && is_event_blacklisted(ev))
>   return -EINVAL;
> + /*
> +  * PMU config registers have fileds that are
> +  * reserved and spacific values to bit fileds be reserved.

s/spacific/specific/
s/fileds/fields/
Same comment about "specific values to bit fields be reserved", and
rewording that to be more clear.

> +  * This call-back will check the event code for same.
> +  *
> +  * Event type hardware and hw_cache will not value
> +  * invalid values in the event code which is not true
> +  * for raw event type.

I confess I don't understand what this means. (But it could be just me!)

> +  */
> + if (ppmu->check_attr_config &&
> + ppmu->check_attr_config(event))
> + return -EINVAL;
>   break;
>   default:
>   return -ENOENT;
> -- 

PC


[PATCH 2/2] powerpc/perf: Add platform specific check_attr_config

2021-02-24 Thread Madhavan Srinivasan
Add platform specific attr.config value checks. Patch
includes checks for power9 and power10 platforms.

Signed-off-by: Madhavan Srinivasan 
---
 arch/powerpc/perf/isa207-common.c | 41 +++
 arch/powerpc/perf/isa207-common.h |  2 ++
 arch/powerpc/perf/power10-pmu.c   | 13 ++
 arch/powerpc/perf/power9-pmu.c| 13 ++
 4 files changed, 69 insertions(+)

diff --git a/arch/powerpc/perf/isa207-common.c 
b/arch/powerpc/perf/isa207-common.c
index e4f577da33d8..b255799f5b51 100644
--- a/arch/powerpc/perf/isa207-common.c
+++ b/arch/powerpc/perf/isa207-common.c
@@ -694,3 +694,44 @@ int isa207_get_alternatives(u64 event, u64 alt[], int 
size, unsigned int flags,
 
return num_alt;
 }
+
+int isa3_X_check_attr_config(struct perf_event *ev)
+{
+   u64 val, sample_mode;
+   u64 event = ev->attr.config;
+
+   val = (event >> EVENT_SAMPLE_SHIFT) & EVENT_SAMPLE_MASK;
+   sample_mode = val & 0x3;
+
+   /*
+* MMCRA[61:62] is Randome Sampling Mode (SM).
+* value of 0b11 is reserved.
+*/
+   if (sample_mode == 0x3)
+   return -1;
+
+   /*
+* Check for all reserved value
+*/
+   switch (val) {
+   case 0x5:
+   case 0x9:
+   case 0xD:
+   case 0x19:
+   case 0x1D:
+   case 0x1A:
+   case 0x1E:
+   return -1;
+   }
+
+   /*
+* MMCRA[48:51]/[52:55]) Threshold Start/Stop
+* Events Selection.
+* 0b/0b is reserved.
+*/
+   val = (event >> EVENT_THR_CTL_SHIFT) & EVENT_THR_CTL_MASK;
+   if (((val & 0xF0) == 0xF0) || ((val & 0xF) == 0xF))
+   return -1;
+
+   return 0;
+}
diff --git a/arch/powerpc/perf/isa207-common.h 
b/arch/powerpc/perf/isa207-common.h
index 1af0e8c97ac7..ae8eaf05efd1 100644
--- a/arch/powerpc/perf/isa207-common.h
+++ b/arch/powerpc/perf/isa207-common.h
@@ -280,4 +280,6 @@ void isa207_get_mem_data_src(union perf_mem_data_src *dsrc, 
u32 flags,
struct pt_regs *regs);
 void isa207_get_mem_weight(u64 *weight);
 
+int isa3_X_check_attr_config(struct perf_event *ev);
+
 #endif
diff --git a/arch/powerpc/perf/power10-pmu.c b/arch/powerpc/perf/power10-pmu.c
index a901c1348cad..bc64354cab6a 100644
--- a/arch/powerpc/perf/power10-pmu.c
+++ b/arch/powerpc/perf/power10-pmu.c
@@ -106,6 +106,18 @@ static int power10_get_alternatives(u64 event, unsigned 
int flags, u64 alt[])
return num_alt;
 }
 
+static int power10_check_attr_config(struct perf_event *ev)
+{
+   u64 val;
+   u64 event = ev->attr.config;
+
+   val = (event >> EVENT_SAMPLE_SHIFT) & EVENT_SAMPLE_MASK;
+   if (val == 0x10 || isa3_X_check_attr_config(ev))
+   return -1;
+
+   return 0;
+}
+
 GENERIC_EVENT_ATTR(cpu-cycles, PM_RUN_CYC);
 GENERIC_EVENT_ATTR(instructions,   PM_RUN_INST_CMPL);
 GENERIC_EVENT_ATTR(branch-instructions,PM_BR_CMPL);
@@ -559,6 +571,7 @@ static struct power_pmu power10_pmu = {
.attr_groups= power10_pmu_attr_groups,
.bhrb_nr= 32,
.capabilities   = PERF_PMU_CAP_EXTENDED_REGS,
+   .check_attr_config  = power10_check_attr_config,
 };
 
 int init_power10_pmu(void)
diff --git a/arch/powerpc/perf/power9-pmu.c b/arch/powerpc/perf/power9-pmu.c
index 2a57e93a79dc..b3b9b226d053 100644
--- a/arch/powerpc/perf/power9-pmu.c
+++ b/arch/powerpc/perf/power9-pmu.c
@@ -151,6 +151,18 @@ static int power9_get_alternatives(u64 event, unsigned int 
flags, u64 alt[])
return num_alt;
 }
 
+static int power9_check_attr_config(struct perf_event *ev)
+{
+   u64 val;
+   u64 event = ev->attr.config;
+
+   val = (event >> EVENT_SAMPLE_SHIFT) & EVENT_SAMPLE_MASK;
+   if (val == 0xC || isa3_X_check_attr_config(ev))
+   return -1;
+
+   return 0;
+}
+
 GENERIC_EVENT_ATTR(cpu-cycles, PM_CYC);
 GENERIC_EVENT_ATTR(stalled-cycles-frontend,PM_ICT_NOSLOT_CYC);
 GENERIC_EVENT_ATTR(stalled-cycles-backend, PM_CMPLU_STALL);
@@ -437,6 +449,7 @@ static struct power_pmu power9_pmu = {
.attr_groups= power9_pmu_attr_groups,
.bhrb_nr= 32,
.capabilities   = PERF_PMU_CAP_EXTENDED_REGS,
+   .check_attr_config  = power9_check_attr_config,
 };
 
 int init_power9_pmu(void)
-- 
2.26.2



[PATCH 1/2] powerpc/perf: Infrastructure to support checking of attr.config*

2021-02-24 Thread Madhavan Srinivasan
Introduce code to support the checking of attr.config* for
values which are reserved for a given platform.
Performance Monitoring Unit (PMU) configuration registers
have fileds that are reserved and specific values to bit fields
as reserved. Writing a none zero values in these fields
or writing invalid value to bit fields will have unknown
behaviours.

Patch here add a generic call-back function "check_attr_config"
in "struct power_pmu", to be called in event_init to
check for attr.config* values for a given platform.
"check_attr_config" is valid only for raw event type.

Suggested-by: Alexey Kardashevskiy 
Signed-off-by: Madhavan Srinivasan 
---
 arch/powerpc/include/asm/perf_event_server.h |  6 ++
 arch/powerpc/perf/core-book3s.c  | 12 
 2 files changed, 18 insertions(+)

diff --git a/arch/powerpc/include/asm/perf_event_server.h 
b/arch/powerpc/include/asm/perf_event_server.h
index 00e7e671bb4b..dde97d7d9253 100644
--- a/arch/powerpc/include/asm/perf_event_server.h
+++ b/arch/powerpc/include/asm/perf_event_server.h
@@ -67,6 +67,12 @@ struct power_pmu {
 * the pmu supports extended perf regs capability
 */
int capabilities;
+   /*
+* Function to check event code for values which are
+* reserved. Function takes struct perf_event as input,
+* since event code could be spread in attr.config*
+*/
+   int (*check_attr_config)(struct perf_event *ev);
 };
 
 /*
diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
index 6817331e22ff..679d67506299 100644
--- a/arch/powerpc/perf/core-book3s.c
+++ b/arch/powerpc/perf/core-book3s.c
@@ -1958,6 +1958,18 @@ static int power_pmu_event_init(struct perf_event *event)
 
if (ppmu->blacklist_ev && is_event_blacklisted(ev))
return -EINVAL;
+   /*
+* PMU config registers have fileds that are
+* reserved and spacific values to bit fileds be reserved.
+* This call-back will check the event code for same.
+*
+* Event type hardware and hw_cache will not value
+* invalid values in the event code which is not true
+* for raw event type.
+*/
+   if (ppmu->check_attr_config &&
+   ppmu->check_attr_config(event))
+   return -EINVAL;
break;
default:
return -ENOENT;
-- 
2.26.2



[PATCH] powerpc/perf: prevent mixed EBB and non-EBB events

2021-02-24 Thread Thadeu Lima de Souza Cascardo
EBB events must be under exclusive groups, so there is no mix of EBB and
non-EBB events on the same PMU. This requirement worked fine as perf core
would not allow other pinned events to be scheduled together with exclusive
events.

This assumption was broken by commit 1908dc911792 ("perf: Tweak
perf_event_attr::exclusive semantics").

After that, the test cpu_event_pinned_vs_ebb_test started succeeding after
read_events, but worse, the task would not have given access to PMC1, so
when it tried to write to it, it was killed with "illegal instruction".

Preventing mixed EBB and non-EBB events from being add to the same PMU will
just revert to the previous behavior and the test will succeed.

Fixes: 1908dc911792 (perf: Tweak perf_event_attr::exclusive semantics)
Signed-off-by: Thadeu Lima de Souza Cascardo 
---
 arch/powerpc/perf/core-book3s.c | 20 
 1 file changed, 16 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
index 43599e671d38..d767f7944f85 100644
--- a/arch/powerpc/perf/core-book3s.c
+++ b/arch/powerpc/perf/core-book3s.c
@@ -1010,9 +1010,25 @@ static int check_excludes(struct perf_event **ctrs, 
unsigned int cflags[],
  int n_prev, int n_new)
 {
int eu = 0, ek = 0, eh = 0;
+   bool ebb = false;
int i, n, first;
struct perf_event *event;
 
+   n = n_prev + n_new;
+   if (n <= 1)
+   return 0;
+
+   first = 1;
+   for (i = 0; i < n; ++i) {
+   event = ctrs[i];
+   if (first) {
+   ebb = is_ebb_event(event);
+   first = 0;
+   } else if (is_ebb_event(event) != ebb) {
+   return -EAGAIN;
+   }
+   }
+
/*
 * If the PMU we're on supports per event exclude settings then we
 * don't need to do any of this logic. NB. This assumes no PMU has both
@@ -1021,10 +1037,6 @@ static int check_excludes(struct perf_event **ctrs, 
unsigned int cflags[],
if (ppmu->flags & PPMU_ARCH_207S)
return 0;
 
-   n = n_prev + n_new;
-   if (n <= 1)
-   return 0;
-
first = 1;
for (i = 0; i < n; ++i) {
if (cflags[i] & PPMU_LIMITED_PMC_OK) {
-- 
2.27.0



Re: [PASEMI] Nemo board doesn't boot anymore because of moving pas_pci_init

2021-02-24 Thread Christian Zigotzky

On 24 February 21 at 03:17am, Oliver O'Halloran wrote:

On Wed, Feb 24, 2021 at 11:55 AM Michael Ellerman  wrote:

Olof Johansson  writes:

Hi,

On Tue, Feb 23, 2021 at 1:43 PM Christian Zigotzky
 wrote:

Hello,

The Nemo board [1] with a P.A. Semi PA6T SoC doesn't boot anymore
because of moving "pas_pci_init" to the device tree adoption [2] in the
latest PowerPC updates 5.12-1 [3].

Unfortunately the Nemo board doesn't have it in its device tree. I
reverted this commit and after that the Nemo board boots without any
problems.

What do you think about this ifdef?

#ifdef CONFIG_PPC_PASEMI_NEMO
  /*
   * Check for the Nemo motherboard here, if we are running on one
   * then pas_pci_init()
   */
  if (of_machine_is_compatible("pasemi,nemo")) {
  pas_pci_init();
  }
#endif

This is not a proper fix for the problem. Someone will need to debug
what on the pas_pci_init() codepath still needs to happen early in the
boot, even if the main PCI setup happens later.

I looked but don't see anything 100% obvious.

Possibly it's the call to isa_bridge_find_early()?

Looks like it. I think the problem stems from the use of the PIO
helpers (mainly outb()) in i8259_init() which is called from
nemo_init_IRQ(). The PIO helpers require the ISA space to be mapped
and io_isa_base to be set since they take a PIO register address
rather than an MMIO address. It looks like there's a few other legacy
embedded platforms that might have the same problem.

I guess the real fix would be to decouple the ISA base address
discovery from the PHB discovery. That should be doable since it's all
discovered via DT anyway and we only support one ISA address range,
but it's a bit of work.
Sorry because of the false statement of the boot issue. It was too late 
yesterday. If I understand it correctly then the position of the PCIE 
devices scan is at a new place. Therefore it doesn't work anymore. It 
hasn't nothing to do with the device tree adoption. We will use the 
following patch for reverting this commit for further testing the new 
kernels.


--- a/arch/powerpc/platforms/pasemi/setup.c 2021-02-23 
21:40:04.835999708 +0100
+++ b/arch/powerpc/platforms/pasemi/setup.c 2021-02-23 
21:46:04.560667045 +0100

@@ -144,6 +144,7 @@ static void __init pas_setup_arch(void)
    /* Setup SMP callback */
    smp_ops = _smp_ops;
 #endif
+   pas_pci_init();

    /* Remap SDC register for doing reset */
    /* XXXOJN This should maybe come out of the device tree */
@@ -444,7 +445,6 @@ define_machine(pasemi) {
    .name   = "PA Semi PWRficient",
    .probe  = pas_probe,
    .setup_arch = pas_setup_arch,
-   .discover_phbs  = pas_pci_init,
    .init_IRQ   = pas_init_IRQ,
    .get_irq    = mpic_get_irq,
    .restart    = pas_restart,


Re: [PATCH] powerpc/perf: Fix handling of privilege level checks in perf interrupt context

2021-02-24 Thread Athira Rajeev



> On 23-Feb-2021, at 6:24 PM, Michael Ellerman  wrote:
> 
> Peter Zijlstra  writes:
>> On Tue, Feb 23, 2021 at 01:31:49AM -0500, Athira Rajeev wrote:
>>> Running "perf mem record" in powerpc platforms with selinux enabled
>>> resulted in soft lockup's. Below call-trace was seen in the logs:
> ...
>>> 
>>> Since the purpose of this security hook is to control access to
>>> perf_event_open, it is not right to call this in interrupt context.
>>> But in case of powerpc PMU, we need the privilege checks for specific
>>> samples from branch history ring buffer and sampling register values.
>> 
>> I'm confused... why would you need those checks at event time? Either
>> the event has perf_event_attr::exclude_kernel and it then isn't allowed
>> to expose kernel addresses, or it doesn't and it is.
> 
> Well one of us is confused that's for sure ^_^
> 
> I missed/forgot that we had that logic in open.
> 
> I think the reason we got here is that in the past we didn't have the
> event in the low-level routines where we want to check,
> power_pmu_bhrb_read() and perf_get_data_addr(), so we hacked in a
> perf_paranoid_kernel() check. Which was wrong.
> 
> Then Joel's patch plumbed the event through and switched those paranoid
> checks to perf_allow_kernel().
> 
> Anyway, we'll just switch those to exclude_kernel checks.
> 
>> There should never be an event-time question of permission like this. If
>> you allow creation of an event, you're allowing the data it generates.
> 
> Ack.

Thanks for all the reviews. I will send a V2 with using 
'event->attr.exclude_kernel' in the checks.

Athira 
> 
> cheers


Re: [PATCH v2] vio: make remove callback return void

2021-02-24 Thread Anatoly Pugachev
On Wed, Feb 24, 2021 at 11:17 AM Uwe Kleine-König  
wrote:
>
> The driver core ignores the return value of struct bus_type::remove()
> because there is only little that can be done. To simplify the quest to
> make this function return void, let struct vio_driver::remove() return
> void, too. All users already unconditionally return 0, this commit makes
> it obvious that returning an error code is a bad idea and makes it
> obvious for future driver authors that returning an error code isn't
> intended.
>
> Note there are two nominally different implementations for a vio bus:
> one in arch/sparc/kernel/vio.c and the other in
> arch/powerpc/platforms/pseries/vio.c. I didn't care to check which
> driver is using which of these busses (or if even some of them can be
> used with both) and simply adapt all drivers and the two bus codes in
> one go.

Applied over current git kernel, boots on my sparc64 LDOM (sunvdc
block driver which uses vio).
Linux ttip 5.11.0-10201-gc03c21ba6f4e-dirty #189 SMP Wed Feb 24
13:48:37 MSK 2021 sparc64 GNU/Linux
boot logs (and kernel config) on [1] for "5.11.0-10201-gc03c21ba6f4e-dirty".
Up to you to add "tested-by".
Thanks.

1. https://github.com/mator/sparc64-dmesg

PS: going to check with ppc64 later as well on LPAR (uses vio).


[PATCH v5 3/3] KVM: PPC: Book3S HV: Use H_RPT_INVALIDATE in nested KVM

2021-02-24 Thread Bharata B Rao
In the nested KVM case, replace H_TLB_INVALIDATE by the new hcall
H_RPT_INVALIDATE if available. The availability of this hcall
is determined from "hcall-rpt-invalidate" string in ibm,hypertas-functions
DT property.

Signed-off-by: Bharata B Rao 
Reviewed-by: Fabiano Rosas 
---
 arch/powerpc/kvm/book3s_64_mmu_radix.c | 27 +-
 arch/powerpc/kvm/book3s_hv_nested.c| 12 ++--
 2 files changed, 32 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_64_mmu_radix.c 
b/arch/powerpc/kvm/book3s_64_mmu_radix.c
index bb35490400e9..7ea5459022cb 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_radix.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_radix.c
@@ -21,6 +21,7 @@
 #include 
 #include 
 #include 
+#include 
 
 /*
  * Supported radix tree geometry.
@@ -318,9 +319,19 @@ void kvmppc_radix_tlbie_page(struct kvm *kvm, unsigned 
long addr,
}
 
psi = shift_to_mmu_psize(pshift);
-   rb = addr | (mmu_get_ap(psi) << PPC_BITLSHIFT(58));
-   rc = plpar_hcall_norets(H_TLB_INVALIDATE, H_TLBIE_P1_ENC(0, 0, 1),
-   lpid, rb);
+
+   if (!firmware_has_feature(FW_FEATURE_RPT_INVALIDATE)) {
+   rb = addr | (mmu_get_ap(psi) << PPC_BITLSHIFT(58));
+   rc = plpar_hcall_norets(H_TLB_INVALIDATE, H_TLBIE_P1_ENC(0, 0, 
1),
+   lpid, rb);
+   } else {
+   rc = pseries_rpt_invalidate(lpid, H_RPTI_TARGET_CMMU,
+   H_RPTI_TYPE_NESTED |
+   H_RPTI_TYPE_TLB,
+   psize_to_rpti_pgsize(psi),
+   addr, addr + psize);
+   }
+
if (rc)
pr_err("KVM: TLB page invalidation hcall failed, rc=%ld\n", rc);
 }
@@ -334,8 +345,14 @@ static void kvmppc_radix_flush_pwc(struct kvm *kvm, 
unsigned int lpid)
return;
}
 
-   rc = plpar_hcall_norets(H_TLB_INVALIDATE, H_TLBIE_P1_ENC(1, 0, 1),
-   lpid, TLBIEL_INVAL_SET_LPID);
+   if (!firmware_has_feature(FW_FEATURE_RPT_INVALIDATE))
+   rc = plpar_hcall_norets(H_TLB_INVALIDATE, H_TLBIE_P1_ENC(1, 0, 
1),
+   lpid, TLBIEL_INVAL_SET_LPID);
+   else
+   rc = pseries_rpt_invalidate(lpid, H_RPTI_TARGET_CMMU,
+   H_RPTI_TYPE_NESTED |
+   H_RPTI_TYPE_PWC, H_RPTI_PAGE_ALL,
+   0, -1UL);
if (rc)
pr_err("KVM: TLB PWC invalidation hcall failed, rc=%ld\n", rc);
 }
diff --git a/arch/powerpc/kvm/book3s_hv_nested.c 
b/arch/powerpc/kvm/book3s_hv_nested.c
index ca43b2d38dce..2a6570e6c2c4 100644
--- a/arch/powerpc/kvm/book3s_hv_nested.c
+++ b/arch/powerpc/kvm/book3s_hv_nested.c
@@ -19,6 +19,7 @@
 #include 
 #include 
 #include 
+#include 
 
 static struct patb_entry *pseries_partition_tb;
 
@@ -444,8 +445,15 @@ static void kvmhv_flush_lpid(unsigned int lpid)
return;
}
 
-   rc = plpar_hcall_norets(H_TLB_INVALIDATE, H_TLBIE_P1_ENC(2, 0, 1),
-   lpid, TLBIEL_INVAL_SET_LPID);
+   if (!firmware_has_feature(FW_FEATURE_RPT_INVALIDATE))
+   rc = plpar_hcall_norets(H_TLB_INVALIDATE, H_TLBIE_P1_ENC(2, 0, 
1),
+   lpid, TLBIEL_INVAL_SET_LPID);
+   else
+   rc = pseries_rpt_invalidate(lpid, H_RPTI_TARGET_CMMU,
+   H_RPTI_TYPE_NESTED |
+   H_RPTI_TYPE_TLB | H_RPTI_TYPE_PWC |
+   H_RPTI_TYPE_PAT,
+   H_RPTI_PAGE_ALL, 0, -1UL);
if (rc)
pr_err("KVM: TLB LPID invalidation hcall failed, rc=%ld\n", rc);
 }
-- 
2.26.2



[PATCH v5 1/3] powerpc/book3s64/radix: Add H_RPT_INVALIDATE pgsize encodings to mmu_psize_def

2021-02-24 Thread Bharata B Rao
Add a field to mmu_psize_def to store the page size encodings
of H_RPT_INVALIDATE hcall. Initialize this while scanning the radix
AP encodings. This will be used when invalidating with required
page size encoding in the hcall.

Signed-off-by: Bharata B Rao 
---
 arch/powerpc/include/asm/book3s/64/mmu.h | 1 +
 arch/powerpc/mm/book3s64/radix_pgtable.c | 5 +
 2 files changed, 6 insertions(+)

diff --git a/arch/powerpc/include/asm/book3s/64/mmu.h 
b/arch/powerpc/include/asm/book3s/64/mmu.h
index eace8c3f7b0a..c02f42d1031e 100644
--- a/arch/powerpc/include/asm/book3s/64/mmu.h
+++ b/arch/powerpc/include/asm/book3s/64/mmu.h
@@ -19,6 +19,7 @@ struct mmu_psize_def {
int penc[MMU_PAGE_COUNT];   /* HPTE encoding */
unsigned inttlbiel; /* tlbiel supported for that page size */
unsigned long   avpnm;  /* bits to mask out in AVPN in the HPTE */
+   unsigned long   h_rpt_pgsize; /* H_RPT_INVALIDATE page size encoding */
union {
unsigned long   sllp;   /* SLB L||LP (exact mask to use in 
slbmte) */
unsigned long ap;   /* Ap encoding used by PowerISA 3.0 */
diff --git a/arch/powerpc/mm/book3s64/radix_pgtable.c 
b/arch/powerpc/mm/book3s64/radix_pgtable.c
index 98f0b243c1ab..1b749899016b 100644
--- a/arch/powerpc/mm/book3s64/radix_pgtable.c
+++ b/arch/powerpc/mm/book3s64/radix_pgtable.c
@@ -486,6 +486,7 @@ static int __init radix_dt_scan_page_sizes(unsigned long 
node,
def = _psize_defs[idx];
def->shift = shift;
def->ap  = ap;
+   def->h_rpt_pgsize = psize_to_rpti_pgsize(idx);
}
 
/* needed ? */
@@ -560,9 +561,13 @@ void __init radix__early_init_devtree(void)
 */
mmu_psize_defs[MMU_PAGE_4K].shift = 12;
mmu_psize_defs[MMU_PAGE_4K].ap = 0x0;
+   mmu_psize_defs[MMU_PAGE_4K].h_rpt_pgsize =
+   psize_to_rpti_pgsize(MMU_PAGE_4K);
 
mmu_psize_defs[MMU_PAGE_64K].shift = 16;
mmu_psize_defs[MMU_PAGE_64K].ap = 0x5;
+   mmu_psize_defs[MMU_PAGE_64K].h_rpt_pgsize =
+   psize_to_rpti_pgsize(MMU_PAGE_64K);
}
 
/*
-- 
2.26.2



[PATCH v5 2/3] KVM: PPC: Book3S HV: Add support for H_RPT_INVALIDATE

2021-02-24 Thread Bharata B Rao
Implement H_RPT_INVALIDATE hcall and add KVM capability
KVM_CAP_PPC_RPT_INVALIDATE to indicate the support for the same.

This hcall does two types of TLB invalidations:

1. Process-scoped invalidations for guests with LPCR[GTSE]=0.
   This is currently not used in KVM as GTSE is not usually
   disabled in KVM.
2. Partition-scoped invalidations that an L1 hypervisor does on
   behalf of an L2 guest. This replaces the uses of the existing
   hcall H_TLB_INVALIDATE.

In order to handle process scoped invalidations of L2, we
intercept the nested exit handling code in L0 only to handle
H_TLB_INVALIDATE hcall.

Process scoped tlbie invalidations from L1 and nested guests
need RS register for TLBIE instruction to contain both PID and
LPID.  This patch introduces primitives that execute tlbie
instruction with both PID and LPID set in prepartion for
H_RPT_INVALIDATE hcall.

Signed-off-by: Bharata B Rao 
---
 Documentation/virt/kvm/api.rst|  18 +++
 .../include/asm/book3s/64/tlbflush-radix.h|   4 +
 arch/powerpc/include/asm/kvm_book3s.h |   3 +
 arch/powerpc/include/asm/mmu_context.h|  11 ++
 arch/powerpc/kvm/book3s_hv.c  |  90 +++
 arch/powerpc/kvm/book3s_hv_nested.c   |  77 +
 arch/powerpc/kvm/powerpc.c|   3 +
 arch/powerpc/mm/book3s64/radix_tlb.c  | 147 +-
 include/uapi/linux/kvm.h  |   1 +
 9 files changed, 350 insertions(+), 4 deletions(-)

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index 45fd862ac128..38ce3f21b21f 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -6225,6 +6225,24 @@ KVM_RUN_BUS_LOCK flag is used to distinguish between 
them.
 This capability can be used to check / enable 2nd DAWR feature provided
 by POWER10 processor.
 
+7.23 KVM_CAP_PPC_RPT_INVALIDATE
+--
+
+:Capability: KVM_CAP_PPC_RPT_INVALIDATE
+:Architectures: ppc
+:Type: vm
+
+This capability indicates that the kernel is capable of handling
+H_RPT_INVALIDATE hcall.
+
+In order to enable the use of H_RPT_INVALIDATE in the guest,
+user space might have to advertise it for the guest. For example,
+IBM pSeries (sPAPR) guest starts using it if "hcall-rpt-invalidate" is
+present in the "ibm,hypertas-functions" device-tree property.
+
+This capability is enabled for hypervisors on platforms like POWER9
+that support radix MMU.
+
 8. Other capabilities.
 ==
 
diff --git a/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h 
b/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h
index 8b33601cdb9d..a46fd37ad552 100644
--- a/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h
+++ b/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h
@@ -4,6 +4,10 @@
 
 #include 
 
+#define RIC_FLUSH_TLB 0
+#define RIC_FLUSH_PWC 1
+#define RIC_FLUSH_ALL 2
+
 struct vm_area_struct;
 struct mm_struct;
 struct mmu_gather;
diff --git a/arch/powerpc/include/asm/kvm_book3s.h 
b/arch/powerpc/include/asm/kvm_book3s.h
index 2f5f919f6cd3..a1515f94400e 100644
--- a/arch/powerpc/include/asm/kvm_book3s.h
+++ b/arch/powerpc/include/asm/kvm_book3s.h
@@ -305,6 +305,9 @@ void kvmhv_set_ptbl_entry(unsigned int lpid, u64 dw0, u64 
dw1);
 void kvmhv_release_all_nested(struct kvm *kvm);
 long kvmhv_enter_nested_guest(struct kvm_vcpu *vcpu);
 long kvmhv_do_nested_tlbie(struct kvm_vcpu *vcpu);
+long kvmhv_h_rpti_nested(struct kvm_vcpu *vcpu, unsigned long lpid,
+unsigned long type, unsigned long pg_sizes,
+unsigned long start, unsigned long end);
 int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu,
  u64 time_limit, unsigned long lpcr);
 void kvmhv_save_hv_regs(struct kvm_vcpu *vcpu, struct hv_guest_state *hr);
diff --git a/arch/powerpc/include/asm/mmu_context.h 
b/arch/powerpc/include/asm/mmu_context.h
index 652ce85f9410..820caf4e01b7 100644
--- a/arch/powerpc/include/asm/mmu_context.h
+++ b/arch/powerpc/include/asm/mmu_context.h
@@ -124,8 +124,19 @@ static inline bool need_extra_context(struct mm_struct 
*mm, unsigned long ea)
 
 #if defined(CONFIG_KVM_BOOK3S_HV_POSSIBLE) && defined(CONFIG_PPC_RADIX_MMU)
 extern void radix_kvm_prefetch_workaround(struct mm_struct *mm);
+void do_h_rpt_invalidate(unsigned long pid, unsigned long lpid,
+unsigned long type, unsigned long page_size,
+unsigned long psize, unsigned long start,
+unsigned long end);
 #else
 static inline void radix_kvm_prefetch_workaround(struct mm_struct *mm) { }
+static inline void do_h_rpt_invalidate(unsigned long pid,
+  unsigned long lpid,
+  unsigned long type,
+  unsigned long page_size,
+  unsigned long psize,
+  unsigned long start,
+  

[PATCH v5 0/3] Support for H_RPT_INVALIDATE in PowerPC KVM

2021-02-24 Thread Bharata B Rao
This patchset adds support for the new hcall H_RPT_INVALIDATE
and replaces the nested tlb flush calls with this new hcall
if support for the same exists.

Changes in v5:
-
- Included the h_rpt_invalidate page size information within
  mmu_pszie_defs[] as per David Gibson's suggestion.
- Redid nested exit changes as per Paul Mackerras' suggestion.
- Folded the patch that added tlbie primitives into the
  hcall implementation patch.

v4: 
https://lore.kernel.org/linuxppc-dev/20210215063542.3642366-1-bhar...@linux.ibm.com/T/#t

Bharata B Rao (3):
  powerpc/book3s64/radix: Add H_RPT_INVALIDATE pgsize encodings to
mmu_psize_def
  KVM: PPC: Book3S HV: Add support for H_RPT_INVALIDATE
  KVM: PPC: Book3S HV: Use H_RPT_INVALIDATE in nested KVM

 Documentation/virt/kvm/api.rst|  18 +++
 arch/powerpc/include/asm/book3s/64/mmu.h  |   1 +
 .../include/asm/book3s/64/tlbflush-radix.h|   4 +
 arch/powerpc/include/asm/kvm_book3s.h |   3 +
 arch/powerpc/include/asm/mmu_context.h|  11 ++
 arch/powerpc/kvm/book3s_64_mmu_radix.c|  27 +++-
 arch/powerpc/kvm/book3s_hv.c  |  90 +++
 arch/powerpc/kvm/book3s_hv_nested.c   |  89 ++-
 arch/powerpc/kvm/powerpc.c|   3 +
 arch/powerpc/mm/book3s64/radix_pgtable.c  |   5 +
 arch/powerpc/mm/book3s64/radix_tlb.c  | 147 +-
 include/uapi/linux/kvm.h  |   1 +
 12 files changed, 388 insertions(+), 11 deletions(-)

-- 
2.26.2



Is unrecoverable_exception() really an interrupt handler ?

2021-02-24 Thread Christophe Leroy

Hi Nick,

You defined unrecoverable_exeption() as an interrupt handler in interrupt.h

I think there are several issues around that:

- do_bad_slb_fault() which is also an interrupt handler calls 
unrecoverable_exeption()
- in exception-64s.S, unrecoverable_exeption() is called after 
machine_check_exception()
- interrupt_exit_kernel_prepare() calls unrecoverable_exception()

So in those cases, interrupt_enter_prepare() gets called twice, so things like for instance 
account_cpu_user_entry() gets called twice.


Christophe