linux-next: build failure after merge of the powerpc tree

2018-01-18 Thread Stephen Rothwell
Hi all,

After merging the powerpc tree, today's linux-next build (powerpc64
allnoconfig) failed like this:

arch/powerpc/kernel/mce_power.o: In function `.mce_handle_error':
mce_power.c:(.text+0x5a8): undefined reference to `.hash__tlbiel_all'
mce_power.c:(.text+0x6b8): undefined reference to `.hash__tlbiel_all'
arch/powerpc/mm/hash_utils_64.o: In function `.hash__early_init_mmu':
hash_utils_64.c:(.init.text+0x9d0): undefined reference to `.hash__tlbiel_all'

Caused by commit

  d4748276ae14 ("powerpc/64s: Improve local TLB flush for boot and MCE on 
POWER9")

The definition of hash__tlbiel_all() is in
arch/powerpc/mm/hash_native_64.c which is only built if CONFIG_PPC_NATIVE
is set, which it is not for this build.

I applied a supplied fix patch.

-- 
Cheers,
Stephen Rothwell


[PATCH v10 27/27] mm: display pkey in smaps if arch_pkeys_enabled() is true

2018-01-18 Thread Ram Pai
Currently the  architecture  specific code is expected to
display  the  protection  keys  in  smap  for a given vma.
This can lead to redundant code and possibly to divergent
formats in which the key gets displayed.

This  patch  changes  the implementation. It displays the
pkey only if the architecture support pkeys.

x86 arch_show_smap() function is not needed anymore.
Delete it.

Signed-off-by: Ram Pai 
---
 arch/x86/kernel/setup.c |8 
 fs/proc/task_mmu.c  |   11 ++-
 2 files changed, 6 insertions(+), 13 deletions(-)

diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 8af2e8d..ddf945a 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -1326,11 +1326,3 @@ static int __init register_kernel_offset_dumper(void)
return 0;
 }
 __initcall(register_kernel_offset_dumper);
-
-void arch_show_smap(struct seq_file *m, struct vm_area_struct *vma)
-{
-   if (!boot_cpu_has(X86_FEATURE_OSPKE))
-   return;
-
-   seq_printf(m, "ProtectionKey:  %8u\n", vma_pkey(vma));
-}
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 0edd4da..4b39a94 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -18,6 +18,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -728,10 +729,6 @@ static int smaps_hugetlb_range(pte_t *pte, unsigned long 
hmask,
 }
 #endif /* HUGETLB_PAGE */
 
-void __weak arch_show_smap(struct seq_file *m, struct vm_area_struct *vma)
-{
-}
-
 static int show_smap(struct seq_file *m, void *v, int is_pid)
 {
struct proc_maps_private *priv = m->private;
@@ -851,9 +848,13 @@ static int show_smap(struct seq_file *m, void *v, int 
is_pid)
   (unsigned long)(mss->pss >> (10 + PSS_SHIFT)));
 
if (!rollup_mode) {
-   arch_show_smap(m, vma);
+#ifdef CONFIG_ARCH_HAS_PKEYS
+   if (arch_pkeys_enabled())
+   seq_printf(m, "ProtectionKey:  %8u\n", vma_pkey(vma));
+#endif
show_smap_vma_flags(m, vma);
}
+
m_cache_vma(m, vma);
return ret;
 }
-- 
1.7.1



[PATCH v10 26/27] mm, x86 : introduce arch_pkeys_enabled()

2018-01-18 Thread Ram Pai
Arch neutral code needs to know if the architecture supports
protection  keys  to  display protection key in smaps. Hence
introducing arch_pkeys_enabled().

This patch also provides x86 implementation for
arch_pkeys_enabled().

Signed-off-by: Ram Pai 
---
 arch/x86/include/asm/pkeys.h |1 +
 arch/x86/kernel/fpu/xstate.c |5 +
 include/linux/pkeys.h|5 +
 3 files changed, 11 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/pkeys.h b/arch/x86/include/asm/pkeys.h
index a0ba1ff..f6c287b 100644
--- a/arch/x86/include/asm/pkeys.h
+++ b/arch/x86/include/asm/pkeys.h
@@ -6,6 +6,7 @@
 
 extern int arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
unsigned long init_val);
+extern bool arch_pkeys_enabled(void);
 
 /*
  * Try to dedicate one of the protection keys to be used as an
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index 87a57b7..4f566e9 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -945,6 +945,11 @@ int arch_set_user_pkey_access(struct task_struct *tsk, int 
pkey,
 
return 0;
 }
+
+bool arch_pkeys_enabled(void)
+{
+   return boot_cpu_has(X86_FEATURE_OSPKE);
+}
 #endif /* ! CONFIG_ARCH_HAS_PKEYS */
 
 /*
diff --git a/include/linux/pkeys.h b/include/linux/pkeys.h
index 0794ca7..3ca2e44 100644
--- a/include/linux/pkeys.h
+++ b/include/linux/pkeys.h
@@ -35,6 +35,11 @@ static inline int arch_set_user_pkey_access(struct 
task_struct *tsk, int pkey,
return 0;
 }
 
+static inline bool arch_pkeys_enabled(void)
+{
+   return false;
+}
+
 static inline void copy_init_pkru_to_fpregs(void)
 {
 }
-- 
1.7.1



[PATCH v10 25/27] powerpc: sys_pkey_mprotect() system call

2018-01-18 Thread Ram Pai
Patch provides the ability for a process to
associate a pkey with a address range.

Signed-off-by: Ram Pai 
---
 arch/powerpc/include/asm/systbl.h  |1 +
 arch/powerpc/include/asm/unistd.h  |4 +---
 arch/powerpc/include/uapi/asm/unistd.h |1 +
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/include/asm/systbl.h 
b/arch/powerpc/include/asm/systbl.h
index dea4a95..d61f9c9 100644
--- a/arch/powerpc/include/asm/systbl.h
+++ b/arch/powerpc/include/asm/systbl.h
@@ -391,3 +391,4 @@
 SYSCALL(statx)
 SYSCALL(pkey_alloc)
 SYSCALL(pkey_free)
+SYSCALL(pkey_mprotect)
diff --git a/arch/powerpc/include/asm/unistd.h 
b/arch/powerpc/include/asm/unistd.h
index e0273bc..daf1ba9 100644
--- a/arch/powerpc/include/asm/unistd.h
+++ b/arch/powerpc/include/asm/unistd.h
@@ -12,12 +12,10 @@
 #include 
 
 
-#define NR_syscalls386
+#define NR_syscalls387
 
 #define __NR__exit __NR_exit
 
-#define __IGNORE_pkey_mprotect
-
 #ifndef __ASSEMBLY__
 
 #include 
diff --git a/arch/powerpc/include/uapi/asm/unistd.h 
b/arch/powerpc/include/uapi/asm/unistd.h
index 5db4385..389c36f 100644
--- a/arch/powerpc/include/uapi/asm/unistd.h
+++ b/arch/powerpc/include/uapi/asm/unistd.h
@@ -397,5 +397,6 @@
 #define __NR_statx 383
 #define __NR_pkey_alloc384
 #define __NR_pkey_free 385
+#define __NR_pkey_mprotect 386
 
 #endif /* _UAPI_ASM_POWERPC_UNISTD_H_ */
-- 
1.7.1



[PATCH v10 24/27] powerpc: sys_pkey_alloc() and sys_pkey_free() system calls

2018-01-18 Thread Ram Pai
Finally this patch provides the ability for a process to
allocate and free a protection key.

Signed-off-by: Ram Pai 
---
 arch/powerpc/include/asm/systbl.h  |2 ++
 arch/powerpc/include/asm/unistd.h  |4 +---
 arch/powerpc/include/uapi/asm/unistd.h |2 ++
 3 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/include/asm/systbl.h 
b/arch/powerpc/include/asm/systbl.h
index 449912f..dea4a95 100644
--- a/arch/powerpc/include/asm/systbl.h
+++ b/arch/powerpc/include/asm/systbl.h
@@ -389,3 +389,5 @@
 COMPAT_SYS_SPU(pwritev2)
 SYSCALL(kexec_file_load)
 SYSCALL(statx)
+SYSCALL(pkey_alloc)
+SYSCALL(pkey_free)
diff --git a/arch/powerpc/include/asm/unistd.h 
b/arch/powerpc/include/asm/unistd.h
index 9ba11db..e0273bc 100644
--- a/arch/powerpc/include/asm/unistd.h
+++ b/arch/powerpc/include/asm/unistd.h
@@ -12,13 +12,11 @@
 #include 
 
 
-#define NR_syscalls384
+#define NR_syscalls386
 
 #define __NR__exit __NR_exit
 
 #define __IGNORE_pkey_mprotect
-#define __IGNORE_pkey_alloc
-#define __IGNORE_pkey_free
 
 #ifndef __ASSEMBLY__
 
diff --git a/arch/powerpc/include/uapi/asm/unistd.h 
b/arch/powerpc/include/uapi/asm/unistd.h
index df8684f..5db4385 100644
--- a/arch/powerpc/include/uapi/asm/unistd.h
+++ b/arch/powerpc/include/uapi/asm/unistd.h
@@ -395,5 +395,7 @@
 #define __NR_pwritev2  381
 #define __NR_kexec_file_load   382
 #define __NR_statx 383
+#define __NR_pkey_alloc384
+#define __NR_pkey_free 385
 
 #endif /* _UAPI_ASM_POWERPC_UNISTD_H_ */
-- 
1.7.1



[PATCH v10 23/27] powerpc: Enable pkey subsystem

2018-01-18 Thread Ram Pai
PAPR defines 'ibm,processor-storage-keys' property. It exports two
values. The first value holds the number of data-access keys and the
second holds the number of instruction-access keys.  Due to a bug in
the  firmware, instruction-access  keys is  always  reported  as zero.
However any key can be configured to disable data-access and/or disable
execution-access. The inavailablity of the second value is not a
big handicap, though it could have been used to determine if the
platform supported disable-execution-access.

Non-PAPR platforms do not define this property in the device tree yet.
Fortunately power8 is the only released Non-PAPR platform that is
supported.  Here, we hardcode the number of supported pkey to 32, by
consulting the PowerISA3.0

This patch calculates the number of keys supported by the platform.
Also it determines the platform support for read/write/execution access
support for pkeys.

Signed-off-by: Ram Pai 
---
 arch/powerpc/include/asm/cputable.h |   16 ++---
 arch/powerpc/include/asm/pkeys.h|3 ++
 arch/powerpc/mm/pkeys.c |   61 +-
 3 files changed, 65 insertions(+), 15 deletions(-)

diff --git a/arch/powerpc/include/asm/cputable.h 
b/arch/powerpc/include/asm/cputable.h
index 0546663..5b54337 100644
--- a/arch/powerpc/include/asm/cputable.h
+++ b/arch/powerpc/include/asm/cputable.h
@@ -207,7 +207,7 @@ enum {
 #define CPU_FTR_STCX_CHECKS_ADDRESSLONG_ASM_CONST(0x0004)
 #define CPU_FTR_POPCNTB
LONG_ASM_CONST(0x0008)
 #define CPU_FTR_POPCNTD
LONG_ASM_CONST(0x0010)
-/* Free
LONG_ASM_CONST(0x0020) */
+#define CPU_FTR_PKEY   LONG_ASM_CONST(0x0020)
 #define CPU_FTR_VMX_COPY   LONG_ASM_CONST(0x0040)
 #define CPU_FTR_TM LONG_ASM_CONST(0x0080)
 #define CPU_FTR_CFAR   LONG_ASM_CONST(0x0100)
@@ -215,6 +215,7 @@ enum {
 #define CPU_FTR_DAWR   LONG_ASM_CONST(0x0400)
 #define CPU_FTR_DABRX  LONG_ASM_CONST(0x0800)
 #define CPU_FTR_PMAO_BUG   LONG_ASM_CONST(0x1000)
+#define CPU_FTR_PKEY_EXECUTE   LONG_ASM_CONST(0x2000)
 #define CPU_FTR_POWER9_DD1 LONG_ASM_CONST(0x4000)
 #define CPU_FTR_POWER9_DD2_1   LONG_ASM_CONST(0x8000)
 
@@ -437,7 +438,8 @@ enum {
CPU_FTR_PPCAS_ARCH_V2 | CPU_FTR_CTRL | \
CPU_FTR_MMCRA | CPU_FTR_SMT | \
CPU_FTR_COHERENT_ICACHE | CPU_FTR_PURR | \
-   CPU_FTR_STCX_CHECKS_ADDRESS | CPU_FTR_POPCNTB | CPU_FTR_DABRX)
+   CPU_FTR_STCX_CHECKS_ADDRESS | CPU_FTR_POPCNTB | CPU_FTR_DABRX | \
+   CPU_FTR_PKEY)
 #define CPU_FTRS_POWER6 (CPU_FTR_USE_TB | CPU_FTR_LWSYNC | \
CPU_FTR_PPCAS_ARCH_V2 | CPU_FTR_CTRL | \
CPU_FTR_MMCRA | CPU_FTR_SMT | \
@@ -445,7 +447,7 @@ enum {
CPU_FTR_PURR | CPU_FTR_SPURR | CPU_FTR_REAL_LE | \
CPU_FTR_DSCR | CPU_FTR_UNALIGNED_LD_STD | \
CPU_FTR_STCX_CHECKS_ADDRESS | CPU_FTR_POPCNTB | CPU_FTR_CFAR | \
-   CPU_FTR_DABRX)
+   CPU_FTR_DABRX | CPU_FTR_PKEY)
 #define CPU_FTRS_POWER7 (CPU_FTR_USE_TB | CPU_FTR_LWSYNC | \
CPU_FTR_PPCAS_ARCH_V2 | CPU_FTR_CTRL | CPU_FTR_ARCH_206 |\
CPU_FTR_MMCRA | CPU_FTR_SMT | \
@@ -454,7 +456,7 @@ enum {
CPU_FTR_DSCR | CPU_FTR_SAO  | CPU_FTR_ASYM_SMT | \
CPU_FTR_STCX_CHECKS_ADDRESS | CPU_FTR_POPCNTB | CPU_FTR_POPCNTD | \
CPU_FTR_CFAR | CPU_FTR_HVMODE | \
-   CPU_FTR_VMX_COPY | CPU_FTR_HAS_PPR | CPU_FTR_DABRX)
+   CPU_FTR_VMX_COPY | CPU_FTR_HAS_PPR | CPU_FTR_DABRX | CPU_FTR_PKEY)
 #define CPU_FTRS_POWER8 (CPU_FTR_USE_TB | CPU_FTR_LWSYNC | \
CPU_FTR_PPCAS_ARCH_V2 | CPU_FTR_CTRL | CPU_FTR_ARCH_206 |\
CPU_FTR_MMCRA | CPU_FTR_SMT | \
@@ -464,7 +466,8 @@ enum {
CPU_FTR_STCX_CHECKS_ADDRESS | CPU_FTR_POPCNTB | CPU_FTR_POPCNTD | \
CPU_FTR_CFAR | CPU_FTR_HVMODE | CPU_FTR_VMX_COPY | \
CPU_FTR_DBELL | CPU_FTR_HAS_PPR | CPU_FTR_DAWR | \
-   CPU_FTR_ARCH_207S | CPU_FTR_TM_COMP)
+   CPU_FTR_ARCH_207S | CPU_FTR_TM_COMP | CPU_FTR_PKEY |\
+   CPU_FTR_PKEY_EXECUTE)
 #define CPU_FTRS_POWER8E (CPU_FTRS_POWER8 | CPU_FTR_PMAO_BUG)
 #define CPU_FTRS_POWER8_DD1 (CPU_FTRS_POWER8 & ~CPU_FTR_DBELL)
 #define CPU_FTRS_POWER9 (CPU_FTR_USE_TB | CPU_FTR_LWSYNC | \
@@ -476,7 +479,8 @@ enum {
CPU_FTR_STCX_CHECKS_ADDRESS | CPU_FTR_POPCNTB | CPU_FTR_POPCNTD | \
CPU_FTR_CFAR | CPU_FTR_HVMODE | CPU_FTR_VMX_COPY | \
CPU_FTR_DBELL | CPU_FTR_HAS_PPR | CPU_FTR_DAWR | \
-   CPU_FTR_ARCH_207S | CPU_FTR_TM_COMP | CPU_FTR_ARCH_300)
+   CPU_FTR_ARCH_207S | CPU_FTR_TM_COMP | CPU_FTR_ARCH_300 | \
+   C

[PATCH v10 22/27] powerpc/ptrace: Add memory protection key regset

2018-01-18 Thread Ram Pai
From: Thiago Jung Bauermann 

The AMR/IAMR/UAMOR are part of the program context.
Allow it to be accessed via ptrace and through core files.

Signed-off-by: Ram Pai 
Signed-off-by: Thiago Jung Bauermann 
---
 arch/powerpc/include/asm/pkeys.h|5 +++
 arch/powerpc/include/uapi/asm/elf.h |1 +
 arch/powerpc/kernel/ptrace.c|   66 +++
 arch/powerpc/kernel/traps.c |7 
 include/uapi/linux/elf.h|1 +
 5 files changed, 80 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
index 7c45a40..0c480b2 100644
--- a/arch/powerpc/include/asm/pkeys.h
+++ b/arch/powerpc/include/asm/pkeys.h
@@ -213,6 +213,11 @@ static inline int arch_set_user_pkey_access(struct 
task_struct *tsk, int pkey,
return __arch_set_user_pkey_access(tsk, pkey, init_val);
 }
 
+static inline bool arch_pkeys_enabled(void)
+{
+   return !static_branch_likely(&pkey_disabled);
+}
+
 extern void pkey_mm_init(struct mm_struct *mm);
 extern void thread_pkey_regs_save(struct thread_struct *thread);
 extern void thread_pkey_regs_restore(struct thread_struct *new_thread,
diff --git a/arch/powerpc/include/uapi/asm/elf.h 
b/arch/powerpc/include/uapi/asm/elf.h
index 5f201d4..860c592 100644
--- a/arch/powerpc/include/uapi/asm/elf.h
+++ b/arch/powerpc/include/uapi/asm/elf.h
@@ -97,6 +97,7 @@
 #define ELF_NTMSPRREG  3   /* include tfhar, tfiar, texasr */
 #define ELF_NEBB   3   /* includes ebbrr, ebbhr, bescr */
 #define ELF_NPMU   5   /* includes siar, sdar, sier, mmcr2, mmcr0 */
+#define ELF_NPKEY  3   /* includes amr, iamr, uamor */
 
 typedef unsigned long elf_greg_t64;
 typedef elf_greg_t64 elf_gregset_t64[ELF_NGREG];
diff --git a/arch/powerpc/kernel/ptrace.c b/arch/powerpc/kernel/ptrace.c
index f52ad5b..3718a04 100644
--- a/arch/powerpc/kernel/ptrace.c
+++ b/arch/powerpc/kernel/ptrace.c
@@ -35,6 +35,7 @@
 #include 
 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -1775,6 +1776,61 @@ static int pmu_set(struct task_struct *target,
return ret;
 }
 #endif
+
+#ifdef CONFIG_PPC_MEM_KEYS
+static int pkey_active(struct task_struct *target,
+  const struct user_regset *regset)
+{
+   if (!arch_pkeys_enabled())
+   return -ENODEV;
+
+   return regset->n;
+}
+
+static int pkey_get(struct task_struct *target,
+   const struct user_regset *regset,
+   unsigned int pos, unsigned int count,
+   void *kbuf, void __user *ubuf)
+{
+   BUILD_BUG_ON(TSO(amr) + sizeof(unsigned long) != TSO(iamr));
+   BUILD_BUG_ON(TSO(iamr) + sizeof(unsigned long) != TSO(uamor));
+
+   if (!arch_pkeys_enabled())
+   return -ENODEV;
+
+   return user_regset_copyout(&pos, &count, &kbuf, &ubuf,
+  &target->thread.amr, 0,
+  ELF_NPKEY * sizeof(unsigned long));
+}
+
+static int pkey_set(struct task_struct *target,
+ const struct user_regset *regset,
+ unsigned int pos, unsigned int count,
+ const void *kbuf, const void __user *ubuf)
+{
+   u64 new_amr;
+   int ret;
+
+   if (!arch_pkeys_enabled())
+   return -ENODEV;
+
+   /* Only the AMR can be set from userspace */
+   if (pos != 0 || count != sizeof(new_amr))
+   return -EINVAL;
+
+   ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf,
+&new_amr, 0, sizeof(new_amr));
+   if (ret)
+   return ret;
+
+   /* UAMOR determines which bits of the AMR can be set from userspace. */
+   target->thread.amr = (new_amr & target->thread.uamor) |
+   (target->thread.amr & ~target->thread.uamor);
+
+   return 0;
+}
+#endif /* CONFIG_PPC_MEM_KEYS */
+
 /*
  * These are our native regset flavors.
  */
@@ -1809,6 +1865,9 @@ enum powerpc_regset {
REGSET_EBB, /* EBB registers */
REGSET_PMR, /* Performance Monitor Registers */
 #endif
+#ifdef CONFIG_PPC_MEM_KEYS
+   REGSET_PKEY,/* AMR register */
+#endif
 };
 
 static const struct user_regset native_regsets[] = {
@@ -1914,6 +1973,13 @@ enum powerpc_regset {
.active = pmu_active, .get = pmu_get, .set = pmu_set
},
 #endif
+#ifdef CONFIG_PPC_MEM_KEYS
+   [REGSET_PKEY] = {
+   .core_note_type = NT_PPC_PKEY, .n = ELF_NPKEY,
+   .size = sizeof(u64), .align = sizeof(u64),
+   .active = pkey_active, .get = pkey_get, .set = pkey_set
+   },
+#endif
 };
 
 static const struct user_regset_view user_ppc_native_view = {
diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
index 3753498..c42ca5b 100644
--- a/arch/powerpc/kernel/traps.c
+++ b/arch/powerpc/kernel/traps.c
@@ -292,6 +292,13 @@ void _exception_pkey(int signr, struct pt_regs *regs, int 
co

[PATCH v10 21/27] powerpc: Deliver SEGV signal on pkey violation

2018-01-18 Thread Ram Pai
The value of the pkey, whose protection got violated,
is made available in si_pkey field of the siginfo structure.

Signed-off-by: Ram Pai 
Signed-off-by: Thiago Jung Bauermann 
---
 arch/powerpc/include/asm/bug.h |1 +
 arch/powerpc/kernel/traps.c|   12 +++-
 arch/powerpc/mm/fault.c|   39 +++
 3 files changed, 39 insertions(+), 13 deletions(-)

diff --git a/arch/powerpc/include/asm/bug.h b/arch/powerpc/include/asm/bug.h
index 3c04249..97c3847 100644
--- a/arch/powerpc/include/asm/bug.h
+++ b/arch/powerpc/include/asm/bug.h
@@ -133,6 +133,7 @@
 extern int do_page_fault(struct pt_regs *, unsigned long, unsigned long);
 extern void bad_page_fault(struct pt_regs *, unsigned long, int);
 extern void _exception(int, struct pt_regs *, int, unsigned long);
+extern void _exception_pkey(int, struct pt_regs *, int, unsigned long, int);
 extern void die(const char *, struct pt_regs *, long);
 extern bool die_will_crash(void);
 
diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
index f3eb61b..3753498 100644
--- a/arch/powerpc/kernel/traps.c
+++ b/arch/powerpc/kernel/traps.c
@@ -20,6 +20,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -266,7 +267,9 @@ void user_single_step_siginfo(struct task_struct *tsk,
info->si_addr = (void __user *)regs->nip;
 }
 
-void _exception(int signr, struct pt_regs *regs, int code, unsigned long addr)
+
+void _exception_pkey(int signr, struct pt_regs *regs, int code,
+   unsigned long addr, int key)
 {
siginfo_t info;
const char fmt32[] = KERN_INFO "%s[%d]: unhandled signal %d " \
@@ -293,9 +296,16 @@ void _exception(int signr, struct pt_regs *regs, int code, 
unsigned long addr)
info.si_signo = signr;
info.si_code = code;
info.si_addr = (void __user *) addr;
+   info.si_pkey = key;
+
force_sig_info(signr, &info, current);
 }
 
+void _exception(int signr, struct pt_regs *regs, int code, unsigned long addr)
+{
+   _exception_pkey(signr, regs, code, addr, 0);
+}
+
 void system_reset_exception(struct pt_regs *regs)
 {
/*
diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
index 943a91e..65f8b04 100644
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@@ -107,7 +107,8 @@ static bool store_updates_sp(struct pt_regs *regs)
  */
 
 static int
-__bad_area_nosemaphore(struct pt_regs *regs, unsigned long address, int 
si_code)
+__bad_area_nosemaphore(struct pt_regs *regs, unsigned long address, int 
si_code,
+   int pkey)
 {
/*
 * If we are in kernel mode, bail out with a SEGV, this will
@@ -117,17 +118,18 @@ static bool store_updates_sp(struct pt_regs *regs)
if (!user_mode(regs))
return SIGSEGV;
 
-   _exception(SIGSEGV, regs, si_code, address);
+   _exception_pkey(SIGSEGV, regs, si_code, address, pkey);
 
return 0;
 }
 
 static noinline int bad_area_nosemaphore(struct pt_regs *regs, unsigned long 
address)
 {
-   return __bad_area_nosemaphore(regs, address, SEGV_MAPERR);
+   return __bad_area_nosemaphore(regs, address, SEGV_MAPERR, 0);
 }
 
-static int __bad_area(struct pt_regs *regs, unsigned long address, int si_code)
+static int __bad_area(struct pt_regs *regs, unsigned long address, int si_code,
+   int pkey)
 {
struct mm_struct *mm = current->mm;
 
@@ -137,12 +139,18 @@ static int __bad_area(struct pt_regs *regs, unsigned long 
address, int si_code)
 */
up_read(&mm->mmap_sem);
 
-   return __bad_area_nosemaphore(regs, address, si_code);
+   return __bad_area_nosemaphore(regs, address, si_code, pkey);
 }
 
 static noinline int bad_area(struct pt_regs *regs, unsigned long address)
 {
-   return __bad_area(regs, address, SEGV_MAPERR);
+   return __bad_area(regs, address, SEGV_MAPERR, 0);
+}
+
+static int bad_key_fault_exception(struct pt_regs *regs, unsigned long address,
+   int pkey)
+{
+   return __bad_area_nosemaphore(regs, address, SEGV_PKUERR, pkey);
 }
 
 static int do_sigbus(struct pt_regs *regs, unsigned long address,
@@ -427,10 +435,9 @@ static int __do_page_fault(struct pt_regs *regs, unsigned 
long address,
 
perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address);
 
-   if (error_code & DSISR_KEYFAULT) {
-   _exception(SIGSEGV, regs, SEGV_PKUERR, address);
-   return 0;
-   }
+   if (error_code & DSISR_KEYFAULT)
+   return bad_key_fault_exception(regs, address,
+  get_mm_addr_key(mm, address));
 
/*
 * We want to do this outside mmap_sem, because reading code around nip
@@ -513,10 +520,18 @@ static int __do_page_fault(struct pt_regs *regs, unsigned 
long address,
if (unlikely(fault & VM_FAULT_SIGSEGV) &&
!arch_vma_access_permitted(vma, flags & FAULT_FLAG_WRI

[PATCH v10 20/27] powerpc: introduce get_mm_addr_key() helper

2018-01-18 Thread Ram Pai
get_mm_addr_key() helper returns the pkey associated with
an address corresponding to a given mm_struct.

Signed-off-by: Ram Pai 
---
 arch/powerpc/include/asm/mmu.h  |9 +
 arch/powerpc/mm/hash_utils_64.c |   24 
 2 files changed, 33 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/mmu.h b/arch/powerpc/include/asm/mmu.h
index 6364f5c..bb38312 100644
--- a/arch/powerpc/include/asm/mmu.h
+++ b/arch/powerpc/include/asm/mmu.h
@@ -260,6 +260,15 @@ static inline bool early_radix_enabled(void)
 }
 #endif
 
+#ifdef CONFIG_PPC_MEM_KEYS
+extern u16 get_mm_addr_key(struct mm_struct *mm, unsigned long address);
+#else
+static inline u16 get_mm_addr_key(struct mm_struct *mm, unsigned long address)
+{
+   return 0;
+}
+#endif /* CONFIG_PPC_MEM_KEYS */
+
 #endif /* !__ASSEMBLY__ */
 
 /* The kernel use the constants below to index in the page sizes array.
diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index dc0f76e..1e1c03b 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -1573,6 +1573,30 @@ void hash_preload(struct mm_struct *mm, unsigned long ea,
local_irq_restore(flags);
 }
 
+#ifdef CONFIG_PPC_MEM_KEYS
+/*
+ * Return the protection key associated with the given address and the
+ * mm_struct.
+ */
+u16 get_mm_addr_key(struct mm_struct *mm, unsigned long address)
+{
+   pte_t *ptep;
+   u16 pkey = 0;
+   unsigned long flags;
+
+   if (!mm || !mm->pgd)
+   return 0;
+
+   local_irq_save(flags);
+   ptep = find_linux_pte(mm->pgd, address, NULL, NULL);
+   if (ptep)
+   pkey = pte_to_pkey_bits(pte_val(READ_ONCE(*ptep)));
+   local_irq_restore(flags);
+
+   return pkey;
+}
+#endif /* CONFIG_PPC_MEM_KEYS */
+
 #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
 static inline void tm_flush_hash_page(int local)
 {
-- 
1.7.1



[PATCH v10 19/27] powerpc: Handle exceptions caused by pkey violation

2018-01-18 Thread Ram Pai
Handle Data and  Instruction exceptions caused by memory
protection-key.

The CPU will detect the key fault if the HPTE is already
programmed with the key.

However if the HPTE is not  hashed, a key fault will not
be detected by the hardware. The software will detect
pkey violation in such a case.

Signed-off-by: Ram Pai 
Signed-off-by: Thiago Jung Bauermann 
---
 arch/powerpc/include/asm/reg.h   |1 -
 arch/powerpc/kernel/exceptions-64s.S |2 +-
 arch/powerpc/mm/fault.c  |   22 ++
 3 files changed, 23 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h
index b779f3c..ffc9990 100644
--- a/arch/powerpc/include/asm/reg.h
+++ b/arch/powerpc/include/asm/reg.h
@@ -312,7 +312,6 @@
 DSISR_BAD_EXT_CTRL)
 #define  DSISR_BAD_FAULT_64S   (DSISR_BAD_FAULT_32S| \
 DSISR_ATTR_CONFLICT| \
-DSISR_KEYFAULT | \
 DSISR_UNSUPP_MMU   | \
 DSISR_PRTABLE_FAULT| \
 DSISR_ICSWX_NO_CT  | \
diff --git a/arch/powerpc/kernel/exceptions-64s.S 
b/arch/powerpc/kernel/exceptions-64s.S
index e441b46..804e804 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -1521,7 +1521,7 @@ USE_TEXT_SECTION()
.balign IFETCH_ALIGN_BYTES
 do_hash_page:
 #ifdef CONFIG_PPC_BOOK3S_64
-   lis r0,(DSISR_BAD_FAULT_64S|DSISR_DABRMATCH)@h
+   lis r0,(DSISR_BAD_FAULT_64S | DSISR_DABRMATCH | DSISR_KEYFAULT)@h
ori r0,r0,DSISR_BAD_FAULT_64S@l
and.r0,r4,r0/* weird error? */
bne-handle_page_fault   /* if not, try to insert a HPTE */
diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
index 4797d08..943a91e 100644
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@@ -427,6 +427,11 @@ static int __do_page_fault(struct pt_regs *regs, unsigned 
long address,
 
perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address);
 
+   if (error_code & DSISR_KEYFAULT) {
+   _exception(SIGSEGV, regs, SEGV_PKUERR, address);
+   return 0;
+   }
+
/*
 * We want to do this outside mmap_sem, because reading code around nip
 * can result in fault, which will cause a deadlock when called with
@@ -498,6 +503,23 @@ static int __do_page_fault(struct pt_regs *regs, unsigned 
long address,
 * the fault.
 */
fault = handle_mm_fault(vma, address, flags);
+
+#ifdef CONFIG_PPC_MEM_KEYS
+   /*
+* if the HPTE is not hashed, hardware will not detect
+* a key fault. Lets check if we failed because of a
+* software detected key fault.
+*/
+   if (unlikely(fault & VM_FAULT_SIGSEGV) &&
+   !arch_vma_access_permitted(vma, flags & FAULT_FLAG_WRITE,
+   is_exec, 0)) {
+   int pkey = vma_pkey(vma);
+
+   if (likely(pkey))
+   return __bad_area(regs, address, SEGV_PKUERR);
+   }
+#endif /* CONFIG_PPC_MEM_KEYS */
+
major |= fault & VM_FAULT_MAJOR;
 
/*
-- 
1.7.1



[PATCH v10 18/27] powerpc: implementation for arch_vma_access_permitted()

2018-01-18 Thread Ram Pai
This patch provides the implementation for
arch_vma_access_permitted(). Returns true if the
requested access is allowed by pkey associated with the
vma.

Signed-off-by: Ram Pai 
---
 arch/powerpc/include/asm/mmu_context.h |5 +++-
 arch/powerpc/mm/pkeys.c|   34 
 2 files changed, 38 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/include/asm/mmu_context.h 
b/arch/powerpc/include/asm/mmu_context.h
index 209f127..cd2bd73 100644
--- a/arch/powerpc/include/asm/mmu_context.h
+++ b/arch/powerpc/include/asm/mmu_context.h
@@ -186,6 +186,10 @@ static inline void arch_bprm_mm_init(struct mm_struct *mm,
 {
 }
 
+#ifdef CONFIG_PPC_MEM_KEYS
+bool arch_vma_access_permitted(struct vm_area_struct *vma, bool write,
+  bool execute, bool foreign);
+#else /* CONFIG_PPC_MEM_KEYS */
 static inline bool arch_vma_access_permitted(struct vm_area_struct *vma,
bool write, bool execute, bool foreign)
 {
@@ -193,7 +197,6 @@ static inline bool arch_vma_access_permitted(struct 
vm_area_struct *vma,
return true;
 }
 
-#ifndef CONFIG_PPC_MEM_KEYS
 #define pkey_mm_init(mm)
 #define thread_pkey_regs_save(thread)
 #define thread_pkey_regs_restore(new_thread, old_thread)
diff --git a/arch/powerpc/mm/pkeys.c b/arch/powerpc/mm/pkeys.c
index 0e044ea..0701aa3 100644
--- a/arch/powerpc/mm/pkeys.c
+++ b/arch/powerpc/mm/pkeys.c
@@ -390,3 +390,37 @@ bool arch_pte_access_permitted(u64 pte, bool write, bool 
execute)
 
return pkey_access_permitted(pte_to_pkey_bits(pte), write, execute);
 }
+
+/*
+ * We only want to enforce protection keys on the current thread because we
+ * effectively have no access to AMR/IAMR for other threads or any way to tell
+ * which AMR/IAMR in a threaded process we could use.
+ *
+ * So do not enforce things if the VMA is not from the current mm, or if we are
+ * in a kernel thread.
+ */
+static inline bool vma_is_foreign(struct vm_area_struct *vma)
+{
+   if (!current->mm)
+   return true;
+
+   /* if it is not our ->mm, it has to be foreign */
+   if (current->mm != vma->vm_mm)
+   return true;
+
+   return false;
+}
+
+bool arch_vma_access_permitted(struct vm_area_struct *vma, bool write,
+  bool execute, bool foreign)
+{
+   if (static_branch_likely(&pkey_disabled))
+   return true;
+   /*
+* Do not enforce our key-permissions on a foreign vma.
+*/
+   if (foreign || vma_is_foreign(vma))
+   return true;
+
+   return pkey_access_permitted(vma_pkey(vma), write, execute);
+}
-- 
1.7.1



[PATCH v10 17/27] powerpc: check key protection for user page access

2018-01-18 Thread Ram Pai
Make sure that the kernel does not access user pages without
checking their key-protection.

Signed-off-by: Ram Pai 
---
 arch/powerpc/include/asm/book3s/64/pgtable.h |   19 +++
 1 files changed, 19 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h 
b/arch/powerpc/include/asm/book3s/64/pgtable.h
index e785c68..3d8186e 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -464,6 +464,25 @@ static inline void huge_ptep_set_wrprotect(struct 
mm_struct *mm,
 
 #ifdef CONFIG_PPC_MEM_KEYS
 extern bool arch_pte_access_permitted(u64 pte, bool write, bool execute);
+
+#define pte_access_permitted(pte, write) \
+   (pte_present(pte) && \
+((!(write) || pte_write(pte)) && \
+ arch_pte_access_permitted(pte_val(pte), !!write, 0)))
+
+/*
+ * We store key in pmd/pud for huge pages. Need to check for key protection.
+ */
+#define pmd_access_permitted(pmd, write) \
+   (pmd_present(pmd) && \
+((!(write) || pmd_write(pmd)) && \
+ arch_pte_access_permitted(pmd_val(pmd), !!write, 0)))
+
+#define pud_access_permitted(pud, write) \
+   (pud_present(pud) && \
+((!(write) || pud_write(pud)) && \
+ arch_pte_access_permitted(pud_val(pud), !!write, 0)))
+
 #endif /* CONFIG_PPC_MEM_KEYS */
 
 #define __HAVE_ARCH_PTEP_GET_AND_CLEAR
-- 
1.7.1



[PATCH v10 16/27] powerpc: helper to validate key-access permissions of a pte

2018-01-18 Thread Ram Pai
helper function that checks if the read/write/execute is allowed
on the pte.

Signed-off-by: Ram Pai 
---
 arch/powerpc/include/asm/book3s/64/pgtable.h |4 +++
 arch/powerpc/include/asm/pkeys.h |9 
 arch/powerpc/mm/pkeys.c  |   28 ++
 3 files changed, 41 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h 
b/arch/powerpc/include/asm/book3s/64/pgtable.h
index e1a8bb6..e785c68 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -462,6 +462,10 @@ static inline void huge_ptep_set_wrprotect(struct 
mm_struct *mm,
pte_update(mm, addr, ptep, 0, _PAGE_PRIVILEGED, 1);
 }
 
+#ifdef CONFIG_PPC_MEM_KEYS
+extern bool arch_pte_access_permitted(u64 pte, bool write, bool execute);
+#endif /* CONFIG_PPC_MEM_KEYS */
+
 #define __HAVE_ARCH_PTEP_GET_AND_CLEAR
 static inline pte_t ptep_get_and_clear(struct mm_struct *mm,
   unsigned long addr, pte_t *ptep)
diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
index 523d66c..7c45a40 100644
--- a/arch/powerpc/include/asm/pkeys.h
+++ b/arch/powerpc/include/asm/pkeys.h
@@ -82,6 +82,15 @@ static inline u64 pte_to_hpte_pkey_bits(u64 pteflags)
((pteflags & H_PTE_PKEY_BIT4) ? HPTE_R_KEY_BIT4 : 0x0UL));
 }
 
+static inline u16 pte_to_pkey_bits(u64 pteflags)
+{
+   return (((pteflags & H_PTE_PKEY_BIT0) ? 0x10 : 0x0UL) |
+   ((pteflags & H_PTE_PKEY_BIT1) ? 0x8 : 0x0UL) |
+   ((pteflags & H_PTE_PKEY_BIT2) ? 0x4 : 0x0UL) |
+   ((pteflags & H_PTE_PKEY_BIT3) ? 0x2 : 0x0UL) |
+   ((pteflags & H_PTE_PKEY_BIT4) ? 0x1 : 0x0UL));
+}
+
 #define pkey_alloc_mask(pkey) (0x1 << pkey)
 
 #define mm_pkey_allocation_map(mm) (mm->context.pkey_allocation_map)
diff --git a/arch/powerpc/mm/pkeys.c b/arch/powerpc/mm/pkeys.c
index 7630c2f..0e044ea 100644
--- a/arch/powerpc/mm/pkeys.c
+++ b/arch/powerpc/mm/pkeys.c
@@ -362,3 +362,31 @@ int __arch_override_mprotect_pkey(struct vm_area_struct 
*vma, int prot,
/* Nothing to override. */
return vma_pkey(vma);
 }
+
+static bool pkey_access_permitted(int pkey, bool write, bool execute)
+{
+   int pkey_shift;
+   u64 amr;
+
+   if (!pkey)
+   return true;
+
+   if (!is_pkey_enabled(pkey))
+   return true;
+
+   pkey_shift = pkeyshift(pkey);
+   if (execute && !(read_iamr() & (IAMR_EX_BIT << pkey_shift)))
+   return true;
+
+   amr = read_amr(); /* Delay reading amr until absolutely needed */
+   return ((!write && !(amr & (AMR_RD_BIT << pkey_shift))) ||
+   (write &&  !(amr & (AMR_WR_BIT << pkey_shift;
+}
+
+bool arch_pte_access_permitted(u64 pte, bool write, bool execute)
+{
+   if (static_branch_likely(&pkey_disabled))
+   return true;
+
+   return pkey_access_permitted(pte_to_pkey_bits(pte), write, execute);
+}
-- 
1.7.1



[PATCH v10 15/27] powerpc: Program HPTE key protection bits

2018-01-18 Thread Ram Pai
Map the PTE protection key bits to the HPTE key protection bits,
while creating HPTE  entries.

Acked-by: Balbir Singh 
Signed-off-by: Ram Pai 
---
 arch/powerpc/include/asm/book3s/64/mmu-hash.h |5 +
 arch/powerpc/include/asm/mmu_context.h|6 ++
 arch/powerpc/include/asm/pkeys.h  |9 +
 arch/powerpc/mm/hash_utils_64.c   |1 +
 4 files changed, 21 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/mmu-hash.h 
b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
index e91e115..50ed64f 100644
--- a/arch/powerpc/include/asm/book3s/64/mmu-hash.h
+++ b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
@@ -90,6 +90,8 @@
 #define HPTE_R_PP0 ASM_CONST(0x8000)
 #define HPTE_R_TS  ASM_CONST(0x4000)
 #define HPTE_R_KEY_HI  ASM_CONST(0x3000)
+#define HPTE_R_KEY_BIT0ASM_CONST(0x2000)
+#define HPTE_R_KEY_BIT1ASM_CONST(0x1000)
 #define HPTE_R_RPN_SHIFT   12
 #define HPTE_R_RPN ASM_CONST(0x0000)
 #define HPTE_R_RPN_3_0 ASM_CONST(0x01fff000)
@@ -104,6 +106,9 @@
 #define HPTE_R_C   ASM_CONST(0x0080)
 #define HPTE_R_R   ASM_CONST(0x0100)
 #define HPTE_R_KEY_LO  ASM_CONST(0x0e00)
+#define HPTE_R_KEY_BIT2ASM_CONST(0x0800)
+#define HPTE_R_KEY_BIT3ASM_CONST(0x0400)
+#define HPTE_R_KEY_BIT4ASM_CONST(0x0200)
 #define HPTE_R_KEY (HPTE_R_KEY_LO | HPTE_R_KEY_HI)
 
 #define HPTE_V_1TB_SEG ASM_CONST(0x4000)
diff --git a/arch/powerpc/include/asm/mmu_context.h 
b/arch/powerpc/include/asm/mmu_context.h
index 3ba571d..209f127 100644
--- a/arch/powerpc/include/asm/mmu_context.h
+++ b/arch/powerpc/include/asm/mmu_context.h
@@ -203,6 +203,12 @@ static inline int vma_pkey(struct vm_area_struct *vma)
 {
return 0;
 }
+
+static inline u64 pte_to_hpte_pkey_bits(u64 pteflags)
+{
+   return 0x0UL;
+}
+
 #endif /* CONFIG_PPC_MEM_KEYS */
 
 #endif /* __KERNEL__ */
diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
index f65dedd..523d66c 100644
--- a/arch/powerpc/include/asm/pkeys.h
+++ b/arch/powerpc/include/asm/pkeys.h
@@ -73,6 +73,15 @@ static inline int vma_pkey(struct vm_area_struct *vma)
 
 #define arch_max_pkey() pkeys_total
 
+static inline u64 pte_to_hpte_pkey_bits(u64 pteflags)
+{
+   return (((pteflags & H_PTE_PKEY_BIT0) ? HPTE_R_KEY_BIT0 : 0x0UL) |
+   ((pteflags & H_PTE_PKEY_BIT1) ? HPTE_R_KEY_BIT1 : 0x0UL) |
+   ((pteflags & H_PTE_PKEY_BIT2) ? HPTE_R_KEY_BIT2 : 0x0UL) |
+   ((pteflags & H_PTE_PKEY_BIT3) ? HPTE_R_KEY_BIT3 : 0x0UL) |
+   ((pteflags & H_PTE_PKEY_BIT4) ? HPTE_R_KEY_BIT4 : 0x0UL));
+}
+
 #define pkey_alloc_mask(pkey) (0x1 << pkey)
 
 #define mm_pkey_allocation_map(mm) (mm->context.pkey_allocation_map)
diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index 8bd841a..dc0f76e 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -233,6 +233,7 @@ unsigned long htab_convert_pte_flags(unsigned long pteflags)
 */
rflags |= HPTE_R_M;
 
+   rflags |= pte_to_hpte_pkey_bits(pteflags);
return rflags;
 }
 
-- 
1.7.1



[PATCH v10 14/27] powerpc: map vma key-protection bits to pte key bits.

2018-01-18 Thread Ram Pai
Map  the  key  protection  bits of the vma to the pkey bits in
the PTE.

The PTE  bits used  for pkey  are  3,4,5,6  and 57. The  first
four bits are the same four bits that were freed up  initially
in this patch series. remember? :-) Without those four bits
this patch wouldn't be possible.

BUT, on 4k kernel, bit 3, and 4 could not be freed up. remember?
Hence we have to be satisfied with 5, 6 and 7.

Signed-off-by: Ram Pai 
---
 arch/powerpc/include/asm/book3s/64/pgtable.h |   25 -
 arch/powerpc/include/asm/mman.h  |6 ++
 arch/powerpc/include/asm/pkeys.h |   12 
 3 files changed, 42 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h 
b/arch/powerpc/include/asm/book3s/64/pgtable.h
index 4469781..e1a8bb6 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -39,6 +39,7 @@
 #define _RPAGE_RSV20x0800UL
 #define _RPAGE_RSV30x0400UL
 #define _RPAGE_RSV40x0200UL
+#define _RPAGE_RSV50x00040UL
 
 #define _PAGE_PTE  0x4000UL/* distinguishes PTEs 
from pointers */
 #define _PAGE_PRESENT  0x8000UL/* pte contains a 
translation */
@@ -58,6 +59,25 @@
 /* Max physical address bit as per radix table */
 #define _RPAGE_PA_MAX  57
 
+#ifdef CONFIG_PPC_MEM_KEYS
+#ifdef CONFIG_PPC_64K_PAGES
+#define H_PTE_PKEY_BIT0_RPAGE_RSV1
+#define H_PTE_PKEY_BIT1_RPAGE_RSV2
+#else /* CONFIG_PPC_64K_PAGES */
+#define H_PTE_PKEY_BIT00 /* _RPAGE_RSV1 is not available */
+#define H_PTE_PKEY_BIT10 /* _RPAGE_RSV2 is not available */
+#endif /* CONFIG_PPC_64K_PAGES */
+#define H_PTE_PKEY_BIT2_RPAGE_RSV3
+#define H_PTE_PKEY_BIT3_RPAGE_RSV4
+#define H_PTE_PKEY_BIT4_RPAGE_RSV5
+#else /*  CONFIG_PPC_MEM_KEYS */
+#define H_PTE_PKEY_BIT00
+#define H_PTE_PKEY_BIT10
+#define H_PTE_PKEY_BIT20
+#define H_PTE_PKEY_BIT30
+#define H_PTE_PKEY_BIT40
+#endif /*  CONFIG_PPC_MEM_KEYS */
+
 /*
  * Max physical address bit we will use for now.
  *
@@ -121,13 +141,16 @@
 #define _PAGE_CHG_MASK (PTE_RPN_MASK | _PAGE_HPTEFLAGS | _PAGE_DIRTY | \
 _PAGE_ACCESSED | _PAGE_SPECIAL | _PAGE_PTE |   \
 _PAGE_SOFT_DIRTY)
+
+#define H_PTE_PKEY  (H_PTE_PKEY_BIT0 | H_PTE_PKEY_BIT1 | H_PTE_PKEY_BIT2 | \
+H_PTE_PKEY_BIT3 | H_PTE_PKEY_BIT4)
 /*
  * Mask of bits returned by pte_pgprot()
  */
 #define PAGE_PROT_BITS  (_PAGE_SAO | _PAGE_NON_IDEMPOTENT | _PAGE_TOLERANT | \
 H_PAGE_4K_PFN | _PAGE_PRIVILEGED | _PAGE_ACCESSED | \
 _PAGE_READ | _PAGE_WRITE |  _PAGE_DIRTY | _PAGE_EXEC | 
\
-_PAGE_SOFT_DIRTY)
+_PAGE_SOFT_DIRTY | H_PTE_PKEY)
 /*
  * We define 2 sets of base prot bits, one for basic pages (ie,
  * cacheable kernel and user pages) and one for non cacheable
diff --git a/arch/powerpc/include/asm/mman.h b/arch/powerpc/include/asm/mman.h
index 2999478..07e3f54 100644
--- a/arch/powerpc/include/asm/mman.h
+++ b/arch/powerpc/include/asm/mman.h
@@ -33,7 +33,13 @@ static inline unsigned long arch_calc_vm_prot_bits(unsigned 
long prot,
 
 static inline pgprot_t arch_vm_get_page_prot(unsigned long vm_flags)
 {
+#ifdef CONFIG_PPC_MEM_KEYS
+   return (vm_flags & VM_SAO) ?
+   __pgprot(_PAGE_SAO | vmflag_to_pte_pkey_bits(vm_flags)) :
+   __pgprot(0 | vmflag_to_pte_pkey_bits(vm_flags));
+#else
return (vm_flags & VM_SAO) ? __pgprot(_PAGE_SAO) : __pgprot(0);
+#endif
 }
 #define arch_vm_get_page_prot(vm_flags) arch_vm_get_page_prot(vm_flags)
 
diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
index 0a643b8..f65dedd 100644
--- a/arch/powerpc/include/asm/pkeys.h
+++ b/arch/powerpc/include/asm/pkeys.h
@@ -52,6 +52,18 @@ static inline u64 pkey_to_vmflag_bits(u16 pkey)
return (((u64)pkey << VM_PKEY_SHIFT) & ARCH_VM_PKEY_FLAGS);
 }
 
+static inline u64 vmflag_to_pte_pkey_bits(u64 vm_flags)
+{
+   if (static_branch_likely(&pkey_disabled))
+   return 0x0UL;
+
+   return (((vm_flags & VM_PKEY_BIT0) ? H_PTE_PKEY_BIT4 : 0x0UL) |
+   ((vm_flags & VM_PKEY_BIT1) ? H_PTE_PKEY_BIT3 : 0x0UL) |
+   ((vm_flags & VM_PKEY_BIT2) ? H_PTE_PKEY_BIT2 : 0x0UL) |
+   ((vm_flags & VM_PKEY_BIT3) ? H_PTE_PKEY_BIT1 : 0x0UL) |
+   ((vm_flags & VM_PKEY_BIT4) ? H_PTE_PKEY_BIT0 : 0x0UL));
+}
+
 static inline int vma_pkey(struct vm_area_struct *vma)
 {
if (static_branch_likely(&pkey_disabled))
-- 
1.7.1



[PATCH v10 13/27] powerpc: implementation for arch_override_mprotect_pkey()

2018-01-18 Thread Ram Pai
arch independent code calls arch_override_mprotect_pkey()
to return a pkey that best matches the requested protection.

This patch provides the implementation.

Signed-off-by: Ram Pai 
---
 arch/powerpc/include/asm/mmu_context.h |5 
 arch/powerpc/include/asm/pkeys.h   |   21 +-
 arch/powerpc/mm/pkeys.c|   36 
 3 files changed, 61 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/include/asm/mmu_context.h 
b/arch/powerpc/include/asm/mmu_context.h
index 4d69223..3ba571d 100644
--- a/arch/powerpc/include/asm/mmu_context.h
+++ b/arch/powerpc/include/asm/mmu_context.h
@@ -198,6 +198,11 @@ static inline bool arch_vma_access_permitted(struct 
vm_area_struct *vma,
 #define thread_pkey_regs_save(thread)
 #define thread_pkey_regs_restore(new_thread, old_thread)
 #define thread_pkey_regs_init(thread)
+
+static inline int vma_pkey(struct vm_area_struct *vma)
+{
+   return 0;
+}
 #endif /* CONFIG_PPC_MEM_KEYS */
 
 #endif /* __KERNEL__ */
diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
index c7cc433..0a643b8 100644
--- a/arch/powerpc/include/asm/pkeys.h
+++ b/arch/powerpc/include/asm/pkeys.h
@@ -52,6 +52,13 @@ static inline u64 pkey_to_vmflag_bits(u16 pkey)
return (((u64)pkey << VM_PKEY_SHIFT) & ARCH_VM_PKEY_FLAGS);
 }
 
+static inline int vma_pkey(struct vm_area_struct *vma)
+{
+   if (static_branch_likely(&pkey_disabled))
+   return 0;
+   return (vma->vm_flags & ARCH_VM_PKEY_FLAGS) >> VM_PKEY_SHIFT;
+}
+
 #define arch_max_pkey() pkeys_total
 
 #define pkey_alloc_mask(pkey) (0x1 << pkey)
@@ -148,10 +155,22 @@ static inline int execute_only_pkey(struct mm_struct *mm)
return __execute_only_pkey(mm);
 }
 
+extern int __arch_override_mprotect_pkey(struct vm_area_struct *vma,
+int prot, int pkey);
 static inline int arch_override_mprotect_pkey(struct vm_area_struct *vma,
  int prot, int pkey)
 {
-   return 0;
+   if (static_branch_likely(&pkey_disabled))
+   return 0;
+
+   /*
+* Is this an mprotect_pkey() call? If so, never override the value that
+* came from the user.
+*/
+   if (pkey != -1)
+   return pkey;
+
+   return __arch_override_mprotect_pkey(vma, prot, pkey);
 }
 
 extern int __arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
diff --git a/arch/powerpc/mm/pkeys.c b/arch/powerpc/mm/pkeys.c
index ee31ab5..7630c2f 100644
--- a/arch/powerpc/mm/pkeys.c
+++ b/arch/powerpc/mm/pkeys.c
@@ -326,3 +326,39 @@ int __execute_only_pkey(struct mm_struct *mm)
mm->context.execute_only_pkey = execute_only_pkey;
return execute_only_pkey;
 }
+
+static inline bool vma_is_pkey_exec_only(struct vm_area_struct *vma)
+{
+   /* Do this check first since the vm_flags should be hot */
+   if ((vma->vm_flags & (VM_READ | VM_WRITE | VM_EXEC)) != VM_EXEC)
+   return false;
+
+   return (vma_pkey(vma) == vma->vm_mm->context.execute_only_pkey);
+}
+
+/*
+ * This should only be called for *plain* mprotect calls.
+ */
+int __arch_override_mprotect_pkey(struct vm_area_struct *vma, int prot,
+ int pkey)
+{
+   /*
+* If the currently associated pkey is execute-only, but the requested
+* protection requires read or write, move it back to the default pkey.
+*/
+   if (vma_is_pkey_exec_only(vma) && (prot & (PROT_READ | PROT_WRITE)))
+   return 0;
+
+   /*
+* The requested protection is execute-only. Hence let's use an
+* execute-only pkey.
+*/
+   if (prot == PROT_EXEC) {
+   pkey = execute_only_pkey(vma->vm_mm);
+   if (pkey > 0)
+   return pkey;
+   }
+
+   /* Nothing to override. */
+   return vma_pkey(vma);
+}
-- 
1.7.1



[PATCH v10 12/27] powerpc: ability to associate pkey to a vma

2018-01-18 Thread Ram Pai
arch-independent code expects the arch to  map
a  pkey  into the vma's protection bit setting.
The patch provides that ability.

Signed-off-by: Ram Pai 
---
 arch/powerpc/include/asm/mman.h  |7 ++-
 arch/powerpc/include/asm/pkeys.h |   11 +++
 arch/powerpc/mm/pkeys.c  |8 
 3 files changed, 25 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/include/asm/mman.h b/arch/powerpc/include/asm/mman.h
index 30922f6..2999478 100644
--- a/arch/powerpc/include/asm/mman.h
+++ b/arch/powerpc/include/asm/mman.h
@@ -13,6 +13,7 @@
 
 #include 
 #include 
+#include 
 #include 
 
 /*
@@ -22,7 +23,11 @@
 static inline unsigned long arch_calc_vm_prot_bits(unsigned long prot,
unsigned long pkey)
 {
-   return (prot & PROT_SAO) ? VM_SAO : 0;
+#ifdef CONFIG_PPC_MEM_KEYS
+   return (((prot & PROT_SAO) ? VM_SAO : 0) | pkey_to_vmflag_bits(pkey));
+#else
+   return ((prot & PROT_SAO) ? VM_SAO : 0);
+#endif
 }
 #define arch_calc_vm_prot_bits(prot, pkey) arch_calc_vm_prot_bits(prot, pkey)
 
diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
index 2b5bb35..c7cc433 100644
--- a/arch/powerpc/include/asm/pkeys.h
+++ b/arch/powerpc/include/asm/pkeys.h
@@ -41,6 +41,17 @@
 #define ARCH_VM_PKEY_FLAGS (VM_PKEY_BIT0 | VM_PKEY_BIT1 | VM_PKEY_BIT2 | \
VM_PKEY_BIT3 | VM_PKEY_BIT4)
 
+/* Override any generic PKEY permission defines */
+#define PKEY_DISABLE_EXECUTE   0x4
+#define PKEY_ACCESS_MASK   (PKEY_DISABLE_ACCESS | \
+   PKEY_DISABLE_WRITE  | \
+   PKEY_DISABLE_EXECUTE)
+
+static inline u64 pkey_to_vmflag_bits(u16 pkey)
+{
+   return (((u64)pkey << VM_PKEY_SHIFT) & ARCH_VM_PKEY_FLAGS);
+}
+
 #define arch_max_pkey() pkeys_total
 
 #define pkey_alloc_mask(pkey) (0x1 << pkey)
diff --git a/arch/powerpc/mm/pkeys.c b/arch/powerpc/mm/pkeys.c
index b466a2c..ee31ab5 100644
--- a/arch/powerpc/mm/pkeys.c
+++ b/arch/powerpc/mm/pkeys.c
@@ -39,6 +39,14 @@ int pkey_initialize(void)
 (PKEY_DISABLE_ACCESS | PKEY_DISABLE_WRITE));
 
/*
+* pkey_to_vmflag_bits() assumes that the pkey bits are contiguous
+* in the vmaflag. Make sure that is really the case.
+*/
+   BUILD_BUG_ON(__builtin_clzl(ARCH_VM_PKEY_FLAGS >> VM_PKEY_SHIFT) +
+__builtin_popcountl(ARCH_VM_PKEY_FLAGS >> VM_PKEY_SHIFT)
+   != (sizeof(u64) * BITS_PER_BYTE));
+
+   /*
 * Disable the pkey system till everything is in place. A subsequent
 * patch will enable it.
 */
-- 
1.7.1



[PATCH v10 11/27] powerpc: introduce execute-only pkey

2018-01-18 Thread Ram Pai
This patch provides the implementation of execute-only pkey.
The architecture-independent layer expects the arch-dependent
layer, to support the ability to create and enable a special
key which has execute-only permission.

Acked-by: Balbir Singh 
Signed-off-by: Ram Pai 
---
 arch/powerpc/include/asm/book3s/64/mmu.h |1 +
 arch/powerpc/include/asm/pkeys.h |6 +++-
 arch/powerpc/mm/pkeys.c  |   58 ++
 3 files changed, 64 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/mmu.h 
b/arch/powerpc/include/asm/book3s/64/mmu.h
index 37ef23c..0abeb0e 100644
--- a/arch/powerpc/include/asm/book3s/64/mmu.h
+++ b/arch/powerpc/include/asm/book3s/64/mmu.h
@@ -116,6 +116,7 @@ struct patb_entry {
 * bit unset -> key available for allocation
 */
u32 pkey_allocation_map;
+   s16 execute_only_pkey; /* key holding execute-only protection */
 #endif
 } mm_context_t;
 
diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
index 3def5af..2b5bb35 100644
--- a/arch/powerpc/include/asm/pkeys.h
+++ b/arch/powerpc/include/asm/pkeys.h
@@ -128,9 +128,13 @@ static inline int mm_pkey_free(struct mm_struct *mm, int 
pkey)
  * Try to dedicate one of the protection keys to be used as an
  * execute-only protection key.
  */
+extern int __execute_only_pkey(struct mm_struct *mm);
 static inline int execute_only_pkey(struct mm_struct *mm)
 {
-   return 0;
+   if (static_branch_likely(&pkey_disabled))
+   return -1;
+
+   return __execute_only_pkey(mm);
 }
 
 static inline int arch_override_mprotect_pkey(struct vm_area_struct *vma,
diff --git a/arch/powerpc/mm/pkeys.c b/arch/powerpc/mm/pkeys.c
index 7dfcf2d..b466a2c 100644
--- a/arch/powerpc/mm/pkeys.c
+++ b/arch/powerpc/mm/pkeys.c
@@ -96,6 +96,8 @@ void pkey_mm_init(struct mm_struct *mm)
if (static_branch_likely(&pkey_disabled))
return;
mm_pkey_allocation_map(mm) = initial_allocation_mask;
+   /* -1 means unallocated or invalid */
+   mm->context.execute_only_pkey = -1;
 }
 
 static inline u64 read_amr(void)
@@ -260,3 +262,59 @@ void thread_pkey_regs_init(struct thread_struct *thread)
write_iamr(read_iamr() & pkey_iamr_mask);
write_uamor(read_uamor() & pkey_amr_uamor_mask);
 }
+
+static inline bool pkey_allows_readwrite(int pkey)
+{
+   int pkey_shift = pkeyshift(pkey);
+
+   if (!is_pkey_enabled(pkey))
+   return true;
+
+   return !(read_amr() & ((AMR_RD_BIT|AMR_WR_BIT) << pkey_shift));
+}
+
+int __execute_only_pkey(struct mm_struct *mm)
+{
+   bool need_to_set_mm_pkey = false;
+   int execute_only_pkey = mm->context.execute_only_pkey;
+   int ret;
+
+   /* Do we need to assign a pkey for mm's execute-only maps? */
+   if (execute_only_pkey == -1) {
+   /* Go allocate one to use, which might fail */
+   execute_only_pkey = mm_pkey_alloc(mm);
+   if (execute_only_pkey < 0)
+   return -1;
+   need_to_set_mm_pkey = true;
+   }
+
+   /*
+* We do not want to go through the relatively costly dance to set AMR
+* if we do not need to. Check it first and assume that if the
+* execute-only pkey is readwrite-disabled than we do not have to set it
+* ourselves.
+*/
+   if (!need_to_set_mm_pkey && !pkey_allows_readwrite(execute_only_pkey))
+   return execute_only_pkey;
+
+   /*
+* Set up AMR so that it denies access for everything other than
+* execution.
+*/
+   ret = __arch_set_user_pkey_access(current, execute_only_pkey,
+ PKEY_DISABLE_ACCESS |
+ PKEY_DISABLE_WRITE);
+   /*
+* If the AMR-set operation failed somehow, just return 0 and
+* effectively disable execute-only support.
+*/
+   if (ret) {
+   mm_pkey_free(mm, execute_only_pkey);
+   return -1;
+   }
+
+   /* We got one, store it and use it from here on out */
+   if (need_to_set_mm_pkey)
+   mm->context.execute_only_pkey = execute_only_pkey;
+   return execute_only_pkey;
+}
-- 
1.7.1



[PATCH v10 10/27] powerpc: store and restore the pkey state across context switches

2018-01-18 Thread Ram Pai
Store and restore the AMR, IAMR and UAMOR register state of the task
before scheduling out and after scheduling in, respectively.

Signed-off-by: Ram Pai 
---
 arch/powerpc/include/asm/mmu_context.h |3 ++
 arch/powerpc/include/asm/pkeys.h   |4 ++
 arch/powerpc/include/asm/processor.h   |5 +++
 arch/powerpc/kernel/process.c  |7 
 arch/powerpc/mm/pkeys.c|   52 +++-
 5 files changed, 70 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/include/asm/mmu_context.h 
b/arch/powerpc/include/asm/mmu_context.h
index 7d0f2d0..4d69223 100644
--- a/arch/powerpc/include/asm/mmu_context.h
+++ b/arch/powerpc/include/asm/mmu_context.h
@@ -195,6 +195,9 @@ static inline bool arch_vma_access_permitted(struct 
vm_area_struct *vma,
 
 #ifndef CONFIG_PPC_MEM_KEYS
 #define pkey_mm_init(mm)
+#define thread_pkey_regs_save(thread)
+#define thread_pkey_regs_restore(new_thread, old_thread)
+#define thread_pkey_regs_init(thread)
 #endif /* CONFIG_PPC_MEM_KEYS */
 
 #endif /* __KERNEL__ */
diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
index 2500a90..3def5af 100644
--- a/arch/powerpc/include/asm/pkeys.h
+++ b/arch/powerpc/include/asm/pkeys.h
@@ -150,4 +150,8 @@ static inline int arch_set_user_pkey_access(struct 
task_struct *tsk, int pkey,
 }
 
 extern void pkey_mm_init(struct mm_struct *mm);
+extern void thread_pkey_regs_save(struct thread_struct *thread);
+extern void thread_pkey_regs_restore(struct thread_struct *new_thread,
+struct thread_struct *old_thread);
+extern void thread_pkey_regs_init(struct thread_struct *thread);
 #endif /*_ASM_POWERPC_KEYS_H */
diff --git a/arch/powerpc/include/asm/processor.h 
b/arch/powerpc/include/asm/processor.h
index bdab3b7..01299cd 100644
--- a/arch/powerpc/include/asm/processor.h
+++ b/arch/powerpc/include/asm/processor.h
@@ -309,6 +309,11 @@ struct thread_struct {
struct thread_vr_state ckvr_state; /* Checkpointed VR state */
unsigned long   ckvrsave; /* Checkpointed VRSAVE */
 #endif /* CONFIG_PPC_TRANSACTIONAL_MEM */
+#ifdef CONFIG_PPC_MEM_KEYS
+   unsigned long   amr;
+   unsigned long   iamr;
+   unsigned long   uamor;
+#endif
 #ifdef CONFIG_KVM_BOOK3S_32_HANDLER
void*   kvm_shadow_vcpu; /* KVM internal data */
 #endif /* CONFIG_KVM_BOOK3S_32_HANDLER */
diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index 5acb5a1..6447f80 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -42,6 +42,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -1102,6 +1103,8 @@ static inline void save_sprs(struct thread_struct *t)
t->tar = mfspr(SPRN_TAR);
}
 #endif
+
+   thread_pkey_regs_save(t);
 }
 
 static inline void restore_sprs(struct thread_struct *old_thread,
@@ -1141,6 +1144,8 @@ static inline void restore_sprs(struct thread_struct 
*old_thread,
old_thread->tidr != new_thread->tidr)
mtspr(SPRN_TIDR, new_thread->tidr);
 #endif
+
+   thread_pkey_regs_restore(new_thread, old_thread);
 }
 
 #ifdef CONFIG_PPC_BOOK3S_64
@@ -1865,6 +1870,8 @@ void start_thread(struct pt_regs *regs, unsigned long 
start, unsigned long sp)
current->thread.tm_tfiar = 0;
current->thread.load_tm = 0;
 #endif /* CONFIG_PPC_TRANSACTIONAL_MEM */
+
+   thread_pkey_regs_init(¤t->thread);
 }
 EXPORT_SYMBOL(start_thread);
 
diff --git a/arch/powerpc/mm/pkeys.c b/arch/powerpc/mm/pkeys.c
index 39e9814..7dfcf2d 100644
--- a/arch/powerpc/mm/pkeys.c
+++ b/arch/powerpc/mm/pkeys.c
@@ -16,6 +16,8 @@
 bool pkey_execute_disable_supported;
 int  pkeys_total;  /* Total pkeys as per device tree */
 u32  initial_allocation_mask;  /* Bits set for reserved keys */
+u64  pkey_amr_uamor_mask;  /* Bits in AMR/UMOR not to be touched */
+u64  pkey_iamr_mask;   /* Bits in AMR not to be touched */
 
 #define AMR_BITS_PER_PKEY 2
 #define AMR_RD_BIT 0x1UL
@@ -74,8 +76,16 @@ int pkey_initialize(void)
 * programming note.
 */
initial_allocation_mask = ~0x0;
-   for (i = 2; i < (pkeys_total - os_reserved); i++)
+
+   /* register mask is in BE format */
+   pkey_amr_uamor_mask = ~0x0ul;
+   pkey_iamr_mask = ~0x0ul;
+
+   for (i = 2; i < (pkeys_total - os_reserved); i++) {
initial_allocation_mask &= ~(0x1 << i);
+   pkey_amr_uamor_mask &= ~(0x3ul << pkeyshift(i));
+   pkey_iamr_mask &= ~(0x1ul << pkeyshift(i));
+   }
return 0;
 }
 
@@ -210,3 +220,43 @@ int __arch_set_user_pkey_access(struct task_struct *tsk, 
int pkey,
init_amr(pkey, new_amr_bits);
return 0;
 }
+
+void thread_pkey_regs_save(struct thread_struct *thread)
+{
+   if (static_branch_likely(&pkey_disabled))
+   return;
+
+   /*
+* TODO: Skip saving registers if @thread hasn't used any keys yet.
+   

[PATCH v10 09/27] powerpc: ability to create execute-disabled pkeys

2018-01-18 Thread Ram Pai
powerpc has hardware support to disable execute on a pkey.
This patch enables the ability to create execute-disabled
keys.

Signed-off-by: Ram Pai 
---
 arch/powerpc/include/uapi/asm/mman.h |6 ++
 arch/powerpc/mm/pkeys.c  |   16 
 2 files changed, 22 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/uapi/asm/mman.h 
b/arch/powerpc/include/uapi/asm/mman.h
index e63bc37..65065ce 100644
--- a/arch/powerpc/include/uapi/asm/mman.h
+++ b/arch/powerpc/include/uapi/asm/mman.h
@@ -30,4 +30,10 @@
 #define MAP_STACK  0x2 /* give out an address that is best 
suited for process/thread stacks */
 #define MAP_HUGETLB0x4 /* create a huge page mapping */
 
+/* Override any generic PKEY permission defines */
+#define PKEY_DISABLE_EXECUTE   0x4
+#undef PKEY_ACCESS_MASK
+#define PKEY_ACCESS_MASK   (PKEY_DISABLE_ACCESS |\
+   PKEY_DISABLE_WRITE  |\
+   PKEY_DISABLE_EXECUTE)
 #endif /* _UAPI_ASM_POWERPC_MMAN_H */
diff --git a/arch/powerpc/mm/pkeys.c b/arch/powerpc/mm/pkeys.c
index eca04cd..39e9814 100644
--- a/arch/powerpc/mm/pkeys.c
+++ b/arch/powerpc/mm/pkeys.c
@@ -29,6 +29,14 @@ int pkey_initialize(void)
int os_reserved, i;
 
/*
+* We define PKEY_DISABLE_EXECUTE in addition to the arch-neutral
+* generic defines for PKEY_DISABLE_ACCESS and PKEY_DISABLE_WRITE.
+* Ensure that the bits a distinct.
+*/
+   BUILD_BUG_ON(PKEY_DISABLE_EXECUTE &
+(PKEY_DISABLE_ACCESS | PKEY_DISABLE_WRITE));
+
+   /*
 * Disable the pkey system till everything is in place. A subsequent
 * patch will enable it.
 */
@@ -181,10 +189,18 @@ int __arch_set_user_pkey_access(struct task_struct *tsk, 
int pkey,
unsigned long init_val)
 {
u64 new_amr_bits = 0x0ul;
+   u64 new_iamr_bits = 0x0ul;
 
if (!is_pkey_enabled(pkey))
return -EINVAL;
 
+   if (init_val & PKEY_DISABLE_EXECUTE) {
+   if (!pkey_execute_disable_supported)
+   return -EINVAL;
+   new_iamr_bits |= IAMR_EX_BIT;
+   }
+   init_iamr(pkey, new_iamr_bits);
+
/* Set the bits we need in AMR: */
if (init_val & PKEY_DISABLE_ACCESS)
new_amr_bits |= AMR_RD_BIT | AMR_WR_BIT;
-- 
1.7.1



[PATCH v10 08/27] powerpc: implementation for arch_set_user_pkey_access()

2018-01-18 Thread Ram Pai
This patch provides the detailed implementation for
a user to allocate a key and enable it in the hardware.

It provides the plumbing, but it cannot be used till
the system call is implemented. The next patch  will
do so.

Reviewed-by: Thiago Jung Bauermann 
Signed-off-by: Ram Pai 
---
 arch/powerpc/include/asm/pkeys.h |6 -
 arch/powerpc/mm/pkeys.c  |   40 ++
 2 files changed, 45 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
index 9964b46..2500a90 100644
--- a/arch/powerpc/include/asm/pkeys.h
+++ b/arch/powerpc/include/asm/pkeys.h
@@ -139,10 +139,14 @@ static inline int arch_override_mprotect_pkey(struct 
vm_area_struct *vma,
return 0;
 }
 
+extern int __arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
+  unsigned long init_val);
 static inline int arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
unsigned long init_val)
 {
-   return 0;
+   if (static_branch_likely(&pkey_disabled))
+   return -EINVAL;
+   return __arch_set_user_pkey_access(tsk, pkey, init_val);
 }
 
 extern void pkey_mm_init(struct mm_struct *mm);
diff --git a/arch/powerpc/mm/pkeys.c b/arch/powerpc/mm/pkeys.c
index e1dc45b..eca04cd 100644
--- a/arch/powerpc/mm/pkeys.c
+++ b/arch/powerpc/mm/pkeys.c
@@ -9,6 +9,7 @@
  * (at your option) any later version.
  */
 
+#include 
 #include 
 
 DEFINE_STATIC_KEY_TRUE(pkey_disabled);
@@ -17,6 +18,9 @@
 u32  initial_allocation_mask;  /* Bits set for reserved keys */
 
 #define AMR_BITS_PER_PKEY 2
+#define AMR_RD_BIT 0x1UL
+#define AMR_WR_BIT 0x2UL
+#define IAMR_EX_BIT 0x1UL
 #define PKEY_REG_BITS (sizeof(u64)*8)
 #define pkeyshift(pkey) (PKEY_REG_BITS - ((pkey+1) * AMR_BITS_PER_PKEY))
 
@@ -112,6 +116,20 @@ static inline void write_uamor(u64 value)
mtspr(SPRN_UAMOR, value);
 }
 
+static bool is_pkey_enabled(int pkey)
+{
+   u64 uamor = read_uamor();
+   u64 pkey_bits = 0x3ul << pkeyshift(pkey);
+   u64 uamor_pkey_bits = (uamor & pkey_bits);
+
+   /*
+* Both the bits in UAMOR corresponding to the key should be set or
+* reset.
+*/
+   WARN_ON(uamor_pkey_bits && (uamor_pkey_bits != pkey_bits));
+   return !!(uamor_pkey_bits);
+}
+
 static inline void init_amr(int pkey, u8 init_bits)
 {
u64 new_amr_bits = (((u64)init_bits & 0x3UL) << pkeyshift(pkey));
@@ -154,3 +172,25 @@ void __arch_deactivate_pkey(int pkey)
 {
pkey_status_change(pkey, false);
 }
+
+/*
+ * Set the access rights in AMR IAMR and UAMOR registers for @pkey to that
+ * specified in @init_val.
+ */
+int __arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
+   unsigned long init_val)
+{
+   u64 new_amr_bits = 0x0ul;
+
+   if (!is_pkey_enabled(pkey))
+   return -EINVAL;
+
+   /* Set the bits we need in AMR: */
+   if (init_val & PKEY_DISABLE_ACCESS)
+   new_amr_bits |= AMR_RD_BIT | AMR_WR_BIT;
+   else if (init_val & PKEY_DISABLE_WRITE)
+   new_amr_bits |= AMR_WR_BIT;
+
+   init_amr(pkey, new_amr_bits);
+   return 0;
+}
-- 
1.7.1



[PATCH v10 07/27] powerpc: cleanup AMR, IAMR when a key is allocated or freed

2018-01-18 Thread Ram Pai
Cleanup the bits corresponding to a key in the AMR, and IAMR
register, when the key is newly allocated/activated or is freed.
We dont want some residual bits cause the hardware enforce
unintended behavior when the key is activated or freed.

Reviewed-by: Thiago Jung Bauermann 
Signed-off-by: Ram Pai 
---
 arch/powerpc/include/asm/pkeys.h |   12 
 1 files changed, 12 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
index 1e8cef2..9964b46 100644
--- a/arch/powerpc/include/asm/pkeys.h
+++ b/arch/powerpc/include/asm/pkeys.h
@@ -69,6 +69,8 @@ static inline bool mm_pkey_is_allocated(struct mm_struct *mm, 
int pkey)
__mm_pkey_is_allocated(mm, pkey));
 }
 
+extern void __arch_activate_pkey(int pkey);
+extern void __arch_deactivate_pkey(int pkey);
 /*
  * Returns a positive, 5-bit key on success, or -1 on failure.
  * Relies on the mmap_sem to protect against concurrency in mm_pkey_alloc() and
@@ -96,6 +98,12 @@ static inline int mm_pkey_alloc(struct mm_struct *mm)
 
ret = ffz((u32)mm_pkey_allocation_map(mm));
__mm_pkey_allocated(mm, ret);
+
+   /*
+* Enable the key in the hardware
+*/
+   if (ret > 0)
+   __arch_activate_pkey(ret);
return ret;
 }
 
@@ -107,6 +115,10 @@ static inline int mm_pkey_free(struct mm_struct *mm, int 
pkey)
if (!mm_pkey_is_allocated(mm, pkey))
return -EINVAL;
 
+   /*
+* Disable the key in the hardware
+*/
+   __arch_deactivate_pkey(pkey);
__mm_pkey_free(mm, pkey);
 
return 0;
-- 
1.7.1



[PATCH v10 06/27] powerpc: helper functions to initialize AMR, IAMR and UAMOR registers

2018-01-18 Thread Ram Pai
Introduce  helper functions that can initialize the bits in the AMR,
IAMR and UAMOR register; the bits that correspond to the given pkey.

Reviewed-by: Thiago Jung Bauermann 
Signed-off-by: Ram Pai 
---
 arch/powerpc/mm/pkeys.c |   47 +++
 1 files changed, 47 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/mm/pkeys.c b/arch/powerpc/mm/pkeys.c
index 6e8df6e..e1dc45b 100644
--- a/arch/powerpc/mm/pkeys.c
+++ b/arch/powerpc/mm/pkeys.c
@@ -16,6 +16,10 @@
 int  pkeys_total;  /* Total pkeys as per device tree */
 u32  initial_allocation_mask;  /* Bits set for reserved keys */
 
+#define AMR_BITS_PER_PKEY 2
+#define PKEY_REG_BITS (sizeof(u64)*8)
+#define pkeyshift(pkey) (PKEY_REG_BITS - ((pkey+1) * AMR_BITS_PER_PKEY))
+
 int pkey_initialize(void)
 {
int os_reserved, i;
@@ -107,3 +111,46 @@ static inline void write_uamor(u64 value)
 {
mtspr(SPRN_UAMOR, value);
 }
+
+static inline void init_amr(int pkey, u8 init_bits)
+{
+   u64 new_amr_bits = (((u64)init_bits & 0x3UL) << pkeyshift(pkey));
+   u64 old_amr = read_amr() & ~((u64)(0x3ul) << pkeyshift(pkey));
+
+   write_amr(old_amr | new_amr_bits);
+}
+
+static inline void init_iamr(int pkey, u8 init_bits)
+{
+   u64 new_iamr_bits = (((u64)init_bits & 0x1UL) << pkeyshift(pkey));
+   u64 old_iamr = read_iamr() & ~((u64)(0x1ul) << pkeyshift(pkey));
+
+   write_iamr(old_iamr | new_iamr_bits);
+}
+
+static void pkey_status_change(int pkey, bool enable)
+{
+   u64 old_uamor;
+
+   /* Reset the AMR and IAMR bits for this key */
+   init_amr(pkey, 0x0);
+   init_iamr(pkey, 0x0);
+
+   /* Enable/disable key */
+   old_uamor = read_uamor();
+   if (enable)
+   old_uamor |= (0x3ul << pkeyshift(pkey));
+   else
+   old_uamor &= ~(0x3ul << pkeyshift(pkey));
+   write_uamor(old_uamor);
+}
+
+void __arch_activate_pkey(int pkey)
+{
+   pkey_status_change(pkey, true);
+}
+
+void __arch_deactivate_pkey(int pkey)
+{
+   pkey_status_change(pkey, false);
+}
-- 
1.7.1



[PATCH v10 05/27] powerpc: helper function to read, write AMR, IAMR, UAMOR registers

2018-01-18 Thread Ram Pai
Implements helper functions to read and write the key related
registers; AMR, IAMR, UAMOR.

AMR register tracks the read,write permission of a key
IAMR register tracks the execute permission of a key
UAMOR register enables and disables a key

Acked-by: Balbir Singh 
Reviewed-by: Thiago Jung Bauermann 
Signed-off-by: Ram Pai 
---
 arch/powerpc/mm/pkeys.c |   36 
 1 files changed, 36 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/mm/pkeys.c b/arch/powerpc/mm/pkeys.c
index e2f3992..6e8df6e 100644
--- a/arch/powerpc/mm/pkeys.c
+++ b/arch/powerpc/mm/pkeys.c
@@ -71,3 +71,39 @@ void pkey_mm_init(struct mm_struct *mm)
return;
mm_pkey_allocation_map(mm) = initial_allocation_mask;
 }
+
+static inline u64 read_amr(void)
+{
+   return mfspr(SPRN_AMR);
+}
+
+static inline void write_amr(u64 value)
+{
+   mtspr(SPRN_AMR, value);
+}
+
+static inline u64 read_iamr(void)
+{
+   if (!likely(pkey_execute_disable_supported))
+   return 0x0UL;
+
+   return mfspr(SPRN_IAMR);
+}
+
+static inline void write_iamr(u64 value)
+{
+   if (!likely(pkey_execute_disable_supported))
+   return;
+
+   mtspr(SPRN_IAMR, value);
+}
+
+static inline u64 read_uamor(void)
+{
+   return mfspr(SPRN_UAMOR);
+}
+
+static inline void write_uamor(u64 value)
+{
+   mtspr(SPRN_UAMOR, value);
+}
-- 
1.7.1



[PATCH v10 04/27] powerpc: track allocation status of all pkeys

2018-01-18 Thread Ram Pai
Total 32 keys are available on power7 and above. However
pkey 0,1 are reserved. So effectively we  have  30 pkeys.

On 4K kernels, we do not  have  5  bits  in  the  PTE to
represent  all the keys; we only have 3bits.Two of those
keys are reserved; pkey 0 and pkey 1. So effectively  we
have 6 pkeys.

This patch keeps track of reserved keys, allocated  keys
and keys that are currently free.

Also it  adds  skeletal  functions  and macros, that the
architecture-independent code expects to be available.

Reviewed-by: Thiago Jung Bauermann 
Signed-off-by: Ram Pai 
---
 arch/powerpc/include/asm/book3s/64/mmu.h |9 +++
 arch/powerpc/include/asm/mmu_context.h   |4 +
 arch/powerpc/include/asm/pkeys.h |   90 -
 arch/powerpc/mm/mmu_context_book3s64.c   |2 +
 arch/powerpc/mm/pkeys.c  |   40 +
 5 files changed, 141 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/mmu.h 
b/arch/powerpc/include/asm/book3s/64/mmu.h
index c9448e1..37ef23c 100644
--- a/arch/powerpc/include/asm/book3s/64/mmu.h
+++ b/arch/powerpc/include/asm/book3s/64/mmu.h
@@ -108,6 +108,15 @@ struct patb_entry {
 #ifdef CONFIG_SPAPR_TCE_IOMMU
struct list_head iommu_group_mem_list;
 #endif
+
+#ifdef CONFIG_PPC_MEM_KEYS
+   /*
+* Each bit represents one protection key.
+* bit set   -> key allocated
+* bit unset -> key available for allocation
+*/
+   u32 pkey_allocation_map;
+#endif
 } mm_context_t;
 
 /*
diff --git a/arch/powerpc/include/asm/mmu_context.h 
b/arch/powerpc/include/asm/mmu_context.h
index fb5e6a3..7d0f2d0 100644
--- a/arch/powerpc/include/asm/mmu_context.h
+++ b/arch/powerpc/include/asm/mmu_context.h
@@ -193,5 +193,9 @@ static inline bool arch_vma_access_permitted(struct 
vm_area_struct *vma,
return true;
 }
 
+#ifndef CONFIG_PPC_MEM_KEYS
+#define pkey_mm_init(mm)
+#endif /* CONFIG_PPC_MEM_KEYS */
+
 #endif /* __KERNEL__ */
 #endif /* __ASM_POWERPC_MMU_CONTEXT_H */
diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
index 1280b35..1e8cef2 100644
--- a/arch/powerpc/include/asm/pkeys.h
+++ b/arch/powerpc/include/asm/pkeys.h
@@ -15,21 +15,101 @@
 #include 
 
 DECLARE_STATIC_KEY_TRUE(pkey_disabled);
-#define ARCH_VM_PKEY_FLAGS 0
+extern int pkeys_total; /* total pkeys as per device tree */
+extern u32 initial_allocation_mask; /* bits set for reserved keys */
+
+/*
+ * powerpc needs VM_PKEY_BIT* bit to enable pkey system.
+ * Without them, at least compilation needs to succeed.
+ */
+#ifndef VM_PKEY_BIT0
+#define VM_PKEY_SHIFT 0
+#define VM_PKEY_BIT0 0
+#define VM_PKEY_BIT1 0
+#define VM_PKEY_BIT2 0
+#define VM_PKEY_BIT3 0
+#endif
+
+/*
+ * powerpc needs an additional vma bit to support 32 keys. Till the additional
+ * vma bit lands in include/linux/mm.h we can only support 16 keys.
+ */
+#ifndef VM_PKEY_BIT4
+#define VM_PKEY_BIT4 0
+#endif
+
+#define ARCH_VM_PKEY_FLAGS (VM_PKEY_BIT0 | VM_PKEY_BIT1 | VM_PKEY_BIT2 | \
+   VM_PKEY_BIT3 | VM_PKEY_BIT4)
+
+#define arch_max_pkey() pkeys_total
+
+#define pkey_alloc_mask(pkey) (0x1 << pkey)
+
+#define mm_pkey_allocation_map(mm) (mm->context.pkey_allocation_map)
+
+#define __mm_pkey_allocated(mm, pkey) {\
+   mm_pkey_allocation_map(mm) |= pkey_alloc_mask(pkey); \
+}
+
+#define __mm_pkey_free(mm, pkey) { \
+   mm_pkey_allocation_map(mm) &= ~pkey_alloc_mask(pkey);   \
+}
+
+#define __mm_pkey_is_allocated(mm, pkey)   \
+   (mm_pkey_allocation_map(mm) & pkey_alloc_mask(pkey))
+
+#define __mm_pkey_is_reserved(pkey) (initial_allocation_mask & \
+  pkey_alloc_mask(pkey))
 
 static inline bool mm_pkey_is_allocated(struct mm_struct *mm, int pkey)
 {
-   return false;
+   /* A reserved key is never considered as 'explicitly allocated' */
+   return ((pkey < arch_max_pkey()) &&
+   !__mm_pkey_is_reserved(pkey) &&
+   __mm_pkey_is_allocated(mm, pkey));
 }
 
+/*
+ * Returns a positive, 5-bit key on success, or -1 on failure.
+ * Relies on the mmap_sem to protect against concurrency in mm_pkey_alloc() and
+ * mm_pkey_free().
+ */
 static inline int mm_pkey_alloc(struct mm_struct *mm)
 {
-   return -1;
+   /*
+* Note: this is the one and only place we make sure that the pkey is
+* valid as far as the hardware is concerned. The rest of the kernel
+* trusts that only good, valid pkeys come out of here.
+*/
+   u32 all_pkeys_mask = (u32)(~(0x0));
+   int ret;
+
+   if (static_branch_likely(&pkey_disabled))
+   return -1;
+
+   /*
+* Are we out of pkeys? We must handle this specially because ffz()
+* behavior is undefined if there are no zeros.
+*/
+   if (mm_pkey_allocation_map(mm) == all_pkeys_mask)
+   return -1;
+
+   ret = ffz((u32)mm_pkey_allocation_map(mm));
+   __mm_pkey_allocated(mm, ret);
+   

[PATCH v10 03/27] powerpc: initial pkey plumbing

2018-01-18 Thread Ram Pai
Basic  plumbing  to   initialize  the   pkey  system.
Nothing is enabled yet. A later patch will enable it
once all the infrastructure is in place.

Signed-off-by: Ram Pai 
---
 arch/powerpc/Kconfig   |   15 +
 arch/powerpc/include/asm/mmu_context.h |1 +
 arch/powerpc/include/asm/pkeys.h   |   55 
 arch/powerpc/mm/Makefile   |1 +
 arch/powerpc/mm/hash_utils_64.c|1 +
 arch/powerpc/mm/pkeys.c|   33 +++
 6 files changed, 106 insertions(+), 0 deletions(-)
 create mode 100644 arch/powerpc/include/asm/pkeys.h
 create mode 100644 arch/powerpc/mm/pkeys.c

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index c51e6ce..c9660a1 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -867,6 +867,21 @@ config SECCOMP
 
  If unsure, say Y. Only embedded should say N here.
 
+config PPC_MEM_KEYS
+   prompt "PowerPC Memory Protection Keys"
+   def_bool y
+   depends on PPC_BOOK3S_64
+   select ARCH_USES_HIGH_VMA_FLAGS
+   select ARCH_HAS_PKEYS
+   help
+ Memory Protection Keys provides a mechanism for enforcing
+ page-based protections, but without requiring modification of the
+ page tables when an application changes protection domains.
+
+ For details, see Documentation/vm/protection-keys.txt
+
+ If unsure, say y.
+
 endmenu
 
 config ISA_DMA_API
diff --git a/arch/powerpc/include/asm/mmu_context.h 
b/arch/powerpc/include/asm/mmu_context.h
index 6177d43..fb5e6a3 100644
--- a/arch/powerpc/include/asm/mmu_context.h
+++ b/arch/powerpc/include/asm/mmu_context.h
@@ -192,5 +192,6 @@ static inline bool arch_vma_access_permitted(struct 
vm_area_struct *vma,
/* by default, allow everything */
return true;
 }
+
 #endif /* __KERNEL__ */
 #endif /* __ASM_POWERPC_MMU_CONTEXT_H */
diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
new file mode 100644
index 000..1280b35
--- /dev/null
+++ b/arch/powerpc/include/asm/pkeys.h
@@ -0,0 +1,55 @@
+/*
+ * PowerPC Memory Protection Keys management
+ * Copyright (c) 2017, IBM Corporation.
+ * Author: Ram Pai 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+#ifndef _ASM_POWERPC_KEYS_H
+#define _ASM_POWERPC_KEYS_H
+
+#include 
+
+DECLARE_STATIC_KEY_TRUE(pkey_disabled);
+#define ARCH_VM_PKEY_FLAGS 0
+
+static inline bool mm_pkey_is_allocated(struct mm_struct *mm, int pkey)
+{
+   return false;
+}
+
+static inline int mm_pkey_alloc(struct mm_struct *mm)
+{
+   return -1;
+}
+
+static inline int mm_pkey_free(struct mm_struct *mm, int pkey)
+{
+   return -EINVAL;
+}
+
+/*
+ * Try to dedicate one of the protection keys to be used as an
+ * execute-only protection key.
+ */
+static inline int execute_only_pkey(struct mm_struct *mm)
+{
+   return 0;
+}
+
+static inline int arch_override_mprotect_pkey(struct vm_area_struct *vma,
+ int prot, int pkey)
+{
+   return 0;
+}
+
+static inline int arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
+   unsigned long init_val)
+{
+   return 0;
+}
+#endif /*_ASM_POWERPC_KEYS_H */
diff --git a/arch/powerpc/mm/Makefile b/arch/powerpc/mm/Makefile
index 76a6b05..181166d 100644
--- a/arch/powerpc/mm/Makefile
+++ b/arch/powerpc/mm/Makefile
@@ -44,3 +44,4 @@ obj-$(CONFIG_PPC_COPRO_BASE)  += copro_fault.o
 obj-$(CONFIG_SPAPR_TCE_IOMMU)  += mmu_context_iommu.o
 obj-$(CONFIG_PPC_PTDUMP)   += dump_linuxpagetables.o
 obj-$(CONFIG_PPC_HTDUMP)   += dump_hashpagetable.o
+obj-$(CONFIG_PPC_MEM_KEYS) += pkeys.o
diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index 0c802de..8bd841a 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -36,6 +36,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
diff --git a/arch/powerpc/mm/pkeys.c b/arch/powerpc/mm/pkeys.c
new file mode 100644
index 000..de7dc48
--- /dev/null
+++ b/arch/powerpc/mm/pkeys.c
@@ -0,0 +1,33 @@
+/*
+ * PowerPC Memory Protection Keys management
+ * Copyright (c) 2017, IBM Corporation.
+ * Author: Ram Pai 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+#include 
+
+DEFINE_STATIC_KEY_TRUE(pkey_disabled);
+bool pkey_execute_disable_supported;
+
+int pkey_initialize(void)
+{
+   /*
+* Disable the pkey system till everything is in place. A subsequent
+* patch will enable it.
+*/
+   static_b

[PATCH v10 02/27] mm, powerpc, x86: introduce an additional vma bit for powerpc pkey

2018-01-18 Thread Ram Pai
Currently only 4bits are allocated in the vma flags to hold 16
keys. This is sufficient for x86. PowerPC  supports  32  keys,
which needs 5bits. This patch allocates an  additional bit.

Acked-by: Balbir Singh 
Signed-off-by: Ram Pai 
---
 fs/proc/task_mmu.c |1 +
 include/linux/mm.h |3 ++-
 2 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index b139617..0edd4da 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -680,6 +680,7 @@ static void show_smap_vma_flags(struct seq_file *m, struct 
vm_area_struct *vma)
[ilog2(VM_PKEY_BIT1)]   = "",
[ilog2(VM_PKEY_BIT2)]   = "",
[ilog2(VM_PKEY_BIT3)]   = "",
+   [ilog2(VM_PKEY_BIT4)]   = "",
 #endif /* CONFIG_ARCH_HAS_PKEYS */
};
size_t i;
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 01381d3..ebcb997 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -231,9 +231,10 @@ extern int overcommit_kbytes_handler(struct ctl_table *, 
int, void __user *,
 #ifdef CONFIG_ARCH_HAS_PKEYS
 # define VM_PKEY_SHIFT VM_HIGH_ARCH_BIT_0
 # define VM_PKEY_BIT0  VM_HIGH_ARCH_0  /* A protection key is a 4-bit value */
-# define VM_PKEY_BIT1  VM_HIGH_ARCH_1
+# define VM_PKEY_BIT1  VM_HIGH_ARCH_1  /* on x86 and 5-bit value on ppc64   */
 # define VM_PKEY_BIT2  VM_HIGH_ARCH_2
 # define VM_PKEY_BIT3  VM_HIGH_ARCH_3
+# define VM_PKEY_BIT4  VM_HIGH_ARCH_4
 #endif /* CONFIG_ARCH_HAS_PKEYS */
 
 #if defined(CONFIG_X86)
-- 
1.7.1



[PATCH v10 01/27] mm, powerpc, x86: define VM_PKEY_BITx bits if CONFIG_ARCH_HAS_PKEYS is enabled

2018-01-18 Thread Ram Pai
VM_PKEY_BITx are defined only if CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS
is enabled. Powerpc also needs these bits. Hence lets define the
VM_PKEY_BITx bits for any architecture that enables
CONFIG_ARCH_HAS_PKEYS.

Signed-off-by: Ram Pai 
---
 fs/proc/task_mmu.c |4 ++--
 include/linux/mm.h |9 +
 2 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 339e4c1..b139617 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -674,13 +674,13 @@ static void show_smap_vma_flags(struct seq_file *m, 
struct vm_area_struct *vma)
[ilog2(VM_MERGEABLE)]   = "mg",
[ilog2(VM_UFFD_MISSING)]= "um",
[ilog2(VM_UFFD_WP)] = "uw",
-#ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS
+#ifdef CONFIG_ARCH_HAS_PKEYS
/* These come out via ProtectionKey: */
[ilog2(VM_PKEY_BIT0)]   = "",
[ilog2(VM_PKEY_BIT1)]   = "",
[ilog2(VM_PKEY_BIT2)]   = "",
[ilog2(VM_PKEY_BIT3)]   = "",
-#endif
+#endif /* CONFIG_ARCH_HAS_PKEYS */
};
size_t i;
 
diff --git a/include/linux/mm.h b/include/linux/mm.h
index ea818ff..01381d3 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -228,15 +228,16 @@ extern int overcommit_kbytes_handler(struct ctl_table *, 
int, void __user *,
 #define VM_HIGH_ARCH_4 BIT(VM_HIGH_ARCH_BIT_4)
 #endif /* CONFIG_ARCH_USES_HIGH_VMA_FLAGS */
 
-#if defined(CONFIG_X86)
-# define VM_PATVM_ARCH_1   /* PAT reserves whole VMA at 
once (x86) */
-#if defined (CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS)
+#ifdef CONFIG_ARCH_HAS_PKEYS
 # define VM_PKEY_SHIFT VM_HIGH_ARCH_BIT_0
 # define VM_PKEY_BIT0  VM_HIGH_ARCH_0  /* A protection key is a 4-bit value */
 # define VM_PKEY_BIT1  VM_HIGH_ARCH_1
 # define VM_PKEY_BIT2  VM_HIGH_ARCH_2
 # define VM_PKEY_BIT3  VM_HIGH_ARCH_3
-#endif
+#endif /* CONFIG_ARCH_HAS_PKEYS */
+
+#if defined(CONFIG_X86)
+# define VM_PATVM_ARCH_1   /* PAT reserves whole VMA at 
once (x86) */
 #elif defined(CONFIG_PPC)
 # define VM_SAOVM_ARCH_1   /* Strong Access Ordering 
(powerpc) */
 #elif defined(CONFIG_PARISC)
-- 
1.7.1



[PATCH v10 00/27] powerpc, mm: Memory Protection Keys

2018-01-18 Thread Ram Pai
Memory protection keys enable applications to protect its
address space from inadvertent access from or corruption
by itself.

These patches along with the pte-bit freeing patch series
enables the protection key feature on powerpc; 4k and 64k
hashpage kernels.

Will send the documentation and selftest patches separately

All patches can be found at --
https://github.com/rampai/memorykeys.git memkey.v10


The overall idea:
-
 A process allocates a key and associates it with
 an address range within its address space.
 The process then can dynamically set read/write 
 permissions on the key without involving the 
 kernel. Any code that violates the permissions
 of the address space; as defined by its associated
 key, will receive a segmentation fault.

This patch series enables the feature on PPC64 HPTE
platform.

ISA3.0 section 5.7.13 describes the detailed
specifications.


Highlevel view of the design:
---
When an application associates a key with a address
address range, program the key in the Linux PTE.
When the MMU detects a page fault, allocate a hash
page and program the key into HPTE. And finally
when the MMU detects a key violation; due to
invalid application access, invoke the registered
signal handler and provide the violated key number.


Testing:
---
This patch series has passed all the protection key
tests available in the selftest directory.The
tests are updated to work on both x86 and powerpc.
The selftests have passed on x86 and powerpc hardware.

History:
---
version v10:
(1) key-fault in page-fault handler
is handled as normal fault
and not as a bad fault.
(2) changed device tree scanning to 
unflattened device tree.
(3) fixed a bug in the logic that detected
the total number of available pkeys.
(4) dropped two patches. (i) sysfs interface
(ii) sys_pkey_modif() syscall

version v9:
(1) used jump-labels to optimize code
-- Balbir
(2) fixed a register initialization bug noted
by Balbir
(3) fixed inappropriate use of paca to pass
siginfo and keys to signal handler
(4) Cleanup of comment style not to be right 
justified -- mpe
(5) restructured the patches to depend on the
availability of VM_PKEY_BIT4 in
include/linux/mm.h
(6) Incorporated comments from Dave Hansen
towards changes to selftest and got
them tested on x86.

version v8:
(1) Contents of the AMR register withdrawn from
the siginfo structure. Applications can always
read the AMR register.
(2) AMR/IAMR/UAMOR are now available through 
ptrace system call. -- thanks to Thiago
(3) code changes to handle legacy power cpus
that do not support execute-disable.
(4) incorporates many code improvement
suggestions.

version v7:
(1) refers to device tree property to enable
protection keys.
(2) adds 4K PTE support.
(3) fixes a couple of bugs noticed by Thiago
(4) decouples this patch series from arch-
 independent code. This patch series can
 now stand by itself, with one kludge
patch(2).
version v7:
(1) refers to device tree property to enable
protection keys.
(2) adds 4K PTE support.
(3) fixes a couple of bugs noticed by Thiago
(4) decouples this patch series from arch-
 independent code. This patch series can
 now stand by itself, with one kludge
 patch(2).

version v6:
(1) selftest changes are broken down into 20
incremental patches.
(2) A separate key allocation mask that
includes PKEY_DISABLE_EXECUTE is 
added for powerpc
(3) pkey feature is enabled for 64K HPT case
only. RPT and 4k HPT is disabled.
(4) Documentation is updated to better 
capture the semantics.
(5) introduced arch_pkeys_enabled() to find
if an arch enables pkeys. Correspond-
ing change the logic that displays
key value in smaps.
(6) code rearranged in many places based on
comments from Dave Hansen, Balbir,
Anshuman.   
(7) fixed one bug where a bogus key could be
associated successfully in
pkey_mprotect().

version v5:
(1) reverted back to the old design -- store
 the key in the pte, instead of bypassing
 it. The v4 design slowed down the hash
 page path.
(2) detects key violation when kernel is told
to access user pages.
(3) further refined the patches into smaller
consumable units

linux-next: build failure after merge of the powerpc tree

2018-01-18 Thread Stephen Rothwell
Hi all,

After merging the powerpc tree, today's linux-next build (powerpc
ppc64_defconfig) failed like this:

In file included from include/asm-generic/bug.h:18:0,
 from arch/powerpc/include/asm/bug.h:128,
 from include/linux/bug.h:5,
 from arch/powerpc/include/asm/mmu.h:126,
 from arch/powerpc/include/asm/lppaca.h:36,
 from arch/powerpc/include/asm/paca.h:21,
 from arch/powerpc/include/asm/current.h:16,
 from include/linux/sched.h:12,
 from arch/powerpc/kernel/setup_64.c:15:
arch/powerpc/kernel/setup_64.c: In function 'init_fallback_flush':
arch/powerpc/kernel/setup_64.c:864:14: error: implicit declaration of function 
'safe_stack_limit'; did you mean 'save_stack_trace'? 
[-Werror=implicit-function-declaration]
  limit = min(safe_stack_limit(), ppc64_rma_size);
  ^
include/linux/kernel.h:790:2: note: in definition of macro '__min'
  t1 min1 = (x); \
  ^~
arch/powerpc/kernel/setup_64.c:864:10: note: in expansion of macro 'min'
  limit = min(safe_stack_limit(), ppc64_rma_size);
  ^~~
include/linux/kernel.h:792:16: error: comparison of distinct pointer types 
lacks a cast [-Werror]
  (void) (&min1 == &min2);   \
^
include/linux/kernel.h:801:2: note: in expansion of macro '__min'
  __min(typeof(x), typeof(y),   \
  ^
arch/powerpc/kernel/setup_64.c:864:10: note: in expansion of macro 'min'
  limit = min(safe_stack_limit(), ppc64_rma_size);
  ^~~

Caused by commit

  1af19331a3a1 ("powerpc/64s: Relax PACA address limitations")

interacting with commit

  aa8a5e0062ac ("powerpc/64s: Add support for RFI flush of L1-D cache")

from Linus' tree.

I applied the following fix patch.

From: Stephen Rothwell 
Date: Fri, 19 Jan 2018 09:21:44 +1100
Subject: [PATCH] powerpc: fix up for safe_stack_limit rename

Signed-off-by: Stephen Rothwell 
---
 arch/powerpc/kernel/setup_64.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/setup_64.c b/arch/powerpc/kernel/setup_64.c
index 9e23c74896cc..f2b532f00861 100644
--- a/arch/powerpc/kernel/setup_64.c
+++ b/arch/powerpc/kernel/setup_64.c
@@ -861,7 +861,7 @@ static void init_fallback_flush(void)
int cpu;
 
l1d_size = ppc64_caches.l1d.size;
-   limit = min(safe_stack_limit(), ppc64_rma_size);
+   limit = min(ppc64_bolted_size(), ppc64_rma_size);
 
/*
 * Align to L1d size, and size it at 2x L1d size, to catch possible
-- 
2.15.1

-- 
Cheers,
Stephen Rothwell


Re: [PATCH -next] ipmi/powernv: Fix error return code in ipmi_powernv_probe()

2018-01-18 Thread Corey Minyard

On 01/17/2018 10:04 PM, Alexey Kardashevskiy wrote:

On 17/01/18 22:25, Wei Yongjun wrote:

Fix to return a negative error code from the request_irq() error
handling case instead of 0, as done elsewhere in this function.

Signed-off-by: Wei Yongjun 


Reviewed-by: Alexey Kardashevskiy 


Queued for next release.  Thanks!

-corey





---
  drivers/char/ipmi/ipmi_powernv.c | 5 +++--
  1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/char/ipmi/ipmi_powernv.c b/drivers/char/ipmi/ipmi_powernv.c
index c687c8d..bcf493d 100644
--- a/drivers/char/ipmi/ipmi_powernv.c
+++ b/drivers/char/ipmi/ipmi_powernv.c
@@ -250,8 +250,9 @@ static int ipmi_powernv_probe(struct platform_device *pdev)
ipmi->irq = opal_event_request(prop);
}
  
-	if (request_irq(ipmi->irq, ipmi_opal_event, IRQ_TYPE_LEVEL_HIGH,

-   "opal-ipmi", ipmi)) {
+   rc = request_irq(ipmi->irq, ipmi_opal_event, IRQ_TYPE_LEVEL_HIGH,
+"opal-ipmi", ipmi);
+   if (rc) {
dev_warn(dev, "Unable to request irq\n");
goto err_dispose;
}







Re: [PATCH 3/5] powerpc/ftw: Implement a simple FTW driver

2018-01-18 Thread Sukadev Bhattiprolu
Randy Dunlap [rdun...@infradead.org] wrote:
> > +
> > +   default:
> > +   return -EINVAL;
> > +   }
> > +}
> 
> Nit:  some versions of gcc (or maybe clang) complain about a typed function
> not always having a return value in code like above, so it is often done as:

Ok.
> 
> > +static long ftw_ioctl(struct file *fp, unsigned int cmd, unsigned long arg)
> > +{
> > +   switch (cmd) {
> > +
> > +   case FTW_SETUP:
> > +   return ftw_ioc_ftw_setup(fp, arg);
> > +
> > +   default:
> > +   break;
> > +   }
> 
>   return -EINVAL;
> > +}
> 
> Do you expect to implement more ioctls?  If not, just change the switch to
> an if ().
Maybe a couple more but changed it to an 'if' for now (and fixed an
error handling issue in ftw_file_init()).

Here is the updated patch.

---
>From 344ffbcc2cd1e64dd87249d508cf6000e6e41a0c Mon Sep 17 00:00:00 2001
From: Sukadev Bhattiprolu 
Date: Fri, 4 Aug 2017 16:45:34 -0500
Subject: [PATCH 3/5] powerpc/ftw: Implement a simple FTW driver

The Fast Thread Wake-up (FTW) driver provides user space applications an
interface to the low latency Core-to-Core wakeup functionality in POWER9.

This mechanism allows a thread on one core to efficiently send a message
to a "waiting thread" on another core on the same chip, using the Virtual
Accelrator Switchboard (VAS) subsystem.

This initial FTW driver implements the ioctl and mmap operations on an
FTW device node. Using these operations, a pair of application threads
can establish a "communication channel" and use the COPY, PASTE and WAIT
instructions to wait/wake up.

PATCH 5/5 documents the API and includes an example of the usage.

Signed-off-by: Sukadev Bhattiprolu 
---
Changelog[v2]
- [Michael Neuling] Rename from drop "nx" from name "nx-ftw".
- [Michael Neuling] Use a single VAS_FTW_SETUP ioctl to simplify
  interface.
- [Michael Ellerman] To work with paste emulation patch, mark
  PTE dirty in ->mmap() to ensure there is no fault on paste
  (the emulation patch must disable pagefaults when updating
  thread reconfig registers).
- [Randy Dunlap] Minor cleanup in ftw_ioctl().
- Fix cleanup code in ftw_file_init()
- Check return value from set_thread_tidr().
- Move driver drivers/misc/ftw.
---
 drivers/misc/Kconfig  |   1 +
 drivers/misc/Makefile |   1 +
 drivers/misc/ftw/Kconfig  |  16 +++
 drivers/misc/ftw/Makefile |   4 +
 drivers/misc/ftw/ftw.c| 346 ++
 5 files changed, 368 insertions(+)
 create mode 100644 drivers/misc/ftw/Kconfig
 create mode 100644 drivers/misc/ftw/Makefile
 create mode 100644 drivers/misc/ftw/ftw.c

diff --git a/drivers/misc/Kconfig b/drivers/misc/Kconfig
index f1a5c23..a9b161f 100644
--- a/drivers/misc/Kconfig
+++ b/drivers/misc/Kconfig
@@ -508,4 +508,5 @@ source "drivers/misc/mic/Kconfig"
 source "drivers/misc/genwqe/Kconfig"
 source "drivers/misc/echo/Kconfig"
 source "drivers/misc/cxl/Kconfig"
+source "drivers/misc/ftw/Kconfig"
 endmenu
diff --git a/drivers/misc/Makefile b/drivers/misc/Makefile
index 5ca5f64..338668c 100644
--- a/drivers/misc/Makefile
+++ b/drivers/misc/Makefile
@@ -52,6 +52,7 @@ obj-$(CONFIG_GENWQE)  += genwqe/
 obj-$(CONFIG_ECHO) += echo/
 obj-$(CONFIG_VEXPRESS_SYSCFG)  += vexpress-syscfg.o
 obj-$(CONFIG_CXL_BASE) += cxl/
+obj-$(CONFIG_PPC_FTW)  += ftw/
 obj-$(CONFIG_ASPEED_LPC_CTRL)  += aspeed-lpc-ctrl.o
 obj-$(CONFIG_ASPEED_LPC_SNOOP) += aspeed-lpc-snoop.o
 obj-$(CONFIG_PCI_ENDPOINT_TEST)+= pci_endpoint_test.o
diff --git a/drivers/misc/ftw/Kconfig b/drivers/misc/ftw/Kconfig
new file mode 100644
index 000..5454d40
--- /dev/null
+++ b/drivers/misc/ftw/Kconfig
@@ -0,0 +1,16 @@
+
+config PPC_FTW
+   tristate "IBM Fast Thread-Wakeup (FTW)"
+   depends on PPC_VAS
+   default n
+   help
+  This enables support for IBM Fast Thread-Wakeup driver.
+
+  The FTW driver allows applications to utilize a low overhead
+  core-to-core wake up mechansim in the IBM Virtual Accelerator
+  Switchboard (VAS) to improve performance.
+
+  VAS adapters are found in POWER9 based systems and are required
+  for the FTW driver to be operational.
+
+  If unsure, say N.
diff --git a/drivers/misc/ftw/Makefile b/drivers/misc/ftw/Makefile
new file mode 100644
index 000..2cfe566
--- /dev/null
+++ b/drivers/misc/ftw/Makefile
@@ -0,0 +1,4 @@
+# SPDX-License-Identifier: GPL-2.0
+ccflags-y  := $(call cc-disable-warning, 
unused-const-variable)
+ccflags-$(CONFIG_PPC_WERROR)   += -Werror
+obj-$(CONFIG_PPC_FTW)  += ftw.o
diff --git a/drivers/misc/ftw/ftw.c b/drivers/misc/ftw/ftw.c
new file mode 100644
index 000..ea02a19
--- /dev/null
+++ b/drivers/misc/ftw/ftw.c
@@ -0,0 +1,346 @@
+/*
+ * Copyright 2018 IBM Corp.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms 

Re: [PATCH 2/5] powerpc/ftw: Define FTW_SETUP ioctl API

2018-01-18 Thread Sukadev Bhattiprolu
Randy Dunlap [rdun...@infradead.org] wrote:

> > +#define FTW_FLAGS_PIN_WINDOW   0x1
> > +
> > +#define FTW_SETUP  _IOW('v', 1, struct ftw_setup_attr)
> 
> ioctls should be documented in Documentation/ioctl/ioctl-number.txt.
> Please update that file.

Ok. Here is the updated patch.

Thanks for the review.

Sukadev
---
>From 1f347c199a0b1bbc528705c8e9ddd11c825a80fc Mon Sep 17 00:00:00 2001
From: Sukadev Bhattiprolu 
Date: Thu, 2 Feb 2017 06:20:07 -0500
Subject: [PATCH 2/5] powerpc/ftw: Define FTW_SETUP ioctl API

Define the FTW_SETUP ioctl interface for fast thread wakeup (FTW). A
follow-on patch will implement the FTW driver and ioctl.

Thanks to input from Ben Herrenschmidt, Michael Neuling, Michael Ellerman.

Signed-off-by: Sukadev Bhattiprolu 
---
Changelog[v2]
- [Michael Neuling] Use a single VAS_FTW_SETUP ioctl and simplify
  the interface.
- [Randy Dunlap] Reserve/document the ioctl number used.
---
 Documentation/ioctl/ioctl-number.txt |  1 +
 include/uapi/misc/ftw.h  | 35 +++
 2 files changed, 36 insertions(+)
 create mode 100644 include/uapi/misc/ftw.h

diff --git a/Documentation/ioctl/ioctl-number.txt 
b/Documentation/ioctl/ioctl-number.txt
index 3e3fdae..b0f323c 100644
--- a/Documentation/ioctl/ioctl-number.txt
+++ b/Documentation/ioctl/ioctl-number.txt
@@ -277,6 +277,7 @@ Code  Seq#(hex) Include FileComments
 'v'00-1F   linux/fs.h  conflict!
 'v'00-0F   linux/sonypi.h  conflict!
 'v'C0-FF   linux/meye.hconflict!
+'v'20-27   include/uapi/misc/ftw.h
 'w'all CERN SCI driver
 'y'00-1F   packet based user level communications

diff --git a/include/uapi/misc/ftw.h b/include/uapi/misc/ftw.h
new file mode 100644
index 000..99676b2
--- /dev/null
+++ b/include/uapi/misc/ftw.h
@@ -0,0 +1,35 @@
+/*
+ * Copyright 2018 IBM Corp.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+
+#ifndef _UAPI_MISC_FTW_H
+#define _UAPI_MISC_FTW_H
+
+#include 
+#include 
+
+#define FTW_FLAGS_PIN_WINDOW   0x1
+
+/*
+ * Note: The range 0x20-27 for letter 'v' are reserved for FTW ioctls in
+ *  Documentation/ioctl/ioctl-number.txt.
+ */
+#define FTW_SETUP  _IOW('v', 0x20, struct ftw_setup_attr)
+
+struct ftw_setup_attr {
+   __s16   version;
+   __s16   vas_id; /* specific instance of vas or -1 for default */
+   __u32   reserved;
+
+   __u64   reserved1;
+
+   __u64   flags;
+   __u64   reserved2;
+};
+
+#endif /* _UAPI_MISC_FTW_H */
-- 
2.7.4



[PATCH v4.14 backport 10/10] powerpc/powernv: Check device-tree for RFI flush settings

2018-01-18 Thread Michael Ellerman
From: Oliver O'Halloran 

commit 6e032b350cd1fdb830f18f8320ef0e13b4e24094 upstream.

New device-tree properties are available which tell the hypervisor
settings related to the RFI flush. Use them to determine the
appropriate flush instruction to use, and whether the flush is
required.

Signed-off-by: Oliver O'Halloran 
Signed-off-by: Michael Ellerman 
---
 arch/powerpc/platforms/powernv/setup.c | 49 ++
 1 file changed, 49 insertions(+)

diff --git a/arch/powerpc/platforms/powernv/setup.c 
b/arch/powerpc/platforms/powernv/setup.c
index bfe2aa702973..7966a314d93a 100644
--- a/arch/powerpc/platforms/powernv/setup.c
+++ b/arch/powerpc/platforms/powernv/setup.c
@@ -36,13 +36,62 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "powernv.h"
 
+static void pnv_setup_rfi_flush(void)
+{
+   struct device_node *np, *fw_features;
+   enum l1d_flush_type type;
+   int enable;
+
+   /* Default to fallback in case fw-features are not available */
+   type = L1D_FLUSH_FALLBACK;
+   enable = 1;
+
+   np = of_find_node_by_name(NULL, "ibm,opal");
+   fw_features = of_get_child_by_name(np, "fw-features");
+   of_node_put(np);
+
+   if (fw_features) {
+   np = of_get_child_by_name(fw_features, "inst-l1d-flush-trig2");
+   if (np && of_property_read_bool(np, "enabled"))
+   type = L1D_FLUSH_MTTRIG;
+
+   of_node_put(np);
+
+   np = of_get_child_by_name(fw_features, 
"inst-l1d-flush-ori30,30,0");
+   if (np && of_property_read_bool(np, "enabled"))
+   type = L1D_FLUSH_ORI;
+
+   of_node_put(np);
+
+   /* Enable unless firmware says NOT to */
+   enable = 2;
+   np = of_get_child_by_name(fw_features, 
"needs-l1d-flush-msr-hv-1-to-0");
+   if (np && of_property_read_bool(np, "disabled"))
+   enable--;
+
+   of_node_put(np);
+
+   np = of_get_child_by_name(fw_features, 
"needs-l1d-flush-msr-pr-0-to-1");
+   if (np && of_property_read_bool(np, "disabled"))
+   enable--;
+
+   of_node_put(np);
+   of_node_put(fw_features);
+   }
+
+   setup_rfi_flush(type, enable > 0);
+}
+
 static void __init pnv_setup_arch(void)
 {
set_arch_panic_timeout(10, ARCH_PANIC_TIMEOUT);
 
+   pnv_setup_rfi_flush();
+
/* Initialize SMP */
pnv_smp_init();
 
-- 
2.14.3



[PATCH v4.14 backport 09/10] powerpc/pseries: Query hypervisor for RFI flush settings

2018-01-18 Thread Michael Ellerman
From: Michael Neuling 

commit 8989d56878a7735dfdb234707a2fee6faf631085 upstream.

A new hypervisor call is available which tells the guest settings
related to the RFI flush. Use it to query the appropriate flush
instruction(s), and whether the flush is required.

Signed-off-by: Michael Neuling 
Signed-off-by: Michael Ellerman 
---
 arch/powerpc/platforms/pseries/setup.c | 35 ++
 1 file changed, 35 insertions(+)

diff --git a/arch/powerpc/platforms/pseries/setup.c 
b/arch/powerpc/platforms/pseries/setup.c
index a8531e012658..ae4f596273b5 100644
--- a/arch/powerpc/platforms/pseries/setup.c
+++ b/arch/powerpc/platforms/pseries/setup.c
@@ -459,6 +459,39 @@ static void __init find_and_init_phbs(void)
of_pci_check_probe_only();
 }
 
+static void pseries_setup_rfi_flush(void)
+{
+   struct h_cpu_char_result result;
+   enum l1d_flush_type types;
+   bool enable;
+   long rc;
+
+   /* Enable by default */
+   enable = true;
+
+   rc = plpar_get_cpu_characteristics(&result);
+   if (rc == H_SUCCESS) {
+   types = L1D_FLUSH_NONE;
+
+   if (result.character & H_CPU_CHAR_L1D_FLUSH_TRIG2)
+   types |= L1D_FLUSH_MTTRIG;
+   if (result.character & H_CPU_CHAR_L1D_FLUSH_ORI30)
+   types |= L1D_FLUSH_ORI;
+
+   /* Use fallback if nothing set in hcall */
+   if (types == L1D_FLUSH_NONE)
+   types = L1D_FLUSH_FALLBACK;
+
+   if (!(result.behaviour & H_CPU_BEHAV_L1D_FLUSH_PR))
+   enable = false;
+   } else {
+   /* Default to fallback if case hcall is not available */
+   types = L1D_FLUSH_FALLBACK;
+   }
+
+   setup_rfi_flush(types, enable);
+}
+
 static void __init pSeries_setup_arch(void)
 {
set_arch_panic_timeout(10, ARCH_PANIC_TIMEOUT);
@@ -476,6 +509,8 @@ static void __init pSeries_setup_arch(void)
 
fwnmi_init();
 
+   pseries_setup_rfi_flush();
+
/* By default, only probe PCI (can be overridden by rtas_pci) */
pci_add_flags(PCI_PROBE_ONLY);
 
-- 
2.14.3



[PATCH v4.14 backport 08/10] powerpc/64s: Support disabling RFI flush with no_rfi_flush and nopti

2018-01-18 Thread Michael Ellerman
commit bc9c9304a45480797e13a8e1df96ffcf44fb62fe upstream.

Because there may be some performance overhead of the RFI flush, add
kernel command line options to disable it.

We add a sensibly named 'no_rfi_flush' option, but we also hijack the
x86 option 'nopti'. The RFI flush is not the same as KPTI, but if we
see 'nopti' we can guess that the user is trying to avoid any overhead
of Meltdown mitigations, and it means we don't have to educate every
one about a different command line option.

Signed-off-by: Michael Ellerman 
---
 arch/powerpc/kernel/setup_64.c | 24 +++-
 1 file changed, 23 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/setup_64.c b/arch/powerpc/kernel/setup_64.c
index 79e81a783547..935059cb9e40 100644
--- a/arch/powerpc/kernel/setup_64.c
+++ b/arch/powerpc/kernel/setup_64.c
@@ -788,8 +788,29 @@ early_initcall(disable_hardlockup_detector);
 #ifdef CONFIG_PPC_BOOK3S_64
 static enum l1d_flush_type enabled_flush_types;
 static void *l1d_flush_fallback_area;
+static bool no_rfi_flush;
 bool rfi_flush;
 
+static int __init handle_no_rfi_flush(char *p)
+{
+   pr_info("rfi-flush: disabled on command line.");
+   no_rfi_flush = true;
+   return 0;
+}
+early_param("no_rfi_flush", handle_no_rfi_flush);
+
+/*
+ * The RFI flush is not KPTI, but because users will see doco that says to use
+ * nopti we hijack that option here to also disable the RFI flush.
+ */
+static int __init handle_no_pti(char *p)
+{
+   pr_info("rfi-flush: disabling due to 'nopti' on command line.\n");
+   handle_no_rfi_flush(NULL);
+   return 0;
+}
+early_param("nopti", handle_no_pti);
+
 static void do_nothing(void *unused)
 {
/*
@@ -860,6 +881,7 @@ void __init setup_rfi_flush(enum l1d_flush_type types, bool 
enable)
 
enabled_flush_types = types;
 
-   rfi_flush_enable(enable);
+   if (!no_rfi_flush)
+   rfi_flush_enable(enable);
 }
 #endif /* CONFIG_PPC_BOOK3S_64 */
-- 
2.14.3



[PATCH v4.14 backport 07/10] powerpc/64s: Add support for RFI flush of L1-D cache

2018-01-18 Thread Michael Ellerman
commit aa8a5e0062ac940f7659394f4817c948dc8c0667 upstream.

On some CPUs we can prevent the Meltdown vulnerability by flushing the
L1-D cache on exit from kernel to user mode, and from hypervisor to
guest.

This is known to be the case on at least Power7, Power8 and Power9. At
this time we do not know the status of the vulnerability on other CPUs
such as the 970 (Apple G5), pasemi CPUs (AmigaOne X1000) or Freescale
CPUs. As more information comes to light we can enable this, or other
mechanisms on those CPUs.

The vulnerability occurs when the load of an architecturally
inaccessible memory region (eg. userspace load of kernel memory) is
speculatively executed to the point where its result can influence the
address of a subsequent speculatively executed load.

In order for that to happen, the first load must hit in the L1,
because before the load is sent to the L2 the permission check is
performed. Therefore if no kernel addresses hit in the L1 the
vulnerability can not occur. We can ensure that is the case by
flushing the L1 whenever we return to userspace. Similarly for
hypervisor vs guest.

In order to flush the L1-D cache on exit, we add a section of nops at
each (h)rfi location that returns to a lower privileged context, and
patch that with some sequence. Newer firmwares are able to advertise
to us that there is a special nop instruction that flushes the L1-D.
If we do not see that advertised, we fall back to doing a displacement
flush in software.

For guest kernels we support migration between some CPU versions, and
different CPUs may use different flush instructions. So that we are
prepared to migrate to a machine with a different flush instruction
activated, we may have to patch more than one flush instruction at
boot if the hypervisor tells us to.

In the end this patch is mostly the work of Nicholas Piggin and
Michael Ellerman. However a cast of thousands contributed to analysis
of the issue, earlier versions of the patch, back ports testing etc.
Many thanks to all of them.

Tested-by: Jon Masters 
Signed-off-by: Nicholas Piggin 
Signed-off-by: Michael Ellerman 
---
 arch/powerpc/include/asm/exception-64s.h  | 40 ---
 arch/powerpc/include/asm/feature-fixups.h | 13 +
 arch/powerpc/include/asm/paca.h   | 10 
 arch/powerpc/include/asm/setup.h  | 13 +
 arch/powerpc/kernel/asm-offsets.c |  5 ++
 arch/powerpc/kernel/exceptions-64s.S  | 84 +++
 arch/powerpc/kernel/setup_64.c| 79 +
 arch/powerpc/kernel/vmlinux.lds.S |  9 
 arch/powerpc/lib/feature-fixups.c | 41 +++
 9 files changed, 286 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/include/asm/exception-64s.h 
b/arch/powerpc/include/asm/exception-64s.h
index 0c8863ff65f7..ccf10c2f8899 100644
--- a/arch/powerpc/include/asm/exception-64s.h
+++ b/arch/powerpc/include/asm/exception-64s.h
@@ -69,34 +69,58 @@
  */
 #define EX_R3  EX_DAR
 
-/* Macros for annotating the expected destination of (h)rfid */
+/*
+ * Macros for annotating the expected destination of (h)rfid
+ *
+ * The nop instructions allow us to insert one or more instructions to flush 
the
+ * L1-D cache when returning to userspace or a guest.
+ */
+#define RFI_FLUSH_SLOT \
+   RFI_FLUSH_FIXUP_SECTION;\
+   nop;\
+   nop;\
+   nop
 
 #define RFI_TO_KERNEL  \
rfid
 
 #define RFI_TO_USER\
-   rfid
+   RFI_FLUSH_SLOT; \
+   rfid;   \
+   b   rfi_flush_fallback
 
 #define RFI_TO_USER_OR_KERNEL  \
-   rfid
+   RFI_FLUSH_SLOT; \
+   rfid;   \
+   b   rfi_flush_fallback
 
 #define RFI_TO_GUEST   \
-   rfid
+   RFI_FLUSH_SLOT; \
+   rfid;   \
+   b   rfi_flush_fallback
 
 #define HRFI_TO_KERNEL \
hrfid
 
 #define HRFI_TO_USER   \
-   hrfid
+   RFI_FLUSH_SLOT; \
+   hrfid;  \
+   b   hrfi_flush_fallback
 
 #define HRFI_TO_USER_OR_KERNEL \
-   hrfid
+   RFI_FLUSH_SLOT; 

[PATCH v4.14 backport 06/10] powerpc/64s: Convert slb_miss_common to use RFI_TO_USER/KERNEL

2018-01-18 Thread Michael Ellerman
From: Nicholas Piggin 

commit c7305645eb0c1621351cfc104038831ae87c0053 upstream.

In the SLB miss handler we may be returning to user or kernel. We need
to add a check early on and save the result in the cr4 register, and
then we bifurcate the return path based on that.

Signed-off-by: Nicholas Piggin 
Signed-off-by: Michael Ellerman 
---
 arch/powerpc/kernel/exceptions-64s.S | 29 -
 1 file changed, 28 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/exceptions-64s.S 
b/arch/powerpc/kernel/exceptions-64s.S
index 85795d41f6ca..845291357f0b 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -596,6 +596,9 @@ EXC_COMMON_BEGIN(slb_miss_common)
stw r9,PACA_EXSLB+EX_CCR(r13)   /* save CR in exc. frame */
std r10,PACA_EXSLB+EX_LR(r13)   /* save LR */
 
+   andi.   r9,r11,MSR_PR   // Check for exception from userspace
+   cmpdi   cr4,r9,MSR_PR   // And save the result in CR4 for later
+
/*
 * Test MSR_RI before calling slb_allocate_realmode, because the
 * MSR in r11 gets clobbered. However we still want to allocate
@@ -622,9 +625,32 @@ END_MMU_FTR_SECTION_IFCLR(MMU_FTR_TYPE_RADIX)
 
/* All done -- return from exception. */
 
+   bne cr4,1f  /* returning to kernel */
+
+.machine   push
+.machine   "power4"
+   mtcrf   0x80,r9
+   mtcrf   0x08,r9 /* MSR[PR] indication is in cr4 */
+   mtcrf   0x04,r9 /* MSR[RI] indication is in cr5 */
+   mtcrf   0x02,r9 /* I/D indication is in cr6 */
+   mtcrf   0x01,r9 /* slb_allocate uses cr0 and cr7 */
+.machine   pop
+
+   RESTORE_CTR(r9, PACA_EXSLB)
+   RESTORE_PPR_PACA(PACA_EXSLB, r9)
+   mr  r3,r12
+   ld  r9,PACA_EXSLB+EX_R9(r13)
+   ld  r10,PACA_EXSLB+EX_R10(r13)
+   ld  r11,PACA_EXSLB+EX_R11(r13)
+   ld  r12,PACA_EXSLB+EX_R12(r13)
+   ld  r13,PACA_EXSLB+EX_R13(r13)
+   RFI_TO_USER
+   b   .   /* prevent speculative execution */
+1:
 .machine   push
 .machine   "power4"
mtcrf   0x80,r9
+   mtcrf   0x08,r9 /* MSR[PR] indication is in cr4 */
mtcrf   0x04,r9 /* MSR[RI] indication is in cr5 */
mtcrf   0x02,r9 /* I/D indication is in cr6 */
mtcrf   0x01,r9 /* slb_allocate uses cr0 and cr7 */
@@ -638,9 +664,10 @@ END_MMU_FTR_SECTION_IFCLR(MMU_FTR_TYPE_RADIX)
ld  r11,PACA_EXSLB+EX_R11(r13)
ld  r12,PACA_EXSLB+EX_R12(r13)
ld  r13,PACA_EXSLB+EX_R13(r13)
-   rfid
+   RFI_TO_KERNEL
b   .   /* prevent speculative execution */
 
+
 2: std r3,PACA_EXSLB+EX_DAR(r13)
mr  r3,r12
mfspr   r11,SPRN_SRR0
-- 
2.14.3



[PATCH v4.14 backport 05/10] powerpc/64: Convert fast_exception_return to use RFI_TO_USER/KERNEL

2018-01-18 Thread Michael Ellerman
From: Nicholas Piggin 

commit a08f828cf47e6c605af21d2cdec68f84e799c318 upstream.

Similar to the syscall return path, in fast_exception_return we may be
returning to user or kernel context. We already have a test for that,
because we conditionally restore r13. So use that existing test and
branch, and bifurcate the return based on that.

Signed-off-by: Nicholas Piggin 
Signed-off-by: Michael Ellerman 
---
 arch/powerpc/kernel/entry_64.S | 18 --
 1 file changed, 16 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
index 150c402728d1..8a8a6d7ddcc6 100644
--- a/arch/powerpc/kernel/entry_64.S
+++ b/arch/powerpc/kernel/entry_64.S
@@ -892,7 +892,7 @@ BEGIN_FTR_SECTION
 END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)
ACCOUNT_CPU_USER_EXIT(r13, r2, r4)
REST_GPR(13, r1)
-1:
+
mtspr   SPRN_SRR1,r3
 
ld  r2,_CCR(r1)
@@ -905,8 +905,22 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)
ld  r3,GPR3(r1)
ld  r4,GPR4(r1)
ld  r1,GPR1(r1)
+   RFI_TO_USER
+   b   .   /* prevent speculative execution */
+
+1: mtspr   SPRN_SRR1,r3
+
+   ld  r2,_CCR(r1)
+   mtcrf   0xFF,r2
+   ld  r2,_NIP(r1)
+   mtspr   SPRN_SRR0,r2
 
-   rfid
+   ld  r0,GPR0(r1)
+   ld  r2,GPR2(r1)
+   ld  r3,GPR3(r1)
+   ld  r4,GPR4(r1)
+   ld  r1,GPR1(r1)
+   RFI_TO_KERNEL
b   .   /* prevent speculative execution */
 
 #endif /* CONFIG_PPC_BOOK3E */
-- 
2.14.3



[PATCH v4.14 backport 04/10] powerpc/64: Convert the syscall exit path to use RFI_TO_USER/KERNEL

2018-01-18 Thread Michael Ellerman
From: Nicholas Piggin 

commit b8e90cb7bc04a509e821e82ab6ed7a8ef11ba333 upstream.

In the syscall exit path we may be returning to user or kernel
context. We already have a test for that, because we conditionally
restore r13. So use that existing test and branch, and bifurcate the
return based on that.

Signed-off-by: Nicholas Piggin 
Signed-off-by: Michael Ellerman 
---
 arch/powerpc/kernel/entry_64.S | 12 +++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
index 5ae0a435417d..150c402728d1 100644
--- a/arch/powerpc/kernel/entry_64.S
+++ b/arch/powerpc/kernel/entry_64.S
@@ -267,13 +267,23 @@ BEGIN_FTR_SECTION
 END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)
 
ld  r13,GPR13(r1)   /* only restore r13 if returning to usermode */
+   ld  r2,GPR2(r1)
+   ld  r1,GPR1(r1)
+   mtlrr4
+   mtcrr5
+   mtspr   SPRN_SRR0,r7
+   mtspr   SPRN_SRR1,r8
+   RFI_TO_USER
+   b   .   /* prevent speculative execution */
+
+   /* exit to kernel */
 1: ld  r2,GPR2(r1)
ld  r1,GPR1(r1)
mtlrr4
mtcrr5
mtspr   SPRN_SRR0,r7
mtspr   SPRN_SRR1,r8
-   RFI
+   RFI_TO_KERNEL
b   .   /* prevent speculative execution */
 
 .Lsyscall_error:
-- 
2.14.3



[PATCH v4.14 backport 03/10] powerpc/64s: Simple RFI macro conversions

2018-01-18 Thread Michael Ellerman
From: Nicholas Piggin 

commit 222f20f140623ef6033491d0103ee0875fe87d35 upstream.

This commit does simple conversions of rfi/rfid to the new macros that
include the expected destination context. By simple we mean cases
where there is a single well known destination context, and it's
simply a matter of substituting the instruction for the appropriate
macro.

Signed-off-by: Nicholas Piggin 
Signed-off-by: Michael Ellerman 
---
 arch/powerpc/include/asm/exception-64s.h |  4 ++--
 arch/powerpc/kernel/entry_64.S   | 14 +-
 arch/powerpc/kernel/exceptions-64s.S | 22 +++---
 arch/powerpc/kvm/book3s_hv_rmhandlers.S  |  7 +++
 arch/powerpc/kvm/book3s_rmhandlers.S |  7 +--
 arch/powerpc/kvm/book3s_segment.S|  4 ++--
 6 files changed, 32 insertions(+), 26 deletions(-)

diff --git a/arch/powerpc/include/asm/exception-64s.h 
b/arch/powerpc/include/asm/exception-64s.h
index 6d65a561b0ee..0c8863ff65f7 100644
--- a/arch/powerpc/include/asm/exception-64s.h
+++ b/arch/powerpc/include/asm/exception-64s.h
@@ -242,7 +242,7 @@ END_FTR_SECTION_NESTED(ftr,ftr,943)
mtspr   SPRN_##h##SRR0,r12; \
mfspr   r12,SPRN_##h##SRR1; /* and SRR1 */  \
mtspr   SPRN_##h##SRR1,r10; \
-   h##rfid;\
+   h##RFI_TO_KERNEL;   \
b   .   /* prevent speculative execution */
 #define EXCEPTION_PROLOG_PSERIES_1(label, h)   \
__EXCEPTION_PROLOG_PSERIES_1(label, h)
@@ -256,7 +256,7 @@ END_FTR_SECTION_NESTED(ftr,ftr,943)
mtspr   SPRN_##h##SRR0,r12; \
mfspr   r12,SPRN_##h##SRR1; /* and SRR1 */  \
mtspr   SPRN_##h##SRR1,r10; \
-   h##rfid;\
+   h##RFI_TO_KERNEL;   \
b   .   /* prevent speculative execution */
 
 #define EXCEPTION_PROLOG_PSERIES_1_NORI(label, h)  \
diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
index 4a0fd4f40245..5ae0a435417d 100644
--- a/arch/powerpc/kernel/entry_64.S
+++ b/arch/powerpc/kernel/entry_64.S
@@ -37,6 +37,11 @@
 #include 
 #include 
 #include 
+#ifdef CONFIG_PPC_BOOK3S
+#include 
+#else
+#include 
+#endif
 
 /*
  * System calls.
@@ -397,8 +402,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)
mtmsrd  r10, 1
mtspr   SPRN_SRR0, r11
mtspr   SPRN_SRR1, r12
-
-   rfid
+   RFI_TO_USER
b   .   /* prevent speculative execution */
 #endif
 _ASM_NOKPROBE_SYMBOL(system_call_common);
@@ -1073,7 +1077,7 @@ __enter_rtas:

mtspr   SPRN_SRR0,r5
mtspr   SPRN_SRR1,r6
-   rfid
+   RFI_TO_KERNEL
b   .   /* prevent speculative execution */
 
 rtas_return_loc:
@@ -1098,7 +1102,7 @@ rtas_return_loc:
 
mtspr   SPRN_SRR0,r3
mtspr   SPRN_SRR1,r4
-   rfid
+   RFI_TO_KERNEL
b   .   /* prevent speculative execution */
 _ASM_NOKPROBE_SYMBOL(__enter_rtas)
 _ASM_NOKPROBE_SYMBOL(rtas_return_loc)
@@ -1171,7 +1175,7 @@ _GLOBAL(enter_prom)
LOAD_REG_IMMEDIATE(r12, MSR_SF | MSR_ISF | MSR_LE)
andcr11,r11,r12
mtsrr1  r11
-   rfid
+   RFI_TO_KERNEL
 #endif /* CONFIG_PPC_BOOK3E */
 
 1: /* Return from OF */
diff --git a/arch/powerpc/kernel/exceptions-64s.S 
b/arch/powerpc/kernel/exceptions-64s.S
index 06598142d755..85795d41f6ca 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -254,7 +254,7 @@ BEGIN_FTR_SECTION
LOAD_HANDLER(r12, machine_check_handle_early)
 1: mtspr   SPRN_SRR0,r12
mtspr   SPRN_SRR1,r11
-   rfid
+   RFI_TO_KERNEL
b   .   /* prevent speculative execution */
 2:
/* Stack overflow. Stay on emergency stack and panic.
@@ -443,7 +443,7 @@ EXC_COMMON_BEGIN(machine_check_handle_early)
li  r3,MSR_ME
andcr10,r10,r3  /* Turn off MSR_ME */
mtspr   SPRN_SRR1,r10
-   rfid
+   RFI_TO_KERNEL
b   .
 2:
/*
@@ -461,7 +461,7 @@ EXC_COMMON_BEGIN(machine_check_handle_early)
 */
bl  machine_check_queue_event
MACHINE_CHECK_HANDLER_WINDUP
-   rfid
+   RFI_TO_USER_OR_KERNEL
 9:
/* Deliver the machine check to host kernel in V mode. */
MACHINE_CHECK_HANDLER_WINDUP
@@ -649,7 +649,7 @@ END_MMU_FTR_SECTION_IFCLR(MMU_FTR_TYPE_RADIX)
mtspr   SPRN_SRR0,r10
ld  r10,PACAKMSR(r13)
mtspr   SPRN_SRR1,r10
-   rfid
+   RFI_TO_KERNEL
b   .
 
 8: std r3,PACA_EXSLB+EX_DAR(r13)
@@ -660,7 +660,7 @@ END_MMU_FTR_SECTION_IFCLR(MMU_FTR_TYPE_RADIX)
  

[PATCH v4.14 backport 01/10] powerpc/pseries: Add H_GET_CPU_CHARACTERISTICS flags & wrapper

2018-01-18 Thread Michael Ellerman
From: Michael Neuling 

commit 191eccb1580939fb0d47deb405b82a85b0379070 upstream.

A new hypervisor call has been defined to communicate various
characteristics of the CPU to guests. Add definitions for the hcall
number, flags and a wrapper function.

Signed-off-by: Michael Neuling 
Signed-off-by: Michael Ellerman 
---
 arch/powerpc/include/asm/hvcall.h | 17 +
 arch/powerpc/include/asm/plpar_wrappers.h | 14 ++
 2 files changed, 31 insertions(+)

diff --git a/arch/powerpc/include/asm/hvcall.h 
b/arch/powerpc/include/asm/hvcall.h
index a409177be8bd..f0461618bf7b 100644
--- a/arch/powerpc/include/asm/hvcall.h
+++ b/arch/powerpc/include/asm/hvcall.h
@@ -241,6 +241,7 @@
 #define H_GET_HCA_INFO  0x1B8
 #define H_GET_PERF_COUNT0x1BC
 #define H_MANAGE_TRACE  0x1C0
+#define H_GET_CPU_CHARACTERISTICS 0x1C8
 #define H_FREE_LOGICAL_LAN_BUFFER 0x1D4
 #define H_QUERY_INT_STATE   0x1E4
 #define H_POLL_PENDING 0x1D8
@@ -330,6 +331,17 @@
 #define H_SIGNAL_SYS_RESET_ALL_OTHERS  -2
 /* >= 0 values are CPU number */
 
+/* H_GET_CPU_CHARACTERISTICS return values */
+#define H_CPU_CHAR_SPEC_BAR_ORI31  (1ull << 63) // IBM bit 0
+#define H_CPU_CHAR_BCCTRL_SERIALISED   (1ull << 62) // IBM bit 1
+#define H_CPU_CHAR_L1D_FLUSH_ORI30 (1ull << 61) // IBM bit 2
+#define H_CPU_CHAR_L1D_FLUSH_TRIG2 (1ull << 60) // IBM bit 3
+#define H_CPU_CHAR_L1D_THREAD_PRIV (1ull << 59) // IBM bit 4
+
+#define H_CPU_BEHAV_FAVOUR_SECURITY(1ull << 63) // IBM bit 0
+#define H_CPU_BEHAV_L1D_FLUSH_PR   (1ull << 62) // IBM bit 1
+#define H_CPU_BEHAV_BNDS_CHK_SPEC_BAR  (1ull << 61) // IBM bit 2
+
 /* Flag values used in H_REGISTER_PROC_TBL hcall */
 #define PROC_TABLE_OP_MASK 0x18
 #define PROC_TABLE_DEREG   0x10
@@ -436,6 +448,11 @@ static inline unsigned int get_longbusy_msecs(int 
longbusy_rc)
}
 }
 
+struct h_cpu_char_result {
+   u64 character;
+   u64 behaviour;
+};
+
 #endif /* __ASSEMBLY__ */
 #endif /* __KERNEL__ */
 #endif /* _ASM_POWERPC_HVCALL_H */
diff --git a/arch/powerpc/include/asm/plpar_wrappers.h 
b/arch/powerpc/include/asm/plpar_wrappers.h
index 7f01b22fa6cb..55eddf50d149 100644
--- a/arch/powerpc/include/asm/plpar_wrappers.h
+++ b/arch/powerpc/include/asm/plpar_wrappers.h
@@ -326,4 +326,18 @@ static inline long plapr_signal_sys_reset(long cpu)
return plpar_hcall_norets(H_SIGNAL_SYS_RESET, cpu);
 }
 
+static inline long plpar_get_cpu_characteristics(struct h_cpu_char_result *p)
+{
+   unsigned long retbuf[PLPAR_HCALL_BUFSIZE];
+   long rc;
+
+   rc = plpar_hcall(H_GET_CPU_CHARACTERISTICS, retbuf);
+   if (rc == H_SUCCESS) {
+   p->character = retbuf[0];
+   p->behaviour = retbuf[1];
+   }
+
+   return rc;
+}
+
 #endif /* _ASM_POWERPC_PLPAR_WRAPPERS_H */
-- 
2.14.3



[PATCH v4.14 backport 02/10] powerpc/64: Add macros for annotating the destination of rfid/hrfid

2018-01-18 Thread Michael Ellerman
From: Nicholas Piggin 

commit 50e51c13b3822d14ff6df4279423e4b7b2269bc3 upstream.

The rfid/hrfid ((Hypervisor) Return From Interrupt) instruction is
used for switching from the kernel to userspace, and from the
hypervisor to the guest kernel. However it can and is also used for
other transitions, eg. from real mode kernel code to virtual mode
kernel code, and it's not always clear from the code what the
destination context is.

To make it clearer when reading the code, add macros which encode the
expected destination context.

Signed-off-by: Nicholas Piggin 
Signed-off-by: Michael Ellerman 
---
 arch/powerpc/include/asm/exception-64e.h |  6 ++
 arch/powerpc/include/asm/exception-64s.h | 29 +
 2 files changed, 35 insertions(+)

diff --git a/arch/powerpc/include/asm/exception-64e.h 
b/arch/powerpc/include/asm/exception-64e.h
index a703452d67b6..555e22d5e07f 100644
--- a/arch/powerpc/include/asm/exception-64e.h
+++ b/arch/powerpc/include/asm/exception-64e.h
@@ -209,5 +209,11 @@ exc_##label##_book3e:
ori r3,r3,vector_offset@l;  \
mtspr   SPRN_IVOR##vector_number,r3;
 
+#define RFI_TO_KERNEL  \
+   rfi
+
+#define RFI_TO_USER\
+   rfi
+
 #endif /* _ASM_POWERPC_EXCEPTION_64E_H */
 
diff --git a/arch/powerpc/include/asm/exception-64s.h 
b/arch/powerpc/include/asm/exception-64s.h
index 9a318973af05..6d65a561b0ee 100644
--- a/arch/powerpc/include/asm/exception-64s.h
+++ b/arch/powerpc/include/asm/exception-64s.h
@@ -69,6 +69,35 @@
  */
 #define EX_R3  EX_DAR
 
+/* Macros for annotating the expected destination of (h)rfid */
+
+#define RFI_TO_KERNEL  \
+   rfid
+
+#define RFI_TO_USER\
+   rfid
+
+#define RFI_TO_USER_OR_KERNEL  \
+   rfid
+
+#define RFI_TO_GUEST   \
+   rfid
+
+#define HRFI_TO_KERNEL \
+   hrfid
+
+#define HRFI_TO_USER   \
+   hrfid
+
+#define HRFI_TO_USER_OR_KERNEL \
+   hrfid
+
+#define HRFI_TO_GUEST  \
+   hrfid
+
+#define HRFI_TO_UNKNOWN
\
+   hrfid
+
 #ifdef CONFIG_RELOCATABLE
 #define __EXCEPTION_RELON_PROLOG_PSERIES_1(label, h)   \
mfspr   r11,SPRN_##h##SRR0; /* save SRR0 */ \
-- 
2.14.3



[PATCH] watchdog: core: make sure the watchdog_worker is not deferred

2018-01-18 Thread Christophe Leroy
commit 4cd13c21b207e ("softirq: Let ksoftirqd do its job") has the
effect of deferring timer handling in case of high CPU load, hence
delaying the delayed work allthought the worker is running which
high realtime priority.

As hrtimers are not managed by softirqs, this patch replaces the
delayed work by a plain work and uses an hrtimer to schedule that work.

Signed-off-by: Christophe Leroy 
---
 drivers/watchdog/watchdog_dev.c | 86 +
 1 file changed, 52 insertions(+), 34 deletions(-)

diff --git a/drivers/watchdog/watchdog_dev.c b/drivers/watchdog/watchdog_dev.c
index 68bc29e6e79e..ffbdc4642ea5 100644
--- a/drivers/watchdog/watchdog_dev.c
+++ b/drivers/watchdog/watchdog_dev.c
@@ -36,10 +36,10 @@
 #include/* For the -ENODEV/... values */
 #include   /* For file operations */
 #include /* For __init/__exit/... */
-#include  /* For timeout functions */
+#include  /* For hrtimers */
 #include   /* For printk/panic/... */
 #include /* For data references */
-#include  /* For kthread_delayed_work */
+#include  /* For kthread_work */
 #include   /* For handling misc devices */
 #include   /* For module stuff/... */
 #include/* For mutexes */
@@ -67,9 +67,10 @@ struct watchdog_core_data {
struct cdev cdev;
struct watchdog_device *wdd;
struct mutex lock;
-   unsigned long last_keepalive;
-   unsigned long last_hw_keepalive;
-   struct kthread_delayed_work work;
+   ktime_t last_keepalive;
+   ktime_t last_hw_keepalive;
+   struct hrtimer timer;
+   struct kthread_work work;
unsigned long status;   /* Internal status bits */
 #define _WDOG_DEV_OPEN 0   /* Opened ? */
 #define _WDOG_ALLOW_RELEASE1   /* Did we receive the magic char ? */
@@ -109,18 +110,19 @@ static inline bool watchdog_need_worker(struct 
watchdog_device *wdd)
(t && !watchdog_active(wdd) && watchdog_hw_running(wdd));
 }
 
-static long watchdog_next_keepalive(struct watchdog_device *wdd)
+static ktime_t watchdog_next_keepalive(struct watchdog_device *wdd)
 {
struct watchdog_core_data *wd_data = wdd->wd_data;
unsigned int timeout_ms = wdd->timeout * 1000;
-   unsigned long keepalive_interval;
-   unsigned long last_heartbeat;
-   unsigned long virt_timeout;
+   ktime_t keepalive_interval;
+   ktime_t last_heartbeat, latest_heartbeat;
+   ktime_t virt_timeout;
unsigned int hw_heartbeat_ms;
 
-   virt_timeout = wd_data->last_keepalive + msecs_to_jiffies(timeout_ms);
+   virt_timeout = ktime_add(wd_data->last_keepalive,
+ms_to_ktime(timeout_ms));
hw_heartbeat_ms = min_not_zero(timeout_ms, wdd->max_hw_heartbeat_ms);
-   keepalive_interval = msecs_to_jiffies(hw_heartbeat_ms / 2);
+   keepalive_interval = ms_to_ktime(hw_heartbeat_ms / 2);
 
if (!watchdog_active(wdd))
return keepalive_interval;
@@ -130,8 +132,11 @@ static long watchdog_next_keepalive(struct watchdog_device 
*wdd)
 * after the most recent ping from userspace, the last
 * worker ping has to come in hw_heartbeat_ms before this timeout.
 */
-   last_heartbeat = virt_timeout - msecs_to_jiffies(hw_heartbeat_ms);
-   return min_t(long, last_heartbeat - jiffies, keepalive_interval);
+   last_heartbeat = ktime_sub(virt_timeout, ms_to_ktime(hw_heartbeat_ms));
+   latest_heartbeat = ktime_sub(last_heartbeat, ktime_get());
+   if (ktime_before(latest_heartbeat, keepalive_interval))
+   return latest_heartbeat;
+   return keepalive_interval;
 }
 
 static inline void watchdog_update_worker(struct watchdog_device *wdd)
@@ -139,30 +144,33 @@ static inline void watchdog_update_worker(struct 
watchdog_device *wdd)
struct watchdog_core_data *wd_data = wdd->wd_data;
 
if (watchdog_need_worker(wdd)) {
-   long t = watchdog_next_keepalive(wdd);
+   ktime_t t = watchdog_next_keepalive(wdd);
 
if (t > 0)
-   kthread_mod_delayed_work(watchdog_kworker,
-&wd_data->work, t);
+   hrtimer_start(&wd_data->timer, t, HRTIMER_MODE_REL);
} else {
-   kthread_cancel_delayed_work_sync(&wd_data->work);
+   hrtimer_cancel(&wd_data->timer);
}
 }
 
 static int __watchdog_ping(struct watchdog_device *wdd)
 {
struct watchdog_core_data *wd_data = wdd->wd_data;
-   unsigned long earliest_keepalive = wd_data->last_hw_keepalive +
-   msecs_to_jiffies(wdd->min_hw_heartbeat_ms);
+   ktime_t earliest_keepalive, now;
int err;
 
-   if (time_is_after_jiffies(earliest_keepalive)) {
-   kthread_mod_delayed_work(watchdog_kworker, &wd_data->work,
-earliest

Re: DPAA Ethernet traffice troubles with Linux kernel

2018-01-18 Thread Joakim Tjernlund
On Thu, 1970-01-01 at 00:00 +, Joakim Tjernlund wrote:
> On Thu, 1970-01-01 at 00:00 +, Madalin-cristian Bucur wrote:
> > CAUTION: This email originated from outside of the organization. Do not 
> > click links or open attachments unless you recognize the sender and know 
> > the content is safe.
> > 
> > 
> > > -Original Message-
> > > From: Joakim Tjernlund [mailto:joakim.tjernl...@infinera.com]
> > > Sent: Tuesday, January 16, 2018 7:58 PM
> > > To: and...@lunn.ch
> > > Subject: Re: DPAA Ethernet traffice troubles with Linux kernel
> > > 
> > > On Thu, 1970-01-01 at 00:00 +, Andrew Lunn wrote:
> > > > 
> > > > Hi Joakim
> > > > 
> > > > You appear to be using an old kernel. Take a look at:
> > > 
> > > Not really, I am using 4.14.x and I don't think that is old. Seems like
> > > this
> > > patch hasn't been sent to 4.14.x.
> > > 
> > > I wonder if I might be missing something else, we just moved to 4.14 and
> > > notic that all
> > > our fixed PHYs are non functioning:
> > > fsl_mac ffe4e2000.ethernet: FMan MEMAC
> > > fsl_mac ffe4e2000.ethernet: FMan MAC address: 00:06:9c:0b:06:20
> > > fsl_mac dpaa-ethernet.0: __devm_request_mem_region(mac) failed
> > > fsl_mac: probe of dpaa-ethernet.0 failed with error -16
> > > fsl_mac ffe4e4000.ethernet: FMan MEMAC
> > > fsl_mac ffe4e4000.ethernet: FMan MAC address: 00:06:9c:0b:06:21
> > > fsl_mac dpaa-ethernet.1: __devm_request_mem_region(mac) failed
> > > fsl_mac: probe of dpaa-ethernet.1 failed with error -16
> > > fsl_mac ffe4e6000.ethernet: FMan MEMAC
> > > fsl_mac ffe4e6000.ethernet: FMan MAC address: 00:06:9c:0b:06:22
> > > fsl_mac dpaa-ethernet.2: __devm_request_mem_region(mac) failed
> > > fsl_mac: probe of dpaa-ethernet.2 failed with error -16
> > > fsl_mac ffe4e8000.ethernet: FMan MEMAC
> > > fsl_mac ffe4e8000.ethernet: FMan MAC address: 00:06:9c:0b:06:23
> > > fsl_mac dpaa-ethernet.3: __devm_request_mem_region(mac) failed
> > > fsl_mac: probe of dpaa-ethernet.3 failed with error -16
> > > 
> > > Feels like FMAN still think there are real PHYs there ?
> > 
> > Hi Joakim,
> > 
> > These errors are issued when trying to probe the second time the same
> > MAC node. The issue was introduced by this commit:
> > 
> > commit 4d8ee1935bcd666360311dfdadeee235d682d69a
> > Author: Florian Fainelli 
> > Date: Tue Aug 22 15:24:47 2017 -0700
> > fsl/man: Inherit parent device and of_node
> > 
> > and was later addressed by this patch set:
> > 
> > http://patchwork.ozlabs.org/project/netdev/list/?series=8462&state=*
> > 
> > Even with these errors printed, all is working fine, it's just the
> > second probing that fails. Adding the latter patches or reverting
> > the one above makes the errors prints dissapear.
> > 
> > Madalin
> 
> Ahh, now it starts to look better, reverting "fsl/man: Inherit parent device 
> and of_node" on 4.14 gives:
> libphy: Fixed MDIO Bus: probed
> tun: Universal TUN/TAP device driver, 1.6
> libphy: Freescale XGMAC MDIO Bus: probed
> iommu: Adding device ffe488000.port to group 10
> libphy: Freescale XGMAC MDIO Bus: probed
> mdio_bus ffe4e1000: Error while reading PHY0 reg at 3.3
> iommu: Adding device ffe489000.port to group 22
> libphy: Freescale XGMAC MDIO Bus: probed
> mdio_bus ffe4e3000: Error while reading PHY0 reg at 3.3
> iommu: Adding device ffe48a000.port to group 23
> libphy: Freescale XGMAC MDIO Bus: probed
> mdio_bus ffe4e5000: Error while reading PHY0 reg at 3.3
> iommu: Adding device ffe48b000.port to group 24
> libphy: Freescale XGMAC MDIO Bus: probed
> mdio_bus ffe4e7000: Error while reading PHY0 reg at 3.3
> iommu: Adding device ffe48c000.port to group 25
> libphy: Freescale XGMAC MDIO Bus: probed
> mdio_bus ffe4e9000: Error while reading PHY0 reg at 3.3
> fsl_mac ffe4e2000.ethernet: FMan MEMAC
> fsl_mac ffe4e2000.ethernet: FMan MAC address: 00:06:9c:0b:06:20
> fsl_mac ffe4e4000.ethernet: FMan MEMAC
> fsl_mac ffe4e4000.ethernet: FMan MAC address: 00:06:9c:0b:06:21
> fsl_mac ffe4e6000.ethernet: FMan MEMAC
> fsl_mac ffe4e6000.ethernet: FMan MAC address: 00:06:9c:0b:06:22
> fsl_mac ffe4e8000.ethernet: FMan MEMAC
> fsl_mac ffe4e8000.ethernet: FMan MAC address: 00:06:9c:0b:06:23
> fsl_mac ffe4e.ethernet: FMan MEMAC
> fsl_mac ffe4e.ethernet: FMan MAC address: 00:06:9c:0b:06:1f
> fsl_dpa dpaa-ethernet.0 eth0: Probed interface eth0
> fsl_dpa dpaa-ethernet.1 eth1: Probed interface eth1
> fsl_dpa dpaa-ethernet.2 eth2: Probed interface eth2
> fsl_dpa dpaa-ethernet.3 eth3: Probed interface eth3
> fsl_dpa dpaa-ethernet.4 eth4: Probed interface eth4
> 
> Still some minor errors: mdio_bus ffe4e7000: Error while reading PHY0 reg at 
> 3.3
> but this is going the right way(I have not had a chance to try if they work 
> due
> to external modules not ported/ready yet)
> 
> The other patch series is still to be tested though but I already now wanted 
> to stress the importance of getting all upstream fixes into stable, ASAP.
> You now what they are, I have no idea.

FYI, applied 
http://patchwork.ozlabs.org/project/ne