Re: [PATCH V9 1/3] powernv: powercap: Add support for powercap framework

2017-07-30 Thread Cyril Bur
On Mon, 2017-07-31 at 07:54 +0530, Shilpasri G Bhat wrote:
> Adds a generic powercap framework to change the system powercap
> inband through OPAL-OCC command/response interface.
> 
> Signed-off-by: Shilpasri G Bhat 
> ---
> Changes from V8:
> - Use __pa() while passing pointer in opal call
> - Use mutex_lock_interruptible()
> - Fix error codes returned to user
> - Allocate and add sysfs attributes in a single loop
> 
>  arch/powerpc/include/asm/opal-api.h|   5 +-
>  arch/powerpc/include/asm/opal.h|   4 +
>  arch/powerpc/platforms/powernv/Makefile|   2 +-
>  arch/powerpc/platforms/powernv/opal-powercap.c | 243 
> +
>  arch/powerpc/platforms/powernv/opal-wrappers.S |   2 +
>  arch/powerpc/platforms/powernv/opal.c  |   4 +
>  6 files changed, 258 insertions(+), 2 deletions(-)
>  create mode 100644 arch/powerpc/platforms/powernv/opal-powercap.c
> 
> diff --git a/arch/powerpc/include/asm/opal-api.h 
> b/arch/powerpc/include/asm/opal-api.h
> index 3130a73..c3e0c4a 100644
> --- a/arch/powerpc/include/asm/opal-api.h
> +++ b/arch/powerpc/include/asm/opal-api.h
> @@ -42,6 +42,7 @@
>  #define OPAL_I2C_STOP_ERR-24
>  #define OPAL_XIVE_PROVISIONING   -31
>  #define OPAL_XIVE_FREE_ACTIVE-32
> +#define OPAL_TIMEOUT -33
>  
>  /* API Tokens (in r0) */
>  #define OPAL_INVALID_CALL   -1
> @@ -190,7 +191,9 @@
>  #define OPAL_NPU_INIT_CONTEXT146
>  #define OPAL_NPU_DESTROY_CONTEXT 147
>  #define OPAL_NPU_MAP_LPAR148
> -#define OPAL_LAST148
> +#define OPAL_GET_POWERCAP152
> +#define OPAL_SET_POWERCAP153
> +#define OPAL_LAST153
>  
>  /* Device tree flags */
>  
> diff --git a/arch/powerpc/include/asm/opal.h b/arch/powerpc/include/asm/opal.h
> index 588fb1c..ec2087c 100644
> --- a/arch/powerpc/include/asm/opal.h
> +++ b/arch/powerpc/include/asm/opal.h
> @@ -267,6 +267,8 @@ int64_t opal_xive_set_vp_info(uint64_t vp,
>  int64_t opal_xive_free_irq(uint32_t girq);
>  int64_t opal_xive_sync(uint32_t type, uint32_t id);
>  int64_t opal_xive_dump(uint32_t type, uint32_t id);
> +int opal_get_powercap(u32 handle, int token, u32 *pcap);
> +int opal_set_powercap(u32 handle, int token, u32 pcap);
>  
>  /* Internal functions */
>  extern int early_init_dt_scan_opal(unsigned long node, const char *uname,
> @@ -345,6 +347,8 @@ static inline int opal_get_async_rc(struct opal_msg msg)
>  
>  void opal_wake_poller(void);
>  
> +void opal_powercap_init(void);
> +
>  #endif /* __ASSEMBLY__ */
>  
>  #endif /* _ASM_POWERPC_OPAL_H */
> diff --git a/arch/powerpc/platforms/powernv/Makefile 
> b/arch/powerpc/platforms/powernv/Makefile
> index b5d98cb..e79f806 100644
> --- a/arch/powerpc/platforms/powernv/Makefile
> +++ b/arch/powerpc/platforms/powernv/Makefile
> @@ -2,7 +2,7 @@ obj-y += setup.o opal-wrappers.o opal.o 
> opal-async.o idle.o
>  obj-y+= opal-rtc.o opal-nvram.o opal-lpc.o 
> opal-flash.o
>  obj-y+= rng.o opal-elog.o opal-dump.o 
> opal-sysparam.o opal-sensor.o
>  obj-y+= opal-msglog.o opal-hmi.o opal-power.o 
> opal-irqchip.o
> -obj-y+= opal-kmsg.o
> +obj-y+= opal-kmsg.o opal-powercap.o
>  
>  obj-$(CONFIG_SMP)+= smp.o subcore.o subcore-asm.o
>  obj-$(CONFIG_PCI)+= pci.o pci-ioda.o npu-dma.o
> diff --git a/arch/powerpc/platforms/powernv/opal-powercap.c 
> b/arch/powerpc/platforms/powernv/opal-powercap.c
> new file mode 100644
> index 000..9be5093
> --- /dev/null
> +++ b/arch/powerpc/platforms/powernv/opal-powercap.c
> @@ -0,0 +1,243 @@
> +/*
> + * PowerNV OPAL Powercap interface
> + *
> + * Copyright 2017 IBM Corp.
> + *
> + * This program is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU General Public License
> + * as published by the Free Software Foundation; either version
> + * 2 of the License, or (at your option) any later version.
> + */
> +
> +#define pr_fmt(fmt) "opal-powercap: " fmt
> +
> +#include 
> +#include 
> +#include 
> +
> +#include 
> +
> +DEFINE_MUTEX(powercap_mutex);
> +
> +static struct kobject *powercap_kobj;
> +
> +struct powercap_attr {
> + u32 handle;
> + struct kobj_attribute attr;
> +};
> +
> +static struct pcap {
> + struct attribute_group pg;
> + struct powercap_attr *pattrs;
> +} *pcaps;
> +
> +static ssize_t powercap_show(struct kobject *kobj, struct kobj_attribute 
> *attr,
> +  char *buf)
> +{
> + struct powercap_attr *pcap_attr = container_of(attr,
> + struct powercap_attr, attr);
> + struct opal_msg msg;
> + u32 pcap;
> + int ret, token;
> +
> + token = opal_async_get_token_interruptible();
> + if (token < 0) {
> +

Re: [PATCH v4 2/5] powerpc/lib/sstep: Add popcnt instruction emulation

2017-07-30 Thread Cyril Bur
On Mon, 2017-07-31 at 10:58 +1000, Matt Brown wrote:
> This adds emulations for the popcntb, popcntw, and popcntd instructions.
> Tested for correctness against the popcnt{b,w,d} instructions on ppc64le.
> 
> Signed-off-by: Matt Brown 

Unlike the rest of this series, it isn't immediately clear that it is
correct, we're definitely on the other side of the optimisation vs
readability line. It looks like it is, perhaps some comments to
clarify.

Otherwise,

Reviewed-by: Cyril Bur 

> ---
> v4:
>   - change ifdef macro from __powerpc64__ to CONFIG_PPC64
>   - slight optimisations 
>   (now identical to the popcntb implementation in kernel/traps.c)
> v3:
>   - optimised using the Giles-Miller method of side-ways addition
> v2:
>   - fixed opcodes
>   - fixed typecasting
>   - fixed bitshifting error for both 32 and 64bit arch
> ---
>  arch/powerpc/lib/sstep.c | 42 +-
>  1 file changed, 41 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
> index 87d277f..2fd7377 100644
> --- a/arch/powerpc/lib/sstep.c
> +++ b/arch/powerpc/lib/sstep.c
> @@ -612,6 +612,34 @@ static nokprobe_inline void do_cmpb(struct pt_regs 
> *regs, unsigned long v1,
>   regs->gpr[rd] = out_val;
>  }
>  
> +/*
> + * The size parameter is used to adjust the equivalent popcnt instruction.
> + * popcntb = 8, popcntw = 32, popcntd = 64
> + */
> +static nokprobe_inline void do_popcnt(struct pt_regs *regs, unsigned long v1,
> + int size, int ra)
> +{
> + unsigned long long out = v1;
> +
> + out -= (out >> 1) & 0x;
> + out = (0x & out) + (0x & (out >> 2));
> + out = (out + (out >> 4)) & 0x0f0f0f0f0f0f0f0f;
> +
> + if (size == 8) {/* popcntb */
> + regs->gpr[ra] = out;
> + return;
> + }
> + out += out >> 8;
> + out += out >> 16;
> + if (size == 32) {   /* popcntw */
> + regs->gpr[ra] = out & 0x003f003f;
> + return;
> + }
> +
> + out = (out + (out >> 32)) & 0x7f;
> + regs->gpr[ra] = out;/* popcntd */
> +}
> +
>  static nokprobe_inline int trap_compare(long v1, long v2)
>  {
>   int ret = 0;
> @@ -1194,6 +1222,10 @@ int analyse_instr(struct instruction_op *op, struct 
> pt_regs *regs,
>   regs->gpr[ra] = regs->gpr[rd] & ~regs->gpr[rb];
>   goto logical_done;
>  
> + case 122:   /* popcntb */
> + do_popcnt(regs, regs->gpr[rd], 8, ra);
> + goto logical_done;
> +
>   case 124:   /* nor */
>   regs->gpr[ra] = ~(regs->gpr[rd] | regs->gpr[rb]);
>   goto logical_done;
> @@ -1206,6 +1238,10 @@ int analyse_instr(struct instruction_op *op, struct 
> pt_regs *regs,
>   regs->gpr[ra] = regs->gpr[rd] ^ regs->gpr[rb];
>   goto logical_done;
>  
> + case 378:   /* popcntw */
> + do_popcnt(regs, regs->gpr[rd], 32, ra);
> + goto logical_done;
> +
>   case 412:   /* orc */
>   regs->gpr[ra] = regs->gpr[rd] | ~regs->gpr[rb];
>   goto logical_done;
> @@ -1217,7 +1253,11 @@ int analyse_instr(struct instruction_op *op, struct 
> pt_regs *regs,
>   case 476:   /* nand */
>   regs->gpr[ra] = ~(regs->gpr[rd] & regs->gpr[rb]);
>   goto logical_done;
> -
> +#ifdef CONFIG_PPC64
> + case 506:   /* popcntd */
> + do_popcnt(regs, regs->gpr[rd], 64, ra);
> + goto logical_done;
> +#endif
>   case 922:   /* extsh */
>   regs->gpr[ra] = (signed short) regs->gpr[rd];
>   goto logical_done;


Re: [PATCH v4 5/5] powerpc/lib/sstep: Add isel instruction emulation

2017-07-30 Thread Cyril Bur
On Mon, 2017-07-31 at 10:58 +1000, Matt Brown wrote:
> This adds emulation for the isel instruction.
> Tested for correctness against the isel instruction and its extended
> mnemonics (lt, gt, eq) on ppc64le.
> 
> Signed-off-by: Matt Brown 

Reviewed-by: Cyril Bur 

> ---
> v4:
>   - simplify if statement to ternary op
>   (same as isel emulation in kernel/traps.c)
> v2:
>   - fixed opcode
>   - fixed definition to include the 'if RA=0, a=0' clause
>   - fixed ccr bitshifting error
> ---
>  arch/powerpc/lib/sstep.c | 8 
>  1 file changed, 8 insertions(+)
> 
> diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
> index af4eef9..473bab5 100644
> --- a/arch/powerpc/lib/sstep.c
> +++ b/arch/powerpc/lib/sstep.c
> @@ -1240,6 +1240,14 @@ int analyse_instr(struct instruction_op *op, struct 
> pt_regs *regs,
>  /*
>   * Logical instructions
>   */
> + case 15:/* isel */
> + mb = (instr >> 6) & 0x1f; /* bc */
> + val = (regs->ccr >> (31 - mb)) & 1;
> + val2 = (ra) ? regs->gpr[ra] : 0;
> +
> + regs->gpr[rd] = (val) ? val2 : regs->gpr[rb];
> + goto logical_done;
> +
>   case 26:/* cntlzw */
>   asm("cntlzw %0,%1" : "=r" (regs->gpr[ra]) :
>   "r" (regs->gpr[rd]));


Re: [PATCH v4 4/5] powerpc/lib/sstep: Add prty instruction emulation

2017-07-30 Thread Cyril Bur
On Mon, 2017-07-31 at 10:58 +1000, Matt Brown wrote:
> This adds emulation for the prtyw and prtyd instructions.
> Tested for logical correctness against the prtyw and prtyd instructions
> on ppc64le.
> 
> Signed-off-by: Matt Brown 

Reviewed-by: Cyril Bur 

> ---
> v4:
>   - use simpler xor method
> v3:
>   - optimised using the Giles-Miller method of side-ways addition
> v2:
>   - fixed opcodes
>   - fixed bitshifting and typecast errors
>   - merged do_prtyw and do_prtyd into single function
> ---
>  arch/powerpc/lib/sstep.c | 26 ++
>  1 file changed, 26 insertions(+)
> 
> diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
> index c9fd613..af4eef9 100644
> --- a/arch/powerpc/lib/sstep.c
> +++ b/arch/powerpc/lib/sstep.c
> @@ -657,6 +657,24 @@ static nokprobe_inline void do_bpermd(struct pt_regs 
> *regs, unsigned long v1,
>   regs->gpr[ra] = perm;
>  }
>  #endif /* CONFIG_PPC64 */
> +/*
> + * The size parameter adjusts the equivalent prty instruction.
> + * prtyw = 32, prtyd = 64
> + */
> +static nokprobe_inline void do_prty(struct pt_regs *regs, unsigned long v,
> + int size, int ra)
> +{
> + unsigned long long res = v ^ (v >> 8);
> +
> + res ^= res >> 16;
> + if (size == 32) {   /* prtyw */
> + regs->gpr[ra] = res & 0x00010001;
> + return;
> + }
> +
> + res ^= res >> 32;
> + regs->gpr[ra] = res & 1;/*prtyd */
> +}
>  
>  static nokprobe_inline int trap_compare(long v1, long v2)
>  {
> @@ -1247,6 +1265,14 @@ int analyse_instr(struct instruction_op *op, struct 
> pt_regs *regs,
>   case 124:   /* nor */
>   regs->gpr[ra] = ~(regs->gpr[rd] | regs->gpr[rb]);
>   goto logical_done;
> +
> + case 154:   /* prtyw */
> + do_prty(regs, regs->gpr[rd], 32, ra);
> + goto logical_done;
> +
> + case 186:   /* prtyd */
> + do_prty(regs, regs->gpr[rd], 64, ra);
> + goto logical_done;
>  #ifdef CONFIG_PPC64
>   case 252:   /* bpermd */
>   do_bpermd(regs, regs->gpr[rd], regs->gpr[rb], ra);


Re: [PATCH v4 3/5] powerpc/lib/sstep: Add bpermd instruction emulation

2017-07-30 Thread Cyril Bur
On Mon, 2017-07-31 at 10:58 +1000, Matt Brown wrote:
> This adds emulation for the bpermd instruction.
> Tested for correctness against the bpermd instruction on ppc64le.
> 
> Signed-off-by: Matt Brown 

Reviewed-by: Cyril Bur 

> ---
> v4:
>   - change ifdef macro from __powerpc64__ to CONFIG_PPC64
> v2:
>   - fixed opcode
>   - added ifdef tags to do_bpermd func
>   - fixed bitshifting errors
> ---
>  arch/powerpc/lib/sstep.c | 24 +++-
>  1 file changed, 23 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
> index 2fd7377..c9fd613 100644
> --- a/arch/powerpc/lib/sstep.c
> +++ b/arch/powerpc/lib/sstep.c
> @@ -640,6 +640,24 @@ static nokprobe_inline void do_popcnt(struct pt_regs 
> *regs, unsigned long v1,
>   regs->gpr[ra] = out;/* popcntd */
>  }
>  
> +#ifdef CONFIG_PPC64
> +static nokprobe_inline void do_bpermd(struct pt_regs *regs, unsigned long v1,
> + unsigned long v2, int ra)
> +{
> + unsigned char perm, idx;
> + unsigned int i;
> +
> + perm = 0;
> + for (i = 0; i < 8; i++) {
> + idx = (v1 >> (i * 8)) & 0xff;
> + if (idx < 64)
> + if (v2 & PPC_BIT(idx))
> + perm |= 1 << i;
> + }
> + regs->gpr[ra] = perm;
> +}
> +#endif /* CONFIG_PPC64 */
> +
>  static nokprobe_inline int trap_compare(long v1, long v2)
>  {
>   int ret = 0;
> @@ -1229,7 +1247,11 @@ int analyse_instr(struct instruction_op *op, struct 
> pt_regs *regs,
>   case 124:   /* nor */
>   regs->gpr[ra] = ~(regs->gpr[rd] | regs->gpr[rb]);
>   goto logical_done;
> -
> +#ifdef CONFIG_PPC64
> + case 252:   /* bpermd */
> + do_bpermd(regs, regs->gpr[rd], regs->gpr[rb], ra);
> + goto logical_done;
> +#endif
>   case 284:   /* xor */
>   regs->gpr[ra] = ~(regs->gpr[rd] ^ regs->gpr[rb]);
>   goto logical_done;


[PATCH V9 3/3] powernv: Add support to clear sensor groups data

2017-07-30 Thread Shilpasri G Bhat
Adds support for clearing different sensor groups. OCC inband sensor
groups like CSM, Profiler, Job Scheduler can be cleared using this
driver. The min/max of all sensors belonging to these sensor groups
will be cleared.

Signed-off-by: Shilpasri G Bhat 
---
Changes from V8:
- Use mutex_lock_interruptible()
- Fix error codes returned to user

 arch/powerpc/include/asm/opal-api.h|   3 +-
 arch/powerpc/include/asm/opal.h|   2 +
 arch/powerpc/include/uapi/asm/opal-occ.h   |  23 +
 arch/powerpc/platforms/powernv/Makefile|   2 +-
 arch/powerpc/platforms/powernv/opal-occ.c  | 115 +
 arch/powerpc/platforms/powernv/opal-wrappers.S |   1 +
 arch/powerpc/platforms/powernv/opal.c  |   3 +
 7 files changed, 147 insertions(+), 2 deletions(-)
 create mode 100644 arch/powerpc/include/uapi/asm/opal-occ.h
 create mode 100644 arch/powerpc/platforms/powernv/opal-occ.c

diff --git a/arch/powerpc/include/asm/opal-api.h 
b/arch/powerpc/include/asm/opal-api.h
index 92e31fd..0841659 100644
--- a/arch/powerpc/include/asm/opal-api.h
+++ b/arch/powerpc/include/asm/opal-api.h
@@ -195,7 +195,8 @@
 #define OPAL_SET_POWERCAP  153
 #define OPAL_GET_POWER_SHIFT_RATIO 154
 #define OPAL_SET_POWER_SHIFT_RATIO 155
-#define OPAL_LAST  155
+#define OPAL_SENSOR_GROUPS_CLEAR   156
+#define OPAL_LAST  156
 
 /* Device tree flags */
 
diff --git a/arch/powerpc/include/asm/opal.h b/arch/powerpc/include/asm/opal.h
index b9ea77f..a716def 100644
--- a/arch/powerpc/include/asm/opal.h
+++ b/arch/powerpc/include/asm/opal.h
@@ -271,6 +271,7 @@ int64_t opal_xive_set_vp_info(uint64_t vp,
 int opal_set_powercap(u32 handle, int token, u32 pcap);
 int opal_get_power_shift_ratio(u32 handle, int token, u32 *psr);
 int opal_set_power_shift_ratio(u32 handle, int token, u32 psr);
+int opal_sensor_groups_clear(u32 group_hndl, int token);
 
 /* Internal functions */
 extern int early_init_dt_scan_opal(unsigned long node, const char *uname,
@@ -351,6 +352,7 @@ static inline int opal_get_async_rc(struct opal_msg msg)
 
 void opal_powercap_init(void);
 void opal_psr_init(void);
+int opal_sensor_groups_clear_history(u32 handle);
 
 #endif /* __ASSEMBLY__ */
 
diff --git a/arch/powerpc/include/uapi/asm/opal-occ.h 
b/arch/powerpc/include/uapi/asm/opal-occ.h
new file mode 100644
index 000..97c45e2
--- /dev/null
+++ b/arch/powerpc/include/uapi/asm/opal-occ.h
@@ -0,0 +1,23 @@
+/*
+ * OPAL OCC command interface
+ * Supported on POWERNV platform
+ *
+ * (C) Copyright IBM 2017
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2, or (at your option)
+ * any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#ifndef _UAPI_ASM_POWERPC_OPAL_OCC_H_
+#define _UAPI_ASM_POWERPC_OPAL_OCC_H_
+
+#define OPAL_OCC_IOCTL_CLEAR_SENSOR_GROUPS _IOR('o', 1, u32)
+
+#endif /* _UAPI_ASM_POWERPC_OPAL_OCC_H */
diff --git a/arch/powerpc/platforms/powernv/Makefile 
b/arch/powerpc/platforms/powernv/Makefile
index 9ed7d33..f193b33 100644
--- a/arch/powerpc/platforms/powernv/Makefile
+++ b/arch/powerpc/platforms/powernv/Makefile
@@ -2,7 +2,7 @@ obj-y   += setup.o opal-wrappers.o opal.o 
opal-async.o idle.o
 obj-y  += opal-rtc.o opal-nvram.o opal-lpc.o opal-flash.o
 obj-y  += rng.o opal-elog.o opal-dump.o opal-sysparam.o 
opal-sensor.o
 obj-y  += opal-msglog.o opal-hmi.o opal-power.o opal-irqchip.o
-obj-y  += opal-kmsg.o opal-powercap.o opal-psr.o
+obj-y  += opal-kmsg.o opal-powercap.o opal-psr.o opal-occ.o
 
 obj-$(CONFIG_SMP)  += smp.o subcore.o subcore-asm.o
 obj-$(CONFIG_PCI)  += pci.o pci-ioda.o npu-dma.o
diff --git a/arch/powerpc/platforms/powernv/opal-occ.c 
b/arch/powerpc/platforms/powernv/opal-occ.c
new file mode 100644
index 000..f3ce880
--- /dev/null
+++ b/arch/powerpc/platforms/powernv/opal-occ.c
@@ -0,0 +1,115 @@
+/*
+ * Copyright IBM Corporation 2017
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#define pr_fmt(fmt) "opal-occ: " fmt
+
+#include 
+#include 
+#include 
+#include 
+#include 

[PATCH V9 2/3] powernv: Add support to set power-shifting-ratio

2017-07-30 Thread Shilpasri G Bhat
This patch adds support to set power-shifting-ratio which hints the
firmware how to distribute/throttle power between different entities
in a system (e.g CPU v/s GPU). This ratio is used by OCC for power
capping algorithm.

Signed-off-by: Shilpasri G Bhat 
---
Changes from V8:
- Use __pa() while passing pointer in opal call
- Use mutex_lock_interruptible()
- Fix error codes returned to user
- Allocate and add sysfs attributes in a single loop

 arch/powerpc/include/asm/opal-api.h|   4 +-
 arch/powerpc/include/asm/opal.h|   3 +
 arch/powerpc/platforms/powernv/Makefile|   2 +-
 arch/powerpc/platforms/powernv/opal-psr.c  | 173 +
 arch/powerpc/platforms/powernv/opal-wrappers.S |   2 +
 arch/powerpc/platforms/powernv/opal.c  |   3 +
 6 files changed, 185 insertions(+), 2 deletions(-)
 create mode 100644 arch/powerpc/platforms/powernv/opal-psr.c

diff --git a/arch/powerpc/include/asm/opal-api.h 
b/arch/powerpc/include/asm/opal-api.h
index c3e0c4a..92e31fd 100644
--- a/arch/powerpc/include/asm/opal-api.h
+++ b/arch/powerpc/include/asm/opal-api.h
@@ -193,7 +193,9 @@
 #define OPAL_NPU_MAP_LPAR  148
 #define OPAL_GET_POWERCAP  152
 #define OPAL_SET_POWERCAP  153
-#define OPAL_LAST  153
+#define OPAL_GET_POWER_SHIFT_RATIO 154
+#define OPAL_SET_POWER_SHIFT_RATIO 155
+#define OPAL_LAST  155
 
 /* Device tree flags */
 
diff --git a/arch/powerpc/include/asm/opal.h b/arch/powerpc/include/asm/opal.h
index ec2087c..b9ea77f 100644
--- a/arch/powerpc/include/asm/opal.h
+++ b/arch/powerpc/include/asm/opal.h
@@ -269,6 +269,8 @@ int64_t opal_xive_set_vp_info(uint64_t vp,
 int64_t opal_xive_dump(uint32_t type, uint32_t id);
 int opal_get_powercap(u32 handle, int token, u32 *pcap);
 int opal_set_powercap(u32 handle, int token, u32 pcap);
+int opal_get_power_shift_ratio(u32 handle, int token, u32 *psr);
+int opal_set_power_shift_ratio(u32 handle, int token, u32 psr);
 
 /* Internal functions */
 extern int early_init_dt_scan_opal(unsigned long node, const char *uname,
@@ -348,6 +350,7 @@ static inline int opal_get_async_rc(struct opal_msg msg)
 void opal_wake_poller(void);
 
 void opal_powercap_init(void);
+void opal_psr_init(void);
 
 #endif /* __ASSEMBLY__ */
 
diff --git a/arch/powerpc/platforms/powernv/Makefile 
b/arch/powerpc/platforms/powernv/Makefile
index e79f806..9ed7d33 100644
--- a/arch/powerpc/platforms/powernv/Makefile
+++ b/arch/powerpc/platforms/powernv/Makefile
@@ -2,7 +2,7 @@ obj-y   += setup.o opal-wrappers.o opal.o 
opal-async.o idle.o
 obj-y  += opal-rtc.o opal-nvram.o opal-lpc.o opal-flash.o
 obj-y  += rng.o opal-elog.o opal-dump.o opal-sysparam.o 
opal-sensor.o
 obj-y  += opal-msglog.o opal-hmi.o opal-power.o opal-irqchip.o
-obj-y  += opal-kmsg.o opal-powercap.o
+obj-y  += opal-kmsg.o opal-powercap.o opal-psr.o
 
 obj-$(CONFIG_SMP)  += smp.o subcore.o subcore-asm.o
 obj-$(CONFIG_PCI)  += pci.o pci-ioda.o npu-dma.o
diff --git a/arch/powerpc/platforms/powernv/opal-psr.c 
b/arch/powerpc/platforms/powernv/opal-psr.c
new file mode 100644
index 000..e2cb335
--- /dev/null
+++ b/arch/powerpc/platforms/powernv/opal-psr.c
@@ -0,0 +1,173 @@
+/*
+ * PowerNV OPAL Power-Shift-Ratio interface
+ *
+ * Copyright 2017 IBM Corp.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+
+#define pr_fmt(fmt) "opal-psr: " fmt
+
+#include 
+#include 
+#include 
+
+#include 
+
+DEFINE_MUTEX(psr_mutex);
+
+static struct kobject *psr_kobj;
+
+struct psr_attr {
+   u32 handle;
+   struct kobj_attribute attr;
+} *psr_attrs;
+
+static ssize_t psr_show(struct kobject *kobj, struct kobj_attribute *attr,
+   char *buf)
+{
+   struct psr_attr *psr_attr = container_of(attr, struct psr_attr, attr);
+   struct opal_msg msg;
+   int psr, ret, token;
+
+   token = opal_async_get_token_interruptible();
+   if (token < 0) {
+   pr_devel("Failed to get token\n");
+   return token;
+   }
+
+   ret = mutex_lock_interruptible(_mutex);
+   if (ret)
+   return ret;
+
+   ret = opal_get_power_shift_ratio(psr_attr->handle, token,
+   (u32 *)__pa());
+   switch (ret) {
+   case OPAL_ASYNC_COMPLETION:
+   ret = opal_async_wait_response(token, );
+   if (ret) {
+   pr_devel("Failed to wait for the async response\n");
+   ret = -EIO;
+   goto out;
+   }
+

[PATCH V9 1/3] powernv: powercap: Add support for powercap framework

2017-07-30 Thread Shilpasri G Bhat
Adds a generic powercap framework to change the system powercap
inband through OPAL-OCC command/response interface.

Signed-off-by: Shilpasri G Bhat 
---
Changes from V8:
- Use __pa() while passing pointer in opal call
- Use mutex_lock_interruptible()
- Fix error codes returned to user
- Allocate and add sysfs attributes in a single loop

 arch/powerpc/include/asm/opal-api.h|   5 +-
 arch/powerpc/include/asm/opal.h|   4 +
 arch/powerpc/platforms/powernv/Makefile|   2 +-
 arch/powerpc/platforms/powernv/opal-powercap.c | 243 +
 arch/powerpc/platforms/powernv/opal-wrappers.S |   2 +
 arch/powerpc/platforms/powernv/opal.c  |   4 +
 6 files changed, 258 insertions(+), 2 deletions(-)
 create mode 100644 arch/powerpc/platforms/powernv/opal-powercap.c

diff --git a/arch/powerpc/include/asm/opal-api.h 
b/arch/powerpc/include/asm/opal-api.h
index 3130a73..c3e0c4a 100644
--- a/arch/powerpc/include/asm/opal-api.h
+++ b/arch/powerpc/include/asm/opal-api.h
@@ -42,6 +42,7 @@
 #define OPAL_I2C_STOP_ERR  -24
 #define OPAL_XIVE_PROVISIONING -31
 #define OPAL_XIVE_FREE_ACTIVE  -32
+#define OPAL_TIMEOUT   -33
 
 /* API Tokens (in r0) */
 #define OPAL_INVALID_CALL -1
@@ -190,7 +191,9 @@
 #define OPAL_NPU_INIT_CONTEXT  146
 #define OPAL_NPU_DESTROY_CONTEXT   147
 #define OPAL_NPU_MAP_LPAR  148
-#define OPAL_LAST  148
+#define OPAL_GET_POWERCAP  152
+#define OPAL_SET_POWERCAP  153
+#define OPAL_LAST  153
 
 /* Device tree flags */
 
diff --git a/arch/powerpc/include/asm/opal.h b/arch/powerpc/include/asm/opal.h
index 588fb1c..ec2087c 100644
--- a/arch/powerpc/include/asm/opal.h
+++ b/arch/powerpc/include/asm/opal.h
@@ -267,6 +267,8 @@ int64_t opal_xive_set_vp_info(uint64_t vp,
 int64_t opal_xive_free_irq(uint32_t girq);
 int64_t opal_xive_sync(uint32_t type, uint32_t id);
 int64_t opal_xive_dump(uint32_t type, uint32_t id);
+int opal_get_powercap(u32 handle, int token, u32 *pcap);
+int opal_set_powercap(u32 handle, int token, u32 pcap);
 
 /* Internal functions */
 extern int early_init_dt_scan_opal(unsigned long node, const char *uname,
@@ -345,6 +347,8 @@ static inline int opal_get_async_rc(struct opal_msg msg)
 
 void opal_wake_poller(void);
 
+void opal_powercap_init(void);
+
 #endif /* __ASSEMBLY__ */
 
 #endif /* _ASM_POWERPC_OPAL_H */
diff --git a/arch/powerpc/platforms/powernv/Makefile 
b/arch/powerpc/platforms/powernv/Makefile
index b5d98cb..e79f806 100644
--- a/arch/powerpc/platforms/powernv/Makefile
+++ b/arch/powerpc/platforms/powernv/Makefile
@@ -2,7 +2,7 @@ obj-y   += setup.o opal-wrappers.o opal.o 
opal-async.o idle.o
 obj-y  += opal-rtc.o opal-nvram.o opal-lpc.o opal-flash.o
 obj-y  += rng.o opal-elog.o opal-dump.o opal-sysparam.o 
opal-sensor.o
 obj-y  += opal-msglog.o opal-hmi.o opal-power.o opal-irqchip.o
-obj-y  += opal-kmsg.o
+obj-y  += opal-kmsg.o opal-powercap.o
 
 obj-$(CONFIG_SMP)  += smp.o subcore.o subcore-asm.o
 obj-$(CONFIG_PCI)  += pci.o pci-ioda.o npu-dma.o
diff --git a/arch/powerpc/platforms/powernv/opal-powercap.c 
b/arch/powerpc/platforms/powernv/opal-powercap.c
new file mode 100644
index 000..9be5093
--- /dev/null
+++ b/arch/powerpc/platforms/powernv/opal-powercap.c
@@ -0,0 +1,243 @@
+/*
+ * PowerNV OPAL Powercap interface
+ *
+ * Copyright 2017 IBM Corp.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+
+#define pr_fmt(fmt) "opal-powercap: " fmt
+
+#include 
+#include 
+#include 
+
+#include 
+
+DEFINE_MUTEX(powercap_mutex);
+
+static struct kobject *powercap_kobj;
+
+struct powercap_attr {
+   u32 handle;
+   struct kobj_attribute attr;
+};
+
+static struct pcap {
+   struct attribute_group pg;
+   struct powercap_attr *pattrs;
+} *pcaps;
+
+static ssize_t powercap_show(struct kobject *kobj, struct kobj_attribute *attr,
+char *buf)
+{
+   struct powercap_attr *pcap_attr = container_of(attr,
+   struct powercap_attr, attr);
+   struct opal_msg msg;
+   u32 pcap;
+   int ret, token;
+
+   token = opal_async_get_token_interruptible();
+   if (token < 0) {
+   pr_devel("Failed to get token\n");
+   return token;
+   }
+
+   ret = mutex_lock_interruptible(_mutex);
+   if (ret)
+   return ret;
+
+   ret = opal_get_powercap(pcap_attr->handle, token, (u32 *)__pa());
+   switch (ret) {
+   case OPAL_ASYNC_COMPLETION:
+   ret = 

[PATCH V9 0/3] powernv : Add support for OPAL-OCC command/response interface

2017-07-30 Thread Shilpasri G Bhat
In P9, OCC (On-Chip-Controller) supports shared memory based
commad-response interface. Within the shared memory there is an OPAL
command buffer and OCC response buffer that can be used to send
inband commands to OCC. The following commands are supported:

1) Set system powercap
2) Set CPU-GPU power shifting ratio
3) Clear min/max for OCC sensor groups

Shilpasri G Bhat (3):
  powernv: powercap: Add support for powercap framework
  powernv: Add support to set power-shifting-ratio
  powernv: Add support to clear sensor groups data

 arch/powerpc/include/asm/opal-api.h|   8 +-
 arch/powerpc/include/asm/opal.h|   9 +
 arch/powerpc/include/uapi/asm/opal-occ.h   |  23 +++
 arch/powerpc/platforms/powernv/Makefile|   2 +-
 arch/powerpc/platforms/powernv/opal-occ.c  | 115 
 arch/powerpc/platforms/powernv/opal-powercap.c | 243 +
 arch/powerpc/platforms/powernv/opal-psr.c  | 173 ++
 arch/powerpc/platforms/powernv/opal-wrappers.S |   5 +
 arch/powerpc/platforms/powernv/opal.c  |  10 +
 9 files changed, 586 insertions(+), 2 deletions(-)
 create mode 100644 arch/powerpc/include/uapi/asm/opal-occ.h
 create mode 100644 arch/powerpc/platforms/powernv/opal-occ.c
 create mode 100644 arch/powerpc/platforms/powernv/opal-powercap.c
 create mode 100644 arch/powerpc/platforms/powernv/opal-psr.c

-- 
1.8.3.1



Re: [PATCH v4 1/5] powerpc/lib/sstep: Add cmpb instruction emulation

2017-07-30 Thread Cyril Bur
On Mon, 2017-07-31 at 10:58 +1000, Matt Brown wrote:
> This patch adds emulation of the cmpb instruction, enabling xmon to
> emulate this instruction.
> Tested for correctness against the cmpb asm instruction on ppc64le.
> 
> Signed-off-by: Matt Brown 

Reviewed-by: Cyril Bur 

> ---
> v2: 
>   - fixed opcode
>   - fixed mask typecasting
> ---
>  arch/powerpc/lib/sstep.c | 20 
>  1 file changed, 20 insertions(+)
> 
> diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
> index 33117f8..87d277f 100644
> --- a/arch/powerpc/lib/sstep.c
> +++ b/arch/powerpc/lib/sstep.c
> @@ -596,6 +596,22 @@ static nokprobe_inline void do_cmp_unsigned(struct 
> pt_regs *regs, unsigned long
>   regs->ccr = (regs->ccr & ~(0xf << shift)) | (crval << shift);
>  }
>  
> +static nokprobe_inline void do_cmpb(struct pt_regs *regs, unsigned long v1,
> + unsigned long v2, int rd)
> +{
> + unsigned long long out_val, mask;
> + int i;
> +
> + out_val = 0;
> + for (i = 0; i < 8; i++) {
> + mask = 0xffUL << (i * 8);
> + if ((v1 & mask) == (v2 & mask))
> + out_val |= mask;
> + }
> +
> + regs->gpr[rd] = out_val;
> +}
> +
>  static nokprobe_inline int trap_compare(long v1, long v2)
>  {
>   int ret = 0;
> @@ -1049,6 +1065,10 @@ int analyse_instr(struct instruction_op *op, struct 
> pt_regs *regs,
>   do_cmp_unsigned(regs, val, val2, rd >> 2);
>   goto instr_done;
>  
> + case 508: /* cmpb */
> + do_cmpb(regs, regs->gpr[rd], regs->gpr[rb], ra);
> + goto instr_done;
> +
>  /*
>   * Arithmetic instructions
>   */


[PATCH v4 5/5] powerpc/lib/sstep: Add isel instruction emulation

2017-07-30 Thread Matt Brown
This adds emulation for the isel instruction.
Tested for correctness against the isel instruction and its extended
mnemonics (lt, gt, eq) on ppc64le.

Signed-off-by: Matt Brown 
---
v4:
- simplify if statement to ternary op
(same as isel emulation in kernel/traps.c)
v2:
- fixed opcode
- fixed definition to include the 'if RA=0, a=0' clause
- fixed ccr bitshifting error
---
 arch/powerpc/lib/sstep.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
index af4eef9..473bab5 100644
--- a/arch/powerpc/lib/sstep.c
+++ b/arch/powerpc/lib/sstep.c
@@ -1240,6 +1240,14 @@ int analyse_instr(struct instruction_op *op, struct 
pt_regs *regs,
 /*
  * Logical instructions
  */
+   case 15:/* isel */
+   mb = (instr >> 6) & 0x1f; /* bc */
+   val = (regs->ccr >> (31 - mb)) & 1;
+   val2 = (ra) ? regs->gpr[ra] : 0;
+
+   regs->gpr[rd] = (val) ? val2 : regs->gpr[rb];
+   goto logical_done;
+
case 26:/* cntlzw */
asm("cntlzw %0,%1" : "=r" (regs->gpr[ra]) :
"r" (regs->gpr[rd]));
-- 
2.9.3



[PATCH v4 4/5] powerpc/lib/sstep: Add prty instruction emulation

2017-07-30 Thread Matt Brown
This adds emulation for the prtyw and prtyd instructions.
Tested for logical correctness against the prtyw and prtyd instructions
on ppc64le.

Signed-off-by: Matt Brown 
---
v4:
- use simpler xor method
v3:
- optimised using the Giles-Miller method of side-ways addition
v2:
- fixed opcodes
- fixed bitshifting and typecast errors
- merged do_prtyw and do_prtyd into single function
---
 arch/powerpc/lib/sstep.c | 26 ++
 1 file changed, 26 insertions(+)

diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
index c9fd613..af4eef9 100644
--- a/arch/powerpc/lib/sstep.c
+++ b/arch/powerpc/lib/sstep.c
@@ -657,6 +657,24 @@ static nokprobe_inline void do_bpermd(struct pt_regs 
*regs, unsigned long v1,
regs->gpr[ra] = perm;
 }
 #endif /* CONFIG_PPC64 */
+/*
+ * The size parameter adjusts the equivalent prty instruction.
+ * prtyw = 32, prtyd = 64
+ */
+static nokprobe_inline void do_prty(struct pt_regs *regs, unsigned long v,
+   int size, int ra)
+{
+   unsigned long long res = v ^ (v >> 8);
+
+   res ^= res >> 16;
+   if (size == 32) {   /* prtyw */
+   regs->gpr[ra] = res & 0x00010001;
+   return;
+   }
+
+   res ^= res >> 32;
+   regs->gpr[ra] = res & 1;/*prtyd */
+}
 
 static nokprobe_inline int trap_compare(long v1, long v2)
 {
@@ -1247,6 +1265,14 @@ int analyse_instr(struct instruction_op *op, struct 
pt_regs *regs,
case 124:   /* nor */
regs->gpr[ra] = ~(regs->gpr[rd] | regs->gpr[rb]);
goto logical_done;
+
+   case 154:   /* prtyw */
+   do_prty(regs, regs->gpr[rd], 32, ra);
+   goto logical_done;
+
+   case 186:   /* prtyd */
+   do_prty(regs, regs->gpr[rd], 64, ra);
+   goto logical_done;
 #ifdef CONFIG_PPC64
case 252:   /* bpermd */
do_bpermd(regs, regs->gpr[rd], regs->gpr[rb], ra);
-- 
2.9.3



[PATCH v4 3/5] powerpc/lib/sstep: Add bpermd instruction emulation

2017-07-30 Thread Matt Brown
This adds emulation for the bpermd instruction.
Tested for correctness against the bpermd instruction on ppc64le.

Signed-off-by: Matt Brown 
---
v4:
- change ifdef macro from __powerpc64__ to CONFIG_PPC64
v2:
- fixed opcode
- added ifdef tags to do_bpermd func
- fixed bitshifting errors
---
 arch/powerpc/lib/sstep.c | 24 +++-
 1 file changed, 23 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
index 2fd7377..c9fd613 100644
--- a/arch/powerpc/lib/sstep.c
+++ b/arch/powerpc/lib/sstep.c
@@ -640,6 +640,24 @@ static nokprobe_inline void do_popcnt(struct pt_regs 
*regs, unsigned long v1,
regs->gpr[ra] = out;/* popcntd */
 }
 
+#ifdef CONFIG_PPC64
+static nokprobe_inline void do_bpermd(struct pt_regs *regs, unsigned long v1,
+   unsigned long v2, int ra)
+{
+   unsigned char perm, idx;
+   unsigned int i;
+
+   perm = 0;
+   for (i = 0; i < 8; i++) {
+   idx = (v1 >> (i * 8)) & 0xff;
+   if (idx < 64)
+   if (v2 & PPC_BIT(idx))
+   perm |= 1 << i;
+   }
+   regs->gpr[ra] = perm;
+}
+#endif /* CONFIG_PPC64 */
+
 static nokprobe_inline int trap_compare(long v1, long v2)
 {
int ret = 0;
@@ -1229,7 +1247,11 @@ int analyse_instr(struct instruction_op *op, struct 
pt_regs *regs,
case 124:   /* nor */
regs->gpr[ra] = ~(regs->gpr[rd] | regs->gpr[rb]);
goto logical_done;
-
+#ifdef CONFIG_PPC64
+   case 252:   /* bpermd */
+   do_bpermd(regs, regs->gpr[rd], regs->gpr[rb], ra);
+   goto logical_done;
+#endif
case 284:   /* xor */
regs->gpr[ra] = ~(regs->gpr[rd] ^ regs->gpr[rb]);
goto logical_done;
-- 
2.9.3



[PATCH v4 2/5] powerpc/lib/sstep: Add popcnt instruction emulation

2017-07-30 Thread Matt Brown
This adds emulations for the popcntb, popcntw, and popcntd instructions.
Tested for correctness against the popcnt{b,w,d} instructions on ppc64le.

Signed-off-by: Matt Brown 
---
v4:
- change ifdef macro from __powerpc64__ to CONFIG_PPC64
- slight optimisations 
(now identical to the popcntb implementation in kernel/traps.c)
v3:
- optimised using the Giles-Miller method of side-ways addition
v2:
- fixed opcodes
- fixed typecasting
- fixed bitshifting error for both 32 and 64bit arch
---
 arch/powerpc/lib/sstep.c | 42 +-
 1 file changed, 41 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
index 87d277f..2fd7377 100644
--- a/arch/powerpc/lib/sstep.c
+++ b/arch/powerpc/lib/sstep.c
@@ -612,6 +612,34 @@ static nokprobe_inline void do_cmpb(struct pt_regs *regs, 
unsigned long v1,
regs->gpr[rd] = out_val;
 }
 
+/*
+ * The size parameter is used to adjust the equivalent popcnt instruction.
+ * popcntb = 8, popcntw = 32, popcntd = 64
+ */
+static nokprobe_inline void do_popcnt(struct pt_regs *regs, unsigned long v1,
+   int size, int ra)
+{
+   unsigned long long out = v1;
+
+   out -= (out >> 1) & 0x;
+   out = (0x & out) + (0x & (out >> 2));
+   out = (out + (out >> 4)) & 0x0f0f0f0f0f0f0f0f;
+
+   if (size == 8) {/* popcntb */
+   regs->gpr[ra] = out;
+   return;
+   }
+   out += out >> 8;
+   out += out >> 16;
+   if (size == 32) {   /* popcntw */
+   regs->gpr[ra] = out & 0x003f003f;
+   return;
+   }
+
+   out = (out + (out >> 32)) & 0x7f;
+   regs->gpr[ra] = out;/* popcntd */
+}
+
 static nokprobe_inline int trap_compare(long v1, long v2)
 {
int ret = 0;
@@ -1194,6 +1222,10 @@ int analyse_instr(struct instruction_op *op, struct 
pt_regs *regs,
regs->gpr[ra] = regs->gpr[rd] & ~regs->gpr[rb];
goto logical_done;
 
+   case 122:   /* popcntb */
+   do_popcnt(regs, regs->gpr[rd], 8, ra);
+   goto logical_done;
+
case 124:   /* nor */
regs->gpr[ra] = ~(regs->gpr[rd] | regs->gpr[rb]);
goto logical_done;
@@ -1206,6 +1238,10 @@ int analyse_instr(struct instruction_op *op, struct 
pt_regs *regs,
regs->gpr[ra] = regs->gpr[rd] ^ regs->gpr[rb];
goto logical_done;
 
+   case 378:   /* popcntw */
+   do_popcnt(regs, regs->gpr[rd], 32, ra);
+   goto logical_done;
+
case 412:   /* orc */
regs->gpr[ra] = regs->gpr[rd] | ~regs->gpr[rb];
goto logical_done;
@@ -1217,7 +1253,11 @@ int analyse_instr(struct instruction_op *op, struct 
pt_regs *regs,
case 476:   /* nand */
regs->gpr[ra] = ~(regs->gpr[rd] & regs->gpr[rb]);
goto logical_done;
-
+#ifdef CONFIG_PPC64
+   case 506:   /* popcntd */
+   do_popcnt(regs, regs->gpr[rd], 64, ra);
+   goto logical_done;
+#endif
case 922:   /* extsh */
regs->gpr[ra] = (signed short) regs->gpr[rd];
goto logical_done;
-- 
2.9.3



[PATCH v4 1/5] powerpc/lib/sstep: Add cmpb instruction emulation

2017-07-30 Thread Matt Brown
This patch adds emulation of the cmpb instruction, enabling xmon to
emulate this instruction.
Tested for correctness against the cmpb asm instruction on ppc64le.

Signed-off-by: Matt Brown 
---
v2: 
- fixed opcode
- fixed mask typecasting
---
 arch/powerpc/lib/sstep.c | 20 
 1 file changed, 20 insertions(+)

diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
index 33117f8..87d277f 100644
--- a/arch/powerpc/lib/sstep.c
+++ b/arch/powerpc/lib/sstep.c
@@ -596,6 +596,22 @@ static nokprobe_inline void do_cmp_unsigned(struct pt_regs 
*regs, unsigned long
regs->ccr = (regs->ccr & ~(0xf << shift)) | (crval << shift);
 }
 
+static nokprobe_inline void do_cmpb(struct pt_regs *regs, unsigned long v1,
+   unsigned long v2, int rd)
+{
+   unsigned long long out_val, mask;
+   int i;
+
+   out_val = 0;
+   for (i = 0; i < 8; i++) {
+   mask = 0xffUL << (i * 8);
+   if ((v1 & mask) == (v2 & mask))
+   out_val |= mask;
+   }
+
+   regs->gpr[rd] = out_val;
+}
+
 static nokprobe_inline int trap_compare(long v1, long v2)
 {
int ret = 0;
@@ -1049,6 +1065,10 @@ int analyse_instr(struct instruction_op *op, struct 
pt_regs *regs,
do_cmp_unsigned(regs, val, val2, rd >> 2);
goto instr_done;
 
+   case 508: /* cmpb */
+   do_cmpb(regs, regs->gpr[rd], regs->gpr[rb], ra);
+   goto instr_done;
+
 /*
  * Arithmetic instructions
  */
-- 
2.9.3



[RFC v7 25/25] powerpc: Enable pkey subsystem

2017-07-30 Thread Ram Pai
PAPR defines 'ibm,processor-storage-keys' property. It exports
two values.The first value indicates the number of data-access
keys and the second indicates the number of instruction-access
keys. Though this alludes  that keys can be either data access
or instructon access only, that is not the case in reality.Any
key can be of any kind.  This patch adds all the keys and uses
that as the total number of keys available to us.

Non PAPR platforms do not define this property   in the device
tree yet. Here, we   hardcode   CPUs   that   support  pkey by
consulting PowerISA3.0

Signed-off-by: Ram Pai 
---
 arch/powerpc/include/asm/cputable.h|8 +---
 arch/powerpc/include/asm/mmu_context.h |1 +
 arch/powerpc/include/asm/pkeys.h   |   32 ++--
 arch/powerpc/kernel/prom.c |   19 +++
 4 files changed, 55 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/include/asm/cputable.h 
b/arch/powerpc/include/asm/cputable.h
index d02ad93..1d9ed84 100644
--- a/arch/powerpc/include/asm/cputable.h
+++ b/arch/powerpc/include/asm/cputable.h
@@ -214,6 +214,7 @@ enum {
 #define CPU_FTR_DAWR   LONG_ASM_CONST(0x0400)
 #define CPU_FTR_DABRX  LONG_ASM_CONST(0x0800)
 #define CPU_FTR_PMAO_BUG   LONG_ASM_CONST(0x1000)
+#define CPU_FTR_PKEY   LONG_ASM_CONST(0x2000)
 #define CPU_FTR_POWER9_DD1 LONG_ASM_CONST(0x4000)
 
 #ifndef __ASSEMBLY__
@@ -452,7 +453,7 @@ enum {
CPU_FTR_DSCR | CPU_FTR_SAO  | CPU_FTR_ASYM_SMT | \
CPU_FTR_STCX_CHECKS_ADDRESS | CPU_FTR_POPCNTB | CPU_FTR_POPCNTD | \
CPU_FTR_ICSWX | CPU_FTR_CFAR | CPU_FTR_HVMODE | \
-   CPU_FTR_VMX_COPY | CPU_FTR_HAS_PPR | CPU_FTR_DABRX)
+   CPU_FTR_VMX_COPY | CPU_FTR_HAS_PPR | CPU_FTR_DABRX | CPU_FTR_PKEY)
 #define CPU_FTRS_POWER8 (CPU_FTR_USE_TB | CPU_FTR_LWSYNC | \
CPU_FTR_PPCAS_ARCH_V2 | CPU_FTR_CTRL | CPU_FTR_ARCH_206 |\
CPU_FTR_MMCRA | CPU_FTR_SMT | \
@@ -462,7 +463,7 @@ enum {
CPU_FTR_STCX_CHECKS_ADDRESS | CPU_FTR_POPCNTB | CPU_FTR_POPCNTD | \
CPU_FTR_ICSWX | CPU_FTR_CFAR | CPU_FTR_HVMODE | CPU_FTR_VMX_COPY | \
CPU_FTR_DBELL | CPU_FTR_HAS_PPR | CPU_FTR_DAWR | \
-   CPU_FTR_ARCH_207S | CPU_FTR_TM_COMP)
+   CPU_FTR_ARCH_207S | CPU_FTR_TM_COMP | CPU_FTR_PKEY)
 #define CPU_FTRS_POWER8E (CPU_FTRS_POWER8 | CPU_FTR_PMAO_BUG)
 #define CPU_FTRS_POWER8_DD1 (CPU_FTRS_POWER8 & ~CPU_FTR_DBELL)
 #define CPU_FTRS_POWER9 (CPU_FTR_USE_TB | CPU_FTR_LWSYNC | \
@@ -474,7 +475,8 @@ enum {
CPU_FTR_STCX_CHECKS_ADDRESS | CPU_FTR_POPCNTB | CPU_FTR_POPCNTD | \
CPU_FTR_CFAR | CPU_FTR_HVMODE | CPU_FTR_VMX_COPY | \
CPU_FTR_DBELL | CPU_FTR_HAS_PPR | CPU_FTR_DAWR | \
-   CPU_FTR_ARCH_207S | CPU_FTR_TM_COMP | CPU_FTR_ARCH_300)
+   CPU_FTR_ARCH_207S | CPU_FTR_TM_COMP | CPU_FTR_ARCH_300 | \
+   CPU_FTR_PKEY)
 #define CPU_FTRS_POWER9_DD1 ((CPU_FTRS_POWER9 | CPU_FTR_POWER9_DD1) & \
 (~CPU_FTR_SAO))
 #define CPU_FTRS_CELL  (CPU_FTR_USE_TB | CPU_FTR_LWSYNC | \
diff --git a/arch/powerpc/include/asm/mmu_context.h 
b/arch/powerpc/include/asm/mmu_context.h
index a1cfcca..acd59d8 100644
--- a/arch/powerpc/include/asm/mmu_context.h
+++ b/arch/powerpc/include/asm/mmu_context.h
@@ -188,6 +188,7 @@ static inline bool arch_vma_access_permitted(struct 
vm_area_struct *vma,
 
 #define pkey_initialize()
 #define pkey_mm_init(mm)
+#define pkey_mmu_values(total_data, total_execute)
 
 static inline int vma_pkey(struct vm_area_struct *vma)
 {
diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
index ba7bff6..e61ed6c 100644
--- a/arch/powerpc/include/asm/pkeys.h
+++ b/arch/powerpc/include/asm/pkeys.h
@@ -1,6 +1,8 @@
 #ifndef _ASM_PPC64_PKEYS_H
 #define _ASM_PPC64_PKEYS_H
 
+#include 
+
 extern bool pkey_inited;
 extern int pkeys_total; /* total pkeys as per device tree */
 extern u32 initial_allocation_mask;/* bits set for reserved keys */
@@ -227,6 +229,24 @@ static inline void pkey_mm_init(struct mm_struct *mm)
mm->context.execute_only_pkey = -1;
 }
 
+static inline void pkey_mmu_values(int total_data, int total_execute)
+{
+   /*
+* since any pkey can be used for data or execute, we
+* will  just  treat all keys as equal and track them
+* as one entity.
+*/
+   pkeys_total = total_data + total_execute;
+}
+
+static inline bool pkey_mmu_enabled(void)
+{
+   if (firmware_has_feature(FW_FEATURE_LPAR))
+   return pkeys_total;
+   else
+   return cpu_has_feature(CPU_FTR_PKEY);
+}
+
 static inline void pkey_initialize(void)
 {
int os_reserved, i;
@@ -236,9 +256,17 @@ static inline void pkey_initialize(void)
 * line will enable it.
 */
pkey_inited = false;
+  

[RFC v7 24/25] powerpc: Deliver SEGV signal on pkey violation

2017-07-30 Thread Ram Pai
The value of the AMR register at the time of exception
is made available in gp_regs[PT_AMR] of the siginfo.

The value of the pkey, whose protection got violated,
is made available in si_pkey field of the siginfo structure.

Signed-off-by: Ram Pai 
---
 arch/powerpc/include/uapi/asm/ptrace.h |1 +
 arch/powerpc/kernel/signal_32.c|5 +
 arch/powerpc/kernel/signal_64.c|4 
 arch/powerpc/kernel/traps.c|   15 +++
 4 files changed, 25 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/uapi/asm/ptrace.h 
b/arch/powerpc/include/uapi/asm/ptrace.h
index 8036b38..fc9c9c0 100644
--- a/arch/powerpc/include/uapi/asm/ptrace.h
+++ b/arch/powerpc/include/uapi/asm/ptrace.h
@@ -110,6 +110,7 @@ struct pt_regs {
 #define PT_RESULT 43
 #define PT_DSCR 44
 #define PT_REGS_COUNT 44
+#define PT_AMR 45
 
 #define PT_FPR048  /* each FP reg occupies 2 slots in this space */
 
diff --git a/arch/powerpc/kernel/signal_32.c b/arch/powerpc/kernel/signal_32.c
index 97bb138..9c4a7f3 100644
--- a/arch/powerpc/kernel/signal_32.c
+++ b/arch/powerpc/kernel/signal_32.c
@@ -500,6 +500,11 @@ static int save_user_regs(struct pt_regs *regs, struct 
mcontext __user *frame,
   (unsigned long) >tramp[2]);
}
 
+#ifdef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
+   if (__put_user(get_paca()->paca_amr, >mc_gregs[PT_AMR]))
+   return 1;
+#endif /*  CONFIG_PPC64_MEMORY_PROTECTION_KEYS */
+
return 0;
 }
 
diff --git a/arch/powerpc/kernel/signal_64.c b/arch/powerpc/kernel/signal_64.c
index c83c115..86a4262 100644
--- a/arch/powerpc/kernel/signal_64.c
+++ b/arch/powerpc/kernel/signal_64.c
@@ -174,6 +174,10 @@ static long setup_sigcontext(struct sigcontext __user *sc,
if (set != NULL)
err |=  __put_user(set->sig[0], >oldmask);
 
+#ifdef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
+   err |= __put_user(get_paca()->paca_amr, >gp_regs[PT_AMR]);
+#endif /*  CONFIG_PPC64_MEMORY_PROTECTION_KEYS */
+
return err;
 }
 
diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
index d4e545d..fe1e7c7 100644
--- a/arch/powerpc/kernel/traps.c
+++ b/arch/powerpc/kernel/traps.c
@@ -20,6 +20,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -247,6 +248,15 @@ void user_single_step_siginfo(struct task_struct *tsk,
info->si_addr = (void __user *)regs->nip;
 }
 
+#ifdef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
+static void fill_sig_info_pkey(int si_code, siginfo_t *info, unsigned long 
addr)
+{
+   if (si_code != SEGV_PKUERR)
+   return;
+   info->si_pkey = get_paca()->paca_pkey;
+}
+#endif /* CONFIG_PPC64_MEMORY_PROTECTION_KEYS */
+
 void _exception(int signr, struct pt_regs *regs, int code, unsigned long addr)
 {
siginfo_t info;
@@ -274,6 +284,11 @@ void _exception(int signr, struct pt_regs *regs, int code, 
unsigned long addr)
info.si_signo = signr;
info.si_code = code;
info.si_addr = (void __user *) addr;
+
+#ifdef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
+   fill_sig_info_pkey(code, , addr);
+#endif /* CONFIG_PPC64_MEMORY_PROTECTION_KEYS */
+
force_sig_info(signr, , current);
 }
 
-- 
1.7.1



[RFC v7 23/25] powerpc: capture the violated protection key on fault

2017-07-30 Thread Ram Pai
Capture the protection key that got violated in paca.
This value will be later used to inform the signal
handler.

Signed-off-by: Ram Pai 
---
 arch/powerpc/include/asm/paca.h   |1 +
 arch/powerpc/kernel/asm-offsets.c |1 +
 arch/powerpc/mm/fault.c   |8 
 3 files changed, 10 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/paca.h b/arch/powerpc/include/asm/paca.h
index c8bd1fc..0c06188 100644
--- a/arch/powerpc/include/asm/paca.h
+++ b/arch/powerpc/include/asm/paca.h
@@ -94,6 +94,7 @@ struct paca_struct {
u64 dscr_default;   /* per-CPU default DSCR */
 #ifdef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
u64 paca_amr;   /* value of amr at exception */
+   u16 paca_pkey;  /* exception causing pkey */
 #endif /* CONFIG_PPC64_MEMORY_PROTECTION_KEYS */
 
 #ifdef CONFIG_PPC_STD_MMU_64
diff --git a/arch/powerpc/kernel/asm-offsets.c 
b/arch/powerpc/kernel/asm-offsets.c
index 17f5d8a..7dff862 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -244,6 +244,7 @@ int main(void)
 
 #ifdef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
OFFSET(PACA_AMR, paca_struct, paca_amr);
+   OFFSET(PACA_PKEY, paca_struct, paca_pkey);
 #endif /* CONFIG_PPC64_MEMORY_PROTECTION_KEYS */
 
OFFSET(ACCOUNT_STARTTIME, paca_struct, accounting.starttime);
diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
index a6710f5..7fee303 100644
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@@ -265,6 +265,7 @@ int do_page_fault(struct pt_regs *regs, unsigned long 
address,
if (error_code & DSISR_KEYFAULT) {
code = SEGV_PKUERR;
get_paca()->paca_amr = read_amr();
+   get_paca()->paca_pkey = get_pte_pkey(current->mm, address);
goto bad_area_nosemaphore;
}
 #endif /* CONFIG_PPC64_MEMORY_PROTECTION_KEYS */
@@ -453,6 +454,13 @@ int do_page_fault(struct pt_regs *regs, unsigned long 
address,
if (!arch_vma_access_permitted(vma, flags & FAULT_FLAG_WRITE,
is_exec, 0)) {
get_paca()->paca_amr = read_amr();
+   /*
+* The pgd-pdt...pmd-pte tree may not  have  been fully setup.
+* Hence we cannot walk the tree to locate the pte, to locate
+* the key. Hence lets use vma_pkey() to get the key; instead
+* of get_pte_pkey().
+*/
+   get_paca()->paca_pkey = vma_pkey(vma);
code = SEGV_PKUERR;
goto bad_area;
}
-- 
1.7.1



[RFC v7 22/25] powerpc: introduce get_pte_pkey() helper

2017-07-30 Thread Ram Pai
get_pte_pkey() helper returns the pkey associated with
a address corresponding to a given mm_struct.

Signed-off-by: Ram Pai 
---
 arch/powerpc/include/asm/book3s/64/mmu-hash.h |5 +
 arch/powerpc/mm/hash_utils_64.c   |   25 +
 2 files changed, 30 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/mmu-hash.h 
b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
index f7a6ed3..369f9ff 100644
--- a/arch/powerpc/include/asm/book3s/64/mmu-hash.h
+++ b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
@@ -450,6 +450,11 @@ extern int hash_page(unsigned long ea, unsigned long 
access, unsigned long trap,
 int __hash_page_huge(unsigned long ea, unsigned long access, unsigned long 
vsid,
 pte_t *ptep, unsigned long trap, unsigned long flags,
 int ssize, unsigned int shift, unsigned int mmu_psize);
+
+#ifdef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
+u16 get_pte_pkey(struct mm_struct *mm, unsigned long address);
+#endif /* CONFIG_PPC64_MEMORY_PROTECTION_KEYS */
+
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
 extern int __hash_page_thp(unsigned long ea, unsigned long access,
   unsigned long vsid, pmd_t *pmdp, unsigned long trap,
diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index 545f291..44ebf2a 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -1570,6 +1570,31 @@ void hash_preload(struct mm_struct *mm, unsigned long ea,
local_irq_restore(flags);
 }
 
+#ifdef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
+/*
+ * return the protection key associated with the given address
+ * and the mm_struct.
+ */
+u16 get_pte_pkey(struct mm_struct *mm, unsigned long address)
+{
+   pte_t *ptep;
+   u16 pkey = 0;
+   unsigned long flags;
+
+   if (!mm || !mm->pgd)
+   return 0;
+
+   local_irq_save(flags);
+   ptep = find_linux_pte_or_hugepte(mm->pgd, address,
+   NULL, NULL);
+   if (ptep)
+   pkey = pte_to_pkey_bits(pte_val(READ_ONCE(*ptep)));
+   local_irq_restore(flags);
+
+   return pkey;
+}
+#endif /* CONFIG_PPC64_MEMORY_PROTECTION_KEYS */
+
 #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
 static inline void tm_flush_hash_page(int local)
 {
-- 
1.7.1



[RFC v7 21/25] powerpc: capture AMR register content on pkey violation

2017-07-30 Thread Ram Pai
capture AMR register contents, and save it in paca
whenever a pkey violation is detected.

This value will be needed to deliver pkey-violation
signal to the task.

Signed-off-by: Ram Pai 
---
 arch/powerpc/include/asm/paca.h   |3 +++
 arch/powerpc/kernel/asm-offsets.c |5 +
 arch/powerpc/mm/fault.c   |2 ++
 3 files changed, 10 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/paca.h b/arch/powerpc/include/asm/paca.h
index 1c09f8f..c8bd1fc 100644
--- a/arch/powerpc/include/asm/paca.h
+++ b/arch/powerpc/include/asm/paca.h
@@ -92,6 +92,9 @@ struct paca_struct {
struct dtl_entry *dispatch_log_end;
 #endif /* CONFIG_PPC_STD_MMU_64 */
u64 dscr_default;   /* per-CPU default DSCR */
+#ifdef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
+   u64 paca_amr;   /* value of amr at exception */
+#endif /* CONFIG_PPC64_MEMORY_PROTECTION_KEYS */
 
 #ifdef CONFIG_PPC_STD_MMU_64
/*
diff --git a/arch/powerpc/kernel/asm-offsets.c 
b/arch/powerpc/kernel/asm-offsets.c
index 709e234..17f5d8a 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -241,6 +241,11 @@ int main(void)
OFFSET(PACAHWCPUID, paca_struct, hw_cpu_id);
OFFSET(PACAKEXECSTATE, paca_struct, kexec_state);
OFFSET(PACA_DSCR_DEFAULT, paca_struct, dscr_default);
+
+#ifdef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
+   OFFSET(PACA_AMR, paca_struct, paca_amr);
+#endif /* CONFIG_PPC64_MEMORY_PROTECTION_KEYS */
+
OFFSET(ACCOUNT_STARTTIME, paca_struct, accounting.starttime);
OFFSET(ACCOUNT_STARTTIME_USER, paca_struct, accounting.starttime_user);
OFFSET(ACCOUNT_USER_TIME, paca_struct, accounting.utime);
diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
index ea74fe2..a6710f5 100644
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@@ -264,6 +264,7 @@ int do_page_fault(struct pt_regs *regs, unsigned long 
address,
 #ifdef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
if (error_code & DSISR_KEYFAULT) {
code = SEGV_PKUERR;
+   get_paca()->paca_amr = read_amr();
goto bad_area_nosemaphore;
}
 #endif /* CONFIG_PPC64_MEMORY_PROTECTION_KEYS */
@@ -451,6 +452,7 @@ int do_page_fault(struct pt_regs *regs, unsigned long 
address,
 #ifdef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
if (!arch_vma_access_permitted(vma, flags & FAULT_FLAG_WRITE,
is_exec, 0)) {
+   get_paca()->paca_amr = read_amr();
code = SEGV_PKUERR;
goto bad_area;
}
-- 
1.7.1



[RFC v7 20/25] powerpc: Handle exceptions caused by pkey violation

2017-07-30 Thread Ram Pai
Handle Data and  Instruction exceptions caused by memory
protection-key.

The CPU will detect the key fault if the HPTE is already
programmed with the key.

However if the HPTE is not  hashed, a key fault will not
be detected by the  hardware. The   software will detect
pkey violation in such a case.

Signed-off-by: Ram Pai 
---
 arch/powerpc/include/asm/reg.h |3 ++-
 arch/powerpc/mm/fault.c|   21 +
 2 files changed, 23 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h
index ee04bc0..b7cbc8c 100644
--- a/arch/powerpc/include/asm/reg.h
+++ b/arch/powerpc/include/asm/reg.h
@@ -286,7 +286,8 @@
 #define   DSISR_SET_RC 0x0004  /* Failed setting of R/C bits */
 #define   DSISR_PGDIRFAULT  0x0002  /* Fault on page directory */
 #define   DSISR_PAGE_FAULT_MASK (DSISR_BIT32 | DSISR_PAGEATTR_CONFLT | \
-   DSISR_BADACCESS | DSISR_DABRMATCH | DSISR_BIT43)
+   DSISR_BADACCESS | DSISR_KEYFAULT | \
+   DSISR_DABRMATCH | DSISR_BIT43)
 #define SPRN_TBRL  0x10C   /* Time Base Read Lower Register (user, R/O) */
 #define SPRN_TBRU  0x10D   /* Time Base Read Upper Register (user, R/O) */
 #define SPRN_CIR   0x11B   /* Chip Information Register (hyper, R/0) */
diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
index 3a7d580..ea74fe2 100644
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@@ -261,6 +261,13 @@ int do_page_fault(struct pt_regs *regs, unsigned long 
address,
}
 #endif
 
+#ifdef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
+   if (error_code & DSISR_KEYFAULT) {
+   code = SEGV_PKUERR;
+   goto bad_area_nosemaphore;
+   }
+#endif /* CONFIG_PPC64_MEMORY_PROTECTION_KEYS */
+
/* We restore the interrupt state now */
if (!arch_irq_disabled_regs(regs))
local_irq_enable();
@@ -441,6 +448,20 @@ int do_page_fault(struct pt_regs *regs, unsigned long 
address,
WARN_ON_ONCE(error_code & DSISR_PROTFAULT);
 #endif /* CONFIG_PPC_STD_MMU */
 
+#ifdef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
+   if (!arch_vma_access_permitted(vma, flags & FAULT_FLAG_WRITE,
+   is_exec, 0)) {
+   code = SEGV_PKUERR;
+   goto bad_area;
+   }
+#endif /* CONFIG_PPC64_MEMORY_PROTECTION_KEYS */
+
+
+   /* handle_mm_fault() needs to know if its a instruction access
+* fault.
+*/
+   if (is_exec)
+   flags |= FAULT_FLAG_INSTRUCTION;
/*
 * If for any reason at all we couldn't handle the fault,
 * make sure we exit gracefully rather than endlessly redo
-- 
1.7.1



[RFC v7 19/25] powerpc: implementation for arch_vma_access_permitted()

2017-07-30 Thread Ram Pai
This patch provides the implementation for
arch_vma_access_permitted(). Returns true if the
requested access is allowed by pkey associated with the
vma.

Signed-off-by: Ram Pai 
---
 arch/powerpc/include/asm/mmu_context.h |5 +++-
 arch/powerpc/mm/pkeys.c|   43 
 2 files changed, 47 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/include/asm/mmu_context.h 
b/arch/powerpc/include/asm/mmu_context.h
index 2eb7f80..a1cfcca 100644
--- a/arch/powerpc/include/asm/mmu_context.h
+++ b/arch/powerpc/include/asm/mmu_context.h
@@ -175,6 +175,10 @@ static inline void arch_bprm_mm_init(struct mm_struct *mm,
 {
 }
 
+#ifdef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
+bool arch_vma_access_permitted(struct vm_area_struct *vma,
+   bool write, bool execute, bool foreign);
+#else /* CONFIG_PPC64_MEMORY_PROTECTION_KEYS */
 static inline bool arch_vma_access_permitted(struct vm_area_struct *vma,
bool write, bool execute, bool foreign)
 {
@@ -182,7 +186,6 @@ static inline bool arch_vma_access_permitted(struct 
vm_area_struct *vma,
return true;
 }
 
-#ifndef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
 #define pkey_initialize()
 #define pkey_mm_init(mm)
 
diff --git a/arch/powerpc/mm/pkeys.c b/arch/powerpc/mm/pkeys.c
index 8d756ef..1424c79 100644
--- a/arch/powerpc/mm/pkeys.c
+++ b/arch/powerpc/mm/pkeys.c
@@ -229,3 +229,46 @@ bool arch_pte_access_permitted(u64 pte, bool write, bool 
execute)
return pkey_access_permitted(pte_to_pkey_bits(pte),
write, execute);
 }
+
+/*
+ * We only want to enforce protection keys on the current process
+ * because we effectively have no access to AMR/IAMR for other
+ * processes or any way to tell *which * AMR/IAMR in a threaded
+ * process we could use.
+ *
+ * So do not enforce things if the VMA is not from the current
+ * mm, or if we are in a kernel thread.
+ */
+static inline bool vma_is_foreign(struct vm_area_struct *vma)
+{
+   if (!current->mm)
+   return true;
+   /*
+* if the VMA is from another process, then AMR/IAMR has no
+* relevance and should not be enforced.
+*/
+   if (current->mm != vma->vm_mm)
+   return true;
+
+   return false;
+}
+
+bool arch_vma_access_permitted(struct vm_area_struct *vma,
+   bool write, bool execute, bool foreign)
+{
+   int pkey;
+
+   if (!pkey_inited)
+   return true;
+
+   /* allow access if the VMA is not one from this process */
+   if (foreign || vma_is_foreign(vma))
+   return true;
+
+   pkey = vma_pkey(vma);
+
+   if (!pkey)
+   return true;
+
+   return pkey_access_permitted(pkey, write, execute);
+}
-- 
1.7.1



[RFC v7 18/25] powerpc: Macro the mask used for checking DSI exception

2017-07-30 Thread Ram Pai
Replace the magic number used to check for DSI exception
with a meaningful value.

Signed-off-by: Ram Pai 
---
 arch/powerpc/include/asm/reg.h   |7 ++-
 arch/powerpc/kernel/exceptions-64s.S |2 +-
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h
index 7e50e47..ee04bc0 100644
--- a/arch/powerpc/include/asm/reg.h
+++ b/arch/powerpc/include/asm/reg.h
@@ -272,16 +272,21 @@
 #define SPRN_DAR   0x013   /* Data Address Register */
 #define SPRN_DBCR  0x136   /* e300 Data Breakpoint Control Reg */
 #define SPRN_DSISR 0x012   /* Data Storage Interrupt Status Register */
+#define   DSISR_BIT32  0x8000  /* not defined */
 #define   DSISR_NOHPTE 0x4000  /* no translation found */
+#define   DSISR_PAGEATTR_CONFLT0x2000  /* page attribute 
conflict */
+#define   DSISR_BIT35  0x1000  /* not defined */
 #define   DSISR_PROTFAULT  0x0800  /* protection fault */
 #define   DSISR_BADACCESS  0x0400  /* bad access to CI or G */
 #define   DSISR_ISSTORE0x0200  /* access was a store */
 #define   DSISR_DABRMATCH  0x0040  /* hit data breakpoint */
-#define   DSISR_NOSEGMENT  0x0020  /* SLB miss */
 #define   DSISR_KEYFAULT   0x0020  /* Key fault */
+#define   DSISR_BIT43  0x0010  /* not defined */
 #define   DSISR_UNSUPP_MMU 0x0008  /* Unsupported MMU config */
 #define   DSISR_SET_RC 0x0004  /* Failed setting of R/C bits */
 #define   DSISR_PGDIRFAULT  0x0002  /* Fault on page directory */
+#define   DSISR_PAGE_FAULT_MASK (DSISR_BIT32 | DSISR_PAGEATTR_CONFLT | \
+   DSISR_BADACCESS | DSISR_DABRMATCH | DSISR_BIT43)
 #define SPRN_TBRL  0x10C   /* Time Base Read Lower Register (user, R/O) */
 #define SPRN_TBRU  0x10D   /* Time Base Read Upper Register (user, R/O) */
 #define SPRN_CIR   0x11B   /* Chip Information Register (hyper, R/0) */
diff --git a/arch/powerpc/kernel/exceptions-64s.S 
b/arch/powerpc/kernel/exceptions-64s.S
index b886795..e154bfe 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -1411,7 +1411,7 @@ USE_TEXT_SECTION()
.balign IFETCH_ALIGN_BYTES
 do_hash_page:
 #ifdef CONFIG_PPC_STD_MMU_64
-   andis.  r0,r4,0xa450/* weird error? */
+   andis.  r0,r4,DSISR_PAGE_FAULT_MASK@h
bne-handle_page_fault   /* if not, try to insert a HPTE */
CURRENT_THREAD_INFO(r11, r1)
lwz r0,TI_PREEMPT(r11)  /* If we're in an "NMI" */
-- 
1.7.1



[RFC v7 17/25] powerpc: check key protection for user page access

2017-07-30 Thread Ram Pai
Make sure that the kernel does not access user pages without
checking their key-protection.

Signed-off-by: Ram Pai 
---
 arch/powerpc/include/asm/book3s/64/pgtable.h |   14 ++
 1 files changed, 14 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h 
b/arch/powerpc/include/asm/book3s/64/pgtable.h
index 2bec0f6..d1e1460 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -479,6 +479,20 @@ static inline void write_uamor(u64 value)
 
 #ifdef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
 extern bool arch_pte_access_permitted(u64 pte, bool write, bool execute);
+
+#define pte_access_permitted(pte, write) \
+   (pte_present(pte) && \
+((!(write) || pte_write(pte)) && \
+ arch_pte_access_permitted(pte_val(pte), !!write, 0)))
+
+/*
+ * We store key in pmd for huge tlb pages. So need
+ * to check for key protection.
+ */
+#define pmd_access_permitted(pmd, write) \
+   (pmd_present(pmd) && \
+((!(write) || pmd_write(pmd)) && \
+ arch_pte_access_permitted(pmd_val(pmd), !!write, 0)))
 #endif /* CONFIG_PPC64_MEMORY_PROTECTION_KEYS */
 
 #define __HAVE_ARCH_PTEP_GET_AND_CLEAR
-- 
1.7.1



[RFC v7 16/25] powerpc: helper to validate key-access permissions of a pte

2017-07-30 Thread Ram Pai
helper function that checks if the read/write/execute is allowed
on the pte.

Signed-off-by: Ram Pai 
---
 arch/powerpc/include/asm/book3s/64/pgtable.h |4 +++
 arch/powerpc/include/asm/pkeys.h |   12 +++
 arch/powerpc/mm/pkeys.c  |   28 ++
 3 files changed, 44 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h 
b/arch/powerpc/include/asm/book3s/64/pgtable.h
index 060a1b2..2bec0f6 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -477,6 +477,10 @@ static inline void write_uamor(u64 value)
mtspr(SPRN_UAMOR, value);
 }
 
+#ifdef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
+extern bool arch_pte_access_permitted(u64 pte, bool write, bool execute);
+#endif /* CONFIG_PPC64_MEMORY_PROTECTION_KEYS */
+
 #define __HAVE_ARCH_PTEP_GET_AND_CLEAR
 static inline pte_t ptep_get_and_clear(struct mm_struct *mm,
   unsigned long addr, pte_t *ptep)
diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
index 4b7e3f4..ba7bff6 100644
--- a/arch/powerpc/include/asm/pkeys.h
+++ b/arch/powerpc/include/asm/pkeys.h
@@ -85,6 +85,18 @@ static inline u64 pte_to_hpte_pkey_bits(u64 pteflags)
((pteflags & H_PAGE_PKEY_BIT4) ? HPTE_R_KEY_BIT4 : 0x0UL));
 }
 
+static inline u16 pte_to_pkey_bits(u64 pteflags)
+{
+   if (!pkey_inited)
+   return 0x0UL;
+
+   return (((pteflags & H_PAGE_PKEY_BIT0) ? 0x10 : 0x0UL) |
+   ((pteflags & H_PAGE_PKEY_BIT1) ? 0x8 : 0x0UL) |
+   ((pteflags & H_PAGE_PKEY_BIT2) ? 0x4 : 0x0UL) |
+   ((pteflags & H_PAGE_PKEY_BIT3) ? 0x2 : 0x0UL) |
+   ((pteflags & H_PAGE_PKEY_BIT4) ? 0x1 : 0x0UL));
+}
+
 #define ARCH_VM_PKEY_FLAGS (VM_PKEY_BIT0 | VM_PKEY_BIT1 | VM_PKEY_BIT2 | \
VM_PKEY_BIT3 | VM_PKEY_BIT4)
 #define AMR_BITS_PER_PKEY 2
diff --git a/arch/powerpc/mm/pkeys.c b/arch/powerpc/mm/pkeys.c
index edcbf48..8d756ef 100644
--- a/arch/powerpc/mm/pkeys.c
+++ b/arch/powerpc/mm/pkeys.c
@@ -201,3 +201,31 @@ int __arch_override_mprotect_pkey(struct vm_area_struct 
*vma, int prot,
 */
return vma_pkey(vma);
 }
+
+static bool pkey_access_permitted(int pkey, bool write, bool execute)
+{
+   int pkey_shift;
+   u64 amr;
+
+   if (!pkey)
+   return true;
+
+   pkey_shift = pkeyshift(pkey);
+   if (!(read_uamor() & (0x3UL << pkey_shift)))
+   return true;
+
+   if (execute && !(read_iamr() & (IAMR_EX_BIT << pkey_shift)))
+   return true;
+
+   amr = read_amr(); /* delay reading amr uptil absolutely needed*/
+   return ((!write && !(amr & (AMR_RD_BIT << pkey_shift))) ||
+   (write &&  !(amr & (AMR_WR_BIT << pkey_shift;
+}
+
+bool arch_pte_access_permitted(u64 pte, bool write, bool execute)
+{
+   if (!pkey_inited)
+   return true;
+   return pkey_access_permitted(pte_to_pkey_bits(pte),
+   write, execute);
+}
-- 
1.7.1



[RFC v7 15/25] powerpc: Program HPTE key protection bits

2017-07-30 Thread Ram Pai
Map the PTE protection key bits to the HPTE key protection bits,
while creating HPTE  entries.

Signed-off-by: Ram Pai 
---
 arch/powerpc/include/asm/book3s/64/mmu-hash.h |5 +
 arch/powerpc/include/asm/mmu_context.h|6 ++
 arch/powerpc/include/asm/pkeys.h  |   13 +
 arch/powerpc/mm/hash_utils_64.c   |1 +
 4 files changed, 25 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/mmu-hash.h 
b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
index 6981a52..f7a6ed3 100644
--- a/arch/powerpc/include/asm/book3s/64/mmu-hash.h
+++ b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
@@ -90,6 +90,8 @@
 #define HPTE_R_PP0 ASM_CONST(0x8000)
 #define HPTE_R_TS  ASM_CONST(0x4000)
 #define HPTE_R_KEY_HI  ASM_CONST(0x3000)
+#define HPTE_R_KEY_BIT0ASM_CONST(0x2000)
+#define HPTE_R_KEY_BIT1ASM_CONST(0x1000)
 #define HPTE_R_RPN_SHIFT   12
 #define HPTE_R_RPN ASM_CONST(0x0000)
 #define HPTE_R_RPN_3_0 ASM_CONST(0x01fff000)
@@ -104,6 +106,9 @@
 #define HPTE_R_C   ASM_CONST(0x0080)
 #define HPTE_R_R   ASM_CONST(0x0100)
 #define HPTE_R_KEY_LO  ASM_CONST(0x0e00)
+#define HPTE_R_KEY_BIT2ASM_CONST(0x0800)
+#define HPTE_R_KEY_BIT3ASM_CONST(0x0400)
+#define HPTE_R_KEY_BIT4ASM_CONST(0x0200)
 
 #define HPTE_V_1TB_SEG ASM_CONST(0x4000)
 #define HPTE_V_VRMA_MASK   ASM_CONST(0x4001ff00)
diff --git a/arch/powerpc/include/asm/mmu_context.h 
b/arch/powerpc/include/asm/mmu_context.h
index 7232484..2eb7f80 100644
--- a/arch/powerpc/include/asm/mmu_context.h
+++ b/arch/powerpc/include/asm/mmu_context.h
@@ -190,6 +190,12 @@ static inline int vma_pkey(struct vm_area_struct *vma)
 {
return 0;
 }
+
+static inline u64 pte_to_hpte_pkey_bits(u64 pteflags)
+{
+   return 0x0UL;
+}
+
 #endif /* CONFIG_PPC64_MEMORY_PROTECTION_KEYS */
 
 #endif /* __KERNEL__ */
diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
index 1ded6df..4b7e3f4 100644
--- a/arch/powerpc/include/asm/pkeys.h
+++ b/arch/powerpc/include/asm/pkeys.h
@@ -72,6 +72,19 @@ static inline int vma_pkey(struct vm_area_struct *vma)
 #define AMR_RD_BIT 0x1UL
 #define AMR_WR_BIT 0x2UL
 #define IAMR_EX_BIT 0x1UL
+
+static inline u64 pte_to_hpte_pkey_bits(u64 pteflags)
+{
+   if (!pkey_inited)
+   return 0x0UL;
+
+   return (((pteflags & H_PAGE_PKEY_BIT0) ? HPTE_R_KEY_BIT0 : 0x0UL) |
+   ((pteflags & H_PAGE_PKEY_BIT1) ? HPTE_R_KEY_BIT1 : 0x0UL) |
+   ((pteflags & H_PAGE_PKEY_BIT2) ? HPTE_R_KEY_BIT2 : 0x0UL) |
+   ((pteflags & H_PAGE_PKEY_BIT3) ? HPTE_R_KEY_BIT3 : 0x0UL) |
+   ((pteflags & H_PAGE_PKEY_BIT4) ? HPTE_R_KEY_BIT4 : 0x0UL));
+}
+
 #define ARCH_VM_PKEY_FLAGS (VM_PKEY_BIT0 | VM_PKEY_BIT1 | VM_PKEY_BIT2 | \
VM_PKEY_BIT3 | VM_PKEY_BIT4)
 #define AMR_BITS_PER_PKEY 2
diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index f88423b..545f291 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -231,6 +231,7 @@ unsigned long htab_convert_pte_flags(unsigned long pteflags)
 */
rflags |= HPTE_R_M;
 
+   rflags |= pte_to_hpte_pkey_bits(pteflags);
return rflags;
 }
 
-- 
1.7.1



[RFC v7 14/25] powerpc: sys_pkey_mprotect() system call

2017-07-30 Thread Ram Pai
Patch provides the ability for a process to
associate a pkey with a address range.

Signed-off-by: Ram Pai 
---
 arch/powerpc/include/asm/systbl.h  |1 +
 arch/powerpc/include/asm/unistd.h  |4 +---
 arch/powerpc/include/uapi/asm/unistd.h |1 +
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/include/asm/systbl.h 
b/arch/powerpc/include/asm/systbl.h
index 22dd776..b33b551 100644
--- a/arch/powerpc/include/asm/systbl.h
+++ b/arch/powerpc/include/asm/systbl.h
@@ -390,3 +390,4 @@
 SYSCALL(statx)
 SYSCALL(pkey_alloc)
 SYSCALL(pkey_free)
+SYSCALL(pkey_mprotect)
diff --git a/arch/powerpc/include/asm/unistd.h 
b/arch/powerpc/include/asm/unistd.h
index e0273bc..daf1ba9 100644
--- a/arch/powerpc/include/asm/unistd.h
+++ b/arch/powerpc/include/asm/unistd.h
@@ -12,12 +12,10 @@
 #include 
 
 
-#define NR_syscalls386
+#define NR_syscalls387
 
 #define __NR__exit __NR_exit
 
-#define __IGNORE_pkey_mprotect
-
 #ifndef __ASSEMBLY__
 
 #include 
diff --git a/arch/powerpc/include/uapi/asm/unistd.h 
b/arch/powerpc/include/uapi/asm/unistd.h
index 7993a07..71ae45e 100644
--- a/arch/powerpc/include/uapi/asm/unistd.h
+++ b/arch/powerpc/include/uapi/asm/unistd.h
@@ -396,5 +396,6 @@
 #define __NR_statx 383
 #define __NR_pkey_alloc384
 #define __NR_pkey_free 385
+#define __NR_pkey_mprotect 386
 
 #endif /* _UAPI_ASM_POWERPC_UNISTD_H_ */
-- 
1.7.1



[RFC v7 13/25] powerpc: map vma key-protection bits to pte key bits.

2017-07-30 Thread Ram Pai
map  the  key  protection  bits of the vma to the pkey bits in
the PTE.

The Pte  bits used  for pkey  are  3,4,5,6  and 57. The  first
four bits are the same four bits that were freed up  initially
in this patch series. remember? :-) Without those four bits
this patch would'nt be possible.

BUT, On 4k kernel, bit 3, and 4 could not be freed up. remember?
Hence we have to be satisfied with 5,6 and 7.

Signed-off-by: Ram Pai 
---
 arch/powerpc/include/asm/book3s/64/pgtable.h |   25 -
 arch/powerpc/include/asm/mman.h  |8 
 arch/powerpc/include/asm/pkeys.h |   12 
 3 files changed, 44 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h 
b/arch/powerpc/include/asm/book3s/64/pgtable.h
index d4da0e9..060a1b2 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -37,6 +37,7 @@
 #define _RPAGE_RSV20x0800UL
 #define _RPAGE_RSV30x0400UL
 #define _RPAGE_RSV40x0200UL
+#define _RPAGE_RSV50x00040UL
 
 #define _PAGE_PTE  0x4000UL/* distinguishes PTEs 
from pointers */
 #define _PAGE_PRESENT  0x8000UL/* pte contains a 
translation */
@@ -56,6 +57,25 @@
 /* Max physical address bit as per radix table */
 #define _RPAGE_PA_MAX  57
 
+#ifdef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
+#ifdef CONFIG_PPC_64K_PAGES
+#define H_PAGE_PKEY_BIT0   _RPAGE_RSV1
+#define H_PAGE_PKEY_BIT1   _RPAGE_RSV2
+#else /* CONFIG_PPC_64K_PAGES */
+#define H_PAGE_PKEY_BIT0   0 /* _RPAGE_RSV1 is not available */
+#define H_PAGE_PKEY_BIT1   0 /* _RPAGE_RSV2 is not available */
+#endif /* CONFIG_PPC_64K_PAGES */
+#define H_PAGE_PKEY_BIT2   _RPAGE_RSV3
+#define H_PAGE_PKEY_BIT3   _RPAGE_RSV4
+#define H_PAGE_PKEY_BIT4   _RPAGE_RSV5
+#else /*  CONFIG_PPC64_MEMORY_PROTECTION_KEYS */
+#define H_PAGE_PKEY_BIT0   0
+#define H_PAGE_PKEY_BIT1   0
+#define H_PAGE_PKEY_BIT2   0
+#define H_PAGE_PKEY_BIT3   0
+#define H_PAGE_PKEY_BIT4   0
+#endif /*  CONFIG_PPC64_MEMORY_PROTECTION_KEYS */
+
 /*
  * Max physical address bit we will use for now.
  *
@@ -116,13 +136,16 @@
 #define _PAGE_CHG_MASK (PTE_RPN_MASK | _PAGE_HPTEFLAGS | _PAGE_DIRTY | \
 _PAGE_ACCESSED | _PAGE_SPECIAL | _PAGE_PTE |   \
 _PAGE_SOFT_DIRTY)
+
+#define H_PAGE_PKEY  (H_PAGE_PKEY_BIT0 | H_PAGE_PKEY_BIT1 | H_PAGE_PKEY_BIT2 | 
\
+   H_PAGE_PKEY_BIT3 | H_PAGE_PKEY_BIT4)
 /*
  * Mask of bits returned by pte_pgprot()
  */
 #define PAGE_PROT_BITS  (_PAGE_SAO | _PAGE_NON_IDEMPOTENT | _PAGE_TOLERANT | \
 H_PAGE_4K_PFN | _PAGE_PRIVILEGED | _PAGE_ACCESSED | \
 _PAGE_READ | _PAGE_WRITE |  _PAGE_DIRTY | _PAGE_EXEC | 
\
-_PAGE_SOFT_DIRTY)
+_PAGE_SOFT_DIRTY | H_PAGE_PKEY)
 /*
  * We define 2 sets of base prot bits, one for basic pages (ie,
  * cacheable kernel and user pages) and one for non cacheable
diff --git a/arch/powerpc/include/asm/mman.h b/arch/powerpc/include/asm/mman.h
index 067eec2..3f7220f 100644
--- a/arch/powerpc/include/asm/mman.h
+++ b/arch/powerpc/include/asm/mman.h
@@ -32,12 +32,20 @@ static inline unsigned long arch_calc_vm_prot_bits(unsigned 
long prot,
 }
 #define arch_calc_vm_prot_bits(prot, pkey) arch_calc_vm_prot_bits(prot, pkey)
 
+
 static inline pgprot_t arch_vm_get_page_prot(unsigned long vm_flags)
 {
+#ifdef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
+   return (vm_flags & VM_SAO) ?
+   __pgprot(_PAGE_SAO | vmflag_to_page_pkey_bits(vm_flags)) :
+   __pgprot(0 | vmflag_to_page_pkey_bits(vm_flags));
+#else
return (vm_flags & VM_SAO) ? __pgprot(_PAGE_SAO) : __pgprot(0);
+#endif
 }
 #define arch_vm_get_page_prot(vm_flags) arch_vm_get_page_prot(vm_flags)
 
+
 static inline bool arch_validate_prot(unsigned long prot)
 {
if (prot & ~(PROT_READ | PROT_WRITE | PROT_EXEC | PROT_SEM | PROT_SAO))
diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
index 03f7ea2..1ded6df 100644
--- a/arch/powerpc/include/asm/pkeys.h
+++ b/arch/powerpc/include/asm/pkeys.h
@@ -46,6 +46,18 @@ static inline u64 pkey_to_vmflag_bits(u16 pkey)
((pkey & 0x10UL) ? VM_PKEY_BIT4 : 0x0UL));
 }
 
+static inline u64 vmflag_to_page_pkey_bits(u64 vm_flags)
+{
+   if (!pkey_inited)
+   return 0x0UL;
+
+   return (((vm_flags & VM_PKEY_BIT0) ? H_PAGE_PKEY_BIT4 : 0x0UL) |
+   ((vm_flags & VM_PKEY_BIT1) ? H_PAGE_PKEY_BIT3 : 0x0UL) |
+   ((vm_flags & VM_PKEY_BIT2) ? H_PAGE_PKEY_BIT2 : 0x0UL) |
+   ((vm_flags & VM_PKEY_BIT3) ? H_PAGE_PKEY_BIT1 : 0x0UL) |
+   ((vm_flags & VM_PKEY_BIT4) ? H_PAGE_PKEY_BIT0 : 0x0UL));
+}
+
 #define ARCH_VM_PKEY_FLAGS (VM_PKEY_BIT0 

[RFC v7 12/25] powerpc: implementation for arch_override_mprotect_pkey()

2017-07-30 Thread Ram Pai
arch independent code calls arch_override_mprotect_pkey()
to return a pkey that best matches the requested protection.

This patch provides the implementation.

Signed-off-by: Ram Pai 
---
 arch/powerpc/include/asm/mmu_context.h |5 +++
 arch/powerpc/include/asm/pkeys.h   |   17 ++-
 arch/powerpc/mm/pkeys.c|   47 
 3 files changed, 67 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/mmu_context.h 
b/arch/powerpc/include/asm/mmu_context.h
index 4705dab..7232484 100644
--- a/arch/powerpc/include/asm/mmu_context.h
+++ b/arch/powerpc/include/asm/mmu_context.h
@@ -185,6 +185,11 @@ static inline bool arch_vma_access_permitted(struct 
vm_area_struct *vma,
 #ifndef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
 #define pkey_initialize()
 #define pkey_mm_init(mm)
+
+static inline int vma_pkey(struct vm_area_struct *vma)
+{
+   return 0;
+}
 #endif /* CONFIG_PPC64_MEMORY_PROTECTION_KEYS */
 
 #endif /* __KERNEL__ */
diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
index a715a08..03f7ea2 100644
--- a/arch/powerpc/include/asm/pkeys.h
+++ b/arch/powerpc/include/asm/pkeys.h
@@ -46,6 +46,16 @@ static inline u64 pkey_to_vmflag_bits(u16 pkey)
((pkey & 0x10UL) ? VM_PKEY_BIT4 : 0x0UL));
 }
 
+#define ARCH_VM_PKEY_FLAGS (VM_PKEY_BIT0 | VM_PKEY_BIT1 | VM_PKEY_BIT2 | \
+   VM_PKEY_BIT3 | VM_PKEY_BIT4)
+
+static inline int vma_pkey(struct vm_area_struct *vma)
+{
+   if (!pkey_inited)
+   return 0;
+   return (vma->vm_flags & ARCH_VM_PKEY_FLAGS) >> VM_PKEY_SHIFT;
+}
+
 #define arch_max_pkey()  pkeys_total
 #define AMR_RD_BIT 0x1UL
 #define AMR_WR_BIT 0x2UL
@@ -146,11 +156,14 @@ static inline int execute_only_pkey(struct mm_struct *mm)
return __execute_only_pkey(mm);
 }
 
-
+extern int __arch_override_mprotect_pkey(struct vm_area_struct *vma,
+   int prot, int pkey);
 static inline int arch_override_mprotect_pkey(struct vm_area_struct *vma,
int prot, int pkey)
 {
-   return 0;
+   if (!pkey_inited)
+   return 0;
+   return __arch_override_mprotect_pkey(vma, prot, pkey);
 }
 
 extern int __arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
diff --git a/arch/powerpc/mm/pkeys.c b/arch/powerpc/mm/pkeys.c
index 25749bf..edcbf48 100644
--- a/arch/powerpc/mm/pkeys.c
+++ b/arch/powerpc/mm/pkeys.c
@@ -154,3 +154,50 @@ int __execute_only_pkey(struct mm_struct *mm)
mm->context.execute_only_pkey = execute_only_pkey;
return execute_only_pkey;
 }
+
+static inline bool vma_is_pkey_exec_only(struct vm_area_struct *vma)
+{
+   /* Do this check first since the vm_flags should be hot */
+   if ((vma->vm_flags & (VM_READ | VM_WRITE | VM_EXEC)) != VM_EXEC)
+   return false;
+
+   return (vma_pkey(vma) == vma->vm_mm->context.execute_only_pkey);
+}
+
+/*
+ * This should only be called for *plain* mprotect calls.
+ */
+int __arch_override_mprotect_pkey(struct vm_area_struct *vma, int prot,
+   int pkey)
+{
+   /*
+* Is this an mprotect_pkey() call?  If so, never
+* override the value that came from the user.
+*/
+   if (pkey != -1)
+   return pkey;
+
+   /*
+* If the currently associated pkey is execute-only,
+* but the requested protection requires read or write,
+* move it back to the default pkey.
+*/
+   if (vma_is_pkey_exec_only(vma) &&
+   (prot & (PROT_READ|PROT_WRITE)))
+   return 0;
+
+   /*
+* the requested protection is execute-only. Hence
+* lets use a execute-only pkey.
+*/
+   if (prot == PROT_EXEC) {
+   pkey = execute_only_pkey(vma->vm_mm);
+   if (pkey > 0)
+   return pkey;
+   }
+
+   /*
+* nothing to override.
+*/
+   return vma_pkey(vma);
+}
-- 
1.7.1



[RFC v7 11/25] powerpc: ability to associate pkey to a vma

2017-07-30 Thread Ram Pai
arch-independent code expects the arch to  map
a  pkey  into the vma's protection bit setting.
The patch provides that ability.

Signed-off-by: Ram Pai 
---
 arch/powerpc/include/asm/mman.h  |8 +++-
 arch/powerpc/include/asm/pkeys.h |   12 
 2 files changed, 19 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/include/asm/mman.h b/arch/powerpc/include/asm/mman.h
index 30922f6..067eec2 100644
--- a/arch/powerpc/include/asm/mman.h
+++ b/arch/powerpc/include/asm/mman.h
@@ -13,6 +13,7 @@
 
 #include 
 #include 
+#include 
 #include 
 
 /*
@@ -22,7 +23,12 @@
 static inline unsigned long arch_calc_vm_prot_bits(unsigned long prot,
unsigned long pkey)
 {
-   return (prot & PROT_SAO) ? VM_SAO : 0;
+#ifdef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
+   return (((prot & PROT_SAO) ? VM_SAO : 0) |
+   pkey_to_vmflag_bits(pkey));
+#else
+   return ((prot & PROT_SAO) ? VM_SAO : 0);
+#endif
 }
 #define arch_calc_vm_prot_bits(prot, pkey) arch_calc_vm_prot_bits(prot, pkey)
 
diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
index 5b17d93..a715a08 100644
--- a/arch/powerpc/include/asm/pkeys.h
+++ b/arch/powerpc/include/asm/pkeys.h
@@ -34,6 +34,18 @@
PKEY_DISABLE_WRITE  |\
PKEY_DISABLE_EXECUTE)
 
+static inline u64 pkey_to_vmflag_bits(u16 pkey)
+{
+   if (!pkey_inited)
+   return 0x0UL;
+
+   return (((pkey & 0x1UL) ? VM_PKEY_BIT0 : 0x0UL) |
+   ((pkey & 0x2UL) ? VM_PKEY_BIT1 : 0x0UL) |
+   ((pkey & 0x4UL) ? VM_PKEY_BIT2 : 0x0UL) |
+   ((pkey & 0x8UL) ? VM_PKEY_BIT3 : 0x0UL) |
+   ((pkey & 0x10UL) ? VM_PKEY_BIT4 : 0x0UL));
+}
+
 #define arch_max_pkey()  pkeys_total
 #define AMR_RD_BIT 0x1UL
 #define AMR_WR_BIT 0x2UL
-- 
1.7.1



[RFC v7 10/25] powerpc: introduce execute-only pkey

2017-07-30 Thread Ram Pai
This patch provides the implementation of execute-only pkey.
The architecture-independent  expects the ability to create
and manage a special key which has execute-only permission.

Signed-off-by: Ram Pai 
---
 arch/powerpc/include/asm/book3s/64/mmu.h |1 +
 arch/powerpc/include/asm/pkeys.h |8 -
 arch/powerpc/mm/pkeys.c  |   57 ++
 3 files changed, 65 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/mmu.h 
b/arch/powerpc/include/asm/book3s/64/mmu.h
index 104ad72..0c0a2a8 100644
--- a/arch/powerpc/include/asm/book3s/64/mmu.h
+++ b/arch/powerpc/include/asm/book3s/64/mmu.h
@@ -116,6 +116,7 @@ struct patb_entry {
 * bit unset -> key available for allocation
 */
u32 pkey_allocation_map;
+   s16 execute_only_pkey; /* key holding execute-only protection */
 #endif
 } mm_context_t;
 
diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
index dac8751..5b17d93 100644
--- a/arch/powerpc/include/asm/pkeys.h
+++ b/arch/powerpc/include/asm/pkeys.h
@@ -126,11 +126,15 @@ static inline int mm_pkey_free(struct mm_struct *mm, int 
pkey)
  * Try to dedicate one of the protection keys to be used as an
  * execute-only protection key.
  */
+extern int __execute_only_pkey(struct mm_struct *mm);
 static inline int execute_only_pkey(struct mm_struct *mm)
 {
-   return 0;
+   if (!pkey_inited)
+   return -1;
+   return __execute_only_pkey(mm);
 }
 
+
 static inline int arch_override_mprotect_pkey(struct vm_area_struct *vma,
int prot, int pkey)
 {
@@ -157,6 +161,8 @@ static inline void pkey_mm_init(struct mm_struct *mm)
if (!pkey_inited)
return;
mm_pkey_allocation_map(mm) = initial_allocation_mask;
+   /* -1 means unallocated or invalid */
+   mm->context.execute_only_pkey = -1;
 }
 
 static inline void pkey_initialize(void)
diff --git a/arch/powerpc/mm/pkeys.c b/arch/powerpc/mm/pkeys.c
index c1c40c3..25749bf 100644
--- a/arch/powerpc/mm/pkeys.c
+++ b/arch/powerpc/mm/pkeys.c
@@ -97,3 +97,60 @@ int __arch_set_user_pkey_access(struct task_struct *tsk, int 
pkey,
init_iamr(pkey, new_iamr_bits);
return 0;
 }
+
+static inline bool pkey_allows_readwrite(int pkey)
+{
+   int pkey_shift = pkeyshift(pkey);
+
+   if (!(read_uamor() & (0x3UL << pkey_shift)))
+   return true;
+
+   return !(read_amr() & ((AMR_RD_BIT|AMR_WR_BIT) << pkey_shift));
+}
+
+int __execute_only_pkey(struct mm_struct *mm)
+{
+   bool need_to_set_mm_pkey = false;
+   int execute_only_pkey = mm->context.execute_only_pkey;
+   int ret;
+
+   /* Do we need to assign a pkey for mm's execute-only maps? */
+   if (execute_only_pkey == -1) {
+   /* Go allocate one to use, which might fail */
+   execute_only_pkey = mm_pkey_alloc(mm);
+   if (execute_only_pkey < 0)
+   return -1;
+   need_to_set_mm_pkey = true;
+   }
+
+   /*
+* We do not want to go through the relatively costly
+* dance to set AMR if we do not need to.  Check it
+* first and assume that if the execute-only pkey is
+* readwrite-disabled than we do not have to set it
+* ourselves.
+*/
+   if (!need_to_set_mm_pkey &&
+   !pkey_allows_readwrite(execute_only_pkey))
+   return execute_only_pkey;
+
+   /*
+* Set up AMR so that it denies access for everything
+* other than execution.
+*/
+   ret = __arch_set_user_pkey_access(current, execute_only_pkey,
+   (PKEY_DISABLE_ACCESS | PKEY_DISABLE_WRITE));
+   /*
+* If the AMR-set operation failed somehow, just return
+* 0 and effectively disable execute-only support.
+*/
+   if (ret) {
+   mm_set_pkey_free(mm, execute_only_pkey);
+   return -1;
+   }
+
+   /* We got one, store it and use it from here on out */
+   if (need_to_set_mm_pkey)
+   mm->context.execute_only_pkey = execute_only_pkey;
+   return execute_only_pkey;
+}
-- 
1.7.1



[RFC v7 09/25] powerpc: store and restore the pkey state across context switches

2017-07-30 Thread Ram Pai
Store and restore the AMR, IAMR and UAMOR register state of the task
before scheduling out and after scheduling in, respectively.

Signed-off-by: Ram Pai 
---
 arch/powerpc/include/asm/pkeys.h |5 +
 arch/powerpc/include/asm/processor.h |5 +
 arch/powerpc/kernel/process.c|   25 +
 3 files changed, 35 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
index 9d59619..dac8751 100644
--- a/arch/powerpc/include/asm/pkeys.h
+++ b/arch/powerpc/include/asm/pkeys.h
@@ -147,6 +147,11 @@ static inline int arch_set_user_pkey_access(struct 
task_struct *tsk, int pkey,
return __arch_set_user_pkey_access(tsk, pkey, init_val);
 }
 
+static inline bool arch_pkeys_enabled(void)
+{
+   return pkey_inited;
+}
+
 static inline void pkey_mm_init(struct mm_struct *mm)
 {
if (!pkey_inited)
diff --git a/arch/powerpc/include/asm/processor.h 
b/arch/powerpc/include/asm/processor.h
index 1189d04..dcb1cf0 100644
--- a/arch/powerpc/include/asm/processor.h
+++ b/arch/powerpc/include/asm/processor.h
@@ -309,6 +309,11 @@ struct thread_struct {
struct thread_vr_state ckvr_state; /* Checkpointed VR state */
unsigned long   ckvrsave; /* Checkpointed VRSAVE */
 #endif /* CONFIG_PPC_TRANSACTIONAL_MEM */
+#ifdef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
+   unsigned long   amr;
+   unsigned long   iamr;
+   unsigned long   uamor;
+#endif
 #ifdef CONFIG_KVM_BOOK3S_32_HANDLER
void*   kvm_shadow_vcpu; /* KVM internal data */
 #endif /* CONFIG_KVM_BOOK3S_32_HANDLER */
diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index 2ad725e..af373f4 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -42,6 +42,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -1096,6 +1097,13 @@ static inline void save_sprs(struct thread_struct *t)
t->tar = mfspr(SPRN_TAR);
}
 #endif
+#ifdef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
+   if (arch_pkeys_enabled()) {
+   t->amr = mfspr(SPRN_AMR);
+   t->iamr = mfspr(SPRN_IAMR);
+   t->uamor = mfspr(SPRN_UAMOR);
+   }
+#endif
 }
 
 static inline void restore_sprs(struct thread_struct *old_thread,
@@ -1131,6 +1139,16 @@ static inline void restore_sprs(struct thread_struct 
*old_thread,
mtspr(SPRN_TAR, new_thread->tar);
}
 #endif
+#ifdef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
+   if (arch_pkeys_enabled()) {
+   if (old_thread->amr != new_thread->amr)
+   mtspr(SPRN_AMR, new_thread->amr);
+   if (old_thread->iamr != new_thread->iamr)
+   mtspr(SPRN_IAMR, new_thread->iamr);
+   if (old_thread->uamor != new_thread->uamor)
+   mtspr(SPRN_UAMOR, new_thread->uamor);
+   }
+#endif
 }
 
 struct task_struct *__switch_to(struct task_struct *prev,
@@ -1689,6 +1707,13 @@ void start_thread(struct pt_regs *regs, unsigned long 
start, unsigned long sp)
current->thread.tm_tfiar = 0;
current->thread.load_tm = 0;
 #endif /* CONFIG_PPC_TRANSACTIONAL_MEM */
+#ifdef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
+   if (arch_pkeys_enabled()) {
+   current->thread.amr   = 0x0ul;
+   current->thread.iamr  = 0x0ul;
+   current->thread.uamor = 0x0ul;
+   }
+#endif /* CONFIG_PPC64_MEMORY_PROTECTION_KEYS */
 }
 EXPORT_SYMBOL(start_thread);
 
-- 
1.7.1



[RFC v7 08/25] powerpc: ability to create execute-disabled pkeys

2017-07-30 Thread Ram Pai
powerpc has hardware support to disable execute on a pkey.
This patch enables the ability to create execute-disabled
keys.

Signed-off-by: Ram Pai 
---
 arch/powerpc/include/asm/pkeys.h |   12 
 arch/powerpc/mm/pkeys.c  |5 +
 2 files changed, 17 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
index aec3f49..9d59619 100644
--- a/arch/powerpc/include/asm/pkeys.h
+++ b/arch/powerpc/include/asm/pkeys.h
@@ -22,6 +22,18 @@
 #define VM_PKEY_BIT4   VM_HIGH_ARCH_4
 #endif
 
+/* override any generic PKEY Permission defines */
+#undef  PKEY_DISABLE_ACCESS
+#define PKEY_DISABLE_ACCESS0x1
+#undef  PKEY_DISABLE_WRITE
+#define PKEY_DISABLE_WRITE 0x2
+#undef  PKEY_DISABLE_EXECUTE
+#define PKEY_DISABLE_EXECUTE   0x4
+#undef  PKEY_ACCESS_MASK
+#define PKEY_ACCESS_MASK   (PKEY_DISABLE_ACCESS |\
+   PKEY_DISABLE_WRITE  |\
+   PKEY_DISABLE_EXECUTE)
+
 #define arch_max_pkey()  pkeys_total
 #define AMR_RD_BIT 0x1UL
 #define AMR_WR_BIT 0x2UL
diff --git a/arch/powerpc/mm/pkeys.c b/arch/powerpc/mm/pkeys.c
index 5b3531e..c1c40c3 100644
--- a/arch/powerpc/mm/pkeys.c
+++ b/arch/powerpc/mm/pkeys.c
@@ -78,6 +78,7 @@ int __arch_set_user_pkey_access(struct task_struct *tsk, int 
pkey,
unsigned long init_val)
 {
u64 new_amr_bits = 0x0ul;
+   u64 new_iamr_bits = 0x0ul;
 
if (!is_pkey_enabled(pkey))
return -EINVAL;
@@ -90,5 +91,9 @@ int __arch_set_user_pkey_access(struct task_struct *tsk, int 
pkey,
 
init_amr(pkey, new_amr_bits);
 
+   if ((init_val & PKEY_DISABLE_EXECUTE))
+   new_iamr_bits |= IAMR_EX_BIT;
+
+   init_iamr(pkey, new_iamr_bits);
return 0;
 }
-- 
1.7.1



[RFC v7 07/25] powerpc: sys_pkey_alloc() and sys_pkey_free() system calls

2017-07-30 Thread Ram Pai
Finally this patch provides the ability for a process to
allocate and free a protection key.

Signed-off-by: Ram Pai 
---
 arch/powerpc/include/asm/systbl.h  |2 ++
 arch/powerpc/include/asm/unistd.h  |4 +---
 arch/powerpc/include/uapi/asm/unistd.h |2 ++
 3 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/include/asm/systbl.h 
b/arch/powerpc/include/asm/systbl.h
index 1c94708..22dd776 100644
--- a/arch/powerpc/include/asm/systbl.h
+++ b/arch/powerpc/include/asm/systbl.h
@@ -388,3 +388,5 @@
 COMPAT_SYS_SPU(pwritev2)
 SYSCALL(kexec_file_load)
 SYSCALL(statx)
+SYSCALL(pkey_alloc)
+SYSCALL(pkey_free)
diff --git a/arch/powerpc/include/asm/unistd.h 
b/arch/powerpc/include/asm/unistd.h
index 9ba11db..e0273bc 100644
--- a/arch/powerpc/include/asm/unistd.h
+++ b/arch/powerpc/include/asm/unistd.h
@@ -12,13 +12,11 @@
 #include 
 
 
-#define NR_syscalls384
+#define NR_syscalls386
 
 #define __NR__exit __NR_exit
 
 #define __IGNORE_pkey_mprotect
-#define __IGNORE_pkey_alloc
-#define __IGNORE_pkey_free
 
 #ifndef __ASSEMBLY__
 
diff --git a/arch/powerpc/include/uapi/asm/unistd.h 
b/arch/powerpc/include/uapi/asm/unistd.h
index b85f142..7993a07 100644
--- a/arch/powerpc/include/uapi/asm/unistd.h
+++ b/arch/powerpc/include/uapi/asm/unistd.h
@@ -394,5 +394,7 @@
 #define __NR_pwritev2  381
 #define __NR_kexec_file_load   382
 #define __NR_statx 383
+#define __NR_pkey_alloc384
+#define __NR_pkey_free 385
 
 #endif /* _UAPI_ASM_POWERPC_UNISTD_H_ */
-- 
1.7.1



[RFC v7 06/25] powerpc: implementation for arch_set_user_pkey_access()

2017-07-30 Thread Ram Pai
This patch provides the detailed implementation for
a user to allocate a key and enable it in the hardware.

It provides the plumbing, but it cannot be used till
the system call is implemented. The next patch  will
do so.

Signed-off-by: Ram Pai 
---
 arch/powerpc/include/asm/pkeys.h |9 -
 arch/powerpc/mm/pkeys.c  |   28 
 2 files changed, 36 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
index 40bfaac..aec3f49 100644
--- a/arch/powerpc/include/asm/pkeys.h
+++ b/arch/powerpc/include/asm/pkeys.h
@@ -23,6 +23,9 @@
 #endif
 
 #define arch_max_pkey()  pkeys_total
+#define AMR_RD_BIT 0x1UL
+#define AMR_WR_BIT 0x2UL
+#define IAMR_EX_BIT 0x1UL
 #define ARCH_VM_PKEY_FLAGS (VM_PKEY_BIT0 | VM_PKEY_BIT1 | VM_PKEY_BIT2 | \
VM_PKEY_BIT3 | VM_PKEY_BIT4)
 #define AMR_BITS_PER_PKEY 2
@@ -122,10 +125,14 @@ static inline int arch_override_mprotect_pkey(struct 
vm_area_struct *vma,
return 0;
 }
 
+extern int __arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
+   unsigned long init_val);
 static inline int arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
unsigned long init_val)
 {
-   return 0;
+   if (!pkey_inited)
+   return -EINVAL;
+   return __arch_set_user_pkey_access(tsk, pkey, init_val);
 }
 
 static inline void pkey_mm_init(struct mm_struct *mm)
diff --git a/arch/powerpc/mm/pkeys.c b/arch/powerpc/mm/pkeys.c
index 846d984..5b3531e 100644
--- a/arch/powerpc/mm/pkeys.c
+++ b/arch/powerpc/mm/pkeys.c
@@ -22,6 +22,11 @@
 #define PKEY_REG_BITS (sizeof(u64)*8)
 #define pkeyshift(pkey) (PKEY_REG_BITS - ((pkey+1) * AMR_BITS_PER_PKEY))
 
+static bool is_pkey_enabled(int pkey)
+{
+   return !!(read_uamor() & (0x3ul << pkeyshift(pkey)));
+}
+
 static inline void init_amr(int pkey, u8 init_bits)
 {
u64 new_amr_bits = (((u64)init_bits & 0x3UL) << pkeyshift(pkey));
@@ -64,3 +69,26 @@ void __arch_deactivate_pkey(int pkey)
 {
pkey_status_change(pkey, false);
 }
+
+/*
+ * set the access right in AMR IAMR and UAMOR register
+ * for @pkey to that specified in @init_val.
+ */
+int __arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
+   unsigned long init_val)
+{
+   u64 new_amr_bits = 0x0ul;
+
+   if (!is_pkey_enabled(pkey))
+   return -EINVAL;
+
+   /* Set the bits we need in AMR:  */
+   if (init_val & PKEY_DISABLE_ACCESS)
+   new_amr_bits |= AMR_RD_BIT | AMR_WR_BIT;
+   else if (init_val & PKEY_DISABLE_WRITE)
+   new_amr_bits |= AMR_WR_BIT;
+
+   init_amr(pkey, new_amr_bits);
+
+   return 0;
+}
-- 
1.7.1



[RFC v7 05/25] powerpc: cleaup AMR, iAMR when a key is allocated or freed

2017-07-30 Thread Ram Pai
cleanup the bits corresponding to a key in the AMR, and IAMR
register, when the key is newly allocated/activated or is freed.
We dont want some residual bits cause the hardware enforce
unintended behavior when the key is activated or freed.

Signed-off-by: Ram Pai 
---
 arch/powerpc/include/asm/pkeys.h |   12 
 1 files changed, 12 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
index f1c55e0..40bfaac 100644
--- a/arch/powerpc/include/asm/pkeys.h
+++ b/arch/powerpc/include/asm/pkeys.h
@@ -53,6 +53,8 @@ static inline bool mm_pkey_is_allocated(struct mm_struct *mm, 
int pkey)
mm_set_pkey_is_allocated(mm, pkey));
 }
 
+extern void __arch_activate_pkey(int pkey);
+extern void __arch_deactivate_pkey(int pkey);
 /*
  * Returns a positive, 5-bit key on success, or -1 on failure.
  */
@@ -79,6 +81,12 @@ static inline int mm_pkey_alloc(struct mm_struct *mm)
 
ret = ffz((u32)mm_pkey_allocation_map(mm));
mm_set_pkey_allocated(mm, ret);
+
+   /*
+* enable the key in the hardware
+*/
+   if (ret > 0)
+   __arch_activate_pkey(ret);
return ret;
 }
 
@@ -90,6 +98,10 @@ static inline int mm_pkey_free(struct mm_struct *mm, int 
pkey)
if (!mm_pkey_is_allocated(mm, pkey))
return -EINVAL;
 
+   /*
+* Disable the key in the hardware
+*/
+   __arch_deactivate_pkey(pkey);
mm_set_pkey_free(mm, pkey);
 
return 0;
-- 
1.7.1



[RFC v7 04/25] powerpc: helper functions to initialize AMR, IAMR and UAMOR registers

2017-07-30 Thread Ram Pai
Introduce  helper functions that can initialize the bits in the AMR,
IAMR and UAMOR register; the bits that correspond to the given pkey.

Signed-off-by: Ram Pai 
---
 arch/powerpc/include/asm/pkeys.h |1 +
 arch/powerpc/mm/pkeys.c  |   46 ++
 2 files changed, 47 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
index def385f..f1c55e0 100644
--- a/arch/powerpc/include/asm/pkeys.h
+++ b/arch/powerpc/include/asm/pkeys.h
@@ -25,6 +25,7 @@
 #define arch_max_pkey()  pkeys_total
 #define ARCH_VM_PKEY_FLAGS (VM_PKEY_BIT0 | VM_PKEY_BIT1 | VM_PKEY_BIT2 | \
VM_PKEY_BIT3 | VM_PKEY_BIT4)
+#define AMR_BITS_PER_PKEY 2
 
 #define pkey_alloc_mask(pkey) (0x1 << pkey)
 
diff --git a/arch/powerpc/mm/pkeys.c b/arch/powerpc/mm/pkeys.c
index 37dacc5..846d984 100644
--- a/arch/powerpc/mm/pkeys.c
+++ b/arch/powerpc/mm/pkeys.c
@@ -18,3 +18,49 @@
 bool pkey_inited;
 int  pkeys_total;  /* total pkeys as per device tree */
 u32  initial_allocation_mask;  /* bits set for reserved keys */
+
+#define PKEY_REG_BITS (sizeof(u64)*8)
+#define pkeyshift(pkey) (PKEY_REG_BITS - ((pkey+1) * AMR_BITS_PER_PKEY))
+
+static inline void init_amr(int pkey, u8 init_bits)
+{
+   u64 new_amr_bits = (((u64)init_bits & 0x3UL) << pkeyshift(pkey));
+   u64 old_amr = read_amr() & ~((u64)(0x3ul) << pkeyshift(pkey));
+
+   write_amr(old_amr | new_amr_bits);
+}
+
+static inline void init_iamr(int pkey, u8 init_bits)
+{
+   u64 new_iamr_bits = (((u64)init_bits & 0x3UL) << pkeyshift(pkey));
+   u64 old_iamr = read_iamr() & ~((u64)(0x3ul) << pkeyshift(pkey));
+
+   write_iamr(old_iamr | new_iamr_bits);
+}
+
+static void pkey_status_change(int pkey, bool enable)
+{
+   u64 old_uamor;
+
+   /* reset the AMR and IAMR bits for this key */
+   init_amr(pkey, 0x0);
+   init_iamr(pkey, 0x0);
+
+   /* enable/disable key */
+   old_uamor = read_uamor();
+   if (enable)
+   old_uamor |= (0x3ul << pkeyshift(pkey));
+   else
+   old_uamor &= ~(0x3ul << pkeyshift(pkey));
+   write_uamor(old_uamor);
+}
+
+void __arch_activate_pkey(int pkey)
+{
+   pkey_status_change(pkey, true);
+}
+
+void __arch_deactivate_pkey(int pkey)
+{
+   pkey_status_change(pkey, false);
+}
-- 
1.7.1



[RFC v7 03/25] powerpc: helper function to read, write AMR, IAMR, UAMOR registers

2017-07-30 Thread Ram Pai
Implements helper functions to read and write the key related
registers; AMR, IAMR, UAMOR.

AMR register tracks the read,write permission of a key
IAMR register tracks the execute permission of a key
UAMOR register enables and disables a key

Signed-off-by: Ram Pai 
---
 arch/powerpc/include/asm/book3s/64/pgtable.h |   26 ++
 1 files changed, 26 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h 
b/arch/powerpc/include/asm/book3s/64/pgtable.h
index 85bc987..d4da0e9 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -428,6 +428,32 @@ static inline void huge_ptep_set_wrprotect(struct 
mm_struct *mm,
pte_update(mm, addr, ptep, 0, _PAGE_PRIVILEGED, 1);
 }
 
+#include 
+static inline u64 read_amr(void)
+{
+   return mfspr(SPRN_AMR);
+}
+static inline void write_amr(u64 value)
+{
+   mtspr(SPRN_AMR, value);
+}
+static inline u64 read_iamr(void)
+{
+   return mfspr(SPRN_IAMR);
+}
+static inline void write_iamr(u64 value)
+{
+   mtspr(SPRN_IAMR, value);
+}
+static inline u64 read_uamor(void)
+{
+   return mfspr(SPRN_UAMOR);
+}
+static inline void write_uamor(u64 value)
+{
+   mtspr(SPRN_UAMOR, value);
+}
+
 #define __HAVE_ARCH_PTEP_GET_AND_CLEAR
 static inline pte_t ptep_get_and_clear(struct mm_struct *mm,
   unsigned long addr, pte_t *ptep)
-- 
1.7.1



[RFC v7 02/25] powerpc: track allocation status of all pkeys

2017-07-30 Thread Ram Pai
Total 32 keys are available on power7 and above. However
pkey 0,1 are reserved. So effectively we  have  30 pkeys.

On 4K kernels, we do not  have  5  bits  in  the  PTE to
represent  all the keys; we only have 3bits.Two of those
keys are reserved; pkey 0 and pkey 1. So effectively  we
have 6 pkeys.

This patch keeps track of reserved keys, allocated  keys
and keys that are currently free.

Also it  adds  skeletal  functions  and macros, that the
architecture-independent code expects to be available.

Signed-off-by: Ram Pai 
---
 arch/powerpc/include/asm/book3s/64/mmu.h |9 +++
 arch/powerpc/include/asm/mmu_context.h   |1 +
 arch/powerpc/include/asm/pkeys.h |   98 -
 arch/powerpc/mm/mmu_context_book3s64.c   |2 +
 arch/powerpc/mm/pkeys.c  |2 +
 5 files changed, 108 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/mmu.h 
b/arch/powerpc/include/asm/book3s/64/mmu.h
index 77529a3..104ad72 100644
--- a/arch/powerpc/include/asm/book3s/64/mmu.h
+++ b/arch/powerpc/include/asm/book3s/64/mmu.h
@@ -108,6 +108,15 @@ struct patb_entry {
 #ifdef CONFIG_SPAPR_TCE_IOMMU
struct list_head iommu_group_mem_list;
 #endif
+
+#ifdef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
+   /*
+* Each bit represents one protection key.
+* bit set   -> key allocated
+* bit unset -> key available for allocation
+*/
+   u32 pkey_allocation_map;
+#endif
 } mm_context_t;
 
 /*
diff --git a/arch/powerpc/include/asm/mmu_context.h 
b/arch/powerpc/include/asm/mmu_context.h
index 4b93547..4705dab 100644
--- a/arch/powerpc/include/asm/mmu_context.h
+++ b/arch/powerpc/include/asm/mmu_context.h
@@ -184,6 +184,7 @@ static inline bool arch_vma_access_permitted(struct 
vm_area_struct *vma,
 
 #ifndef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
 #define pkey_initialize()
+#define pkey_mm_init(mm)
 #endif /* CONFIG_PPC64_MEMORY_PROTECTION_KEYS */
 
 #endif /* __KERNEL__ */
diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
index 4ccb8f5..def385f 100644
--- a/arch/powerpc/include/asm/pkeys.h
+++ b/arch/powerpc/include/asm/pkeys.h
@@ -2,6 +2,8 @@
 #define _ASM_PPC64_PKEYS_H
 
 extern bool pkey_inited;
+extern int pkeys_total; /* total pkeys as per device tree */
+extern u32 initial_allocation_mask;/* bits set for reserved keys */
 
 /*
  * powerpc needs an additional vma bit to support 32 keys.
@@ -20,21 +22,76 @@
 #define VM_PKEY_BIT4   VM_HIGH_ARCH_4
 #endif
 
-#define ARCH_VM_PKEY_FLAGS 0
+#define arch_max_pkey()  pkeys_total
+#define ARCH_VM_PKEY_FLAGS (VM_PKEY_BIT0 | VM_PKEY_BIT1 | VM_PKEY_BIT2 | \
+   VM_PKEY_BIT3 | VM_PKEY_BIT4)
+
+#define pkey_alloc_mask(pkey) (0x1 << pkey)
+
+#define mm_pkey_allocation_map(mm) (mm->context.pkey_allocation_map)
+
+#define mm_set_pkey_allocated(mm, pkey) {  \
+   mm_pkey_allocation_map(mm) |= pkey_alloc_mask(pkey); \
+}
+
+#define mm_set_pkey_free(mm, pkey) {   \
+   mm_pkey_allocation_map(mm) &= ~pkey_alloc_mask(pkey);   \
+}
+
+#define mm_set_pkey_is_allocated(mm, pkey) \
+   (mm_pkey_allocation_map(mm) & pkey_alloc_mask(pkey))
+
+#define mm_set_pkey_is_reserved(mm, pkey) (initial_allocation_mask & \
+   pkey_alloc_mask(pkey))
 
 static inline bool mm_pkey_is_allocated(struct mm_struct *mm, int pkey)
 {
-   return (pkey == 0);
+   /* a reserved key is never considered as 'explicitly allocated' */
+   return ((pkey < arch_max_pkey()) &&
+   !mm_set_pkey_is_reserved(mm, pkey) &&
+   mm_set_pkey_is_allocated(mm, pkey));
 }
 
+/*
+ * Returns a positive, 5-bit key on success, or -1 on failure.
+ */
 static inline int mm_pkey_alloc(struct mm_struct *mm)
 {
-   return -1;
+   /*
+* Note: this is the one and only place we make sure
+* that the pkey is valid as far as the hardware is
+* concerned.  The rest of the kernel trusts that
+* only good, valid pkeys come out of here.
+*/
+   u32 all_pkeys_mask = (u32)(~(0x0));
+   int ret;
+
+   if (!pkey_inited)
+   return -1;
+   /*
+* Are we out of pkeys?  We must handle this specially
+* because ffz() behavior is undefined if there are no
+* zeros.
+*/
+   if (mm_pkey_allocation_map(mm) == all_pkeys_mask)
+   return -1;
+
+   ret = ffz((u32)mm_pkey_allocation_map(mm));
+   mm_set_pkey_allocated(mm, ret);
+   return ret;
 }
 
 static inline int mm_pkey_free(struct mm_struct *mm, int pkey)
 {
-   return -EINVAL;
+   if (!pkey_inited)
+   return -1;
+
+   if (!mm_pkey_is_allocated(mm, pkey))
+   return -EINVAL;
+
+   mm_set_pkey_free(mm, pkey);
+
+   return 0;
 }
 
 /*
@@ -58,12 +115,45 @@ static inline int arch_set_user_pkey_access(struct 
task_struct *tsk, int pkey,
return 0;
 }
 
+static inline 

[RFC v7 01/25] powerpc: define an additional vma bit for protection keys.

2017-07-30 Thread Ram Pai
powerpc needs an additional vma bit to support 32 keys.
Till the additional vma bit lands in include/linux/mm.h
we have to define  it  in powerpc specific header file.
This is  needed to get pkeys working on power.

Signed-off-by: Ram Pai 
---
 arch/powerpc/include/asm/pkeys.h |   18 ++
 1 files changed, 18 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
index acad7c2..4ccb8f5 100644
--- a/arch/powerpc/include/asm/pkeys.h
+++ b/arch/powerpc/include/asm/pkeys.h
@@ -2,6 +2,24 @@
 #define _ASM_PPC64_PKEYS_H
 
 extern bool pkey_inited;
+
+/*
+ * powerpc needs an additional vma bit to support 32 keys.
+ * Till the additional vma bit lands in include/linux/mm.h
+ * we have to carry the hunk below. This is  needed to get
+ * pkeys working on power. -- Ram
+ */
+#ifndef VM_HIGH_ARCH_BIT_4
+#define VM_HIGH_ARCH_BIT_4 36
+#define VM_HIGH_ARCH_4 BIT(VM_HIGH_ARCH_BIT_4)
+#define VM_PKEY_SHIFT VM_HIGH_ARCH_BIT_0
+#define VM_PKEY_BIT0   VM_HIGH_ARCH_0
+#define VM_PKEY_BIT1   VM_HIGH_ARCH_1
+#define VM_PKEY_BIT2   VM_HIGH_ARCH_2
+#define VM_PKEY_BIT3   VM_HIGH_ARCH_3
+#define VM_PKEY_BIT4   VM_HIGH_ARCH_4
+#endif
+
 #define ARCH_VM_PKEY_FLAGS 0
 
 static inline bool mm_pkey_is_allocated(struct mm_struct *mm, int pkey)
-- 
1.7.1



[RFC v7 00/25] powerpc: Memory Protection Keys

2017-07-30 Thread Ram Pai
Memory protection keys enable applications to protect its
address  space from inadvertent access from or corruption
by itself.

The overall idea:
-
 A process allocates a   key  and associates it with
 an  address  range  withinits   address   space.
 The process  then  can  dynamically  set read/write 
 permissions on  the   key   without  involving  the 
 kernel. Any  code that  violates   the  permissions
 of  the address space; as defined by its associated
 key, will receive a segmentation fault.

This  patch series enables the feature on PPC64 HPTE
platform.

ISA3.0   section  5.7.13   describes  the  detailed
specifications.


Highlevel view of the design:
---
When  an  application associates a key with a address
address  range,  program  the key inthe Linux PTE.
When the MMU   detects  a page fault, allocate a hash
page  and   program  the  key into HPTE.  And finally
when the  MMUdetects  a  key  violation;  due  to
invalidapplication  access, invoke the registered
signal   handler and provide the violated  key number
as   well  as the state of the key register (AMR), at
the time it faulted.


Testing:
---
This  patch  series has passed all the protection key
tests   availablein   the selftests directory.The
tests are updated  towork on both x86 and powerpc.
NOTE: All the selftest related patches will be   part
of  a separate patch series.


Outstanding issues:
---
How will the application know if pkey is  enabled, if
so   how   many pkeys are   available? Is
PKEY_DISABLE_EXECUTE supported?  - Ben.


History:
---
version v7:
(1) refers to device tree property to enable
protection keys.
(2) adds 4K PTE support.
(3) fixes a couple of bugs noticed by Thiago
(4) decouples this patch series from   arch-
independent code. This patch series can
now stand by itself, with one kludge
patch(2).

version v6:
(1) selftest changes  are broken down into 20
incremental patches.
(2) A  separate   key  allocation  mask  that
includesPKEY_DISABLE_EXECUTE   is 
added for powerpc
(3) pkey feature  is enabled for 64K HPT case
only.  RPT and 4k HPT is disabled.
(4) Documentation   is   updated   to  better 
capture the semantics.
(5) introduced   arch_pkeys_enabled() to find
if an arch enables pkeys.  Correspond-
ing change the  logic   that displays
key value in smaps.
(6) code  rearranged  in many places based on
comments from   Dave Hansen,   Balbir,
Anshuman.   
(7) fixed  one bug where a bogus key could be
associated successfullyin
pkey_mprotect().

version v5:
(1) reverted back to the old  design -- store
the key in the pte,  instead of bypassing
it.  The v4  design  slowed down the hash
page path.
(2) detects key violation when kernel is told
to access user pages.
(3) further  refined the patches into smaller
consumable units
(4) page faults   handlers captures the fault-
ing key 
from the pte   instead of   the vma. This
closes  a  race  between  where  the  key 
update in the  vma and a key fault caused
by the key programmed in the pte.
(5) a key created   with access-denied should
also set it up to deny write. Fixed it.
(6) protection-key   number   is displayed in
smaps the x86 way.

version v4:
(1) patches no more depend on the pte bits
to program the hpte
-- comment by Balbir
(2) documentation updates
(3) fixed a bug in the selftest.
(4) unlike x86, powerpc   lets signal handler
change   key   permission   bits; the
change   will   persist across signal
handler   boundaries.   Earlierwe
allowed   the   signal   handler   to
modify   a   field   in   the siginfo
structure   which would  than be used
by  the  kernel  to  program  the key
protection register (AMR)
  -- resolves a issue raised by Ben.
"Calls  to  sys_swapcontext  with   a
made-up  context  will  end up with a
crap  AMR  if done by code who didn't
know about that register".
(5) these  changes  enable protection keys on
4k-page kernel aswell.

version v3:
(1) split the patches into smaller consumable
patches.
(2) added  the  ability  to  disable  execute

[PATCH 7/7] powerpc: use helper functions to get and set hash slots

2017-07-30 Thread Ram Pai
replace redundant code in __hash_page_64K(), __hash_page_huge(),
__hash_page_4K(), __hash_page_4K() and flush_hash_page()   with
helper functions pte_get_hash_gslot() and   pte_set_hash_slot()

Reviewed-by: Aneesh Kumar K.V 
Signed-off-by: Ram Pai 
---
 arch/powerpc/mm/hash64_4k.c  |   14 +++-
 arch/powerpc/mm/hash64_64k.c |   58 +++--
 arch/powerpc/mm/hash_utils_64.c  |   13 ++-
 arch/powerpc/mm/hugetlbpage-hash64.c |   28 ++--
 4 files changed, 27 insertions(+), 86 deletions(-)

diff --git a/arch/powerpc/mm/hash64_4k.c b/arch/powerpc/mm/hash64_4k.c
index 6fa450c..a1eebc1 100644
--- a/arch/powerpc/mm/hash64_4k.c
+++ b/arch/powerpc/mm/hash64_4k.c
@@ -20,6 +20,7 @@ int __hash_page_4K(unsigned long ea, unsigned long access, 
unsigned long vsid,
   pte_t *ptep, unsigned long trap, unsigned long flags,
   int ssize, int subpg_prot)
 {
+   real_pte_t rpte;
unsigned long hpte_group;
unsigned long rflags, pa;
unsigned long old_pte, new_pte;
@@ -54,6 +55,7 @@ int __hash_page_4K(unsigned long ea, unsigned long access, 
unsigned long vsid,
 * need to add in 0x1 if it's a read-only user page
 */
rflags = htab_convert_pte_flags(new_pte);
+   rpte = __real_pte(__pte(old_pte), ptep);
 
if (cpu_has_feature(CPU_FTR_NOEXECUTE) &&
!cpu_has_feature(CPU_FTR_COHERENT_ICACHE))
@@ -64,13 +66,10 @@ int __hash_page_4K(unsigned long ea, unsigned long access, 
unsigned long vsid,
/*
 * There MIGHT be an HPTE for this pte
 */
-   hash = hpt_hash(vpn, shift, ssize);
-   if (old_pte & H_PAGE_F_SECOND)
-   hash = ~hash;
-   slot = (hash & htab_hash_mask) * HPTES_PER_GROUP;
-   slot += (old_pte & H_PAGE_F_GIX) >> H_PAGE_F_GIX_SHIFT;
+   unsigned long gslot = pte_get_hash_gslot(vpn, shift,
+   ssize, rpte, 0);
 
-   if (mmu_hash_ops.hpte_updatepp(slot, rflags, vpn, MMU_PAGE_4K,
+   if (mmu_hash_ops.hpte_updatepp(gslot, rflags, vpn, MMU_PAGE_4K,
   MMU_PAGE_4K, ssize, flags) == -1)
old_pte &= ~_PAGE_HPTEFLAGS;
}
@@ -118,8 +117,7 @@ int __hash_page_4K(unsigned long ea, unsigned long access, 
unsigned long vsid,
return -1;
}
new_pte = (new_pte & ~_PAGE_HPTEFLAGS) | H_PAGE_HASHPTE;
-   new_pte |= (slot << H_PAGE_F_GIX_SHIFT) &
-   (H_PAGE_F_SECOND | H_PAGE_F_GIX);
+   new_pte |= pte_set_hash_slot(ptep, rpte, 0, slot);
}
*ptep = __pte(new_pte & ~H_PAGE_BUSY);
return 0;
diff --git a/arch/powerpc/mm/hash64_64k.c b/arch/powerpc/mm/hash64_64k.c
index e922a70..6c1c87a 100644
--- a/arch/powerpc/mm/hash64_64k.c
+++ b/arch/powerpc/mm/hash64_64k.c
@@ -39,9 +39,8 @@ int __hash_page_4K(unsigned long ea, unsigned long access, 
unsigned long vsid,
 {
real_pte_t rpte;
unsigned long hpte_group;
-   unsigned long *hidxp;
unsigned int subpg_index;
-   unsigned long rflags, pa, hidx;
+   unsigned long rflags, pa;
unsigned long old_pte, new_pte, subpg_pte;
unsigned long vpn, hash, slot, gslot;
unsigned long shift = mmu_psize_defs[MMU_PAGE_4K].shift;
@@ -114,18 +113,13 @@ int __hash_page_4K(unsigned long ea, unsigned long 
access, unsigned long vsid,
if (__rpte_sub_valid(rpte, subpg_index)) {
int ret;
 
-   hash = hpt_hash(vpn, shift, ssize);
-   hidx = __rpte_to_hidx(rpte, subpg_index);
-   if (hidx & _PTEIDX_SECONDARY)
-   hash = ~hash;
-   slot = (hash & htab_hash_mask) * HPTES_PER_GROUP;
-   slot += hidx & _PTEIDX_GROUP_IX;
-
-   ret = mmu_hash_ops.hpte_updatepp(slot, rflags, vpn,
+   gslot = pte_get_hash_gslot(vpn, shift, ssize, rpte,
+   subpg_index);
+   ret = mmu_hash_ops.hpte_updatepp(gslot, rflags, vpn,
 MMU_PAGE_4K, MMU_PAGE_4K,
 ssize, flags);
/*
-*if we failed because typically the HPTE wasn't really here
+* if we failed because typically the HPTE wasn't really here
 * we try an insertion.
 */
if (ret == -1)
@@ -221,20 +215,10 @@ int __hash_page_4K(unsigned long ea, unsigned long 
access, unsigned long vsid,
   MMU_PAGE_4K, MMU_PAGE_4K, old_pte);
return -1;
}
-   /*
-* Insert slot number & secondary bit in PTE second half,
-* clear H_PAGE_BUSY and 

[PATCH 6/7] powerpc: introduce pte_get_hash_gslot() helper

2017-07-30 Thread Ram Pai
Introduce pte_get_hash_gslot()() which returns the slot number of the
HPTE in the global hash table.

This function will come in handy as we work towards re-arranging the
PTE bits in the later patches.

Reviewed-by: Aneesh Kumar K.V 
Signed-off-by: Ram Pai 
---
 arch/powerpc/include/asm/book3s/64/hash.h |3 +++
 arch/powerpc/mm/hash_utils_64.c   |   18 ++
 2 files changed, 21 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/hash.h 
b/arch/powerpc/include/asm/book3s/64/hash.h
index 509ace1..1b2ac7e 100644
--- a/arch/powerpc/include/asm/book3s/64/hash.h
+++ b/arch/powerpc/include/asm/book3s/64/hash.h
@@ -155,6 +155,9 @@ static inline int hash__pte_none(pte_t pte)
return (pte_val(pte) & ~H_PTE_NONE_MASK) == 0;
 }
 
+unsigned long pte_get_hash_gslot(unsigned long vpn, unsigned long shift,
+   int ssize, real_pte_t rpte, unsigned int subpg_index);
+
 /* This low level function performs the actual PTE insertion
  * Setting the PTE depends on the MMU type and other factors. It's
  * an horrible mess that I'm not going to try to clean up now but
diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index 1b494d0..d3604da 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -1591,6 +1591,24 @@ static inline void tm_flush_hash_page(int local)
 }
 #endif
 
+/*
+ * return the global hash slot, corresponding to the given
+ * pte, which contains the hpte.
+ */
+unsigned long pte_get_hash_gslot(unsigned long vpn, unsigned long shift,
+   int ssize, real_pte_t rpte, unsigned int subpg_index)
+{
+   unsigned long hash, slot, hidx;
+
+   hash = hpt_hash(vpn, shift, ssize);
+   hidx = __rpte_to_hidx(rpte, subpg_index);
+   if (hidx & _PTEIDX_SECONDARY)
+   hash = ~hash;
+   slot = (hash & htab_hash_mask) * HPTES_PER_GROUP;
+   slot += hidx & _PTEIDX_GROUP_IX;
+   return slot;
+}
+
 /* WARNING: This is called from hash_low_64.S, if you change this prototype,
  *  do not forget to update the assembly call site !
  */
-- 
1.7.1



[PATCH 5/7] powerpc: introduce pte_set_hash_slot() helper

2017-07-30 Thread Ram Pai
Introduce pte_set_hash_slot().It  sets the (H_PAGE_F_SECOND|H_PAGE_F_GIX)
bits at  the   appropriate   location   in   the   PTE  of  4K  PTE.  For
64K PTE, it  sets  the  bits  in  the  second  part  of  the  PTE. Though
the implementation  for the former just needs the slot parameter, it does
take some additional parameters to keep the prototype consistent.

This function  will  be  handy  as  we   work   towards  re-arranging the
bits in the later patches.

Reviewed-by: Aneesh Kumar K.V 
Signed-off-by: Ram Pai 
---
 arch/powerpc/include/asm/book3s/64/hash-4k.h  |   15 +++
 arch/powerpc/include/asm/book3s/64/hash-64k.h |   25 +
 2 files changed, 40 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/hash-4k.h 
b/arch/powerpc/include/asm/book3s/64/hash-4k.h
index 778e2f4..5187249 100644
--- a/arch/powerpc/include/asm/book3s/64/hash-4k.h
+++ b/arch/powerpc/include/asm/book3s/64/hash-4k.h
@@ -54,6 +54,21 @@ static inline int hash__hugepd_ok(hugepd_t hpd)
 }
 #endif
 
+/*
+ * 4k pte format is  different  from  64k  pte  format.  Saving  the
+ * hash_slot is just a matter of returning the pte bits that need to
+ * be modified. On 64k pte, things are a  little  more  involved and
+ * hence  needs   many   more  parameters  to  accomplish  the  same.
+ * However we  want  to abstract this out from the caller by keeping
+ * the prototype consistent across the two formats.
+ */
+static inline unsigned long pte_set_hash_slot(pte_t *ptep, real_pte_t rpte,
+   unsigned int subpg_index, unsigned long slot)
+{
+   return (slot << H_PAGE_F_GIX_SHIFT) &
+   (H_PAGE_F_SECOND | H_PAGE_F_GIX);
+}
+
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
 
 static inline char *get_hpte_slot_array(pmd_t *pmdp)
diff --git a/arch/powerpc/include/asm/book3s/64/hash-64k.h 
b/arch/powerpc/include/asm/book3s/64/hash-64k.h
index e83ae86..8576060 100644
--- a/arch/powerpc/include/asm/book3s/64/hash-64k.h
+++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h
@@ -68,6 +68,31 @@ static inline unsigned long __rpte_to_hidx(real_pte_t rpte, 
unsigned long index)
return ((rpte.hidx >> (index<<2)) & 0xfUL);
 }
 
+/*
+ * Commit the hash slot and return pte bits that needs to be modified.
+ * The caller is expected to modify the pte bits accordingly and
+ * commit the pte to memory.
+ */
+static inline unsigned long pte_set_hash_slot(pte_t *ptep, real_pte_t rpte,
+   unsigned int subpg_index, unsigned long slot)
+{
+   unsigned long *hidxp = (unsigned long *)(ptep + PTRS_PER_PTE);
+
+   rpte.hidx &= ~(0xfUL << (subpg_index << 2));
+   *hidxp = rpte.hidx  | (slot << (subpg_index << 2));
+   /*
+* Commit the hidx bits to memory before returning.
+* Anyone reading  pte  must  ensure hidx bits are
+* read  only  after  reading the pte by using the
+* read-side  barrier  smp_rmb(). __real_pte() can
+* help ensure that.
+*/
+   smp_wmb();
+
+   /* no pte bits to be modified, return 0x0UL */
+   return 0x0UL;
+}
+
 #define __rpte_to_pte(r)   ((r).pte)
 extern bool __rpte_sub_valid(real_pte_t rpte, unsigned long index);
 /*
-- 
1.7.1



[PATCH 4/7] powerpc: capture the PTE format changes in the dump pte report

2017-07-30 Thread Ram Pai
The H_PAGE_F_SECOND,H_PAGE_F_GIX are not in the 64K main-PTE.
capture these changes in the dump pte report.

Reviewed-by: Aneesh Kumar K.V 
Signed-off-by: Ram Pai 
---
 arch/powerpc/mm/dump_linuxpagetables.c |3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/mm/dump_linuxpagetables.c 
b/arch/powerpc/mm/dump_linuxpagetables.c
index 44fe483..5627edd 100644
--- a/arch/powerpc/mm/dump_linuxpagetables.c
+++ b/arch/powerpc/mm/dump_linuxpagetables.c
@@ -213,7 +213,7 @@ struct flag_info {
.val= H_PAGE_4K_PFN,
.set= "4K_pfn",
}, {
-#endif
+#else /* CONFIG_PPC_64K_PAGES */
.mask   = H_PAGE_F_GIX,
.val= H_PAGE_F_GIX,
.set= "f_gix",
@@ -224,6 +224,7 @@ struct flag_info {
.val= H_PAGE_F_SECOND,
.set= "f_second",
}, {
+#endif /* CONFIG_PPC_64K_PAGES */
 #endif
.mask   = _PAGE_SPECIAL,
.val= _PAGE_SPECIAL,
-- 
1.7.1



[PATCH 3/7] powerpc: Swizzle around 4K PTE bits to free up bit 5 and bit 6

2017-07-30 Thread Ram Pai
We  need  PTE bits  3 ,4, 5, 6 and 57 to support protection-keys,
because these are  the bits we want to consolidate on across all
configuration to support protection keys.

Bit 3,4,5 and 6 are currently used on 4K-pte kernels.  But bit 9
and 10 are available.  Hence  we  use the two available bits and
free up bit 5 and 6.  We will still not be able to free up bit 3
and 4. In the absence  of  any  other free bits, we will have to
stay satisfied  with  what we have :-(.   This means we will not
be  able  to support  32  protection  keys, but only 8.  The bit
numbers are  big-endian as defined in the  ISA3.0

This patch  does  the  following change to 4K PTE.

H_PAGE_F_SECOND (S) which   occupied  bit  4   moves  to  bit  7.
H_PAGE_F_GIX (G,I,X)  which  occupied  bit 5, 6 and 7 also moves
to  bit 8,9, 10 respectively.
H_PAGE_HASHPTE (H)  which   occupied   bit  8  moves  to  bit  4.

Before the patch, the 4k PTE format was as follows

 0 1 2 3 4  5  6  7  8 9 1057.63
 : : : : :  :  :  :  : : :  : :
 v v v v v  v  v  v  v v v  v v
,-,-,-,-,--,--,--,--,-,-,-,-,-,--,-,-,-,
|x|x|x|B|S |G |I |X |H| | |x|x|| |x|x|x|
'_'_'_'_'__'__'__'__'_'_'_'_'_''_'_'_'_'

After the patch, the 4k PTE format is as follows

 0 1 2 3 4  5  6  7  8 9 1057.63
 : : : : :  :  :  :  : : :  : :
 v v v v v  v  v  v  v v v  v v
,-,-,-,-,--,--,--,--,-,-,-,-,-,--,-,-,-,
|x|x|x|B|H |  |  |S |G|I|X|x|x|| |.|.|.|
'_'_'_'_'__'__'__'__'_'_'_'_'_''_'_'_'_'

The patch has no code changes; just swizzles around bits.

Signed-off-by: Ram Pai 
---
 arch/powerpc/include/asm/book3s/64/hash-4k.h  |7 ---
 arch/powerpc/include/asm/book3s/64/hash-64k.h |1 +
 arch/powerpc/include/asm/book3s/64/hash.h |1 -
 3 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/hash-4k.h 
b/arch/powerpc/include/asm/book3s/64/hash-4k.h
index d2cf949..778e2f4 100644
--- a/arch/powerpc/include/asm/book3s/64/hash-4k.h
+++ b/arch/powerpc/include/asm/book3s/64/hash-4k.h
@@ -16,10 +16,11 @@
 #define H_PUD_TABLE_SIZE   (sizeof(pud_t) << H_PUD_INDEX_SIZE)
 #define H_PGD_TABLE_SIZE   (sizeof(pgd_t) << H_PGD_INDEX_SIZE)
 
-#define H_PAGE_F_GIX_SHIFT 56
-#define H_PAGE_F_SECOND_RPAGE_RSV2 /* HPTE is in 2ndary HPTEG */
-#define H_PAGE_F_GIX   (_RPAGE_RSV3 | _RPAGE_RSV4 | _RPAGE_RPN44)
+#define H_PAGE_F_GIX_SHIFT 53
+#define H_PAGE_F_SECOND_RPAGE_RPN44/* HPTE is in 2ndary HPTEG */
+#define H_PAGE_F_GIX   (_RPAGE_RPN43 | _RPAGE_RPN42 | _RPAGE_RPN41)
 #define H_PAGE_BUSY_RPAGE_RSV1 /* software: PTE & hash are busy */
+#define H_PAGE_HASHPTE _RPAGE_RSV2 /* software: PTE & hash are busy */
 
 /* PTE flags to conserve for HPTE identification */
 #define _PAGE_HPTEFLAGS (H_PAGE_BUSY | H_PAGE_HASHPTE | \
diff --git a/arch/powerpc/include/asm/book3s/64/hash-64k.h 
b/arch/powerpc/include/asm/book3s/64/hash-64k.h
index c281f18..e83ae86 100644
--- a/arch/powerpc/include/asm/book3s/64/hash-64k.h
+++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h
@@ -13,6 +13,7 @@
 #define H_PAGE_COMBO   _RPAGE_RPN0 /* this is a combo 4k page */
 #define H_PAGE_4K_PFN  _RPAGE_RPN1 /* PFN is for a single 4k page */
 #define H_PAGE_BUSY_RPAGE_RPN44 /* software: PTE & hash are busy */
+#define H_PAGE_HASHPTE _RPAGE_RPN43/* PTE has associated HPTE */
 
 /*
  * We need to differentiate between explicit huge page and THP huge
diff --git a/arch/powerpc/include/asm/book3s/64/hash.h 
b/arch/powerpc/include/asm/book3s/64/hash.h
index d27f885..509ace1 100644
--- a/arch/powerpc/include/asm/book3s/64/hash.h
+++ b/arch/powerpc/include/asm/book3s/64/hash.h
@@ -8,7 +8,6 @@
  *
  */
 #define H_PTE_NONE_MASK_PAGE_HPTEFLAGS
-#define H_PAGE_HASHPTE _RPAGE_RPN43/* PTE has associated HPTE */
 
 #ifdef CONFIG_PPC_64K_PAGES
 #include 
-- 
1.7.1



[PATCH 2/7] powerpc: Free up four 64K PTE bits in 64K backed HPTE pages

2017-07-30 Thread Ram Pai
Rearrange 64K PTE bits to  free  up  bits 3, 4, 5  and  6
in the 64K backed HPTE pages. This along with the earlier
patch will  entirely free  up the four bits from 64K PTE.
The bit numbers are  big-endian as defined in the  ISA3.0

This patch  does  the  following change to 64K PTE backed
by 64K HPTE.

H_PAGE_F_SECOND (S) which  occupied  bit  4  moves to the
second part of the pte to bit 60.
H_PAGE_F_GIX (G,I,X) which  occupied  bit 5, 6 and 7 also
moves  to  the   second part of the pte to bit 61,
62, 63, 64 respectively

since bit 7 is now freed up, we move H_PAGE_BUSY (B) from
bit  9  to  bit  7.

The second part of the PTE will hold
(H_PAGE_F_SECOND|H_PAGE_F_GIX) at bit 60,61,62,63.
NOTE: None of the bits in the secondary PTE were not used
by 64k-HPTE backed PTE.

Before the patch, the 64K HPTE backed 64k PTE format was
as follows

 0 1 2 3 4  5  6  7  8 9 10...63
 : : : : :  :  :  :  : : ::
 v v v v v  v  v  v  v v vv

,-,-,-,-,--,--,--,--,-,-,-,-,-,--,-,-,-,
|x|x|x| |S |G |I |X |x|B| |x|x||x|x|x|x| <- primary pte
'_'_'_'_'__'__'__'__'_'_'_'_'_''_'_'_'_'
| | | | |  |  |  |  | | | | |..| | | | | <- secondary pte
'_'_'_'_'__'__'__'__'_'_'_'_'__'_'_'_'_'

After the patch, the 64k HPTE backed 64k PTE format is
as follows

 0 1 2 3 4  5  6  7  8 9 10...63
 : : : : :  :  :  :  : : ::
 v v v v v  v  v  v  v v vv

,-,-,-,-,--,--,--,--,-,-,-,-,-,--,-,-,-,
|x|x|x| |  |  |  |B |x| | |x|x||.|.|.|.| <- primary pte
'_'_'_'_'__'__'__'__'_'_'_'_'_''_'_'_'_'
| | | | |  |  |  |  | | | | |..|S|G|I|X| <- secondary pte
'_'_'_'_'__'__'__'__'_'_'_'_'__'_'_'_'_'

The above PTE changes is applicable to hugetlbpages aswell.

The patch does the following code changes:

a) moves  the  H_PAGE_F_SECOND and  H_PAGE_F_GIX to 4k PTE
header   since it is no more needed b the 64k PTEs.
b) abstracts  out __real_pte() and __rpte_to_hidx() so the
caller  need not know the bit location of the slot.
c) moves the slot bits to the secondary pte.

Reviewed-by: Aneesh Kumar K.V 
Signed-off-by: Ram Pai 
---
 arch/powerpc/include/asm/book3s/64/hash-4k.h  |3 ++
 arch/powerpc/include/asm/book3s/64/hash-64k.h |   29 +---
 arch/powerpc/include/asm/book3s/64/hash.h |3 --
 arch/powerpc/mm/hash64_64k.c  |   34 +---
 arch/powerpc/mm/hugetlbpage-hash64.c  |   26 +++---
 5 files changed, 61 insertions(+), 34 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/hash-4k.h 
b/arch/powerpc/include/asm/book3s/64/hash-4k.h
index f959c00..d2cf949 100644
--- a/arch/powerpc/include/asm/book3s/64/hash-4k.h
+++ b/arch/powerpc/include/asm/book3s/64/hash-4k.h
@@ -16,6 +16,9 @@
 #define H_PUD_TABLE_SIZE   (sizeof(pud_t) << H_PUD_INDEX_SIZE)
 #define H_PGD_TABLE_SIZE   (sizeof(pgd_t) << H_PGD_INDEX_SIZE)
 
+#define H_PAGE_F_GIX_SHIFT 56
+#define H_PAGE_F_SECOND_RPAGE_RSV2 /* HPTE is in 2ndary HPTEG */
+#define H_PAGE_F_GIX   (_RPAGE_RSV3 | _RPAGE_RSV4 | _RPAGE_RPN44)
 #define H_PAGE_BUSY_RPAGE_RSV1 /* software: PTE & hash are busy */
 
 /* PTE flags to conserve for HPTE identification */
diff --git a/arch/powerpc/include/asm/book3s/64/hash-64k.h 
b/arch/powerpc/include/asm/book3s/64/hash-64k.h
index 62e580c..c281f18 100644
--- a/arch/powerpc/include/asm/book3s/64/hash-64k.h
+++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h
@@ -12,7 +12,7 @@
  */
 #define H_PAGE_COMBO   _RPAGE_RPN0 /* this is a combo 4k page */
 #define H_PAGE_4K_PFN  _RPAGE_RPN1 /* PFN is for a single 4k page */
-#define H_PAGE_BUSY_RPAGE_RPN42 /* software: PTE & hash are busy */
+#define H_PAGE_BUSY_RPAGE_RPN44 /* software: PTE & hash are busy */
 
 /*
  * We need to differentiate between explicit huge page and THP huge
@@ -21,8 +21,7 @@
 #define H_PAGE_THP_HUGE  H_PAGE_4K_PFN
 
 /* PTE flags to conserve for HPTE identification */
-#define _PAGE_HPTEFLAGS (H_PAGE_BUSY | H_PAGE_F_SECOND | \
-H_PAGE_F_GIX | H_PAGE_HASHPTE | H_PAGE_COMBO)
+#define _PAGE_HPTEFLAGS (H_PAGE_BUSY | H_PAGE_HASHPTE | H_PAGE_COMBO)
 /*
  * we support 16 fragments per PTE page of 64K size.
  */
@@ -50,24 +49,22 @@ static inline real_pte_t __real_pte(pte_t pte, pte_t *ptep)
unsigned long *hidxp;
 
rpte.pte = pte;
-   rpte.hidx = 0;
-   if (pte_val(pte) & H_PAGE_COMBO) {
-   /*
-* Make sure we order the hidx load against the H_PAGE_COMBO
-* check. The store side ordering is done in __hash_page_4K
-*/
-   smp_rmb();
-   hidxp = (unsigned long *)(ptep + 

[PATCH 1/7] powerpc: Free up four 64K PTE bits in 4K backed HPTE pages

2017-07-30 Thread Ram Pai
Rearrange 64K PTE bits to  free  up  bits 3, 4, 5  and  6,
in the 4K backed HPTE pages.These bits continue to be used
for 64K backed HPTE pages in this patch, but will be freed
up in the next patch. The  bit  numbers are big-endian  as
defined in the ISA3.0

The patch does the following change to the 4k htpe backed
64K PTE's format.

H_PAGE_BUSY moves from bit 3 to bit 9 (B bit in the figure
below)
V0 which occupied bit 4 is not used anymore.
V1 which occupied bit 5 is not used anymore.
V2 which occupied bit 6 is not used anymore.
V3 which occupied bit 7 is not used anymore.

Before the patch, the 4k backed 64k PTE format was as follows

 0 1 2 3 4  5  6  7  8 9 10...63
 : : : : :  :  :  :  : : ::
 v v v v v  v  v  v  v v vv

,-,-,-,-,--,--,--,--,-,-,-,-,-,--,-,-,-,
|x|x|x|B|V0|V1|V2|V3|x| | |x|x||x|x|x|x| <- primary pte
'_'_'_'_'__'__'__'__'_'_'_'_'_''_'_'_'_'
|S|G|I|X|S |G |I |X |S|G|I|X|..|S|G|I|X| <- secondary pte
'_'_'_'_'__'__'__'__'_'_'_'_'__'_'_'_'_'

After the patch, the 4k backed 64k PTE format is as follows

 0 1 2 3 4  5  6  7  8 9 10...63
 : : : : :  :  :  :  : : ::
 v v v v v  v  v  v  v v vv

,-,-,-,-,--,--,--,--,-,-,-,-,-,--,-,-,-,
|x|x|x| |  |  |  |  |x|B| |x|x||.|.|.|.| <- primary pte
'_'_'_'_'__'__'__'__'_'_'_'_'_''_'_'_'_'
|S|G|I|X|S |G |I |X |S|G|I|X|..|S|G|I|X| <- secondary pte
'_'_'_'_'__'__'__'__'_'_'_'_'__'_'_'_'_'

the four  bits S,G,I,X (one quadruplet per 4k HPTE) that
cache  the  hash-bucket  slot  value, is initialized  to
1,1,1,1 indicating -- an invalid slot.   If  a HPTE gets
cached in a   slot(i.e 7th  slot  of  secondary hash
bucket), it is  released  immediately. In  other  words,
even  though    is   a valid slot  value in the hash
bucket, we consider it invalid and  release the slot and
the HPTE.  This  gives  us  the opportunity to determine
the validity of S,G,I,X  bits  based on its contents and
not on any of the bits V0,V1,V2 or V3 in the primary PTE

When   we  release  aHPTEcached in the  slot
we alsorelease  a  legitimate   slot  in the primary
hash bucket  and  unmap  its  corresponding  HPTE.  This
is  to  ensure   that  we do get a HPTE cached in a slot
of the primary hash bucket, the next time we retry.

Though  treating    slot  as  invalid,  reduces  the
number of  available  slots  in the hash bucket and  may
have  an  effect   on the performance, the probabilty of
hitting a  slot is extermely low.

Compared  to  the  current   scheme, the above described
scheme  reduces  the  number of false hash table updates
significantly   andhas  the   added   advantage   of
releasing  four  valuable  PTE bits for other purpose.

NOTE:even though bits 3, 4, 5, 6, 7 are  not  used  when
the  64K  PTE is backed by 4k HPTE,  they continue to be
used  if  the  PTE  gets  backed  by 64k HPTE.  The next
patch will decouple that aswell, and truely  release the
bits.

This idea was jointly developed by Paul Mackerras,
Aneesh, Michael Ellermen and myself.

4K PTE format remains unchanged currently.

The patch does the following code changes
a) PTE flags are split between 64k and 4k  header files.
b) __hash_page_4K()  is  reimplemented   to reflect the
   above logic.

Reviewed-by: Aneesh Kumar K.V 
Signed-off-by: Ram Pai 
---
 arch/powerpc/include/asm/book3s/64/hash-4k.h  |2 +
 arch/powerpc/include/asm/book3s/64/hash-64k.h |8 +--
 arch/powerpc/include/asm/book3s/64/hash.h |1 -
 arch/powerpc/mm/hash64_64k.c  |   74 -
 arch/powerpc/mm/hash_utils_64.c   |4 +-
 5 files changed, 55 insertions(+), 34 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/hash-4k.h 
b/arch/powerpc/include/asm/book3s/64/hash-4k.h
index 0c4e470..f959c00 100644
--- a/arch/powerpc/include/asm/book3s/64/hash-4k.h
+++ b/arch/powerpc/include/asm/book3s/64/hash-4k.h
@@ -16,6 +16,8 @@
 #define H_PUD_TABLE_SIZE   (sizeof(pud_t) << H_PUD_INDEX_SIZE)
 #define H_PGD_TABLE_SIZE   (sizeof(pgd_t) << H_PGD_INDEX_SIZE)
 
+#define H_PAGE_BUSY_RPAGE_RSV1 /* software: PTE & hash are busy */
+
 /* PTE flags to conserve for HPTE identification */
 #define _PAGE_HPTEFLAGS (H_PAGE_BUSY | H_PAGE_HASHPTE | \
 H_PAGE_F_SECOND | H_PAGE_F_GIX)
diff --git a/arch/powerpc/include/asm/book3s/64/hash-64k.h 
b/arch/powerpc/include/asm/book3s/64/hash-64k.h
index 9732837..62e580c 100644
--- a/arch/powerpc/include/asm/book3s/64/hash-64k.h
+++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h
@@ -12,18 +12,14 @@
  */
 #define H_PAGE_COMBO   _RPAGE_RPN0 /* this is a combo 4k page */
 #define H_PAGE_4K_PFN  

[PATCH 0/7] powerpc: Free up RPAGE_RSV bits

2017-07-30 Thread Ram Pai
RPAGE_RSV0..4 pte bits are currently used for hpte slot
tracking. We  need  these bits   for  memory-protection
keys. Luckily these  four bits   are  relatively easier 
to move among all the other candidate bits.

For  64K   linux-ptes   backed  by 4k hptes, these bits
are   used for tracking the  validity of the slot value
stored   in the second-part-of-the-pte. We device a new
mechanism for  tracking   the   validity  without using
those bits. Themechanism  is explained in the first
patch.

For 64K  linux-pte  backed by 64K hptes, we simply move
the   slot  tracking bits to the second-part-of-the-pte.

The above  mechanism  is also used to free the bits for
hugetlb linux-ptes.

For 4k linux-pte, we  have only 3 free  bits  available.
We swizzle around the bits and release RPAGE_RSV{2,3,4}
for memory protection keys.

Testing:

has survived  kernel  compilation on multiple platforms
p8 powernv hash-mode, p9 powernv hash-mode,  p7 powervm,
p8-powervm, p8-kvm-guest.

Has survived git-bisect on p8  power-nv  with  64K page
and 4K page.

History:
---
This patchset  is  a  spin-off from the memkey patchset.

version v8:
(1) an  additional  patch   added  to  free  up
 RSV{2,3,4} on 4K linux-pte.

version v7:
(1) GIX bit reset change  moved  to  the second
patch  -- noticed by Aneesh.
(2) Separated this patches from memkey patchset
(3) merged a  bunch  of  patches, that used the
helper function, into one.
version v6:
(1) No changes related to pte.

version v5:
(1) No changes related to pte.

version v4:
(1) No changes related to pte.

version v3:
(1) split the patches into smaller consumable
patches.
(2) A bug fix while  invalidating a hpte slot
in __hash_page_4K()
-- noticed by Aneesh


version v2:
(1) fixed a  bug  in 4k  hpte  backed 64k pte
where  pageinvalidation   was not
done  correctly,  and  initialization
ofsecond-part-of-the-pte  was not
donecorrectly  if the pte was not
yet Hashed with a hpte.
   --   Reported by Aneesh.


version v1: Initial version

Ram Pai (7):
  powerpc: Free up four 64K PTE bits in 4K backed HPTE pages
  powerpc: Free up four 64K PTE bits in 64K backed HPTE pages
  powerpc: Swizzle around 4K PTE bits to free up bit 5 and bit 6
  powerpc: capture the PTE format changes in the dump pte report
  powerpc: introduce pte_set_hash_slot() helper
  powerpc: introduce pte_get_hash_gslot() helper
  powerpc: use helper functions to get and set hash slots

 arch/powerpc/include/asm/book3s/64/hash-4k.h  |   21 
 arch/powerpc/include/asm/book3s/64/hash-64k.h |   61 
 arch/powerpc/include/asm/book3s/64/hash.h |8 +-
 arch/powerpc/mm/dump_linuxpagetables.c|3 +-
 arch/powerpc/mm/hash64_4k.c   |   14 +--
 arch/powerpc/mm/hash64_64k.c  |  124 +
 arch/powerpc/mm/hash_utils_64.c   |   35 +--
 arch/powerpc/mm/hugetlbpage-hash64.c  |   16 +--
 8 files changed, 167 insertions(+), 115 deletions(-)



Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?

2017-07-30 Thread Paul E. McKenney
On Sun, Jul 30, 2017 at 09:37:47PM +0800, Boqun Feng wrote:
> On Fri, Jul 28, 2017 at 12:09:56PM -0700, Paul E. McKenney wrote:
> > On Fri, Jul 28, 2017 at 11:41:29AM -0700, Paul E. McKenney wrote:
> > > On Fri, Jul 28, 2017 at 07:55:30AM -0700, Paul E. McKenney wrote:
> > > > On Fri, Jul 28, 2017 at 08:54:16PM +0800, Boqun Feng wrote:
> > 
> > [ . . . ]
> > 
> > > > Even though Jonathan's testing indicates that it didn't fix this
> > > > particular problem:
> > > > 
> > > > Acked-by: Paul E. McKenney 
> > > 
> > > And while we are at it:
> > > 
> > > Tested-by: Paul E. McKenney 
> > 
> > Not because it it fixed the TREE01 issue -- it did not.  But as near
> > as I can see, it didn't cause any additional issues.
> > 
> 
> Understood.
> 
> Still work on waketorture for a test case which could trigger this
> problem in a real world. My old plan is to send this out when I could
> use waketorture to show this patch actually resolves some potential
> bugs, but just put it ahead here in case it may help.
> 
> Will send it out with your Tested-by and Acked-by and continue to work
> on waketorture.

Sounds good!

Given that Jonathan's traces didn't show a timer expiration, the problems
might be in timers.

Thanx, Paul



Re: [PATCH v3 7/7] ima: Support module-style appended signatures for appraisal

2017-07-30 Thread Mimi Zohar
On Thu, 2017-07-06 at 19:17 -0300, Thiago Jung Bauermann wrote:
> This patch introduces the modsig keyword to the IMA policy syntax to
> specify that a given hook should expect the file to have the IMA signature
> appended to it. Here is how it can be used in a rule:
> 
> appraise func=KEXEC_KERNEL_CHECK appraise_type=modsig|imasig
> 
> With this rule, IMA will accept either an appended signature or a signature
> stored in the extended attribute. In that case, it will first check whether
> there is an appended signature, and if not it will read it from the
> extended attribute.
> 
> The format of the appended signature is the same used for signed kernel
> modules. This means that the file can be signed with the scripts/sign-file
> tool, with a command line such as this:
> 
> $ sign-file sha256 privkey_ima.pem x509_ima.der vmlinux
> 
> This code only works for files that are hashed from a memory buffer, not
> for files that are read from disk at the time of hash calculation. In other
> words, only hooks that use kernel_read_file can support appended
> signatures. This means that only FIRMWARE_CHECK, KEXEC_KERNEL_CHECK,
> KEXEC_INITRAMFS_CHECK and POLICY_CHECK can be supported.
> 
> This feature warrants a separate config option because enabling it brings
> in many other config options.
> 
> Signed-off-by: Thiago Jung Bauermann 
> ---
>  security/integrity/ima/Kconfig|  13 +++
>  security/integrity/ima/Makefile   |   1 +
>  security/integrity/ima/ima.h  |  60 ++--
>  security/integrity/ima/ima_appraise.c | 102 ++---
>  security/integrity/ima/ima_main.c |   7 +-
>  security/integrity/ima/ima_modsig.c   | 147 
> ++
>  security/integrity/ima/ima_policy.c   |  26 --
>  security/integrity/ima/ima_template_lib.c |  14 ++-
>  security/integrity/integrity.h|   4 +-
>  9 files changed, 343 insertions(+), 31 deletions(-)
> 
> diff --git a/security/integrity/ima/Kconfig b/security/integrity/ima/Kconfig
> index 35ef69312811..55f734a6124b 100644
> --- a/security/integrity/ima/Kconfig
> +++ b/security/integrity/ima/Kconfig
> @@ -163,6 +163,19 @@ config IMA_APPRAISE_BOOTPARAM
> This option enables the different "ima_appraise=" modes
> (eg. fix, log) from the boot command line.
> 
> +config IMA_APPRAISE_MODSIG
> + bool "Support module-style signatures for appraisal"
> + depends on IMA_APPRAISE
> + depends on INTEGRITY_ASYMMETRIC_KEYS
> + select PKCS7_MESSAGE_PARSER
> + select MODULE_SIG_FORMAT
> + default n
> + help
> +Adds support for signatures appended to files. The format of the
> +appended signature is the same used for signed kernel modules.
> +The modsig keyword can be used in the IMA policy to allow a hook
> +to accept such signatures.
> +
>  config IMA_TRUSTED_KEYRING
>   bool "Require all keys on the .ima keyring be signed (deprecated)"
>   depends on IMA_APPRAISE && SYSTEM_TRUSTED_KEYRING
> diff --git a/security/integrity/ima/Makefile b/security/integrity/ima/Makefile
> index 29f198bde02b..c72026acecc3 100644
> --- a/security/integrity/ima/Makefile
> +++ b/security/integrity/ima/Makefile
> @@ -8,5 +8,6 @@ obj-$(CONFIG_IMA) += ima.o
>  ima-y := ima_fs.o ima_queue.o ima_init.o ima_main.o ima_crypto.o ima_api.o \
>ima_policy.o ima_template.o ima_template_lib.o
>  ima-$(CONFIG_IMA_APPRAISE) += ima_appraise.o
> +ima-$(CONFIG_IMA_APPRAISE_MODSIG) += ima_modsig.o
>  ima-$(CONFIG_HAVE_IMA_KEXEC) += ima_kexec.o
>  obj-$(CONFIG_IMA_BLACKLIST_KEYRING) += ima_mok.o
> diff --git a/security/integrity/ima/ima.h b/security/integrity/ima/ima.h
> index d52b487ad259..1e1e7c41ca19 100644
> --- a/security/integrity/ima/ima.h
> +++ b/security/integrity/ima/ima.h
> @@ -190,6 +190,8 @@ enum ima_hooks {
>   __ima_hooks(__ima_hook_enumify)
>  };
> 
> +extern const char *const func_tokens[];
> +
>  /* LIM API function definitions */
>  int ima_get_action(struct inode *inode, int mask,
>  enum ima_hooks func, int *pcr);
> @@ -236,9 +238,10 @@ int ima_policy_show(struct seq_file *m, void *v);
>  #ifdef CONFIG_IMA_APPRAISE
>  int ima_appraise_measurement(enum ima_hooks func,
>struct integrity_iint_cache *iint,
> -  struct file *file, const unsigned char *filename,
> -  struct evm_ima_xattr_data *xattr_value,
> -  int xattr_len, int opened);
> +  struct file *file, const void *buf, loff_t size,
> +  const unsigned char *filename,
> +  struct evm_ima_xattr_data **xattr_value,
> +  int *xattr_len, int opened);
>  int ima_must_appraise(struct inode *inode, int mask, enum ima_hooks func);
>  void ima_update_xattr(struct integrity_iint_cache *iint, struct file *file);
>  enum integrity_status 

Linux 4.13: Reported regressions as of Sunday, 2017-07-30

2017-07-30 Thread Thorsten Leemhuis
Hi! Find below my first regression report for Linux 4.13. It lists 8
regressions I'm currently aware of (a few others I had on my list got
fixed in the past few days). You can also find it at
http://bit.ly/lnxregrep413 where I try to update it every now and then.

As always: Are you aware of any other regressions? Then please let me
know. For details see http://bit.ly/lnxregtrackid
And please tell me if there is anything in the report that shouldn't be
there.

Ciao, Thorsten

P.S.: Thx to all those that CCed me on regression reports or provided
other input, it makes compiling these reports a whole lot easier!

== Current regressions ==

[x86/mm/gup] e585513b76: will-it-scale.per_thread_ops -6.9% regression
2017-07-10 http://lkml.kernel.org/r/20170710024020.GA26389@yexl-desktop
Due to https://git.kernel.org/torvalds/c/e585513b76

Null dereference in rt5677_i2c_probe()
2017-07-17 lr#96bd63 https://bugzilla.kernel.org/show_bug.cgi?id=196397
Due to https://git.kernel.org/torvalds/c/a36afb0ab6
Status: Takashi proposed a patch that fixes the issue
Latest discussion: https://bugzilla.kernel.org/show_bug.cgi?id=196397#c6
(2017-07-17)

[I945GM] Pasted text not shown after mouse middle-click
2017-07-17 lr#d672f3 https://bugs.freedesktop.org/show_bug.cgi?id=101819
Status: could not get reproduced yet
Notes: related to the regression that was fixed rc2+
https://bugs.freedesktop.org/show_bug.cgi?id=101790

[Dell xps13 9630] Could not be woken up from suspend-to-idle via usb
keyboard
2017-07-24 lr#bd29ab https://bugzilla.kernel.org/show_bug.cgi?id=196459
Due to https://git.kernel.org/torvalds/c/33e4f80ee6
Status: it's a tracking bug, looks like issue is handled by Intel devs
already
Notes: suspend-to-idle is rare

[lkp-robot] [Btrfs]  28785f70ef: xfstests.generic.273.fail
2017-07-26 lr#a7d273
https://lkml.kernel.org/r/20170726062352.GC4877@yexl-desktop Due to
https://git.kernel.org/torvalds/c/28785f70ef

Dell XPS 13 9360: Touchscreen does not report events
2017-07-28 lr#fe68bb https://bugzilla.kernel.org/show_bug.cgi?id=196519
Status: afaics waiting to get forwarded to linux-usb by reporter
Notes: might be the same as
https://bugzilla.kernel.org/show_bug.cgi?id=196431

Xen HVM guest with KASLR enabled wouldn't boot any longer
2017-07-28 https://lkml.kernel.org/r/20170728102314.29100-1-jgr...@suse.com
Status: WIP, patches up for review

NULL pointer deref in networking
2017-07-29 lr#084be9 https://bugzilla.kernel.org/show_bug.cgi?id=196529
Status: told reporter he might be better off posting to netdev


Re: Linux 4.13: Reported regressions as of Sunday, 2017-07-30

2017-07-30 Thread Andy Shevchenko
On Sun, Jul 30, 2017 at 4:49 PM, Thorsten Leemhuis
 wrote:
> Hi! Find below my first regression report for Linux 4.13. It lists 8
> regressions I'm currently aware of (a few others I had on my list got
> fixed in the past few days). You can also find it at
> http://bit.ly/lnxregrep413 where I try to update it every now and then.
>
> As always: Are you aware of any other regressions? Then please let me
> know. For details see http://bit.ly/lnxregtrackid
> And please tell me if there is anything in the report that shouldn't be
> there.

> P.S.: Thx to all those that CCed me on regression reports or provided
> other input, it makes compiling these reports a whole lot easier!

> Null dereference in rt5677_i2c_probe()
> 2017-07-17 lr#96bd63 https://bugzilla.kernel.org/show_bug.cgi?id=196397
> Due to https://git.kernel.org/torvalds/c/a36afb0ab6

> Status: Takashi proposed a patch that fixes the issue

Status is outdated as per discussion, Patch is available in upstream
as commit ddc9e69b9dc2.

> Latest discussion: https://bugzilla.kernel.org/show_bug.cgi?id=196397#c6
> (2017-07-17)

...while this one is correct link.

-- 
With Best Regards,
Andy Shevchenko


Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?

2017-07-30 Thread Boqun Feng
On Fri, Jul 28, 2017 at 12:09:56PM -0700, Paul E. McKenney wrote:
> On Fri, Jul 28, 2017 at 11:41:29AM -0700, Paul E. McKenney wrote:
> > On Fri, Jul 28, 2017 at 07:55:30AM -0700, Paul E. McKenney wrote:
> > > On Fri, Jul 28, 2017 at 08:54:16PM +0800, Boqun Feng wrote:
> 
> [ . . . ]
> 
> > > Even though Jonathan's testing indicates that it didn't fix this
> > > particular problem:
> > > 
> > > Acked-by: Paul E. McKenney 
> > 
> > And while we are at it:
> > 
> > Tested-by: Paul E. McKenney 
> 
> Not because it it fixed the TREE01 issue -- it did not.  But as near
> as I can see, it didn't cause any additional issues.
> 

Understood.

Still work on waketorture for a test case which could trigger this
problem in a real world. My old plan is to send this out when I could
use waketorture to show this patch actually resolves some potential
bugs, but just put it ahead here in case it may help.

Will send it out with your Tested-by and Acked-by and continue to work
on waketorture.

Regards,
Boqun

>   Thanx, Paul
> 


signature.asc
Description: PGP signature