Re: [PATCH net] ibmveth: Fix max MTU limit

2020-06-19 Thread David Miller
From: Thomas Falcon 
Date: Thu, 18 Jun 2020 10:43:46 -0500

> The max MTU limit defined for ibmveth is not accounting for
> virtual ethernet buffer overhead, which is twenty-two additional
> bytes set aside for the ethernet header and eight additional bytes
> of an opaque handle reserved for use by the hypervisor. Update the
> max MTU to reflect this overhead.
> 
> Signed-off-by: Thomas Falcon 

Applied with Fixes: tags added and queued up for -stable.

Thank you.


Re: [V2 PATCH 2/3] dt-bindings: chosen: Document ima-kexec-buffer

2020-06-19 Thread Thiago Jung Bauermann


Prakhar Srivastava  writes:

> Integrity measurement architecture(IMA) validates if files
> have been accidentally or maliciously altered, both remotely and
> locally, appraise a file's measurement against a "good" value stored
> as an extended attribute, and enforce local file integrity.
>
> IMA also measures singatures of kernel and initrd during kexec along with
> the command line used for kexec.
> These measurements are critical to verify the seccurity posture of the OS.
>
> Resering memory and adding the memory information to a device tree node
> acts as the mechanism to carry over IMA measurement logs.
>
> Update devicetree documentation to reflect the addition of new property
> under the chosen node.

Thank you for writing this documentation patch. It's something I should
have done when I added the powerpc IMA kexec support.

You addressed Rob Herring's comments regarding the commit message, but
not the ones regarding the patch contents.

When posting a new version of the patches, make sure to address all
comments made so far. Addressing a comment doesn't necessarily mean
implementing the requested change. If you don't then you should at least
explain why you chose a different path.

I mention it because this has occurred before with this patch series,
and it's hard to make forward progress if review comments get ignored.

> ---
>  Documentation/devicetree/bindings/chosen.txt | 17 +
>  1 file changed, 17 insertions(+)
>
> diff --git a/Documentation/devicetree/bindings/chosen.txt 
> b/Documentation/devicetree/bindings/chosen.txt
> index 45e79172a646..a15f70c007ef 100644
> --- a/Documentation/devicetree/bindings/chosen.txt
> +++ b/Documentation/devicetree/bindings/chosen.txt
> @@ -135,3 +135,20 @@ e.g.
>   linux,initrd-end = <0x8280>;
>   };
>  };
> +
> +linux,ima-kexec-buffer
> +--
> +
> +This property(currently used by powerpc, arm64) holds the memory range,

space before the parenthesis.

> +the address and the size, of the IMA measurement logs that are being carried

Maybe it's because English isn't my first language, but IMHO it's
clearer if "the address and the size" is between parentheses rather than
commas.

> +over to the kexec session.

I don't think there's a "kexec session", but I'm not sure what a good
term would be. "linux,booted-from-kexec" uses "new kernel" so perhaps
that's a good option to use instead of "kexec session".

> +
> +/ {
> + chosen {
> + linux,ima-kexec-buffer = <0x9 0x8200 0x0 0x8000>;
> + };
> +};
> +
> +This porperty does not represent real hardware, but the memory allocated for
> +carrying the IMA measurement logs. The address and the suze are expressed in
> +#address-cells and #size-cells, respectively of the root node.


--
Thiago Jung Bauermann
IBM Linux Technology Center


Re: [PATCH] powerpc/pseries: new lparcfg key/value pair: partition_affinity_score

2020-06-19 Thread Nathan Lynch
Hi Tyrel,

Tyrel Datwyler  writes:
> On 6/19/20 8:34 AM, Scott Cheloha wrote:
>> The H_GetPerformanceCounterInfo PHYP hypercall has a subcall,
>> Affinity_Domain_Info_By_Partition, which returns, among other things,
>> a "partition affinity score" for a given LPAR.  This score, a value on
>> [0-100], represents the processor-memory affinity for the LPAR in
>> question.  A score of 0 indicates the worst possible affinity while a
>> score of 100 indicates perfect affinity.  The score can be used to
>> reason about performance.
>> 
>> This patch adds the score for the local LPAR to the lparcfg procfile
>> under a new 'partition_affinity_score' key.
>
> I expect that you will probably get a NACK from Michael on this. The overall
> desire is to move away from these dated /proc interfaces. While its true that 
> I
> did add a new value recently it was strictly to facilitate and correct the
> calculation of a derived value that was already dependent on a couple other
> existing values in lparcfg.
>
> With that said I would expect that you would likely be advised to expose this 
> as
> a sysfs attribute. The question is where? We probably should put some thought 
> in
> to this as I would like to port each lparcfg value over to sysfs so that we 
> can
> move to deprecating lparcfg. Putting everything under something like
> /sys/kernel/lparcfg/* maybe. Michael may have a better suggestion.

I think this score fits pretty naturally in lparcfg: it's a simple
metric that is specific to the pseries/papr platform, like everything
else in there.

A few dozen key=value pairs contained in a single file is simple and
efficient, unlike sysfs with its rather inconsistently applied
one-value-per-file convention. Surely it's OK if lparcfg gains a line
every few years?


>> The H_GetPerformanceCounterInfo hypercall is already used elsewhere in
>> the kernel, in powerpc/perf/hv-gpci.c.  Refactoring that code and this
>> code into a more general API might be worthwhile if additional modules
>> require the hypercall in the future.
>
> If you are duplicating code its likely you should already be doing this. See 
> the
> rest of my comments about below.
>
>> 
>> Signed-off-by: Scott Cheloha 
>> ---
>>  arch/powerpc/platforms/pseries/lparcfg.c | 53 
>>  1 file changed, 53 insertions(+)
>> 
>> diff --git a/arch/powerpc/platforms/pseries/lparcfg.c 
>> b/arch/powerpc/platforms/pseries/lparcfg.c
>> index b8d28ab88178..b75151eee0f0 100644
>> --- a/arch/powerpc/platforms/pseries/lparcfg.c
>> +++ b/arch/powerpc/platforms/pseries/lparcfg.c
>> @@ -136,6 +136,57 @@ static unsigned int h_get_ppp(struct hvcall_ppp_data 
>> *ppp_data)
>>  return rc;
>>  }
>>  
>> +/*
>> + * Based on H_GetPerformanceCounterInfo v1.10.
>> + */
>> +static void show_gpci_data(struct seq_file *m)
>> +{
>> +struct perf_counter_info_params {
>> +__be32 counter_request;
>> +__be32 starting_index;
>> +__be16 secondary_index;
>> +__be16 returned_values;
>> +__be32 detail_rc;
>> +__be16 counter_value_element_size;
>> +u8 counter_info_version_in;
>> +u8 counter_info_version_out;
>> +u8 reserved[0xC];
>> +} __packed;
>
> This looks to duplicate the hv_get_perf_counter_info_params struct from
> arch/powerpc/perf/hv-gpci.h. Maybe this include file needs to move to
> arch/powerpc/asm/inlcude so you don't have to redefine this struct.
>
>> +struct hv_gpci_request_buffer {
>> +struct perf_counter_info_params params;
>> +u8 output[4096 - sizeof(struct perf_counter_info_params)];
>> +} __packed;
>
> This struct is code duplication of the one defined in
> arch/powerpc/perf/hv-gpci.c and could be moved into hv-gpci.h along with
> HGPCI_MAX_DATA_BYTES so that you can use those versions here.

I tend to agree with these comments.


>> +struct hv_gpci_request_buffer *buf;
>> +long ret;
>> +unsigned int affinity_score;
>> +
>> +buf = kmalloc(sizeof(*buf), GFP_KERNEL);
>> +if (buf == NULL)
>> +return;
>> +
>> +/*
>> + * Show the local LPAR's affinity score.
>> + *
>> + * 0xB1 selects the Affinity_Domain_Info_By_Partition subcall.
>> + * The score is at byte 0xB in the output buffer.
>> + */
>> +memset(>params, 0, sizeof(buf->params));
>> +buf->params.counter_request = cpu_to_be32(0xB1);
>> +buf->params.starting_index = cpu_to_be32(-1);   /* local LPAR */
>> +buf->params.counter_info_version_in = 0x5;  /* v5+ for score */
>> +ret = plpar_hcall_norets(H_GET_PERF_COUNTER_INFO, virt_to_phys(buf),
>> + sizeof(*buf));
>> +if (ret != H_SUCCESS) {
>> +pr_debug("hcall failed: H_GET_PERF_COUNTER_INFO: %ld, %x\n",
>> + ret, be32_to_cpu(buf->params.detail_rc));
>> +goto out;
>> +}
>> +affinity_score = buf->output[0xB];
>> +seq_printf(m, 

Re: [V2 PATCH 1/3] Refactoring powerpc code for carrying over IMA measurement logs, to move non architecture specific code to security/ima.

2020-06-19 Thread Thiago Jung Bauermann


Prakhar Srivastava  writes:

> Powerpc has support to carry over the IMA measurement logs. Refatoring the
> non-architecture specific code out of arch/powerpc and into security/ima.
>
> The code adds support for reserving and freeing up of memory for IMA 
> measurement
> logs.

Last week, Mimi provided this feedback:

"From your patch description, this patch should be broken up.  Moving
the non-architecture specific code out of powerpc should be one patch.
 Additional support should be in another patch.  After each patch, the
code should work properly."

That's not what you do here. You move the code, but you also make other
changes at the same time. This has two problems:

1. It makes the patch harder to review, because it's very easy to miss a
   change.

2. If in the future a git bisect later points to this patch, it's not
   clear whether the problem is because of the code movement, or because
   of the other changes.

When you move code, ideally the patch should only make the changes
necessary to make the code work at its new location. The patch which
does code movement should not cause any change in behavior.

Other changes should go in separate patches, either before or after the
one moving the code.

More comments below.

>
> ---
>  arch/powerpc/include/asm/ima.h |  10 ---
>  arch/powerpc/kexec/ima.c   | 126 ++---
>  security/integrity/ima/ima_kexec.c | 116 ++
>  3 files changed, 124 insertions(+), 128 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/ima.h b/arch/powerpc/include/asm/ima.h
> index ead488cf3981..c29ec86498f8 100644
> --- a/arch/powerpc/include/asm/ima.h
> +++ b/arch/powerpc/include/asm/ima.h
> @@ -4,15 +4,6 @@
>
>  struct kimage;
>
> -int ima_get_kexec_buffer(void **addr, size_t *size);
> -int ima_free_kexec_buffer(void);
> -
> -#ifdef CONFIG_IMA
> -void remove_ima_buffer(void *fdt, int chosen_node);
> -#else
> -static inline void remove_ima_buffer(void *fdt, int chosen_node) {}
> -#endif
> -
>  #ifdef CONFIG_IMA_KEXEC
>  int arch_ima_add_kexec_buffer(struct kimage *image, unsigned long load_addr,
> size_t size);
> @@ -22,7 +13,6 @@ int setup_ima_buffer(const struct kimage *image, void *fdt, 
> int chosen_node);
>  static inline int setup_ima_buffer(const struct kimage *image, void *fdt,
>  int chosen_node)
>  {
> - remove_ima_buffer(fdt, chosen_node);
>   return 0;
>  }

This is wrong. Even if the currently running kernel doesn't have
CONFIG_IMA_KEXEC, it should remove the IMA buffer property and memory
reservation from the FDT that is being prepared for the next kernel.

This is because the IMA kexec buffer is useless for the next kernel,
regardless of whether the current kernel supports CONFIG_IMA_KEXEC or
not. Keeping it around would be a waste of memory.

> @@ -179,13 +64,18 @@ int setup_ima_buffer(const struct kimage *image, void 
> *fdt, int chosen_node)
>   int ret, addr_cells, size_cells, entry_size;
>   u8 value[16];
>
> - remove_ima_buffer(fdt, chosen_node);

This is wrong, for the same reason stated above.

>   if (!image->arch.ima_buffer_size)
>   return 0;
>
> - ret = get_addr_size_cells(_cells, _cells);
> - if (ret)
> + ret = fdt_address_cells(fdt, chosen_node);
> + if (ret < 0)
> + return ret;
> + addr_cells = ret;
> +
> + ret = fdt_size_cells(fdt, chosen_node);
> + if (ret < 0)
>   return ret;
> + size_cells = ret;
>
>   entry_size = 4 * (addr_cells + size_cells);
>

I liked this change. Thanks! I agree it's better to use
fdt_address_cells() and fdt_size_cells() here.

But it should be in a separate patch. Either before or after the one
moving the code.

> diff --git a/security/integrity/ima/ima_kexec.c 
> b/security/integrity/ima/ima_kexec.c
> index 121de3e04af2..e1e6d6154015 100644
> --- a/security/integrity/ima/ima_kexec.c
> +++ b/security/integrity/ima/ima_kexec.c
> @@ -10,8 +10,124 @@
>  #include 
>  #include 
>  #include 
> +#include 
> +#include 
> +#include 
>  #include "ima.h"
>
> +static int get_addr_size_cells(int *addr_cells, int *size_cells)
> +{
> + struct device_node *root;
> +
> + root = of_find_node_by_path("/");
> + if (!root)
> + return -EINVAL;
> +
> + *addr_cells = of_n_addr_cells(root);
> + *size_cells = of_n_size_cells(root);
> +
> + of_node_put(root);
> +
> + return 0;
> +}
> +
> +static int do_get_kexec_buffer(const void *prop, int len, unsigned long 
> *addr,
> +size_t *size)
> +{
> + int ret, addr_cells, size_cells;
> +
> + ret = get_addr_size_cells(_cells, _cells);
> + if (ret)
> + return ret;
> +
> + if (len < 4 * (addr_cells + size_cells))
> + return -ENOENT;
> +
> + *addr = of_read_number(prop, addr_cells);
> + *size = of_read_number(prop + 4 * addr_cells, size_cells);
> +
> + return 0;
> +}
> +
> 

[PATCH v3 4/4] KVM: PPC: Book3S HV: migrate hot plugged memory

2020-06-19 Thread Ram Pai
From: Laurent Dufour 

When a memory slot is hot plugged to a SVM, PFNs associated with the
GFNs in that slot must be migrated to the secure-PFNs, aka device-PFNs.

kvmppc_uv_migrate_mem_slot() is called to accomplish this. UV_PAGE_IN
ucall is skipped, since the ultravisor does not trust the content of
those pages and hence ignores it.

Signed-off-by: Laurent Dufour 
Signed-off-by: Ram Pai 
[resolved conflicts, and modified the commit log]
---
 arch/powerpc/kvm/book3s_hv.c | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 6717d24..fcea41c 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -4531,10 +4531,12 @@ static void kvmppc_core_commit_memory_region_hv(struct 
kvm *kvm,
case KVM_MR_CREATE:
if (kvmppc_uvmem_slot_init(kvm, new))
return;
-   uv_register_mem_slot(kvm->arch.lpid,
-new->base_gfn << PAGE_SHIFT,
-new->npages * PAGE_SIZE,
-0, new->id);
+   if (uv_register_mem_slot(kvm->arch.lpid,
+new->base_gfn << PAGE_SHIFT,
+new->npages * PAGE_SIZE,
+0, new->id))
+   return;
+   kvmppc_uv_migrate_mem_slot(kvm, new);
break;
case KVM_MR_DELETE:
uv_unregister_mem_slot(kvm->arch.lpid, old->id);
-- 
1.8.3.1



[PATCH v3 3/4] KVM: PPC: Book3S HV: migrate remaining normal-GFNs to secure-GFNs in H_SVM_INIT_DONE

2020-06-19 Thread Ram Pai
H_SVM_INIT_DONE incorrectly assumes that the Ultravisor has explicitly
called H_SVM_PAGE_IN for all secure pages. These GFNs continue to be
normal GFNs associated with normal PFNs; when infact, these GFNs should
have been secure GFNs, associated with device PFNs.

Move all the PFNs associated with the SVM's GFNs, to secure-PFNs, in
H_SVM_INIT_DONE. Skip the GFNs that are already Paged-in or Shared
through H_SVM_PAGE_IN, or Paged-in followed by a Paged-out through
UV_PAGE_OUT.

Cc: Paul Mackerras 
Cc: Benjamin Herrenschmidt 
Cc: Michael Ellerman 
Cc: Bharata B Rao 
Cc: Aneesh Kumar K.V 
Cc: Sukadev Bhattiprolu 
Cc: Laurent Dufour 
Cc: Thiago Jung Bauermann 
Cc: David Gibson 
Cc: Claudio Carvalho 
Cc: kvm-...@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Ram Pai 
---
 Documentation/powerpc/ultravisor.rst|   2 +
 arch/powerpc/include/asm/kvm_book3s_uvmem.h |   2 +
 arch/powerpc/kvm/book3s_hv_uvmem.c  | 154 +++-
 3 files changed, 132 insertions(+), 26 deletions(-)

diff --git a/Documentation/powerpc/ultravisor.rst 
b/Documentation/powerpc/ultravisor.rst
index 363736d..3bc8957 100644
--- a/Documentation/powerpc/ultravisor.rst
+++ b/Documentation/powerpc/ultravisor.rst
@@ -933,6 +933,8 @@ Return values
* H_UNSUPPORTED if called from the wrong context (e.g.
from an SVM or before an H_SVM_INIT_START
hypercall).
+   * H_STATE   if the hypervisor could not successfully
+transition the VM to Secure VM.
 
 Description
 ~~~
diff --git a/arch/powerpc/include/asm/kvm_book3s_uvmem.h 
b/arch/powerpc/include/asm/kvm_book3s_uvmem.h
index 5a9834e..b9cd7eb 100644
--- a/arch/powerpc/include/asm/kvm_book3s_uvmem.h
+++ b/arch/powerpc/include/asm/kvm_book3s_uvmem.h
@@ -22,6 +22,8 @@ unsigned long kvmppc_h_svm_page_out(struct kvm *kvm,
 unsigned long kvmppc_h_svm_init_abort(struct kvm *kvm);
 void kvmppc_uvmem_drop_pages(const struct kvm_memory_slot *free,
 struct kvm *kvm, bool skip_page_out);
+int kvmppc_uv_migrate_mem_slot(struct kvm *kvm,
+   const struct kvm_memory_slot *memslot);
 #else
 static inline int kvmppc_uvmem_init(void)
 {
diff --git a/arch/powerpc/kvm/book3s_hv_uvmem.c 
b/arch/powerpc/kvm/book3s_hv_uvmem.c
index c8c0290..449e8a7 100644
--- a/arch/powerpc/kvm/book3s_hv_uvmem.c
+++ b/arch/powerpc/kvm/book3s_hv_uvmem.c
@@ -93,6 +93,7 @@
 #include 
 #include 
 #include 
+#include 
 
 static struct dev_pagemap kvmppc_uvmem_pgmap;
 static unsigned long *kvmppc_uvmem_bitmap;
@@ -339,6 +340,21 @@ static bool kvmppc_gfn_is_uvmem_pfn(unsigned long gfn, 
struct kvm *kvm,
return false;
 }
 
+/* return true, if the GFN is a shared-GFN, or a secure-GFN */
+bool kvmppc_gfn_has_transitioned(unsigned long gfn, struct kvm *kvm)
+{
+   struct kvmppc_uvmem_slot *p;
+
+   list_for_each_entry(p, >arch.uvmem_pfns, list) {
+   if (gfn >= p->base_pfn && gfn < p->base_pfn + p->nr_pfns) {
+   unsigned long index = gfn - p->base_pfn;
+
+   return (p->pfns[index] & KVMPPC_GFN_FLAG_MASK);
+   }
+   }
+   return false;
+}
+
 unsigned long kvmppc_h_svm_init_start(struct kvm *kvm)
 {
struct kvm_memslots *slots;
@@ -379,12 +395,31 @@ unsigned long kvmppc_h_svm_init_start(struct kvm *kvm)
 
 unsigned long kvmppc_h_svm_init_done(struct kvm *kvm)
 {
+   struct kvm_memslots *slots;
+   struct kvm_memory_slot *memslot;
+   int srcu_idx;
+   long ret = H_SUCCESS;
+
if (!(kvm->arch.secure_guest & KVMPPC_SECURE_INIT_START))
return H_UNSUPPORTED;
 
+   /* migrate any unmoved normal pfn to device pfns*/
+   srcu_idx = srcu_read_lock(>srcu);
+   slots = kvm_memslots(kvm);
+   kvm_for_each_memslot(memslot, slots) {
+   ret = kvmppc_uv_migrate_mem_slot(kvm, memslot);
+   if (ret) {
+   ret = H_STATE;
+   goto out;
+   }
+   }
+
kvm->arch.secure_guest |= KVMPPC_SECURE_INIT_DONE;
pr_info("LPID %d went secure\n", kvm->arch.lpid);
-   return H_SUCCESS;
+
+out:
+   srcu_read_unlock(>srcu, srcu_idx);
+   return ret;
 }
 
 /*
@@ -505,12 +540,14 @@ static struct page *kvmppc_uvmem_get_page(unsigned long 
gpa, struct kvm *kvm)
 }
 
 /*
- * Alloc a PFN from private device memory pool and copy page from normal
- * memory to secure memory using UV_PAGE_IN uvcall.
+ * Alloc a PFN from private device memory pool. If @pagein is true,
+ * copy page from normal memory to secure memory using UV_PAGE_IN uvcall.
  */
-static int kvmppc_svm_page_in(struct vm_area_struct *vma, unsigned long start,
-  unsigned long end, unsigned long gpa, struct kvm *kvm,
-  unsigned long page_shift, bool *downgrade)
+static int kvmppc_svm_migrate_page(struct vm_area_struct 

[PATCH v3 2/4] KVM: PPC: Book3S HV: track the state GFNs associated with secure VMs

2020-06-19 Thread Ram Pai
During the life of SVM, its GFNs transition through normal, secure and
shared states. Since the kernel does not track GFNs that are shared, it
is not possible to disambiguate a shared GFN from a GFN whose PFN has
not yet been migrated to a secure-PFN. Also it is not possible to
disambiguate a secure-GFN from a GFN whose GFN has been pagedout from
the ultravisor.

The ability to identify the state of a GFN is needed to skip migration
of its PFN to secure-PFN during ESM transition.

The code is re-organized to track the states of a GFN as explained
below.


 1. States of a GFN
---
 The GFN can be in one of the following states.

 (a) Secure - The GFN is secure. The GFN is associated with
a Secure VM, the contents of the GFN is not accessible
to the Hypervisor.  This GFN can be backed by a secure-PFN,
or can be backed by a normal-PFN with contents encrypted.
The former is true when the GFN is paged-in into the
ultravisor. The latter is true when the GFN is paged-out
of the ultravisor.

 (b) Shared - The GFN is shared. The GFN is associated with a
a secure VM. The contents of the GFN is accessible to
Hypervisor. This GFN is backed by a normal-PFN and its
content is un-encrypted.

 (c) Normal - The GFN is a normal. The GFN is associated with
a normal VM. The contents of the GFN is accesible to
the Hypervisor. Its content is never encrypted.

 2. States of a VM.
---

 (a) Normal VM:  A VM whose contents are always accessible to
the hypervisor.  All its GFNs are normal-GFNs.

 (b) Secure VM: A VM whose contents are not accessible to the
hypervisor without the VM's consent.  Its GFNs are
either Shared-GFN or Secure-GFNs.

 (c) Transient VM: A Normal VM that is transitioning to secure VM.
The transition starts on successful return of
H_SVM_INIT_START, and ends on successful return
of H_SVM_INIT_DONE. This transient VM, can have GFNs
in any of the three states; i.e Secure-GFN, Shared-GFN,
and Normal-GFN. The VM never executes in this state
in supervisor-mode.

 3. Memory slot State.
--
The state of a memory slot mirrors the state of the
VM the memory slot is associated with.

 4. VM State transition.


  A VM always starts in Normal Mode.

  H_SVM_INIT_START moves the VM into transient state. During this
  time the Ultravisor may request some of its GFNs to be shared or
  secured. So its GFNs can be in one of the three GFN states.

  H_SVM_INIT_DONE moves the VM entirely from transient state to
  secure-state. At this point any left-over normal-GFNs are
  transitioned to Secure-GFN.

  H_SVM_INIT_ABORT moves the transient VM back to normal VM.
  All its GFNs are moved to Normal-GFNs.

  UV_TERMINATE transitions the secure-VM back to normal-VM. All
  the secure-GFN and shared-GFNs are tranistioned to normal-GFN
  Note: The contents of the normal-GFN is undefined at this point.

 5. GFN state implementation:
-

 Secure GFN is associated with a secure-PFN; also called uvmem_pfn,
 when the GFN is paged-in. Its pfn[] has KVMPPC_GFN_UVMEM_PFN flag
 set, and contains the value of the secure-PFN.
 It is associated with a normal-PFN; also called mem_pfn, when
 the GFN is pagedout. Its pfn[] has KVMPPC_GFN_MEM_PFN flag set.
 The value of the normal-PFN is not tracked.

 Shared GFN is associated with a normal-PFN. Its pfn[] has
 KVMPPC_UVMEM_SHARED_PFN flag set. The value of the normal-PFN
 is not tracked.

 Normal GFN is associated with normal-PFN. Its pfn[] has
 no flag set. The value of the normal-PFN is not tracked.

 6. Life cycle of a GFN

 --
 || Share  |  Unshare | SVM   |H_SVM_INIT_DONE|
 ||operation   |operation | abort/|   |
 |||  | terminate |   |
 -
 |||  |   |   |
 | Secure | Shared | Secure   |Normal |Secure |
 |||  |   |   |
 | Shared | Shared | Secure   |Normal |Shared |
 |||  |   |   |
 | Normal | Shared | Secure   |Normal |Secure |
 --

 7. Life cycle of a VM

 
 | |  start|  H_SVM_  |H_SVM_   |H_SVM_ |UV_SVM_|
 | |  VM   |INIT_START|INIT_DONE|INIT_ABORT |TERMINATE  |
 | |   |  | |   |   |
 - 

[PATCH v3 1/4] KVM: PPC: Book3S HV: Fix function definition in book3s_hv_uvmem.c

2020-06-19 Thread Ram Pai
Without this fix, git is confused. It generates wrong
function context for code changes in subsequent patches.
Weird, but true.

Cc: Paul Mackerras 
Cc: Benjamin Herrenschmidt 
Cc: Michael Ellerman 
Cc: Bharata B Rao 
Cc: Aneesh Kumar K.V 
Cc: Sukadev Bhattiprolu 
Cc: Laurent Dufour 
Cc: Thiago Jung Bauermann 
Cc: David Gibson 
Cc: Claudio Carvalho 
Cc: kvm-...@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Ram Pai 
---
 arch/powerpc/kvm/book3s_hv_uvmem.c | 21 ++---
 1 file changed, 10 insertions(+), 11 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv_uvmem.c 
b/arch/powerpc/kvm/book3s_hv_uvmem.c
index ad950f89..3599aaa 100644
--- a/arch/powerpc/kvm/book3s_hv_uvmem.c
+++ b/arch/powerpc/kvm/book3s_hv_uvmem.c
@@ -369,8 +369,7 @@ static struct page *kvmppc_uvmem_get_page(unsigned long 
gpa, struct kvm *kvm)
  * Alloc a PFN from private device memory pool and copy page from normal
  * memory to secure memory using UV_PAGE_IN uvcall.
  */
-static int
-kvmppc_svm_page_in(struct vm_area_struct *vma, unsigned long start,
+static int kvmppc_svm_page_in(struct vm_area_struct *vma, unsigned long start,
   unsigned long end, unsigned long gpa, struct kvm *kvm,
   unsigned long page_shift, bool *downgrade)
 {
@@ -437,8 +436,8 @@ static struct page *kvmppc_uvmem_get_page(unsigned long 
gpa, struct kvm *kvm)
  * In the former case, uses dev_pagemap_ops.migrate_to_ram handler
  * to unmap the device page from QEMU's page tables.
  */
-static unsigned long
-kvmppc_share_page(struct kvm *kvm, unsigned long gpa, unsigned long page_shift)
+static unsigned long kvmppc_share_page(struct kvm *kvm, unsigned long gpa,
+   unsigned long page_shift)
 {
 
int ret = H_PARAMETER;
@@ -487,9 +486,9 @@ static struct page *kvmppc_uvmem_get_page(unsigned long 
gpa, struct kvm *kvm)
  * H_PAGE_IN_SHARED flag makes the page shared which means that the same
  * memory in is visible from both UV and HV.
  */
-unsigned long
-kvmppc_h_svm_page_in(struct kvm *kvm, unsigned long gpa,
-unsigned long flags, unsigned long page_shift)
+unsigned long kvmppc_h_svm_page_in(struct kvm *kvm, unsigned long gpa,
+   unsigned long flags,
+   unsigned long page_shift)
 {
bool downgrade = false;
unsigned long start, end;
@@ -546,10 +545,10 @@ static struct page *kvmppc_uvmem_get_page(unsigned long 
gpa, struct kvm *kvm)
  * Provision a new page on HV side and copy over the contents
  * from secure memory using UV_PAGE_OUT uvcall.
  */
-static int
-kvmppc_svm_page_out(struct vm_area_struct *vma, unsigned long start,
-   unsigned long end, unsigned long page_shift,
-   struct kvm *kvm, unsigned long gpa)
+static int kvmppc_svm_page_out(struct vm_area_struct *vma,
+   unsigned long start,
+   unsigned long end, unsigned long page_shift,
+   struct kvm *kvm, unsigned long gpa)
 {
unsigned long src_pfn, dst_pfn = 0;
struct migrate_vma mig;
-- 
1.8.3.1



[PATCH v3 0/4] Migrate non-migrated pages of a SVM.

2020-06-19 Thread Ram Pai
The time taken to switch a VM to Secure-VM, increases by the size of the VM.  A
100GB VM takes about 7minutes. This is unacceptable.  This linear increase is
caused by a suboptimal behavior by the Ultravisor and the Hypervisor.  The
Ultravisor unnecessarily migrates all the GFN of the VM from normal-memory to
secure-memory. It has to just migrate the necessary and sufficient GFNs.

However when the optimization is incorporated in the Ultravisor, the Hypervisor
starts misbehaving. The Hypervisor has a inbuilt assumption that the Ultravisor
will explicitly request to migrate, each and every GFN of the VM. If only
necessary and sufficient GFNs are requested for migration, the Hypervisor
continues to manage the remaining GFNs as normal GFNs. This leads of memory
corruption, manifested consistently when the SVM reboots.

The same is true, when a memory slot is hotplugged into a SVM. The Hypervisor
expects the ultravisor to request migration of all GFNs to secure-GFN.  But at
the same time, the hypervisor is unable to handle any H_SVM_PAGE_IN requests
from the Ultravisor, done in the context of UV_REGISTER_MEM_SLOT ucall.  This
problem manifests as random errors in the SVM, when a memory-slot is
hotplugged.

This patch series automatically migrates the non-migrated pages of a SVM,
 and thus solves the problem.

Testing: Passed rigorous SVM test using various sized SVMs.

Changelog:

v2: . fixed a bug observed by Laurent. The state of the GFN's associated
with Secure-VMs were not reset during memslot flush.
. Re-organized the code, for easier review.
. Better description of the patch series.

v1: fixed a bug observed by Bharata. Pages that where paged-in and later
paged-out must also be skipped from migration during H_SVM_INIT_DONE.

Laurent Dufour (1):
  KVM: PPC: Book3S HV: migrate hot plugged memory

Ram Pai (3):
  KVM: PPC: Book3S HV: Fix function definition in book3s_hv_uvmem.c
  KVM: PPC: Book3S HV: track the state GFNs associated with secure VMs
  KVM: PPC: Book3S HV: migrate remaining normal-GFNs to secure-GFNs in
H_SVM_INIT_DONE

 Documentation/powerpc/ultravisor.rst|   2 +
 arch/powerpc/include/asm/kvm_book3s_uvmem.h |   2 +
 arch/powerpc/kvm/book3s_hv.c|  10 +-
 arch/powerpc/kvm/book3s_hv_uvmem.c  | 360 +++-
 4 files changed, 315 insertions(+), 59 deletions(-)

-- 
1.8.3.1



Re: [PATCH 6/6] kernel: add a kernel_wait helper

2020-06-19 Thread Luis Chamberlain
On Thu, Jun 18, 2020 at 04:46:27PM +0200, Christoph Hellwig wrote:
> --- a/kernel/exit.c
> +++ b/kernel/exit.c
> @@ -1626,6 +1626,22 @@ long kernel_wait4(pid_t upid, int __user *stat_addr, 
> int options,
>   return ret;
>  }
>  
> +int kernel_wait(pid_t pid, int *stat)
> +{
> + struct wait_opts wo = {
> + .wo_type= PIDTYPE_PID,
> + .wo_pid = find_get_pid(pid),
> + .wo_flags   = WEXITED,
> + };
> + int ret;
> +
> + ret = do_wait();
> + if (ret > 0 && wo.wo_stat)
> + *stat = wo.wo_stat;

Since all we care about is WEXITED, that could be simplified
to something like this:

if (ret > 0 && KWIFEXITED(wo.wo_stat)
*stat = KWEXITSTATUS(wo.wo_stat)

Otherwise callers have to use W*() wrappers.

> + put_pid(wo.wo_pid);
> + return ret;
> +}

Then we don't get *any* in-kernel code dealing with the W*() crap.
I just unwrapped this for the umh [0], given that otherwise we'd
have to use KW*() callers elsewhere. Doing it upshot one level
further would be even better.

[0] https://lkml.kernel.org/r/20200610154923.27510-1-mcg...@kernel.org  


  Luis


Re: [PATCH] powerpc/pseries: new lparcfg key/value pair: partition_affinity_score

2020-06-19 Thread Tyrel Datwyler
On 6/19/20 8:34 AM, Scott Cheloha wrote:
> The H_GetPerformanceCounterInfo PHYP hypercall has a subcall,
> Affinity_Domain_Info_By_Partition, which returns, among other things,
> a "partition affinity score" for a given LPAR.  This score, a value on
> [0-100], represents the processor-memory affinity for the LPAR in
> question.  A score of 0 indicates the worst possible affinity while a
> score of 100 indicates perfect affinity.  The score can be used to
> reason about performance.
> 
> This patch adds the score for the local LPAR to the lparcfg procfile
> under a new 'partition_affinity_score' key.

I expect that you will probably get a NACK from Michael on this. The overall
desire is to move away from these dated /proc interfaces. While its true that I
did add a new value recently it was strictly to facilitate and correct the
calculation of a derived value that was already dependent on a couple other
existing values in lparcfg.

With that said I would expect that you would likely be advised to expose this as
a sysfs attribute. The question is where? We probably should put some thought in
to this as I would like to port each lparcfg value over to sysfs so that we can
move to deprecating lparcfg. Putting everything under something like
/sys/kernel/lparcfg/* maybe. Michael may have a better suggestion.

> 
> The H_GetPerformanceCounterInfo hypercall is already used elsewhere in
> the kernel, in powerpc/perf/hv-gpci.c.  Refactoring that code and this
> code into a more general API might be worthwhile if additional modules
> require the hypercall in the future.

If you are duplicating code its likely you should already be doing this. See the
rest of my comments about below.

> 
> Signed-off-by: Scott Cheloha 
> ---
>  arch/powerpc/platforms/pseries/lparcfg.c | 53 
>  1 file changed, 53 insertions(+)
> 
> diff --git a/arch/powerpc/platforms/pseries/lparcfg.c 
> b/arch/powerpc/platforms/pseries/lparcfg.c
> index b8d28ab88178..b75151eee0f0 100644
> --- a/arch/powerpc/platforms/pseries/lparcfg.c
> +++ b/arch/powerpc/platforms/pseries/lparcfg.c
> @@ -136,6 +136,57 @@ static unsigned int h_get_ppp(struct hvcall_ppp_data 
> *ppp_data)
>   return rc;
>  }
>  
> +/*
> + * Based on H_GetPerformanceCounterInfo v1.10.
> + */
> +static void show_gpci_data(struct seq_file *m)
> +{
> + struct perf_counter_info_params {
> + __be32 counter_request;
> + __be32 starting_index;
> + __be16 secondary_index;
> + __be16 returned_values;
> + __be32 detail_rc;
> + __be16 counter_value_element_size;
> + u8 counter_info_version_in;
> + u8 counter_info_version_out;
> + u8 reserved[0xC];
> + } __packed;

This looks to duplicate the hv_get_perf_counter_info_params struct from
arch/powerpc/perf/hv-gpci.h. Maybe this include file needs to move to
arch/powerpc/asm/inlcude so you don't have to redefine this struct.

> + struct hv_gpci_request_buffer {
> + struct perf_counter_info_params params;
> + u8 output[4096 - sizeof(struct perf_counter_info_params)];
> + } __packed;

This struct is code duplication of the one defined in
arch/powerpc/perf/hv-gpci.c and could be moved into hv-gpci.h along with
HGPCI_MAX_DATA_BYTES so that you can use those versions here.

> + struct hv_gpci_request_buffer *buf;
> + long ret;
> + unsigned int affinity_score;
> +
> + buf = kmalloc(sizeof(*buf), GFP_KERNEL);
> + if (buf == NULL)
> + return;
> +
> + /*
> +  * Show the local LPAR's affinity score.
> +  *
> +  * 0xB1 selects the Affinity_Domain_Info_By_Partition subcall.
> +  * The score is at byte 0xB in the output buffer.
> +  */
> + memset(>params, 0, sizeof(buf->params));
> + buf->params.counter_request = cpu_to_be32(0xB1);
> + buf->params.starting_index = cpu_to_be32(-1);   /* local LPAR */
> + buf->params.counter_info_version_in = 0x5;  /* v5+ for score */
> + ret = plpar_hcall_norets(H_GET_PERF_COUNTER_INFO, virt_to_phys(buf),
> +  sizeof(*buf));
> + if (ret != H_SUCCESS) {
> + pr_debug("hcall failed: H_GET_PERF_COUNTER_INFO: %ld, %x\n",
> +  ret, be32_to_cpu(buf->params.detail_rc));
> + goto out;
> + }
> + affinity_score = buf->output[0xB];
> + seq_printf(m, "partition_affinity_score=%u\n", affinity_score);
> +out:
> + kfree(buf);
> +}
> +

IIUC we should already be able to get this value from userspace using perf tool,
right? If thats the case can't we also programatically retrieve it via the
perf_event interface in userspace as well?

-Tyrel

>  static unsigned h_pic(unsigned long *pool_idle_time,
> unsigned long *num_procs)
>  {
> @@ -487,6 +538,8 @@ static int pseries_lparcfg_data(struct seq_file *m, void 
> *v)
>  partition_active_processors * 100);
>  

Re: [PATCH 00/22] ReST conversion patches (final?)

2020-06-19 Thread Jonathan Corbet
On Mon, 15 Jun 2020 08:50:05 +0200
Mauro Carvalho Chehab  wrote:

> That's my final(*) series of conversion patches from .txt to ReST.
> 
> (*) Well, running the script I'm using to check, I noticed a couple of new 
> *.txt files.
> If I have some time, I'll try to address those last pending things for v5.9.

OK, I've applied the set except for parts:

 1: |copy| as mentioned before
 18: because of the license boilerplate
 19: doesn't apply at all (perhaps because of one of the above)
 22: because I don't like the latex markup.

Also, I took the liberty of just reverting the |copy| change in #10.

Getting there..!

Thanks,

jon


Re: [PATCH] pci: pcie: AER: Fix logging of Correctable errors

2020-06-19 Thread Joe Perches
On Fri, 2020-06-19 at 13:17 -0400, Sinan Kaya wrote:
> On 6/18/2020 11:55 AM, Matt Jolly wrote:
> 
> > +   pci_warn(dev, "  device [%04x:%04x] error 
> > status/mask=%08x/%08x\n",
> > +   dev->vendor, dev->device,
> > +   info->status, info->mask);
> > +   } else {
> 
> 
> 
> > +   pci_err(dev, "  device [%04x:%04x] error 
> > status/mask=%08x/%08x\n",
> > +   dev->vendor, dev->device,
> > +   info->status, info->mask);
> 
> Function pointers for pci_warn vs. pci_err ?

Not really possible as both are function-like macros.




[PATCH] hvc: unify console setup naming

2020-06-19 Thread Sergey Senozhatsky
Use the 'common' foo_console_setup() naming scheme. There are 71
foo_console_setup() callbacks and only one foo_setup_console().

Signed-off-by: Sergey Senozhatsky 
Cc: Andy Shevchenko 
Cc: Steven Rostedt 
Cc: Greg Kroah-Hartman 
Cc: Jiri Slaby 
---
 drivers/tty/hvc/hvc_xen.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/tty/hvc/hvc_xen.c b/drivers/tty/hvc/hvc_xen.c
index 5ef08905fe05..2a0e51a20e34 100644
--- a/drivers/tty/hvc/hvc_xen.c
+++ b/drivers/tty/hvc/hvc_xen.c
@@ -603,7 +603,7 @@ static void xen_hvm_early_write(uint32_t vtermno, const 
char *str, int len) { }
 #endif
 
 #ifdef CONFIG_EARLY_PRINTK
-static int __init xenboot_setup_console(struct console *console, char *string)
+static int __init xenboot_console_setup(struct console *console, char *string)
 {
static struct xencons_info xenboot;
 
@@ -647,7 +647,7 @@ static void xenboot_write_console(struct console *console, 
const char *string,
 struct console xenboot_console = {
.name   = "xenboot",
.write  = xenboot_write_console,
-   .setup  = xenboot_setup_console,
+   .setup  = xenboot_console_setup,
.flags  = CON_PRINTBUFFER | CON_BOOT | CON_ANYTIME,
.index  = -1,
 };
-- 
2.27.0



Re: [PATCH] pci: pcie: AER: Fix logging of Correctable errors

2020-06-19 Thread Sinan Kaya
On 6/18/2020 11:55 AM, Matt Jolly wrote:

> + pci_warn(dev, "  device [%04x:%04x] error 
> status/mask=%08x/%08x\n",
> + dev->vendor, dev->device,
> + info->status, info->mask);
> + } else {



> + pci_err(dev, "  device [%04x:%04x] error 
> status/mask=%08x/%08x\n",
> + dev->vendor, dev->device,
> + info->status, info->mask);


Function pointers for pci_warn vs. pci_err ?

This looks like a lot of copy/paste.


[PATCH 16/26] mm/powerpc: Use general page fault accounting

2020-06-19 Thread Peter Xu
Use the general page fault accounting by passing regs into handle_mm_fault().

CC: Michael Ellerman 
CC: Benjamin Herrenschmidt 
CC: Paul Mackerras 
CC: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Peter Xu 
---
 arch/powerpc/mm/fault.c | 11 +++
 1 file changed, 3 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
index 992b10c3761c..e325d13efaf5 100644
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@@ -563,7 +563,7 @@ static int __do_page_fault(struct pt_regs *regs, unsigned 
long address,
 * make sure we exit gracefully rather than endlessly redo
 * the fault.
 */
-   fault = handle_mm_fault(vma, address, flags, NULL);
+   fault = handle_mm_fault(vma, address, flags, regs);
 
 #ifdef CONFIG_PPC_MEM_KEYS
/*
@@ -604,14 +604,9 @@ static int __do_page_fault(struct pt_regs *regs, unsigned 
long address,
/*
 * Major/minor page fault accounting.
 */
-   if (major) {
-   current->maj_flt++;
-   perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MAJ, 1, regs, address);
+   if (major)
cmo_account_page_fault();
-   } else {
-   current->min_flt++;
-   perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MIN, 1, regs, address);
-   }
+
return 0;
 }
 NOKPROBE_SYMBOL(__do_page_fault);
-- 
2.26.2



[PATCH] powerpc/pseries: new lparcfg key/value pair: partition_affinity_score

2020-06-19 Thread Scott Cheloha
The H_GetPerformanceCounterInfo PHYP hypercall has a subcall,
Affinity_Domain_Info_By_Partition, which returns, among other things,
a "partition affinity score" for a given LPAR.  This score, a value on
[0-100], represents the processor-memory affinity for the LPAR in
question.  A score of 0 indicates the worst possible affinity while a
score of 100 indicates perfect affinity.  The score can be used to
reason about performance.

This patch adds the score for the local LPAR to the lparcfg procfile
under a new 'partition_affinity_score' key.

The H_GetPerformanceCounterInfo hypercall is already used elsewhere in
the kernel, in powerpc/perf/hv-gpci.c.  Refactoring that code and this
code into a more general API might be worthwhile if additional modules
require the hypercall in the future.

Signed-off-by: Scott Cheloha 
---
 arch/powerpc/platforms/pseries/lparcfg.c | 53 
 1 file changed, 53 insertions(+)

diff --git a/arch/powerpc/platforms/pseries/lparcfg.c 
b/arch/powerpc/platforms/pseries/lparcfg.c
index b8d28ab88178..b75151eee0f0 100644
--- a/arch/powerpc/platforms/pseries/lparcfg.c
+++ b/arch/powerpc/platforms/pseries/lparcfg.c
@@ -136,6 +136,57 @@ static unsigned int h_get_ppp(struct hvcall_ppp_data 
*ppp_data)
return rc;
 }
 
+/*
+ * Based on H_GetPerformanceCounterInfo v1.10.
+ */
+static void show_gpci_data(struct seq_file *m)
+{
+   struct perf_counter_info_params {
+   __be32 counter_request;
+   __be32 starting_index;
+   __be16 secondary_index;
+   __be16 returned_values;
+   __be32 detail_rc;
+   __be16 counter_value_element_size;
+   u8 counter_info_version_in;
+   u8 counter_info_version_out;
+   u8 reserved[0xC];
+   } __packed;
+   struct hv_gpci_request_buffer {
+   struct perf_counter_info_params params;
+   u8 output[4096 - sizeof(struct perf_counter_info_params)];
+   } __packed;
+   struct hv_gpci_request_buffer *buf;
+   long ret;
+   unsigned int affinity_score;
+
+   buf = kmalloc(sizeof(*buf), GFP_KERNEL);
+   if (buf == NULL)
+   return;
+
+   /*
+* Show the local LPAR's affinity score.
+*
+* 0xB1 selects the Affinity_Domain_Info_By_Partition subcall.
+* The score is at byte 0xB in the output buffer.
+*/
+   memset(>params, 0, sizeof(buf->params));
+   buf->params.counter_request = cpu_to_be32(0xB1);
+   buf->params.starting_index = cpu_to_be32(-1);   /* local LPAR */
+   buf->params.counter_info_version_in = 0x5;  /* v5+ for score */
+   ret = plpar_hcall_norets(H_GET_PERF_COUNTER_INFO, virt_to_phys(buf),
+sizeof(*buf));
+   if (ret != H_SUCCESS) {
+   pr_debug("hcall failed: H_GET_PERF_COUNTER_INFO: %ld, %x\n",
+ret, be32_to_cpu(buf->params.detail_rc));
+   goto out;
+   }
+   affinity_score = buf->output[0xB];
+   seq_printf(m, "partition_affinity_score=%u\n", affinity_score);
+out:
+   kfree(buf);
+}
+
 static unsigned h_pic(unsigned long *pool_idle_time,
  unsigned long *num_procs)
 {
@@ -487,6 +538,8 @@ static int pseries_lparcfg_data(struct seq_file *m, void *v)
   partition_active_processors * 100);
}
 
+   show_gpci_data(m);
+
seq_printf(m, "partition_active_processors=%d\n",
   partition_active_processors);
 
-- 
2.24.1



[PATCH v1 6/8] powerpc/32s: Only leave NX unset on segments used for modules

2020-06-19 Thread Christophe Leroy
Instead of leaving NX unset on all segments above the start
of vmalloc space, only leave NX unset on segments used for
modules.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/mm/book3s32/mmu.c | 17 ++---
 1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/mm/book3s32/mmu.c b/arch/powerpc/mm/book3s32/mmu.c
index 03b6ba54460e..c0162911f6cb 100644
--- a/arch/powerpc/mm/book3s32/mmu.c
+++ b/arch/powerpc/mm/book3s32/mmu.c
@@ -187,6 +187,17 @@ unsigned long __init mmu_mapin_ram(unsigned long base, 
unsigned long top)
return __mmu_mapin_ram(border, top);
 }
 
+static bool is_module_segment(unsigned long addr)
+{
+   if (!IS_ENABLED(CONFIG_MODULES))
+   return false;
+   if (addr < ALIGN_DOWN(VMALLOC_START, SZ_256M))
+   return false;
+   if (addr >= ALIGN(VMALLOC_END, SZ_256M))
+   return false;
+   return true;
+}
+
 void mmu_mark_initmem_nx(void)
 {
int nb = mmu_has_feature(MMU_FTR_USE_HIGH_BATS) ? 8 : 4;
@@ -223,9 +234,9 @@ void mmu_mark_initmem_nx(void)
 
for (i = TASK_SIZE >> 28; i < 16; i++) {
/* Do not set NX on VM space for modules */
-   if (IS_ENABLED(CONFIG_MODULES) &&
-   (VMALLOC_START & 0xf000) == i << 28)
-   break;
+   if (is_module_segment(i << 28))
+   continue;
+
mtsrin(mfsrin(i << 28) | 0x1000, i << 28);
}
 }
-- 
2.25.0



[PATCH v1 5/8] powerpc: Use MODULES_VADDR if defined

2020-06-19 Thread Christophe Leroy
In order to allow allocation of modules outside of vmalloc space,
use MODULES_VADDR and MODULES_END when MODULES_VADDR is defined.

Redefine module_alloc() when MODULES_VADDR defined.
Unmap corresponding KASAN shadow memory.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/kernel/module.c  | 11 +++
 arch/powerpc/mm/kasan/kasan_init_32.c |  6 ++
 2 files changed, 17 insertions(+)

diff --git a/arch/powerpc/kernel/module.c b/arch/powerpc/kernel/module.c
index df649acb5631..a211b0253cdb 100644
--- a/arch/powerpc/kernel/module.c
+++ b/arch/powerpc/kernel/module.c
@@ -86,3 +86,14 @@ int module_finalize(const Elf_Ehdr *hdr,
 
return 0;
 }
+
+#ifdef MODULES_VADDR
+void *module_alloc(unsigned long size)
+{
+   BUILD_BUG_ON(TASK_SIZE > MODULES_VADDR);
+
+   return __vmalloc_node_range(size, 1, MODULES_VADDR, MODULES_END, 
GFP_KERNEL,
+   PAGE_KERNEL_EXEC, VM_FLUSH_RESET_PERMS, 
NUMA_NO_NODE,
+   __builtin_return_address(0));
+}
+#endif
diff --git a/arch/powerpc/mm/kasan/kasan_init_32.c 
b/arch/powerpc/mm/kasan/kasan_init_32.c
index 0760e1e754e4..f1bc267d42af 100644
--- a/arch/powerpc/mm/kasan/kasan_init_32.c
+++ b/arch/powerpc/mm/kasan/kasan_init_32.c
@@ -115,6 +115,12 @@ static void __init kasan_unmap_early_shadow_vmalloc(void)
unsigned long k_end = (unsigned long)kasan_mem_to_shadow((void 
*)VMALLOC_END);
 
kasan_update_early_region(k_start, k_end, __pte(0));
+
+#ifdef MODULES_VADDR
+   k_start = (unsigned long)kasan_mem_to_shadow((void *)MODULES_VADDR);
+   k_end = (unsigned long)kasan_mem_to_shadow((void *)MODULES_END);
+   kasan_update_early_region(k_start, k_end, __pte(0));
+#endif
 }
 
 static void __init kasan_mmu_init(void)
-- 
2.25.0



[PATCH v1 7/8] powerpc/32s: Kernel space starts at TASK_SIZE

2020-06-19 Thread Christophe Leroy
Kernel space starts at TASK_SIZE. Select kernel page table
when address is over TASK_SIZE.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/kernel/head_32.S   | 12 ++--
 arch/powerpc/mm/book3s32/hash_low.S |  2 +-
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/kernel/head_32.S b/arch/powerpc/kernel/head_32.S
index 705c042309d8..bbef6ce8322b 100644
--- a/arch/powerpc/kernel/head_32.S
+++ b/arch/powerpc/kernel/head_32.S
@@ -474,7 +474,7 @@ InstructionTLBMiss:
/* Get PTE (linux-style) and check access */
mfspr   r3,SPRN_IMISS
 #if defined(CONFIG_MODULES) || defined(CONFIG_DEBUG_PAGEALLOC)
-   lis r1,PAGE_OFFSET@h/* check if kernel address */
+   lis r1, TASK_SIZE@h /* check if kernel address */
cmplw   0,r1,r3
 #endif
mfspr   r2, SPRN_SPRG_PGDIR
@@ -484,7 +484,7 @@ InstructionTLBMiss:
li  r1,_PAGE_PRESENT | _PAGE_EXEC
 #endif
 #if defined(CONFIG_MODULES) || defined(CONFIG_DEBUG_PAGEALLOC)
-   bge-112f
+   bgt-112f
lis r2, (swapper_pg_dir - PAGE_OFFSET)@ha   /* if kernel address, 
use */
addir2, r2, (swapper_pg_dir - PAGE_OFFSET)@l/* kernel page 
table */
 #endif
@@ -541,7 +541,7 @@ DataLoadTLBMiss:
  */
/* Get PTE (linux-style) and check access */
mfspr   r3,SPRN_DMISS
-   lis r1,PAGE_OFFSET@h/* check if kernel address */
+   lis r1, TASK_SIZE@h /* check if kernel address */
cmplw   0,r1,r3
mfspr   r2, SPRN_SPRG_PGDIR
 #ifdef CONFIG_SWAP
@@ -549,7 +549,7 @@ DataLoadTLBMiss:
 #else
li  r1, _PAGE_PRESENT
 #endif
-   bge-112f
+   bgt-112f
lis r2, (swapper_pg_dir - PAGE_OFFSET)@ha   /* if kernel address, 
use */
addir2, r2, (swapper_pg_dir - PAGE_OFFSET)@l/* kernel page 
table */
 112:   rlwimi  r2,r3,12,20,29  /* insert top 10 bits of address */
@@ -621,7 +621,7 @@ DataStoreTLBMiss:
  */
/* Get PTE (linux-style) and check access */
mfspr   r3,SPRN_DMISS
-   lis r1,PAGE_OFFSET@h/* check if kernel address */
+   lis r1, TASK_SIZE@h /* check if kernel address */
cmplw   0,r1,r3
mfspr   r2, SPRN_SPRG_PGDIR
 #ifdef CONFIG_SWAP
@@ -629,7 +629,7 @@ DataStoreTLBMiss:
 #else
li  r1, _PAGE_RW | _PAGE_DIRTY | _PAGE_PRESENT
 #endif
-   bge-112f
+   bgt-112f
lis r2, (swapper_pg_dir - PAGE_OFFSET)@ha   /* if kernel address, 
use */
addir2, r2, (swapper_pg_dir - PAGE_OFFSET)@l/* kernel page 
table */
 112:   rlwimi  r2,r3,12,20,29  /* insert top 10 bits of address */
diff --git a/arch/powerpc/mm/book3s32/hash_low.S 
b/arch/powerpc/mm/book3s32/hash_low.S
index 923ad8f374eb..1690d369688b 100644
--- a/arch/powerpc/mm/book3s32/hash_low.S
+++ b/arch/powerpc/mm/book3s32/hash_low.S
@@ -62,7 +62,7 @@ _GLOBAL(hash_page)
isync
 #endif
/* Get PTE (linux-style) and check access */
-   lis r0,KERNELBASE@h /* check if kernel address */
+   lis r0, TASK_SIZE@h /* check if kernel address */
cmplw   0,r4,r0
ori r3,r3,_PAGE_USER|_PAGE_PRESENT /* test low addresses as user */
mfspr   r5, SPRN_SPRG_PGDIR /* phys page-table root */
-- 
2.25.0



[PATCH v1 8/8] powerpc/32s: Use dedicated segment for modules with STRICT_KERNEL_RWX

2020-06-19 Thread Christophe Leroy
When STRICT_KERNEL_RWX is set, we want to set NX bit on vmalloc
segments. But modules require exec.

Use a dedicated segment for modules. There is not much space
above kernel, and we don't waste vmalloc space to do alignment.
Therefore, we take the segment before PAGE_OFFSET for modules.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/Kconfig |  1 +
 arch/powerpc/include/asm/book3s/32/pgtable.h | 15 +--
 arch/powerpc/mm/ptdump/ptdump.c  |  8 
 3 files changed, 14 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 9fa23eb320ff..2ba6ac9da46f 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -1198,6 +1198,7 @@ config TASK_SIZE_BOOL
 config TASK_SIZE
hex "Size of user task space" if TASK_SIZE_BOOL
default "0x8000" if PPC_8xx
+   default "0xb000" if PPC_BOOK3S_32 && STRICT_KERNEL_RWX
default "0xc000"
 endmenu
 
diff --git a/arch/powerpc/include/asm/book3s/32/pgtable.h 
b/arch/powerpc/include/asm/book3s/32/pgtable.h
index 224912432821..36443cda8dcf 100644
--- a/arch/powerpc/include/asm/book3s/32/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/32/pgtable.h
@@ -184,17 +184,7 @@ int map_kernel_page(unsigned long va, phys_addr_t pa, 
pgprot_t prot);
  */
 #define VMALLOC_OFFSET (0x100) /* 16M */
 
-/*
- * With CONFIG_STRICT_KERNEL_RWX, kernel segments are set NX. But when modules
- * are used, NX cannot be set on VMALLOC space. So vmalloc VM space and linear
- * memory shall not share segments.
- */
-#if defined(CONFIG_STRICT_KERNEL_RWX) && defined(CONFIG_MODULES)
-#define VMALLOC_START ((ALIGN((long)high_memory, 256L << 20) + VMALLOC_OFFSET) 
& \
-  ~(VMALLOC_OFFSET - 1))
-#else
 #define VMALLOC_START long)high_memory + VMALLOC_OFFSET) & 
~(VMALLOC_OFFSET-1)))
-#endif
 
 #ifdef CONFIG_KASAN_VMALLOC
 #define VMALLOC_ENDALIGN_DOWN(ioremap_bot, PAGE_SIZE << 
KASAN_SHADOW_SCALE_SHIFT)
@@ -202,6 +192,11 @@ int map_kernel_page(unsigned long va, phys_addr_t pa, 
pgprot_t prot);
 #define VMALLOC_ENDioremap_bot
 #endif
 
+#ifdef CONFIG_STRICT_KERNEL_RWX
+#define MODULES_ENDALIGN_DOWN(PAGE_OFFSET, SZ_256M)
+#define MODULES_VADDR  (MODULES_END - SZ_256M)
+#endif
+
 #ifndef __ASSEMBLY__
 #include 
 #include 
diff --git a/arch/powerpc/mm/ptdump/ptdump.c b/arch/powerpc/mm/ptdump/ptdump.c
index e995f2e9e9f7..51aab1b7be31 100644
--- a/arch/powerpc/mm/ptdump/ptdump.c
+++ b/arch/powerpc/mm/ptdump/ptdump.c
@@ -74,6 +74,10 @@ struct addr_marker {
 
 static struct addr_marker address_markers[] = {
{ 0,"Start of kernel VM" },
+#ifdef MODULES_VADDR
+   { 0,"modules start" },
+   { 0,"modules end" },
+#endif
{ 0,"vmalloc() Area" },
{ 0,"vmalloc() End" },
 #ifdef CONFIG_PPC64
@@ -352,6 +356,10 @@ static void populate_markers(void)
int i = 0;
 
address_markers[i++].start_address = TASK_SIZE;
+#ifdef MODULES_VADDR
+   address_markers[i++].start_address = MODULES_VADDR;
+   address_markers[i++].start_address = MODULES_END;
+#endif
address_markers[i++].start_address = VMALLOC_START;
address_markers[i++].start_address = VMALLOC_END;
 #ifdef CONFIG_PPC64
-- 
2.25.0



[PATCH v1 2/8] powerpc/ptdump: Refactor update of pg_state

2020-06-19 Thread Christophe Leroy
In note_page(), the pg_state is updated the same way in two places.

Add note_page_update_state() to do it.

Also include the display of boundary markers there as it is missing
"no level" leg, leading to a mismatch when the first two markers
are at the same address and the first displayed area uses that
address.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/mm/ptdump/ptdump.c | 34 +++--
 1 file changed, 20 insertions(+), 14 deletions(-)

diff --git a/arch/powerpc/mm/ptdump/ptdump.c b/arch/powerpc/mm/ptdump/ptdump.c
index d5e42b958e86..b71cc628facd 100644
--- a/arch/powerpc/mm/ptdump/ptdump.c
+++ b/arch/powerpc/mm/ptdump/ptdump.c
@@ -195,6 +195,24 @@ static void note_prot_wx(struct pg_state *st, unsigned 
long addr)
st->wx_pages += (addr - st->start_address) / PAGE_SIZE;
 }
 
+static void note_page_update_state(struct pg_state *st, unsigned long addr,
+  unsigned int level, u64 val, unsigned long 
page_size)
+{
+   u64 flag = val & pg_level[level].mask;
+   u64 pa = val & PTE_RPN_MASK;
+
+   st->level = level;
+   st->current_flags = flag;
+   st->start_address = addr;
+   st->start_pa = pa;
+   st->page_size = page_size;
+
+   while (addr >= st->marker[1].start_address) {
+   st->marker++;
+   pt_dump_seq_printf(st->seq, "---[ %s ]---\n", st->marker->name);
+   }
+}
+
 static void note_page(struct pg_state *st, unsigned long addr,
   unsigned int level, u64 val, unsigned long page_size)
 {
@@ -203,12 +221,8 @@ static void note_page(struct pg_state *st, unsigned long 
addr,
 
/* At first no level is set */
if (!st->level) {
-   st->level = level;
-   st->current_flags = flag;
-   st->start_address = addr;
-   st->start_pa = pa;
-   st->page_size = page_size;
pt_dump_seq_printf(st->seq, "---[ %s ]---\n", st->marker->name);
+   note_page_update_state(st, addr, level, val, page_size);
/*
 * Dump the section of virtual memory when:
 *   - the PTE flags from one entry to the next differs.
@@ -240,15 +254,7 @@ static void note_page(struct pg_state *st, unsigned long 
addr,
 * Address indicates we have passed the end of the
 * current section of virtual memory
 */
-   while (addr >= st->marker[1].start_address) {
-   st->marker++;
-   pt_dump_seq_printf(st->seq, "---[ %s ]---\n", 
st->marker->name);
-   }
-   st->start_address = addr;
-   st->start_pa = pa;
-   st->page_size = page_size;
-   st->current_flags = flag;
-   st->level = level;
+   note_page_update_state(st, addr, level, val, page_size);
}
st->last_pa = pa;
 }
-- 
2.25.0



[PATCH v1 4/8] powerpc/lib: Prepare code-patching for modules allocated outside vmalloc space

2020-06-19 Thread Christophe Leroy
Use is_vmalloc_or_module_addr() instead of is_vmalloc_addr()

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/lib/code-patching.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/lib/code-patching.c b/arch/powerpc/lib/code-patching.c
index 0a051dfeb177..8c3934ea6220 100644
--- a/arch/powerpc/lib/code-patching.c
+++ b/arch/powerpc/lib/code-patching.c
@@ -93,7 +93,7 @@ static int map_patch_area(void *addr, unsigned long 
text_poke_addr)
unsigned long pfn;
int err;
 
-   if (is_vmalloc_addr(addr))
+   if (is_vmalloc_or_module_addr(addr))
pfn = vmalloc_to_pfn(addr);
else
pfn = __pa_symbol(addr) >> PAGE_SHIFT;
-- 
2.25.0



[PATCH v1 0/8] powerpc/32s: Allocate modules outside of vmalloc space for STRICT_KERNEL_RWX

2020-06-19 Thread Christophe Leroy
On book3s32 (hash), exec protection is set per 256Mb segments with NX bit.
Instead of clearing NX bit on vmalloc space when CONFIG_MODULES is selected,
allocate modules in a dedicated segment (0xb000-0xbfff by default).
This allows to keep exec protection on vmalloc space while allowing exec
on modules.

Christophe Leroy (8):
  powerpc/ptdump: Refactor update of st->last_pa
  powerpc/ptdump: Refactor update of pg_state
  powerpc: Set user/kernel boundary at TASK_SIZE instead of PAGE_OFFSET
  powerpc/lib: Prepare code-patching for modules allocated outside
vmalloc space
  powerpc: Use MODULES_VADDR if defined
  powerpc/32s: Only leave NX unset on segments used for modules
  powerpc/32s: Kernel space starts at TASK_SIZE
  powerpc/32s: Use dedicated segment for modules with STRICT_KERNEL_RWX

 arch/powerpc/Kconfig |  1 +
 arch/powerpc/include/asm/book3s/32/pgtable.h | 15 ++
 arch/powerpc/include/asm/page.h  |  2 +-
 arch/powerpc/kernel/head_32.S| 12 ++---
 arch/powerpc/kernel/module.c | 11 
 arch/powerpc/lib/code-patching.c |  2 +-
 arch/powerpc/mm/book3s32/hash_low.S  |  2 +-
 arch/powerpc/mm/book3s32/mmu.c   | 17 +--
 arch/powerpc/mm/kasan/kasan_init_32.c|  6 +++
 arch/powerpc/mm/ptdump/ptdump.c  | 53 
 10 files changed, 78 insertions(+), 43 deletions(-)

-- 
2.25.0



[PATCH v1 1/8] powerpc/ptdump: Refactor update of st->last_pa

2020-06-19 Thread Christophe Leroy
st->last_pa is always updated in note_page() so it can
be done outside the if/elseif/else block.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/mm/ptdump/ptdump.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/arch/powerpc/mm/ptdump/ptdump.c b/arch/powerpc/mm/ptdump/ptdump.c
index de6e05ef871c..d5e42b958e86 100644
--- a/arch/powerpc/mm/ptdump/ptdump.c
+++ b/arch/powerpc/mm/ptdump/ptdump.c
@@ -207,7 +207,6 @@ static void note_page(struct pg_state *st, unsigned long 
addr,
st->current_flags = flag;
st->start_address = addr;
st->start_pa = pa;
-   st->last_pa = pa;
st->page_size = page_size;
pt_dump_seq_printf(st->seq, "---[ %s ]---\n", st->marker->name);
/*
@@ -247,13 +246,11 @@ static void note_page(struct pg_state *st, unsigned long 
addr,
}
st->start_address = addr;
st->start_pa = pa;
-   st->last_pa = pa;
st->page_size = page_size;
st->current_flags = flag;
st->level = level;
-   } else {
-   st->last_pa = pa;
}
+   st->last_pa = pa;
 }
 
 static void walk_pte(struct pg_state *st, pmd_t *pmd, unsigned long start)
-- 
2.25.0



[PATCH v1 3/8] powerpc: Set user/kernel boundary at TASK_SIZE instead of PAGE_OFFSET

2020-06-19 Thread Christophe Leroy
User space stops at TASK_SIZE. At the moment, kernel space starts
at PAGE_OFFSET.

In order to use space between TASK_SIZE and PAGE_OFFSET for modules,
make TASK_SIZE the limit between user and kernel space.

Note that fault.c already considers TASK_SIZE as the boundary between
user and kernel space.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/include/asm/page.h | 2 +-
 arch/powerpc/mm/ptdump/ptdump.c | 6 +++---
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/include/asm/page.h b/arch/powerpc/include/asm/page.h
index a63fe6f3a0ff..352a2b80d505 100644
--- a/arch/powerpc/include/asm/page.h
+++ b/arch/powerpc/include/asm/page.h
@@ -256,7 +256,7 @@ static inline bool pfn_valid(unsigned long pfn)
 #ifdef CONFIG_PPC_BOOK3E_64
 #define is_kernel_addr(x)  ((x) >= 0x8000ul)
 #else
-#define is_kernel_addr(x)  ((x) >= PAGE_OFFSET)
+#define is_kernel_addr(x)  ((x) >= TASK_SIZE)
 #endif
 
 #ifndef CONFIG_PPC_BOOK3S_64
diff --git a/arch/powerpc/mm/ptdump/ptdump.c b/arch/powerpc/mm/ptdump/ptdump.c
index b71cc628facd..e995f2e9e9f7 100644
--- a/arch/powerpc/mm/ptdump/ptdump.c
+++ b/arch/powerpc/mm/ptdump/ptdump.c
@@ -351,7 +351,7 @@ static void populate_markers(void)
 {
int i = 0;
 
-   address_markers[i++].start_address = PAGE_OFFSET;
+   address_markers[i++].start_address = TASK_SIZE;
address_markers[i++].start_address = VMALLOC_START;
address_markers[i++].start_address = VMALLOC_END;
 #ifdef CONFIG_PPC64
@@ -388,7 +388,7 @@ static int ptdump_show(struct seq_file *m, void *v)
struct pg_state st = {
.seq = m,
.marker = address_markers,
-   .start_address = PAGE_OFFSET,
+   .start_address = TASK_SIZE,
};
 
 #ifdef CONFIG_PPC64
@@ -432,7 +432,7 @@ void ptdump_check_wx(void)
.seq = NULL,
.marker = address_markers,
.check_wx = true,
-   .start_address = PAGE_OFFSET,
+   .start_address = TASK_SIZE,
};
 
 #ifdef CONFIG_PPC64
-- 
2.25.0



[PATCH v5] ocxl: control via sysfs whether the FPGA is reloaded on a link reset

2020-06-19 Thread Frederic Barrat
From: Philippe Bergheaud 

Some opencapi FPGA images allow to control if the FPGA should be reloaded
on the next adapter reset. If it is supported, the image specifies it
through a Vendor Specific DVSEC in the config space of function 0.

Signed-off-by: Philippe Bergheaud 
Signed-off-by: Frederic Barrat 
---
Changelog:
v2:
  - refine ResetReload debug message
  - do not call get_function_0() if pci_dev is for function 0
v3:
  - avoid get_function_0() in ocxl_config_set_reset_reload also
v4:
  - simplify parsing of Vendor Specific DVSEC during AFU init
  - only set/unset bit 0 of the config space register
  - commonize code to fetch the right PCI function and DVSEC offset
  - use kstrtoint() when parsing the sysfs buffer
v5:
  - update documentation (Andrew)


 Documentation/ABI/testing/sysfs-class-ocxl | 11 +++
 drivers/misc/ocxl/config.c | 81 --
 drivers/misc/ocxl/ocxl_internal.h  |  6 ++
 drivers/misc/ocxl/sysfs.c  | 35 ++
 include/misc/ocxl-config.h |  1 +
 5 files changed, 129 insertions(+), 5 deletions(-)

diff --git a/Documentation/ABI/testing/sysfs-class-ocxl 
b/Documentation/ABI/testing/sysfs-class-ocxl
index b5b1fa197592..ae1276efa45a 100644
--- a/Documentation/ABI/testing/sysfs-class-ocxl
+++ b/Documentation/ABI/testing/sysfs-class-ocxl
@@ -33,3 +33,14 @@ Date:January 2018
 Contact:   linuxppc-dev@lists.ozlabs.org
 Description:   read/write
Give access the global mmio area for the AFU
+
+What:  /sys/class/ocxl//reload_on_reset
+Date:  February 2020
+Contact:   linuxppc-dev@lists.ozlabs.org
+Description:   read/write
+   Control whether the FPGA is reloaded on a link reset. Enabled
+   through a vendor-specific logic block on the FPGA.
+   0   Do not reload FPGA image from flash
+   1   Reload FPGA image from flash
+   unavailable
+   The device does not support this capability
diff --git a/drivers/misc/ocxl/config.c b/drivers/misc/ocxl/config.c
index c8e19bfb5ef9..42f7a1298775 100644
--- a/drivers/misc/ocxl/config.c
+++ b/drivers/misc/ocxl/config.c
@@ -71,6 +71,20 @@ static int find_dvsec_afu_ctrl(struct pci_dev *dev, u8 
afu_idx)
return 0;
 }
 
+/**
+ * get_function_0() - Find a related PCI device (function 0)
+ * @device: PCI device to match
+ *
+ * Returns a pointer to the related device, or null if not found
+ */
+static struct pci_dev *get_function_0(struct pci_dev *dev)
+{
+   unsigned int devfn = PCI_DEVFN(PCI_SLOT(dev->devfn), 0);
+
+   return pci_get_domain_bus_and_slot(pci_domain_nr(dev->bus),
+  dev->bus->number, devfn);
+}
+
 static void read_pasid(struct pci_dev *dev, struct ocxl_fn_config *fn)
 {
u16 val;
@@ -159,14 +173,15 @@ static int read_dvsec_afu_info(struct pci_dev *dev, 
struct ocxl_fn_config *fn)
 static int read_dvsec_vendor(struct pci_dev *dev)
 {
int pos;
-   u32 cfg, tlx, dlx;
+   u32 cfg, tlx, dlx, reset_reload;
 
/*
-* vendor specific DVSEC is optional
+* vendor specific DVSEC, for IBM images only. Some older
+* images may not have it
 *
-* It's currently only used on function 0 to specify the
-* version of some logic blocks. Some older images may not
-* even have it so we ignore any errors
+* It's only used on function 0 to specify the version of some
+* logic blocks and to give access to special registers to
+* enable host-based flashing.
 */
if (PCI_FUNC(dev->devfn) != 0)
return 0;
@@ -178,11 +193,67 @@ static int read_dvsec_vendor(struct pci_dev *dev)
pci_read_config_dword(dev, pos + OCXL_DVSEC_VENDOR_CFG_VERS, );
pci_read_config_dword(dev, pos + OCXL_DVSEC_VENDOR_TLX_VERS, );
pci_read_config_dword(dev, pos + OCXL_DVSEC_VENDOR_DLX_VERS, );
+   pci_read_config_dword(dev, pos + OCXL_DVSEC_VENDOR_RESET_RELOAD,
+ _reload);
 
dev_dbg(>dev, "Vendor specific DVSEC:\n");
dev_dbg(>dev, "  CFG version = 0x%x\n", cfg);
dev_dbg(>dev, "  TLX version = 0x%x\n", tlx);
dev_dbg(>dev, "  DLX version = 0x%x\n", dlx);
+   dev_dbg(>dev, "  ResetReload = 0x%x\n", reset_reload);
+   return 0;
+}
+
+static int get_dvsec_vendor0(struct pci_dev *dev, struct pci_dev **dev0,
+int *out_pos)
+{
+   int pos;
+
+   if (PCI_FUNC(dev->devfn) != 0) {
+   dev = get_function_0(dev);
+   if (!dev)
+   return -1;
+   }
+   pos = find_dvsec(dev, OCXL_DVSEC_VENDOR_ID);
+   if (!pos)
+   return -1;
+   *dev0 = dev;
+   *out_pos = pos;
+   return 0;
+}
+
+int ocxl_config_get_reset_reload(struct pci_dev *dev, int *val)
+{
+   struct pci_dev 

Re: linux-next: manual merge of the pidfd tree with the powerpc-fixes tree

2020-06-19 Thread Christian Brauner
On Fri, Jun 19, 2020 at 09:17:30PM +1000, Michael Ellerman wrote:
> Stephen Rothwell  writes:
> > Hi all,
> >
> > Today's linux-next merge of the pidfd tree got a conflict in:
> >
> >   arch/powerpc/kernel/syscalls/syscall.tbl
> >
> > between commit:
> >
> >   35e32a6cb5f6 ("powerpc/syscalls: Split SPU-ness out of ABI")
> >
> > from the powerpc-fixes tree and commit:
> >
> >   9b4feb630e8e ("arch: wire-up close_range()")
> >
> > from the pidfd tree.
> >
> > I fixed it up (see below) and can carry the fix as necessary. This
> > is now fixed as far as linux-next is concerned, but any non trivial
> > conflicts should be mentioned to your upstream maintainer when your tree
> > is submitted for merging.  You may also want to consider cooperating
> > with the maintainer of the conflicting tree to minimise any particularly
> > complex conflicts.
> 
> Thanks.
> 
> I thought the week between rc1 and rc2 would be a safe time to do that
> conversion of the syscall table, but I guess I was wrong :)

:)

> 
> I'm planning to send those changes to Linus for rc2, so the conflict
> will then be vs mainline. But I guess it's pretty trivial so it doesn't
> really matter.

close_range() is targeted for the v5.9 merge window. I always do
test-merges with mainline at the time I'm creating a pr and I'll just
mention to Linus that there's conflict with ppc. :)

Thanks!
Christian


[PATCH v5 26/26] powerpc/selftest/ptrace-pkey: IAMR and uamor cannot be updated by ptrace

2020-06-19 Thread Aneesh Kumar K.V
Both IAMR and uamor are privileged and cannot be updated by userspace. Hence
we also don't allow ptrace interface to update them. Don't update them in the
test. Also expected_iamr is only changed if we can allocate a  DISABLE_EXECUTE
pkey.

Signed-off-by: Aneesh Kumar K.V 
---
 tools/testing/selftests/powerpc/ptrace/ptrace-pkey.c | 9 +++--
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/tools/testing/selftests/powerpc/ptrace/ptrace-pkey.c 
b/tools/testing/selftests/powerpc/ptrace/ptrace-pkey.c
index bc33d748d95b..5c3c8222de46 100644
--- a/tools/testing/selftests/powerpc/ptrace/ptrace-pkey.c
+++ b/tools/testing/selftests/powerpc/ptrace/ptrace-pkey.c
@@ -101,15 +101,12 @@ static int child(struct shared_info *info)
 */
info->invalid_amr = info->amr2 | (~0x0UL & ~info->expected_uamor);
 
+   /*
+* if PKEY_DISABLE_EXECUTE succeeded we should update the expected_iamr
+*/
if (disable_execute)
info->expected_iamr |= 1ul << pkeyshift(pkey1);
-   else
-   info->expected_iamr &= ~(1ul << pkeyshift(pkey1));
-
-   info->expected_iamr &= ~(1ul << pkeyshift(pkey2) | 1ul << 
pkeyshift(pkey3));
 
-   info->expected_uamor |= 3ul << pkeyshift(pkey1) |
-   3ul << pkeyshift(pkey2);
/*
 * Create an IAMR value different from expected value.
 * Kernel will reject an IAMR and UAMOR change.
-- 
2.26.2



[PATCH v5 25/26] powerpc/selftest/ptrace-pkey: Update the test to mark an invalid pkey correctly

2020-06-19 Thread Aneesh Kumar K.V
Signed-off-by: Aneesh Kumar K.V 
---
 .../selftests/powerpc/ptrace/ptrace-pkey.c| 30 ---
 1 file changed, 12 insertions(+), 18 deletions(-)

diff --git a/tools/testing/selftests/powerpc/ptrace/ptrace-pkey.c 
b/tools/testing/selftests/powerpc/ptrace/ptrace-pkey.c
index f9216c7a1829..bc33d748d95b 100644
--- a/tools/testing/selftests/powerpc/ptrace/ptrace-pkey.c
+++ b/tools/testing/selftests/powerpc/ptrace/ptrace-pkey.c
@@ -66,11 +66,6 @@ static int sys_pkey_alloc(unsigned long flags, unsigned long 
init_access_rights)
return syscall(__NR_pkey_alloc, flags, init_access_rights);
 }
 
-static int sys_pkey_free(int pkey)
-{
-   return syscall(__NR_pkey_free, pkey);
-}
-
 static int child(struct shared_info *info)
 {
unsigned long reg;
@@ -100,7 +95,11 @@ static int child(struct shared_info *info)
 
info->amr1 |= 3ul << pkeyshift(pkey1);
info->amr2 |= 3ul << pkeyshift(pkey2);
-   info->invalid_amr |= info->amr2 | 3ul << pkeyshift(pkey3);
+   /*
+* invalid amr value where we try to force write
+* things which are deined by a uamor setting.
+*/
+   info->invalid_amr = info->amr2 | (~0x0UL & ~info->expected_uamor);
 
if (disable_execute)
info->expected_iamr |= 1ul << pkeyshift(pkey1);
@@ -111,17 +110,12 @@ static int child(struct shared_info *info)
 
info->expected_uamor |= 3ul << pkeyshift(pkey1) |
3ul << pkeyshift(pkey2);
-   info->invalid_iamr |= 1ul << pkeyshift(pkey1) | 1ul << pkeyshift(pkey2);
-   info->invalid_uamor |= 3ul << pkeyshift(pkey1);
-
/*
-* We won't use pkey3. We just want a plausible but invalid key to test
-* whether ptrace will let us write to AMR bits we are not supposed to.
-*
-* This also tests whether the kernel restores the UAMOR permissions
-* after a key is freed.
+* Create an IAMR value different from expected value.
+* Kernel will reject an IAMR and UAMOR change.
 */
-   sys_pkey_free(pkey3);
+   info->invalid_iamr = info->expected_iamr | (1ul << pkeyshift(pkey1) | 
1ul << pkeyshift(pkey2));
+   info->invalid_uamor = info->expected_uamor & ~(0x3ul << 
pkeyshift(pkey1));
 
printf("%-30s AMR: %016lx pkey1: %d pkey2: %d pkey3: %d\n",
   user_write, info->amr1, pkey1, pkey2, pkey3);
@@ -196,9 +190,9 @@ static int parent(struct shared_info *info, pid_t pid)
PARENT_SKIP_IF_UNSUPPORTED(ret, >child_sync);
PARENT_FAIL_IF(ret, >child_sync);
 
-   info->amr1 = info->amr2 = info->invalid_amr = regs[0];
-   info->expected_iamr = info->invalid_iamr = regs[1];
-   info->expected_uamor = info->invalid_uamor = regs[2];
+   info->amr1 = info->amr2 = regs[0];
+   info->expected_iamr = regs[1];
+   info->expected_uamor = regs[2];
 
/* Wake up child so that it can set itself up. */
ret = prod_child(>child_sync);
-- 
2.26.2



[PATCH v5 24/26] powerpc/selftest/ptrave-pkey: Rename variables to make it easier to follow code

2020-06-19 Thread Aneesh Kumar K.V
Rename variable to indicate that they are invalid values which we will use to
test ptrace update of pkeys.

Signed-off-by: Aneesh Kumar K.V 
---
 .../selftests/powerpc/ptrace/ptrace-pkey.c| 26 +--
 1 file changed, 13 insertions(+), 13 deletions(-)

diff --git a/tools/testing/selftests/powerpc/ptrace/ptrace-pkey.c 
b/tools/testing/selftests/powerpc/ptrace/ptrace-pkey.c
index bdbbbe8431e0..f9216c7a1829 100644
--- a/tools/testing/selftests/powerpc/ptrace/ptrace-pkey.c
+++ b/tools/testing/selftests/powerpc/ptrace/ptrace-pkey.c
@@ -44,7 +44,7 @@ struct shared_info {
unsigned long amr2;
 
/* AMR value that ptrace should refuse to write to the child. */
-   unsigned long amr3;
+   unsigned long invalid_amr;
 
/* IAMR value the parent expects to read from the child. */
unsigned long expected_iamr;
@@ -57,8 +57,8 @@ struct shared_info {
 * (even though they're valid ones) because userspace doesn't have
 * access to those registers.
 */
-   unsigned long new_iamr;
-   unsigned long new_uamor;
+   unsigned long invalid_iamr;
+   unsigned long invalid_uamor;
 };
 
 static int sys_pkey_alloc(unsigned long flags, unsigned long 
init_access_rights)
@@ -100,7 +100,7 @@ static int child(struct shared_info *info)
 
info->amr1 |= 3ul << pkeyshift(pkey1);
info->amr2 |= 3ul << pkeyshift(pkey2);
-   info->amr3 |= info->amr2 | 3ul << pkeyshift(pkey3);
+   info->invalid_amr |= info->amr2 | 3ul << pkeyshift(pkey3);
 
if (disable_execute)
info->expected_iamr |= 1ul << pkeyshift(pkey1);
@@ -111,8 +111,8 @@ static int child(struct shared_info *info)
 
info->expected_uamor |= 3ul << pkeyshift(pkey1) |
3ul << pkeyshift(pkey2);
-   info->new_iamr |= 1ul << pkeyshift(pkey1) | 1ul << pkeyshift(pkey2);
-   info->new_uamor |= 3ul << pkeyshift(pkey1);
+   info->invalid_iamr |= 1ul << pkeyshift(pkey1) | 1ul << pkeyshift(pkey2);
+   info->invalid_uamor |= 3ul << pkeyshift(pkey1);
 
/*
 * We won't use pkey3. We just want a plausible but invalid key to test
@@ -196,9 +196,9 @@ static int parent(struct shared_info *info, pid_t pid)
PARENT_SKIP_IF_UNSUPPORTED(ret, >child_sync);
PARENT_FAIL_IF(ret, >child_sync);
 
-   info->amr1 = info->amr2 = info->amr3 = regs[0];
-   info->expected_iamr = info->new_iamr = regs[1];
-   info->expected_uamor = info->new_uamor = regs[2];
+   info->amr1 = info->amr2 = info->invalid_amr = regs[0];
+   info->expected_iamr = info->invalid_iamr = regs[1];
+   info->expected_uamor = info->invalid_uamor = regs[2];
 
/* Wake up child so that it can set itself up. */
ret = prod_child(>child_sync);
@@ -234,10 +234,10 @@ static int parent(struct shared_info *info, pid_t pid)
return ret;
 
/* Write invalid AMR value in child. */
-   ret = ptrace_write_regs(pid, NT_PPC_PKEY, >amr3, 1);
+   ret = ptrace_write_regs(pid, NT_PPC_PKEY, >invalid_amr, 1);
PARENT_FAIL_IF(ret, >child_sync);
 
-   printf("%-30s AMR: %016lx\n", ptrace_write_running, info->amr3);
+   printf("%-30s AMR: %016lx\n", ptrace_write_running, info->invalid_amr);
 
/* Wake up child so that it can verify it didn't change. */
ret = prod_child(>child_sync);
@@ -249,7 +249,7 @@ static int parent(struct shared_info *info, pid_t pid)
 
/* Try to write to IAMR. */
regs[0] = info->amr1;
-   regs[1] = info->new_iamr;
+   regs[1] = info->invalid_iamr;
ret = ptrace_write_regs(pid, NT_PPC_PKEY, regs, 2);
PARENT_FAIL_IF(!ret, >child_sync);
 
@@ -257,7 +257,7 @@ static int parent(struct shared_info *info, pid_t pid)
   ptrace_write_running, regs[0], regs[1]);
 
/* Try to write to IAMR and UAMOR. */
-   regs[2] = info->new_uamor;
+   regs[2] = info->invalid_uamor;
ret = ptrace_write_regs(pid, NT_PPC_PKEY, regs, 3);
PARENT_FAIL_IF(!ret, >child_sync);
 
-- 
2.26.2



[PATCH v5 23/26] powerpc/book3s64/kuap: Move UAMOR setup to key init function

2020-06-19 Thread Aneesh Kumar K.V
UAMOR values are not application-specific. The kernel initializes
its value based on different reserved keys. Remove the thread-specific
UAMOR value and don't switch the UAMOR on context switch.

Move UAMOR initialization to key initialization code. Now that
KUAP/KUEP feature depends on PPC_MEM_KEYS, we can start to consolidate
all register initialization to keys init.

Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/include/asm/book3s/64/kup.h |  2 ++
 arch/powerpc/include/asm/processor.h |  1 -
 arch/powerpc/kernel/ptrace/ptrace-view.c | 17 
 arch/powerpc/kernel/smp.c|  5 
 arch/powerpc/mm/book3s64/pkeys.c | 35 ++--
 5 files changed, 39 insertions(+), 21 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/kup.h 
b/arch/powerpc/include/asm/book3s/64/kup.h
index 3a0e138d2735..942594745dfa 100644
--- a/arch/powerpc/include/asm/book3s/64/kup.h
+++ b/arch/powerpc/include/asm/book3s/64/kup.h
@@ -67,6 +67,8 @@
 #include 
 #include 
 
+extern u64 default_uamor;
+
 static inline void kuap_restore_amr(struct pt_regs *regs, unsigned long amr)
 {
if (mmu_has_feature(MMU_FTR_KUAP) && unlikely(regs->kuap != amr)) {
diff --git a/arch/powerpc/include/asm/processor.h 
b/arch/powerpc/include/asm/processor.h
index 52a67835057a..6ac12168f1fe 100644
--- a/arch/powerpc/include/asm/processor.h
+++ b/arch/powerpc/include/asm/processor.h
@@ -237,7 +237,6 @@ struct thread_struct {
 #ifdef CONFIG_PPC_MEM_KEYS
unsigned long   amr;
unsigned long   iamr;
-   unsigned long   uamor;
 #endif
 #ifdef CONFIG_KVM_BOOK3S_32_HANDLER
void*   kvm_shadow_vcpu; /* KVM internal data */
diff --git a/arch/powerpc/kernel/ptrace/ptrace-view.c 
b/arch/powerpc/kernel/ptrace/ptrace-view.c
index caeb5822a8f4..689711eb018a 100644
--- a/arch/powerpc/kernel/ptrace/ptrace-view.c
+++ b/arch/powerpc/kernel/ptrace/ptrace-view.c
@@ -488,14 +488,22 @@ static int pkey_active(struct task_struct *target, const 
struct user_regset *reg
 static int pkey_get(struct task_struct *target, const struct user_regset 
*regset,
unsigned int pos, unsigned int count, void *kbuf, void 
__user *ubuf)
 {
+   int ret;
+
BUILD_BUG_ON(TSO(amr) + sizeof(unsigned long) != TSO(iamr));
-   BUILD_BUG_ON(TSO(iamr) + sizeof(unsigned long) != TSO(uamor));
 
if (!arch_pkeys_enabled())
return -ENODEV;
 
-   return user_regset_copyout(, , , , 
>thread.amr,
-  0, ELF_NPKEY * sizeof(unsigned long));
+   ret = user_regset_copyout(, , , , 
>thread.amr,
+ 0, 2 * sizeof(unsigned long));
+   if (ret)
+   goto err_out;
+
+   ret = user_regset_copyout(, , , , _uamor,
+ 2 * sizeof(unsigned long), 3 * 
sizeof(unsigned long));
+err_out:
+   return ret;
 }
 
 static int pkey_set(struct task_struct *target, const struct user_regset 
*regset,
@@ -518,8 +526,7 @@ static int pkey_set(struct task_struct *target, const 
struct user_regset *regset
return ret;
 
/* UAMOR determines which bits of the AMR can be set from userspace. */
-   target->thread.amr = (new_amr & target->thread.uamor) |
-(target->thread.amr & ~target->thread.uamor);
+   target->thread.amr = (new_amr & default_uamor) | (target->thread.amr & 
~default_uamor);
 
return 0;
 }
diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index c820c95162ff..eec40082599f 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -59,6 +59,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #ifdef DEBUG
 #include 
@@ -1256,6 +1257,10 @@ void start_secondary(void *unused)
mmgrab(_mm);
current->active_mm = _mm;
 
+#ifdef CONFIG_PPC_MEM_KEYS
+   mtspr(SPRN_UAMOR, default_uamor);
+#endif
+
smp_store_cpu_info(cpu);
set_dec(tb_ticks_per_jiffy);
preempt_disable();
diff --git a/arch/powerpc/mm/book3s64/pkeys.c b/arch/powerpc/mm/book3s64/pkeys.c
index aeecc8b8e11c..3f3593f85358 100644
--- a/arch/powerpc/mm/book3s64/pkeys.c
+++ b/arch/powerpc/mm/book3s64/pkeys.c
@@ -24,7 +24,7 @@ static u32  initial_allocation_mask;   /* Bits set for the 
initially allocated k
 static u64 default_amr;
 static u64 default_iamr;
 /* Allow all keys to be modified by default */
-static u64 default_uamor = ~0x0UL;
+u64 default_uamor = ~0x0UL;
 /*
  * Key used to implement PROT_EXEC mmap. Denies READ/WRITE
  * We pick key 2 because 0 is special key and 1 is reserved as per ISA.
@@ -113,8 +113,16 @@ void __init pkey_early_init_devtree(void)
/* scan the device tree for pkey feature */
pkeys_total = scan_pkey_feature();
if (!pkeys_total) {
-   /* No support for pkey. Mark it disabled */
-   return;
+   /*
+* No key support but on radix we can use key 0
+* to 

[PATCH v5 22/26] powerpc/book3s64/kuap/kuep: Make KUAP and KUEP a subfeature of PPC_MEM_KEYS

2020-06-19 Thread Aneesh Kumar K.V
The next set of patches adds support for kuap with hash translation.
Hence make KUAP a BOOK3S_64 feature. Also make it a subfeature of
PPC_MEM_KEYS. Hash translation is going to use pkeys to support
KUAP/KUEP. Adding this dependency reduces the code complexity and
enables us to move some of the initialization code to pkeys.c

Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/include/asm/book3s/64/kup.h | 33 ++--
 arch/powerpc/include/asm/ptrace.h|  2 +-
 arch/powerpc/kernel/asm-offsets.c|  2 +-
 arch/powerpc/platforms/Kconfig.cputype   |  4 +--
 4 files changed, 23 insertions(+), 18 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/kup.h 
b/arch/powerpc/include/asm/book3s/64/kup.h
index 3cecd964a63f..3a0e138d2735 100644
--- a/arch/powerpc/include/asm/book3s/64/kup.h
+++ b/arch/powerpc/include/asm/book3s/64/kup.h
@@ -62,7 +62,7 @@
 
 #else /* !__ASSEMBLY__ */
 
-#ifdef CONFIG_PPC_KUAP
+#ifdef CONFIG_PPC_MEM_KEYS
 
 #include 
 #include 
@@ -97,6 +97,24 @@ static inline void kuap_check_amr(void)
WARN_ON_ONCE(mfspr(SPRN_AMR) != AMR_KUAP_BLOCKED);
 }
 
+#else /* CONFIG_PPC_MEM_KEYS */
+
+static inline void kuap_restore_amr(struct pt_regs *regs, unsigned long amr)
+{
+}
+
+static inline void kuap_check_amr(void)
+{
+}
+
+static inline unsigned long kuap_get_and_check_amr(void)
+{
+   return 0;
+}
+#endif /* CONFIG_PPC_MEM_KEYS */
+
+
+#ifdef CONFIG_PPC_KUAP
 /*
  * We support individually allowing read or write, but we don't support nesting
  * because that would require an expensive read/modify write of the AMR.
@@ -166,19 +184,6 @@ bad_kuap_fault(struct pt_regs *regs, unsigned long 
address, bool is_write)
(regs->kuap & (is_write ? AMR_KUAP_BLOCK_WRITE : 
AMR_KUAP_BLOCK_READ)),
"Bug: %s fault blocked by AMR!", is_write ? "Write" : 
"Read");
 }
-#else /* CONFIG_PPC_KUAP */
-static inline void kuap_restore_amr(struct pt_regs *regs, unsigned long amr)
-{
-}
-
-static inline void kuap_check_amr(void)
-{
-}
-
-static inline unsigned long kuap_get_and_check_amr(void)
-{
-   return 0;
-}
 #endif /* CONFIG_PPC_KUAP */
 
 #define reset_kuap reset_kuap
diff --git a/arch/powerpc/include/asm/ptrace.h 
b/arch/powerpc/include/asm/ptrace.h
index ac3970fff0d5..1a6cadf63d14 100644
--- a/arch/powerpc/include/asm/ptrace.h
+++ b/arch/powerpc/include/asm/ptrace.h
@@ -53,7 +53,7 @@ struct pt_regs
 #ifdef CONFIG_PPC64
unsigned long ppr;
 #endif
-#ifdef CONFIG_PPC_KUAP
+#ifdef CONFIG_PPC_HAVE_KUAP
unsigned long kuap;
 #endif
};
diff --git a/arch/powerpc/kernel/asm-offsets.c 
b/arch/powerpc/kernel/asm-offsets.c
index 9b9cde07e396..1694c4f531b9 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -354,7 +354,7 @@ int main(void)
STACK_PT_REGS_OFFSET(_PPR, ppr);
 #endif /* CONFIG_PPC64 */
 
-#ifdef CONFIG_PPC_KUAP
+#ifdef CONFIG_PPC_HAVE_KUAP
STACK_PT_REGS_OFFSET(STACK_REGS_KUAP, kuap);
 #endif
 
diff --git a/arch/powerpc/platforms/Kconfig.cputype 
b/arch/powerpc/platforms/Kconfig.cputype
index d349603fb889..053c46aecf80 100644
--- a/arch/powerpc/platforms/Kconfig.cputype
+++ b/arch/powerpc/platforms/Kconfig.cputype
@@ -99,6 +99,8 @@ config PPC_BOOK3S_64
select ARCH_SUPPORTS_NUMA_BALANCING
select IRQ_WORK
select PPC_MM_SLICES
+   select PPC_HAVE_KUAP if PPC_MEM_KEYS
+   select PPC_HAVE_KUEP if PPC_MEM_KEYS
 
 config PPC_BOOK3E_64
bool "Embedded processors"
@@ -350,8 +352,6 @@ config PPC_RADIX_MMU
bool "Radix MMU Support"
depends on PPC_BOOK3S_64
select ARCH_HAS_GIGANTIC_PAGE
-   select PPC_HAVE_KUEP
-   select PPC_HAVE_KUAP
default y
help
  Enable support for the Power ISA 3.0 Radix style MMU. Currently this
-- 
2.26.2



[PATCH v5 21/26] powerpc/book3s64/kuap: Rename MMU_FTR_RADIX_KUAP to MMU_FTR_KUAP

2020-06-19 Thread Aneesh Kumar K.V
The next set of patches adds support for kuap with hash translation.
In preparation for that rename/move kuap related functions to
non radix names.

Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/include/asm/book3s/64/kup.h | 20 ++--
 arch/powerpc/include/asm/mmu.h   |  6 +++---
 arch/powerpc/mm/book3s64/pkeys.c |  2 +-
 3 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/kup.h 
b/arch/powerpc/include/asm/book3s/64/kup.h
index c4cf9b1caa23..3cecd964a63f 100644
--- a/arch/powerpc/include/asm/book3s/64/kup.h
+++ b/arch/powerpc/include/asm/book3s/64/kup.h
@@ -24,7 +24,7 @@
mtspr   SPRN_AMR, \gpr2
/* No isync required, see kuap_restore_amr() */
 998:
-   END_MMU_FTR_SECTION_NESTED_IFSET(MMU_FTR_RADIX_KUAP, 67)
+   END_MMU_FTR_SECTION_NESTED_IFSET(MMU_FTR_KUAP, 67)
 #endif
 .endm
 
@@ -36,7 +36,7 @@
sldi\gpr2, \gpr2, AMR_KUAP_SHIFT
 999:   tdne\gpr1, \gpr2
EMIT_BUG_ENTRY 999b, __FILE__, __LINE__, (BUGFLAG_WARNING | 
BUGFLAG_ONCE)
-   END_MMU_FTR_SECTION_NESTED_IFSET(MMU_FTR_RADIX_KUAP, 67)
+   END_MMU_FTR_SECTION_NESTED_IFSET(MMU_FTR_KUAP, 67)
 #endif
 .endm
 
@@ -56,7 +56,7 @@
mtspr   SPRN_AMR, \gpr2
isync
 99:
-   END_MMU_FTR_SECTION_NESTED_IFSET(MMU_FTR_RADIX_KUAP, 67)
+   END_MMU_FTR_SECTION_NESTED_IFSET(MMU_FTR_KUAP, 67)
 #endif
 .endm
 
@@ -69,7 +69,7 @@
 
 static inline void kuap_restore_amr(struct pt_regs *regs, unsigned long amr)
 {
-   if (mmu_has_feature(MMU_FTR_RADIX_KUAP) && unlikely(regs->kuap != amr)) 
{
+   if (mmu_has_feature(MMU_FTR_KUAP) && unlikely(regs->kuap != amr)) {
isync();
mtspr(SPRN_AMR, regs->kuap);
/*
@@ -82,7 +82,7 @@ static inline void kuap_restore_amr(struct pt_regs *regs, 
unsigned long amr)
 
 static inline unsigned long kuap_get_and_check_amr(void)
 {
-   if (mmu_has_feature(MMU_FTR_RADIX_KUAP)) {
+   if (mmu_has_feature(MMU_FTR_KUAP)) {
unsigned long amr = mfspr(SPRN_AMR);
if (IS_ENABLED(CONFIG_PPC_KUAP_DEBUG)) /* kuap_check_amr() */
WARN_ON_ONCE(amr != AMR_KUAP_BLOCKED);
@@ -93,7 +93,7 @@ static inline unsigned long kuap_get_and_check_amr(void)
 
 static inline void kuap_check_amr(void)
 {
-   if (IS_ENABLED(CONFIG_PPC_KUAP_DEBUG) && 
mmu_has_feature(MMU_FTR_RADIX_KUAP))
+   if (IS_ENABLED(CONFIG_PPC_KUAP_DEBUG) && mmu_has_feature(MMU_FTR_KUAP))
WARN_ON_ONCE(mfspr(SPRN_AMR) != AMR_KUAP_BLOCKED);
 }
 
@@ -104,7 +104,7 @@ static inline void kuap_check_amr(void)
 
 static inline unsigned long get_kuap(void)
 {
-   if (!early_mmu_has_feature(MMU_FTR_RADIX_KUAP))
+   if (!early_mmu_has_feature(MMU_FTR_KUAP))
return 0;
 
return mfspr(SPRN_AMR);
@@ -112,7 +112,7 @@ static inline unsigned long get_kuap(void)
 
 static inline void set_kuap(unsigned long value)
 {
-   if (!early_mmu_has_feature(MMU_FTR_RADIX_KUAP))
+   if (!early_mmu_has_feature(MMU_FTR_KUAP))
return;
 
/*
@@ -162,7 +162,7 @@ static inline void restore_user_access(unsigned long flags)
 static inline bool
 bad_kuap_fault(struct pt_regs *regs, unsigned long address, bool is_write)
 {
-   return WARN(mmu_has_feature(MMU_FTR_RADIX_KUAP) &&
+   return WARN(mmu_has_feature(MMU_FTR_KUAP) &&
(regs->kuap & (is_write ? AMR_KUAP_BLOCK_WRITE : 
AMR_KUAP_BLOCK_READ)),
"Bug: %s fault blocked by AMR!", is_write ? "Write" : 
"Read");
 }
@@ -184,7 +184,7 @@ static inline unsigned long kuap_get_and_check_amr(void)
 #define reset_kuap reset_kuap
 static inline void reset_kuap(void)
 {
-   if (mmu_has_feature(MMU_FTR_RADIX_KUAP)) {
+   if (mmu_has_feature(MMU_FTR_KUAP)) {
mtspr(SPRN_AMR, 0);
/*  Do we need isync()? We are going via a kexec reset */
isync();
diff --git a/arch/powerpc/include/asm/mmu.h b/arch/powerpc/include/asm/mmu.h
index 94435f85e3bc..14d7e6803453 100644
--- a/arch/powerpc/include/asm/mmu.h
+++ b/arch/powerpc/include/asm/mmu.h
@@ -112,7 +112,7 @@
 /*
  * Supports KUAP (key 0 controlling userspace addresses) on radix
  */
-#define MMU_FTR_RADIX_KUAP ASM_CONST(0x8000)
+#define MMU_FTR_KUAP   ASM_CONST(0x8000)
 
 /* MMU feature bit sets for various CPUs */
 #define MMU_FTRS_DEFAULT_HPTE_ARCH_V2  \
@@ -175,10 +175,10 @@ enum {
 #endif
 #ifdef CONFIG_PPC_RADIX_MMU
MMU_FTR_TYPE_RADIX |
+#endif /* CONFIG_PPC_RADIX_MMU */
 #ifdef CONFIG_PPC_KUAP
-   MMU_FTR_RADIX_KUAP |
+   MMU_FTR_KUAP |
 #endif /* CONFIG_PPC_KUAP */
-#endif /* CONFIG_PPC_RADIX_MMU */
 #ifdef CONFIG_PPC_MEM_KEYS
MMU_FTR_PKEY |
 #endif
diff --git a/arch/powerpc/mm/book3s64/pkeys.c b/arch/powerpc/mm/book3s64/pkeys.c
index 8ec677a91f80..aeecc8b8e11c 100644
--- a/arch/powerpc/mm/book3s64/pkeys.c
+++ 

[PATCH v5 20/26] powerpc/book3s64/kuep: Move KUEP related function outside radix

2020-06-19 Thread Aneesh Kumar K.V
The next set of patches adds support for kuep with hash translation.
In preparation for that rename/move kuap related functions to
non radix names.

Also set MMU_FTR_KUEP and add the missing isync().

Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/include/asm/book3s/64/kup.h |  1 +
 arch/powerpc/mm/book3s64/pkeys.c | 21 +
 arch/powerpc/mm/book3s64/radix_pgtable.c | 20 
 3 files changed, 22 insertions(+), 20 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/kup.h 
b/arch/powerpc/include/asm/book3s/64/kup.h
index 54e237c093da..c4cf9b1caa23 100644
--- a/arch/powerpc/include/asm/book3s/64/kup.h
+++ b/arch/powerpc/include/asm/book3s/64/kup.h
@@ -7,6 +7,7 @@
 
 #define AMR_KUAP_BLOCK_READUL(0x4000)
 #define AMR_KUAP_BLOCK_WRITE   UL(0x8000)
+#define AMR_KUEP_BLOCKED   (1UL << 62)
 #define AMR_KUAP_BLOCKED   (AMR_KUAP_BLOCK_READ | AMR_KUAP_BLOCK_WRITE)
 #define AMR_KUAP_SHIFT 62
 
diff --git a/arch/powerpc/mm/book3s64/pkeys.c b/arch/powerpc/mm/book3s64/pkeys.c
index e93b65a0e6e7..8ec677a91f80 100644
--- a/arch/powerpc/mm/book3s64/pkeys.c
+++ b/arch/powerpc/mm/book3s64/pkeys.c
@@ -200,6 +200,27 @@ void __init pkey_early_init_devtree(void)
return;
 }
 
+#ifdef CONFIG_PPC_KUEP
+void __init setup_kuep(bool disabled)
+{
+   if (disabled || !early_radix_enabled())
+   return;
+
+   if (smp_processor_id() == boot_cpuid) {
+   pr_info("Activating Kernel Userspace Execution Prevention\n");
+   cur_cpu_spec->mmu_features |= MMU_FTR_KUEP;
+   }
+
+   /*
+* Radix always uses key0 of the IAMR to determine if an access is
+* allowed. We set bit 0 (IBM bit 1) of key0, to prevent instruction
+* fetch.
+*/
+   mtspr(SPRN_IAMR, AMR_KUEP_BLOCKED);
+   isync();
+}
+#endif
+
 #ifdef CONFIG_PPC_KUAP
 void __init setup_kuap(bool disabled)
 {
diff --git a/arch/powerpc/mm/book3s64/radix_pgtable.c 
b/arch/powerpc/mm/book3s64/radix_pgtable.c
index 3959b7d4ad3c..2f641e1d7e82 100644
--- a/arch/powerpc/mm/book3s64/radix_pgtable.c
+++ b/arch/powerpc/mm/book3s64/radix_pgtable.c
@@ -514,26 +514,6 @@ static void radix_init_amor(void)
mtspr(SPRN_AMOR, (3ul << 62));
 }
 
-#ifdef CONFIG_PPC_KUEP
-void setup_kuep(bool disabled)
-{
-   if (disabled || !early_radix_enabled())
-   return;
-
-   if (smp_processor_id() == boot_cpuid) {
-   pr_info("Activating Kernel Userspace Execution Prevention\n");
-   cur_cpu_spec->mmu_features |= MMU_FTR_KUEP;
-   }
-
-   /*
-* Radix always uses key0 of the IAMR to determine if an access is
-* allowed. We set bit 0 (IBM bit 1) of key0, to prevent instruction
-* fetch.
-*/
-   mtspr(SPRN_IAMR, (1ul << 62));
-}
-#endif
-
 void __init radix__early_init_mmu(void)
 {
unsigned long lpcr;
-- 
2.26.2



[PATCH v5 19/26] powerpc/book3s64/kuap: Move KUAP related function outside radix

2020-06-19 Thread Aneesh Kumar K.V
The next set of patches adds support for kuap with hash translation.
In preparation for that rename/move kuap related functions to
non radix names.

Signed-off-by: Aneesh Kumar K.V 
---
 .../asm/book3s/64/{kup-radix.h => kup.h}  |  6 +++---
 arch/powerpc/include/asm/kup.h|  2 +-
 arch/powerpc/kernel/syscall_64.c  |  2 +-
 arch/powerpc/mm/book3s64/pkeys.c  | 19 +++
 arch/powerpc/mm/book3s64/radix_pgtable.c  | 18 --
 5 files changed, 24 insertions(+), 23 deletions(-)
 rename arch/powerpc/include/asm/book3s/64/{kup-radix.h => kup.h} (97%)

diff --git a/arch/powerpc/include/asm/book3s/64/kup-radix.h 
b/arch/powerpc/include/asm/book3s/64/kup.h
similarity index 97%
rename from arch/powerpc/include/asm/book3s/64/kup-radix.h
rename to arch/powerpc/include/asm/book3s/64/kup.h
index c57063c35833..54e237c093da 100644
--- a/arch/powerpc/include/asm/book3s/64/kup-radix.h
+++ b/arch/powerpc/include/asm/book3s/64/kup.h
@@ -1,6 +1,6 @@
 /* SPDX-License-Identifier: GPL-2.0 */
-#ifndef _ASM_POWERPC_BOOK3S_64_KUP_RADIX_H
-#define _ASM_POWERPC_BOOK3S_64_KUP_RADIX_H
+#ifndef _ASM_POWERPC_BOOK3S_64_KUP_H
+#define _ASM_POWERPC_BOOK3S_64_KUP_H
 
 #include 
 #include 
@@ -202,4 +202,4 @@ static inline void reset_kuep(void)
 
 #endif /* __ASSEMBLY__ */
 
-#endif /* _ASM_POWERPC_BOOK3S_64_KUP_RADIX_H */
+#endif /* _ASM_POWERPC_BOOK3S_64_KUP_H */
diff --git a/arch/powerpc/include/asm/kup.h b/arch/powerpc/include/asm/kup.h
index 4dc23a706910..593707e112cc 100644
--- a/arch/powerpc/include/asm/kup.h
+++ b/arch/powerpc/include/asm/kup.h
@@ -15,7 +15,7 @@
 #define KUAP_CURRENT   (KUAP_CURRENT_READ | KUAP_CURRENT_WRITE)
 
 #ifdef CONFIG_PPC64
-#include 
+#include 
 #endif
 #ifdef CONFIG_PPC_8xx
 #include 
diff --git a/arch/powerpc/kernel/syscall_64.c b/arch/powerpc/kernel/syscall_64.c
index 79edba3ab312..7e560a01afa4 100644
--- a/arch/powerpc/kernel/syscall_64.c
+++ b/arch/powerpc/kernel/syscall_64.c
@@ -2,7 +2,7 @@
 
 #include 
 #include 
-#include 
+#include 
 #include 
 #include 
 #include 
diff --git a/arch/powerpc/mm/book3s64/pkeys.c b/arch/powerpc/mm/book3s64/pkeys.c
index 0d72c0246052..e93b65a0e6e7 100644
--- a/arch/powerpc/mm/book3s64/pkeys.c
+++ b/arch/powerpc/mm/book3s64/pkeys.c
@@ -12,6 +12,7 @@
 #include 
 #include 
 
+#include 
 
 DEFINE_STATIC_KEY_FALSE(execute_pkey_disabled);
 int  max_pkey; /* Maximum key value supported */
@@ -199,6 +200,24 @@ void __init pkey_early_init_devtree(void)
return;
 }
 
+#ifdef CONFIG_PPC_KUAP
+void __init setup_kuap(bool disabled)
+{
+   if (disabled || !early_radix_enabled())
+   return;
+
+   if (smp_processor_id() == boot_cpuid) {
+   pr_info("Activating Kernel Userspace Access Prevention\n");
+   cur_cpu_spec->mmu_features |= MMU_FTR_RADIX_KUAP;
+   }
+
+   /* Make sure userspace can't change the AMR */
+   mtspr(SPRN_UAMOR, 0);
+   mtspr(SPRN_AMR, AMR_KUAP_BLOCKED);
+   isync();
+}
+#endif
+
 void pkey_mm_init(struct mm_struct *mm)
 {
if (!mmu_has_feature(MMU_FTR_PKEY))
diff --git a/arch/powerpc/mm/book3s64/radix_pgtable.c 
b/arch/powerpc/mm/book3s64/radix_pgtable.c
index 04fd749c6339..3959b7d4ad3c 100644
--- a/arch/powerpc/mm/book3s64/radix_pgtable.c
+++ b/arch/powerpc/mm/book3s64/radix_pgtable.c
@@ -534,24 +534,6 @@ void setup_kuep(bool disabled)
 }
 #endif
 
-#ifdef CONFIG_PPC_KUAP
-void setup_kuap(bool disabled)
-{
-   if (disabled || !early_radix_enabled())
-   return;
-
-   if (smp_processor_id() == boot_cpuid) {
-   pr_info("Activating Kernel Userspace Access Prevention\n");
-   cur_cpu_spec->mmu_features |= MMU_FTR_RADIX_KUAP;
-   }
-
-   /* Make sure userspace can't change the AMR */
-   mtspr(SPRN_UAMOR, 0);
-   mtspr(SPRN_AMR, AMR_KUAP_BLOCKED);
-   isync();
-}
-#endif
-
 void __init radix__early_init_mmu(void)
 {
unsigned long lpcr;
-- 
2.26.2



[PATCH v5 18/26] powerpc/book3s64/keys/kuap: Reset AMR/IAMR values on kexec

2020-06-19 Thread Aneesh Kumar K.V
As we kexec across kernels that use AMR/IAMR for different purposes
we need to ensure that new kernels get kexec'd with a reset value
of AMR/IAMR. For ex: the new kernel can use key 0 for kernel mapping and the old
AMR value prevents access to key 0.

This patch also removes reset if IAMR and AMOR in kexec_sequence. Reset of AMOR
is not needed and the IAMR reset is partial (it doesn't do the reset
on secondary cpus) and is redundant with this patch.

Signed-off-by: Aneesh Kumar K.V 
---
 .../powerpc/include/asm/book3s/64/kup-radix.h | 20 +++
 arch/powerpc/include/asm/kup.h| 14 +
 arch/powerpc/kernel/misc_64.S | 14 -
 arch/powerpc/kexec/core_64.c  |  3 +++
 arch/powerpc/mm/book3s64/pgtable.c|  3 +++
 5 files changed, 40 insertions(+), 14 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/kup-radix.h 
b/arch/powerpc/include/asm/book3s/64/kup-radix.h
index 3ee1ec60be84..c57063c35833 100644
--- a/arch/powerpc/include/asm/book3s/64/kup-radix.h
+++ b/arch/powerpc/include/asm/book3s/64/kup-radix.h
@@ -180,6 +180,26 @@ static inline unsigned long kuap_get_and_check_amr(void)
 }
 #endif /* CONFIG_PPC_KUAP */
 
+#define reset_kuap reset_kuap
+static inline void reset_kuap(void)
+{
+   if (mmu_has_feature(MMU_FTR_RADIX_KUAP)) {
+   mtspr(SPRN_AMR, 0);
+   /*  Do we need isync()? We are going via a kexec reset */
+   isync();
+   }
+}
+
+#define reset_kuep reset_kuep
+static inline void reset_kuep(void)
+{
+   if (mmu_has_feature(MMU_FTR_KUEP)) {
+   mtspr(SPRN_IAMR, 0);
+   /*  Do we need isync()? We are going via a kexec reset */
+   isync();
+   }
+}
+
 #endif /* __ASSEMBLY__ */
 
 #endif /* _ASM_POWERPC_BOOK3S_64_KUP_RADIX_H */
diff --git a/arch/powerpc/include/asm/kup.h b/arch/powerpc/include/asm/kup.h
index c745ee41ad66..4dc23a706910 100644
--- a/arch/powerpc/include/asm/kup.h
+++ b/arch/powerpc/include/asm/kup.h
@@ -113,6 +113,20 @@ static inline void prevent_current_write_to_user(void)
prevent_user_access(NULL, NULL, ~0UL, KUAP_CURRENT_WRITE);
 }
 
+#ifndef reset_kuap
+#define reset_kuap reset_kuap
+static inline void reset_kuap(void)
+{
+}
+#endif
+
+#ifndef reset_kuep
+#define reset_kuep reset_kuep
+static inline void reset_kuep(void)
+{
+}
+#endif
+
 #endif /* !__ASSEMBLY__ */
 
 #endif /* _ASM_POWERPC_KUAP_H_ */
diff --git a/arch/powerpc/kernel/misc_64.S b/arch/powerpc/kernel/misc_64.S
index 1864605eca29..7bb46ad98207 100644
--- a/arch/powerpc/kernel/misc_64.S
+++ b/arch/powerpc/kernel/misc_64.S
@@ -413,20 +413,6 @@ _GLOBAL(kexec_sequence)
li  r0,0
std r0,16(r1)
 
-BEGIN_FTR_SECTION
-   /*
-* This is the best time to turn AMR/IAMR off.
-* key 0 is used in radix for supervisor<->user
-* protection, but on hash key 0 is reserved
-* ideally we want to enter with a clean state.
-* NOTE, we rely on r0 being 0 from above.
-*/
-   mtspr   SPRN_IAMR,r0
-BEGIN_FTR_SECTION_NESTED(42)
-   mtspr   SPRN_AMOR,r0
-END_FTR_SECTION_NESTED_IFSET(CPU_FTR_HVMODE, 42)
-END_FTR_SECTION_IFSET(CPU_FTR_ARCH_300)
-
/* save regs for local vars on new stack.
 * yes, we won't go back, but ...
 */
diff --git a/arch/powerpc/kexec/core_64.c b/arch/powerpc/kexec/core_64.c
index b4184092172a..a124715f33ea 100644
--- a/arch/powerpc/kexec/core_64.c
+++ b/arch/powerpc/kexec/core_64.c
@@ -152,6 +152,9 @@ static void kexec_smp_down(void *arg)
if (ppc_md.kexec_cpu_down)
ppc_md.kexec_cpu_down(0, 1);
 
+   reset_kuap();
+   reset_kuep();
+
kexec_smp_wait();
/* NOTREACHED */
 }
diff --git a/arch/powerpc/mm/book3s64/pgtable.c 
b/arch/powerpc/mm/book3s64/pgtable.c
index c58ad1049909..9673f4b74c9a 100644
--- a/arch/powerpc/mm/book3s64/pgtable.c
+++ b/arch/powerpc/mm/book3s64/pgtable.c
@@ -165,6 +165,9 @@ void mmu_cleanup_all(void)
radix__mmu_cleanup_all();
else if (mmu_hash_ops.hpte_clear_all)
mmu_hash_ops.hpte_clear_all();
+
+   reset_kuap();
+   reset_kuep();
 }
 
 #ifdef CONFIG_MEMORY_HOTPLUG
-- 
2.26.2



[PATCH v5 17/26] powerpc/book3s64/keys: Print information during boot.

2020-06-19 Thread Aneesh Kumar K.V
Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/mm/book3s64/pkeys.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/powerpc/mm/book3s64/pkeys.c b/arch/powerpc/mm/book3s64/pkeys.c
index 810118123e70..0d72c0246052 100644
--- a/arch/powerpc/mm/book3s64/pkeys.c
+++ b/arch/powerpc/mm/book3s64/pkeys.c
@@ -195,6 +195,7 @@ void __init pkey_early_init_devtree(void)
 */
initial_allocation_mask |= reserved_allocation_mask;
 
+   pr_info("Enabling Memory keys with max key count %d", max_pkey);
return;
 }
 
-- 
2.26.2



[PATCH v5 16/26] powerpc/book3s64/pkeys: Use MMU_FTR_PKEY instead of pkey_disabled static key

2020-06-19 Thread Aneesh Kumar K.V
Instead of pkey_disabled static key use mmu feature MMU_FTR_PKEY.

Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/include/asm/book3s/64/pkeys.h |  2 +-
 arch/powerpc/include/asm/pkeys.h   | 14 ++
 arch/powerpc/mm/book3s64/pkeys.c   | 16 +++-
 3 files changed, 14 insertions(+), 18 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/pkeys.h 
b/arch/powerpc/include/asm/book3s/64/pkeys.h
index 8174662a9173..5b178139f3c0 100644
--- a/arch/powerpc/include/asm/book3s/64/pkeys.h
+++ b/arch/powerpc/include/asm/book3s/64/pkeys.h
@@ -7,7 +7,7 @@
 
 static inline u64 vmflag_to_pte_pkey_bits(u64 vm_flags)
 {
-   if (static_branch_likely(_disabled))
+   if (!mmu_has_feature(MMU_FTR_PKEY))
return 0x0UL;
 
if (radix_enabled())
diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
index 09fbaa409ac4..b1d448c53209 100644
--- a/arch/powerpc/include/asm/pkeys.h
+++ b/arch/powerpc/include/asm/pkeys.h
@@ -11,7 +11,6 @@
 #include 
 #include 
 
-DECLARE_STATIC_KEY_FALSE(pkey_disabled);
 extern int max_pkey;
 extern u32 reserved_allocation_mask; /* bits set for reserved keys */
 
@@ -38,7 +37,7 @@ static inline u64 pkey_to_vmflag_bits(u16 pkey)
 
 static inline int vma_pkey(struct vm_area_struct *vma)
 {
-   if (static_branch_likely(_disabled))
+   if (!mmu_has_feature(MMU_FTR_PKEY))
return 0;
return (vma->vm_flags & ARCH_VM_PKEY_FLAGS) >> VM_PKEY_SHIFT;
 }
@@ -93,9 +92,8 @@ static inline int mm_pkey_alloc(struct mm_struct *mm)
u32 all_pkeys_mask = (u32)(~(0x0));
int ret;
 
-   if (static_branch_likely(_disabled))
+   if (!mmu_has_feature(MMU_FTR_PKEY))
return -1;
-
/*
 * Are we out of pkeys? We must handle this specially because ffz()
 * behavior is undefined if there are no zeros.
@@ -111,7 +109,7 @@ static inline int mm_pkey_alloc(struct mm_struct *mm)
 
 static inline int mm_pkey_free(struct mm_struct *mm, int pkey)
 {
-   if (static_branch_likely(_disabled))
+   if (!mmu_has_feature(MMU_FTR_PKEY))
return -1;
 
if (!mm_pkey_is_allocated(mm, pkey))
@@ -132,7 +130,7 @@ extern int __arch_override_mprotect_pkey(struct 
vm_area_struct *vma,
 static inline int arch_override_mprotect_pkey(struct vm_area_struct *vma,
  int prot, int pkey)
 {
-   if (static_branch_likely(_disabled))
+   if (!mmu_has_feature(MMU_FTR_PKEY))
return 0;
 
/*
@@ -150,7 +148,7 @@ extern int __arch_set_user_pkey_access(struct task_struct 
*tsk, int pkey,
 static inline int arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
unsigned long init_val)
 {
-   if (static_branch_likely(_disabled))
+   if (!mmu_has_feature(MMU_FTR_PKEY))
return -EINVAL;
 
/*
@@ -167,7 +165,7 @@ static inline int arch_set_user_pkey_access(struct 
task_struct *tsk, int pkey,
 
 static inline bool arch_pkeys_enabled(void)
 {
-   return !static_branch_likely(_disabled);
+   return mmu_has_feature(MMU_FTR_PKEY);
 }
 
 extern void pkey_mm_init(struct mm_struct *mm);
diff --git a/arch/powerpc/mm/book3s64/pkeys.c b/arch/powerpc/mm/book3s64/pkeys.c
index fed4f159011b..810118123e70 100644
--- a/arch/powerpc/mm/book3s64/pkeys.c
+++ b/arch/powerpc/mm/book3s64/pkeys.c
@@ -13,7 +13,6 @@
 #include 
 
 
-DEFINE_STATIC_KEY_FALSE(pkey_disabled);
 DEFINE_STATIC_KEY_FALSE(execute_pkey_disabled);
 int  max_pkey; /* Maximum key value supported */
 /*
@@ -114,7 +113,6 @@ void __init pkey_early_init_devtree(void)
pkeys_total = scan_pkey_feature();
if (!pkeys_total) {
/* No support for pkey. Mark it disabled */
-   static_branch_enable(_disabled);
return;
}
 
@@ -202,7 +200,7 @@ void __init pkey_early_init_devtree(void)
 
 void pkey_mm_init(struct mm_struct *mm)
 {
-   if (static_branch_likely(_disabled))
+   if (!mmu_has_feature(MMU_FTR_PKEY))
return;
mm_pkey_allocation_map(mm) = initial_allocation_mask;
mm->context.execute_only_pkey = execute_only_key;
@@ -306,7 +304,7 @@ int __arch_set_user_pkey_access(struct task_struct *tsk, 
int pkey,
 
 void thread_pkey_regs_save(struct thread_struct *thread)
 {
-   if (static_branch_likely(_disabled))
+   if (!mmu_has_feature(MMU_FTR_PKEY))
return;
 
/*
@@ -320,7 +318,7 @@ void thread_pkey_regs_save(struct thread_struct *thread)
 void thread_pkey_regs_restore(struct thread_struct *new_thread,
  struct thread_struct *old_thread)
 {
-   if (static_branch_likely(_disabled))
+   if (!mmu_has_feature(MMU_FTR_PKEY))
return;
 
if (old_thread->amr != new_thread->amr)
@@ -333,7 +331,7 @@ void thread_pkey_regs_restore(struct thread_struct 
*new_thread,
 
 void 

[PATCH v5 15/26] powerpc/book3s64/pkeys: Use execute_pkey_disable static key

2020-06-19 Thread Aneesh Kumar K.V
Use execute_pkey_disabled static key to check for execute key support instead
of pkey_disabled.

Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/include/asm/pkeys.h | 10 +-
 arch/powerpc/mm/book3s64/pkeys.c |  5 -
 2 files changed, 5 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
index 47c81d41ea9a..09fbaa409ac4 100644
--- a/arch/powerpc/include/asm/pkeys.h
+++ b/arch/powerpc/include/asm/pkeys.h
@@ -126,15 +126,7 @@ static inline int mm_pkey_free(struct mm_struct *mm, int 
pkey)
  * Try to dedicate one of the protection keys to be used as an
  * execute-only protection key.
  */
-extern int __execute_only_pkey(struct mm_struct *mm);
-static inline int execute_only_pkey(struct mm_struct *mm)
-{
-   if (static_branch_likely(_disabled))
-   return -1;
-
-   return __execute_only_pkey(mm);
-}
-
+extern int execute_only_pkey(struct mm_struct *mm);
 extern int __arch_override_mprotect_pkey(struct vm_area_struct *vma,
 int prot, int pkey);
 static inline int arch_override_mprotect_pkey(struct vm_area_struct *vma,
diff --git a/arch/powerpc/mm/book3s64/pkeys.c b/arch/powerpc/mm/book3s64/pkeys.c
index bbba9c601e14..fed4f159011b 100644
--- a/arch/powerpc/mm/book3s64/pkeys.c
+++ b/arch/powerpc/mm/book3s64/pkeys.c
@@ -345,8 +345,11 @@ void thread_pkey_regs_init(struct thread_struct *thread)
write_uamor(default_uamor);
 }
 
-int __execute_only_pkey(struct mm_struct *mm)
+int execute_only_pkey(struct mm_struct *mm)
 {
+   if (static_branch_likely(_pkey_disabled))
+   return -1;
+
return mm->context.execute_only_pkey;
 }
 
-- 
2.26.2



[PATCH v5 14/26] powerpc/book3s64/kuep: Add MMU_FTR_KUEP

2020-06-19 Thread Aneesh Kumar K.V
This will be used to enable/disable Kernel Userspace Execution
Prevention (KUEP).

Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/include/asm/mmu.h   | 5 +
 arch/powerpc/mm/book3s64/radix_pgtable.c | 4 +++-
 2 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/mmu.h b/arch/powerpc/include/asm/mmu.h
index 72966d3d8f64..94435f85e3bc 100644
--- a/arch/powerpc/include/asm/mmu.h
+++ b/arch/powerpc/include/asm/mmu.h
@@ -24,6 +24,7 @@
 /* Radix page table supported and enabled */
 #define MMU_FTR_TYPE_RADIX ASM_CONST(0x0040)
 #define MMU_FTR_PKEY   ASM_CONST(0x0080)
+#define MMU_FTR_KUEP   ASM_CONST(0x0100)
 
 /*
  * Individual features below.
@@ -181,6 +182,10 @@ enum {
 #ifdef CONFIG_PPC_MEM_KEYS
MMU_FTR_PKEY |
 #endif
+#ifdef CONFIG_PPC_KUEP
+   MMU_FTR_KUEP |
+#endif /* CONFIG_PPC_KUAP */
+
0,
 };
 
diff --git a/arch/powerpc/mm/book3s64/radix_pgtable.c 
b/arch/powerpc/mm/book3s64/radix_pgtable.c
index 8acb96de0e48..04fd749c6339 100644
--- a/arch/powerpc/mm/book3s64/radix_pgtable.c
+++ b/arch/powerpc/mm/book3s64/radix_pgtable.c
@@ -520,8 +520,10 @@ void setup_kuep(bool disabled)
if (disabled || !early_radix_enabled())
return;
 
-   if (smp_processor_id() == boot_cpuid)
+   if (smp_processor_id() == boot_cpuid) {
pr_info("Activating Kernel Userspace Execution Prevention\n");
+   cur_cpu_spec->mmu_features |= MMU_FTR_KUEP;
+   }
 
/*
 * Radix always uses key0 of the IAMR to determine if an access is
-- 
2.26.2



[PATCH v5 13/26] powerpc/book3s64/pkeys: Enable MMU_FTR_PKEY

2020-06-19 Thread Aneesh Kumar K.V
Parse storage keys related device tree entry in early_init_devtree
and enable MMU feature MMU_FTR_PKEY if pkeys are supported.

MMU feature is used instead of CPU feature because this enables us
to group MMU_FTR_KUAP and MMU_FTR_PKEY in asm feature fixup code.

Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/include/asm/book3s/64/mmu.h |  6 +++
 arch/powerpc/include/asm/mmu.h   |  6 +++
 arch/powerpc/kernel/prom.c   |  5 +++
 arch/powerpc/mm/book3s64/pkeys.c | 54 ++--
 4 files changed, 48 insertions(+), 23 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/mmu.h 
b/arch/powerpc/include/asm/book3s/64/mmu.h
index 5393a535240c..3371ea05b7d3 100644
--- a/arch/powerpc/include/asm/book3s/64/mmu.h
+++ b/arch/powerpc/include/asm/book3s/64/mmu.h
@@ -209,6 +209,12 @@ extern int mmu_io_psize;
 void mmu_early_init_devtree(void);
 void hash__early_init_devtree(void);
 void radix__early_init_devtree(void);
+#ifdef CONFIG_PPC_MEM_KEYS
+void pkey_early_init_devtree(void);
+#else
+static inline void pkey_early_init_devtree(void) {}
+#endif
+
 extern void hash__early_init_mmu(void);
 extern void radix__early_init_mmu(void);
 static inline void __init early_init_mmu(void)
diff --git a/arch/powerpc/include/asm/mmu.h b/arch/powerpc/include/asm/mmu.h
index f4ac25d4df05..72966d3d8f64 100644
--- a/arch/powerpc/include/asm/mmu.h
+++ b/arch/powerpc/include/asm/mmu.h
@@ -23,6 +23,7 @@
 
 /* Radix page table supported and enabled */
 #define MMU_FTR_TYPE_RADIX ASM_CONST(0x0040)
+#define MMU_FTR_PKEY   ASM_CONST(0x0080)
 
 /*
  * Individual features below.
@@ -177,6 +178,9 @@ enum {
MMU_FTR_RADIX_KUAP |
 #endif /* CONFIG_PPC_KUAP */
 #endif /* CONFIG_PPC_RADIX_MMU */
+#ifdef CONFIG_PPC_MEM_KEYS
+   MMU_FTR_PKEY |
+#endif
0,
 };
 
@@ -356,6 +360,8 @@ extern void setup_initial_memory_limit(phys_addr_t 
first_memblock_base,
   phys_addr_t first_memblock_size);
 static inline void mmu_early_init_devtree(void) { }
 
+static inline void pkey_early_init_devtree(void) {}
+
 extern void *abatron_pteptrs[2];
 #endif /* __ASSEMBLY__ */
 #endif
diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c
index 6a3bac357e24..6d70797352d8 100644
--- a/arch/powerpc/kernel/prom.c
+++ b/arch/powerpc/kernel/prom.c
@@ -815,6 +815,11 @@ void __init early_init_devtree(void *params)
/* Now try to figure out if we are running on LPAR and so on */
pseries_probe_fw_features();
 
+   /*
+* Initialize pkey features and default AMR/IAMR values
+*/
+   pkey_early_init_devtree();
+
 #ifdef CONFIG_PPC_PS3
/* Identify PS3 firmware */
if (of_flat_dt_is_compatible(of_get_flat_dt_root(), "sony,ps3"))
diff --git a/arch/powerpc/mm/book3s64/pkeys.c b/arch/powerpc/mm/book3s64/pkeys.c
index 0ff59acdbb84..bbba9c601e14 100644
--- a/arch/powerpc/mm/book3s64/pkeys.c
+++ b/arch/powerpc/mm/book3s64/pkeys.c
@@ -10,7 +10,8 @@
 #include 
 #include 
 #include 
-#include 
+#include 
+
 
 DEFINE_STATIC_KEY_FALSE(pkey_disabled);
 DEFINE_STATIC_KEY_FALSE(execute_pkey_disabled);
@@ -38,38 +39,45 @@ static int execute_only_key = 2;
 #define PKEY_REG_BITS (sizeof(u64) * 8)
 #define pkeyshift(pkey) (PKEY_REG_BITS - ((pkey+1) * AMR_BITS_PER_PKEY))
 
+static int __init dt_scan_storage_keys(unsigned long node,
+  const char *uname, int depth,
+  void *data)
+{
+   const char *type = of_get_flat_dt_prop(node, "device_type", NULL);
+   const __be32 *prop;
+   int pkeys_total;
+
+   /* We are scanning "cpu" nodes only */
+   if (type == NULL || strcmp(type, "cpu") != 0)
+   return 0;
+
+   prop = of_get_flat_dt_prop(node, "ibm,processor-storage-keys", NULL);
+   if (!prop)
+   return 0;
+   pkeys_total = be32_to_cpu(prop[0]);
+   return pkeys_total;
+}
+
 static int scan_pkey_feature(void)
 {
-   u32 vals[2];
-   int pkeys_total = 0;
-   struct device_node *cpu;
+   int pkeys_total;
 
/*
 * Pkey is not supported with Radix translation.
 */
-   if (radix_enabled())
+   if (early_radix_enabled())
return 0;
 
-   cpu = of_find_node_by_type(NULL, "cpu");
-   if (!cpu)
-   return 0;
+   pkeys_total = of_scan_flat_dt(dt_scan_storage_keys, NULL);
+   if (pkeys_total == 0) {
 
-   if (of_property_read_u32_array(cpu,
-  "ibm,processor-storage-keys", vals, 2) 
== 0) {
-   /*
-* Since any pkey can be used for data or execute, we will
-* just treat all keys as equal and track them as one entity.
-*/
-   pkeys_total = vals[0];
-   /*  Should we check for IAMR support FIXME!! */
-   } else {
/*
 * Let's 

[PATCH v5 12/26] powerpc/book3s64/pkeys: Mark all the pkeys above max pkey as reserved

2020-06-19 Thread Aneesh Kumar K.V
The hypervisor can return less than max allowed pkey (for ex: 31) instead
of 32. We should mark all the pkeys above max allowed as reserved so
that we avoid the allocation of the wrong pkey(for ex: key 31 in the above
case) by userspace.

Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/mm/book3s64/pkeys.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/mm/book3s64/pkeys.c b/arch/powerpc/mm/book3s64/pkeys.c
index 73b5ef1490c8..0ff59acdbb84 100644
--- a/arch/powerpc/mm/book3s64/pkeys.c
+++ b/arch/powerpc/mm/book3s64/pkeys.c
@@ -175,9 +175,10 @@ static int pkey_initialize(void)
 
/*
 * Prevent the usage of OS reserved keys. Update UAMOR
-* for those keys.
+* for those keys. Also mark the rest of the bits in the
+* 32 bit mask as reserved.
 */
-   for (i = max_pkey; i < pkeys_total; i++) {
+   for (i = max_pkey; i < 32 ; i++) {
reserved_allocation_mask |= (0x1 << i);
default_uamor &= ~(0x3ul << pkeyshift(i));
}
-- 
2.26.2



[PATCH v5 11/26] powerpc/book3s64/pkeys: Make initial_allocation_mask static

2020-06-19 Thread Aneesh Kumar K.V
initial_allocation_mask is not used outside this file.

Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/include/asm/pkeys.h | 1 -
 arch/powerpc/mm/book3s64/pkeys.c | 2 +-
 2 files changed, 1 insertion(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
index 652bad7334f3..47c81d41ea9a 100644
--- a/arch/powerpc/include/asm/pkeys.h
+++ b/arch/powerpc/include/asm/pkeys.h
@@ -13,7 +13,6 @@
 
 DECLARE_STATIC_KEY_FALSE(pkey_disabled);
 extern int max_pkey;
-extern u32 initial_allocation_mask; /*  bits set for the initially allocated 
keys */
 extern u32 reserved_allocation_mask; /* bits set for reserved keys */
 
 #define ARCH_VM_PKEY_FLAGS (VM_PKEY_BIT0 | VM_PKEY_BIT1 | VM_PKEY_BIT2 | \
diff --git a/arch/powerpc/mm/book3s64/pkeys.c b/arch/powerpc/mm/book3s64/pkeys.c
index a4d7287082a8..73b5ef1490c8 100644
--- a/arch/powerpc/mm/book3s64/pkeys.c
+++ b/arch/powerpc/mm/book3s64/pkeys.c
@@ -15,11 +15,11 @@
 DEFINE_STATIC_KEY_FALSE(pkey_disabled);
 DEFINE_STATIC_KEY_FALSE(execute_pkey_disabled);
 int  max_pkey; /* Maximum key value supported */
-u32  initial_allocation_mask;   /* Bits set for the initially allocated keys */
 /*
  *  Keys marked in the reservation list cannot be allocated by  userspace
  */
 u32  reserved_allocation_mask;
+static u32  initial_allocation_mask;   /* Bits set for the initially allocated 
keys */
 static u64 default_amr;
 static u64 default_iamr;
 /* Allow all keys to be modified by default */
-- 
2.26.2



[PATCH v5 10/26] powerpc/book3s64/pkeys: Convert pkey_total to max_pkey

2020-06-19 Thread Aneesh Kumar K.V
max_pkey now represents max key value that userspace can allocate.

Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/include/asm/pkeys.h |  7 +--
 arch/powerpc/mm/book3s64/pkeys.c | 14 +++---
 2 files changed, 12 insertions(+), 9 deletions(-)

diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
index 75d2a2c19c04..652bad7334f3 100644
--- a/arch/powerpc/include/asm/pkeys.h
+++ b/arch/powerpc/include/asm/pkeys.h
@@ -12,7 +12,7 @@
 #include 
 
 DECLARE_STATIC_KEY_FALSE(pkey_disabled);
-extern int pkeys_total; /* total pkeys as per device tree */
+extern int max_pkey;
 extern u32 initial_allocation_mask; /*  bits set for the initially allocated 
keys */
 extern u32 reserved_allocation_mask; /* bits set for reserved keys */
 
@@ -44,7 +44,10 @@ static inline int vma_pkey(struct vm_area_struct *vma)
return (vma->vm_flags & ARCH_VM_PKEY_FLAGS) >> VM_PKEY_SHIFT;
 }
 
-#define arch_max_pkey() pkeys_total
+static inline int arch_max_pkey(void)
+{
+   return max_pkey;
+}
 
 #define pkey_alloc_mask(pkey) (0x1 << pkey)
 
diff --git a/arch/powerpc/mm/book3s64/pkeys.c b/arch/powerpc/mm/book3s64/pkeys.c
index 87d882a9aaf2..a4d7287082a8 100644
--- a/arch/powerpc/mm/book3s64/pkeys.c
+++ b/arch/powerpc/mm/book3s64/pkeys.c
@@ -14,7 +14,7 @@
 
 DEFINE_STATIC_KEY_FALSE(pkey_disabled);
 DEFINE_STATIC_KEY_FALSE(execute_pkey_disabled);
-int  pkeys_total;  /* Total pkeys as per device tree */
+int  max_pkey; /* Maximum key value supported */
 u32  initial_allocation_mask;   /* Bits set for the initially allocated keys */
 /*
  *  Keys marked in the reservation list cannot be allocated by  userspace
@@ -84,7 +84,7 @@ static int scan_pkey_feature(void)
 
 static int pkey_initialize(void)
 {
-   int os_reserved, i;
+   int pkeys_total, i;
 
/*
 * We define PKEY_DISABLE_EXECUTE in addition to the arch-neutral
@@ -122,12 +122,12 @@ static int pkey_initialize(void)
 * The OS can manage only 8 pkeys due to its inability to represent them
 * in the Linux 4K PTE. Mark all other keys reserved.
 */
-   os_reserved = pkeys_total - 8;
+   max_pkey = min(8, pkeys_total);
 #else
-   os_reserved = 0;
+   max_pkey = pkeys_total;
 #endif
 
-   if (unlikely((pkeys_total - os_reserved) <= execute_only_key)) {
+   if (unlikely(max_pkey <= execute_only_key)) {
/*
 * Insufficient number of keys to support
 * execute only key. Mark it unavailable.
@@ -174,10 +174,10 @@ static int pkey_initialize(void)
default_uamor &= ~(0x3ul << pkeyshift(1));
 
/*
-* Prevent the usage of OS reserved the keys. Update UAMOR
+* Prevent the usage of OS reserved keys. Update UAMOR
 * for those keys.
 */
-   for (i = (pkeys_total - os_reserved); i < pkeys_total; i++) {
+   for (i = max_pkey; i < pkeys_total; i++) {
reserved_allocation_mask |= (0x1 << i);
default_uamor &= ~(0x3ul << pkeyshift(i));
}
-- 
2.26.2



[PATCH v5 09/26] powerpc/book3s64/pkeys: Simplify pkey disable branch

2020-06-19 Thread Aneesh Kumar K.V
Make the default value FALSE (pkey enabled) and set to TRUE when we
find the total number of keys supported to be zero.

Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/include/asm/pkeys.h | 2 +-
 arch/powerpc/mm/book3s64/pkeys.c | 7 +++
 2 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
index 5dd0a79d1809..75d2a2c19c04 100644
--- a/arch/powerpc/include/asm/pkeys.h
+++ b/arch/powerpc/include/asm/pkeys.h
@@ -11,7 +11,7 @@
 #include 
 #include 
 
-DECLARE_STATIC_KEY_TRUE(pkey_disabled);
+DECLARE_STATIC_KEY_FALSE(pkey_disabled);
 extern int pkeys_total; /* total pkeys as per device tree */
 extern u32 initial_allocation_mask; /*  bits set for the initially allocated 
keys */
 extern u32 reserved_allocation_mask; /* bits set for reserved keys */
diff --git a/arch/powerpc/mm/book3s64/pkeys.c b/arch/powerpc/mm/book3s64/pkeys.c
index 7d400d5a4076..87d882a9aaf2 100644
--- a/arch/powerpc/mm/book3s64/pkeys.c
+++ b/arch/powerpc/mm/book3s64/pkeys.c
@@ -12,7 +12,7 @@
 #include 
 #include 
 
-DEFINE_STATIC_KEY_TRUE(pkey_disabled);
+DEFINE_STATIC_KEY_FALSE(pkey_disabled);
 DEFINE_STATIC_KEY_FALSE(execute_pkey_disabled);
 int  pkeys_total;  /* Total pkeys as per device tree */
 u32  initial_allocation_mask;   /* Bits set for the initially allocated keys */
@@ -104,9 +104,8 @@ static int pkey_initialize(void)
 
/* scan the device tree for pkey feature */
pkeys_total = scan_pkey_feature();
-   if (pkeys_total)
-   static_branch_disable(_disabled);
-   else {
+   if (!pkeys_total) {
+   /* No support for pkey. Mark it disabled */
static_branch_enable(_disabled);
return 0;
}
-- 
2.26.2



[PATCH v5 08/26] powerpc/book3s64/pkeys: Convert execute key support to static key

2020-06-19 Thread Aneesh Kumar K.V
Convert the bool to a static key like pkey_disabled.

Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/mm/book3s64/pkeys.c | 12 +---
 1 file changed, 5 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/mm/book3s64/pkeys.c b/arch/powerpc/mm/book3s64/pkeys.c
index 9e68a08799ee..7d400d5a4076 100644
--- a/arch/powerpc/mm/book3s64/pkeys.c
+++ b/arch/powerpc/mm/book3s64/pkeys.c
@@ -13,13 +13,13 @@
 #include 
 
 DEFINE_STATIC_KEY_TRUE(pkey_disabled);
+DEFINE_STATIC_KEY_FALSE(execute_pkey_disabled);
 int  pkeys_total;  /* Total pkeys as per device tree */
 u32  initial_allocation_mask;   /* Bits set for the initially allocated keys */
 /*
  *  Keys marked in the reservation list cannot be allocated by  userspace
  */
 u32  reserved_allocation_mask;
-static bool pkey_execute_disable_supported;
 static u64 default_amr;
 static u64 default_iamr;
 /* Allow all keys to be modified by default */
@@ -116,9 +116,7 @@ static int pkey_initialize(void)
 * execute_disable support. Instead we use a PVR check.
 */
if (pvr_version_is(PVR_POWER7) || pvr_version_is(PVR_POWER7p))
-   pkey_execute_disable_supported = false;
-   else
-   pkey_execute_disable_supported = true;
+   static_branch_enable(_pkey_disabled);
 
 #ifdef CONFIG_PPC_4K_PAGES
/*
@@ -214,7 +212,7 @@ static inline void write_amr(u64 value)
 
 static inline u64 read_iamr(void)
 {
-   if (!likely(pkey_execute_disable_supported))
+   if (static_branch_unlikely(_pkey_disabled))
return 0x0UL;
 
return mfspr(SPRN_IAMR);
@@ -222,7 +220,7 @@ static inline u64 read_iamr(void)
 
 static inline void write_iamr(u64 value)
 {
-   if (!likely(pkey_execute_disable_supported))
+   if (static_branch_unlikely(_pkey_disabled))
return;
 
mtspr(SPRN_IAMR, value);
@@ -282,7 +280,7 @@ int __arch_set_user_pkey_access(struct task_struct *tsk, 
int pkey,
return -EINVAL;
 
if (init_val & PKEY_DISABLE_EXECUTE) {
-   if (!pkey_execute_disable_supported)
+   if (static_branch_unlikely(_pkey_disabled))
return -EINVAL;
new_iamr_bits |= IAMR_EX_BIT;
}
-- 
2.26.2



[PATCH v5 05/26] powerpc/book3s64/pkeys: Simplify the key initialization

2020-06-19 Thread Aneesh Kumar K.V
Add documentation explaining the execute_only_key. The reservation and 
initialization mask
details are also explained in this patch.

No functional change in this patch.

Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/mm/book3s64/pkeys.c | 186 ++-
 1 file changed, 107 insertions(+), 79 deletions(-)

diff --git a/arch/powerpc/mm/book3s64/pkeys.c b/arch/powerpc/mm/book3s64/pkeys.c
index d60e6bfa3e03..3db0b3cfc322 100644
--- a/arch/powerpc/mm/book3s64/pkeys.c
+++ b/arch/powerpc/mm/book3s64/pkeys.c
@@ -15,48 +15,71 @@
 DEFINE_STATIC_KEY_TRUE(pkey_disabled);
 int  pkeys_total;  /* Total pkeys as per device tree */
 u32  initial_allocation_mask;   /* Bits set for the initially allocated keys */
-u32  reserved_allocation_mask;  /* Bits set for reserved keys */
+/*
+ *  Keys marked in the reservation list cannot be allocated by  userspace
+ */
+u32  reserved_allocation_mask;
 static bool pkey_execute_disable_supported;
-static bool pkeys_devtree_defined; /* property exported by device tree */
-static u64 pkey_amr_mask;  /* Bits in AMR not to be touched */
-static u64 pkey_iamr_mask; /* Bits in AMR not to be touched */
-static u64 pkey_uamor_mask;/* Bits in UMOR not to be touched */
+static u64 default_amr;
+static u64 default_iamr;
+/* Allow all keys to be modified by default */
+static u64 default_uamor = ~0x0UL;
+/*
+ * Key used to implement PROT_EXEC mmap. Denies READ/WRITE
+ * We pick key 2 because 0 is special key and 1 is reserved as per ISA.
+ */
 static int execute_only_key = 2;
 
+
 #define AMR_BITS_PER_PKEY 2
 #define AMR_RD_BIT 0x1UL
 #define AMR_WR_BIT 0x2UL
 #define IAMR_EX_BIT 0x1UL
-#define PKEY_REG_BITS (sizeof(u64)*8)
+#define PKEY_REG_BITS (sizeof(u64) * 8)
 #define pkeyshift(pkey) (PKEY_REG_BITS - ((pkey+1) * AMR_BITS_PER_PKEY))
 
-static void scan_pkey_feature(void)
+static int scan_pkey_feature(void)
 {
u32 vals[2];
+   int pkeys_total = 0;
struct device_node *cpu;
 
+   /*
+* Pkey is not supported with Radix translation.
+*/
+   if (radix_enabled())
+   return 0;
+
cpu = of_find_node_by_type(NULL, "cpu");
if (!cpu)
-   return;
+   return 0;
 
if (of_property_read_u32_array(cpu,
-   "ibm,processor-storage-keys", vals, 2))
-   return;
+  "ibm,processor-storage-keys", vals, 2) 
== 0) {
+   /*
+* Since any pkey can be used for data or execute, we will
+* just treat all keys as equal and track them as one entity.
+*/
+   pkeys_total = vals[0];
+   /*  Should we check for IAMR support FIXME!! */
+   } else {
+   /*
+* Let's assume 32 pkeys on P8 bare metal, if its not defined 
by device
+* tree. We make this exception since skiboot forgot to expose 
this
+* property on power8.
+*/
+   if (!firmware_has_feature(FW_FEATURE_LPAR) &&
+   cpu_has_feature(CPU_FTRS_POWER8))
+   pkeys_total = 32;
+   }
 
/*
-* Since any pkey can be used for data or execute, we will just treat
-* all keys as equal and track them as one entity.
+* Adjust the upper limit, based on the number of bits supported by
+* arch-neutral code.
 */
-   pkeys_total = vals[0];
-   pkeys_devtree_defined = true;
-}
-
-static inline bool pkey_mmu_enabled(void)
-{
-   if (firmware_has_feature(FW_FEATURE_LPAR))
-   return pkeys_total;
-   else
-   return cpu_has_feature(CPU_FTR_PKEY);
+   pkeys_total = min_t(int, pkeys_total,
+   ((ARCH_VM_PKEY_FLAGS >> VM_PKEY_SHIFT) + 1));
+   return pkeys_total;
 }
 
 static int pkey_initialize(void)
@@ -80,31 +103,13 @@ static int pkey_initialize(void)
!= (sizeof(u64) * BITS_PER_BYTE));
 
/* scan the device tree for pkey feature */
-   scan_pkey_feature();
-
-   /*
-* Let's assume 32 pkeys on P8 bare metal, if its not defined by device
-* tree. We make this exception since skiboot forgot to expose this
-* property on power8.
-*/
-   if (!pkeys_devtree_defined && !firmware_has_feature(FW_FEATURE_LPAR) &&
-   cpu_has_feature(CPU_FTRS_POWER8))
-   pkeys_total = 32;
-
-   /*
-* Adjust the upper limit, based on the number of bits supported by
-* arch-neutral code.
-*/
-   pkeys_total = min_t(int, pkeys_total,
-   ((ARCH_VM_PKEY_FLAGS >> VM_PKEY_SHIFT)+1));
-
-   if (!pkey_mmu_enabled() || radix_enabled() || !pkeys_total)
-   static_branch_enable(_disabled);
-   else
+   pkeys_total = scan_pkey_feature();
+   if (pkeys_total)

[PATCH v5 07/26] powerpc/book3s64/pkeys: kill cpu feature key CPU_FTR_PKEY

2020-06-19 Thread Aneesh Kumar K.V
We don't use CPU_FTR_PKEY anymore. Remove the feature bit and mark it
free.

Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/include/asm/cputable.h | 13 ++---
 arch/powerpc/kernel/dt_cpu_ftrs.c   |  6 --
 2 files changed, 6 insertions(+), 13 deletions(-)

diff --git a/arch/powerpc/include/asm/cputable.h 
b/arch/powerpc/include/asm/cputable.h
index bac2252c839e..dd0a2e77a695 100644
--- a/arch/powerpc/include/asm/cputable.h
+++ b/arch/powerpc/include/asm/cputable.h
@@ -198,7 +198,7 @@ static inline void cpu_feature_keys_init(void) { }
 #define CPU_FTR_STCX_CHECKS_ADDRESSLONG_ASM_CONST(0x8000)
 #define CPU_FTR_POPCNTB
LONG_ASM_CONST(0x0001)
 #define CPU_FTR_POPCNTD
LONG_ASM_CONST(0x0002)
-#define CPU_FTR_PKEY   LONG_ASM_CONST(0x0004)
+/* LONG_ASM_CONST(0x0004) Free */
 #define CPU_FTR_VMX_COPY   LONG_ASM_CONST(0x0008)
 #define CPU_FTR_TM LONG_ASM_CONST(0x0010)
 #define CPU_FTR_CFAR   LONG_ASM_CONST(0x0020)
@@ -438,7 +438,7 @@ static inline void cpu_feature_keys_init(void) { }
CPU_FTR_DSCR | CPU_FTR_SAO  | CPU_FTR_ASYM_SMT | \
CPU_FTR_STCX_CHECKS_ADDRESS | CPU_FTR_POPCNTB | CPU_FTR_POPCNTD | \
CPU_FTR_CFAR | CPU_FTR_HVMODE | \
-   CPU_FTR_VMX_COPY | CPU_FTR_HAS_PPR | CPU_FTR_DABRX | CPU_FTR_PKEY)
+   CPU_FTR_VMX_COPY | CPU_FTR_HAS_PPR | CPU_FTR_DABRX )
 #define CPU_FTRS_POWER8 (CPU_FTR_LWSYNC | \
CPU_FTR_PPCAS_ARCH_V2 | CPU_FTR_CTRL | CPU_FTR_ARCH_206 |\
CPU_FTR_MMCRA | CPU_FTR_SMT | \
@@ -448,7 +448,7 @@ static inline void cpu_feature_keys_init(void) { }
CPU_FTR_STCX_CHECKS_ADDRESS | CPU_FTR_POPCNTB | CPU_FTR_POPCNTD | \
CPU_FTR_CFAR | CPU_FTR_HVMODE | CPU_FTR_VMX_COPY | \
CPU_FTR_DBELL | CPU_FTR_HAS_PPR | CPU_FTR_DAWR | \
-   CPU_FTR_ARCH_207S | CPU_FTR_TM_COMP | CPU_FTR_PKEY)
+   CPU_FTR_ARCH_207S | CPU_FTR_TM_COMP )
 #define CPU_FTRS_POWER8E (CPU_FTRS_POWER8 | CPU_FTR_PMAO_BUG)
 #define CPU_FTRS_POWER9 (CPU_FTR_LWSYNC | \
CPU_FTR_PPCAS_ARCH_V2 | CPU_FTR_CTRL | CPU_FTR_ARCH_206 |\
@@ -459,8 +459,8 @@ static inline void cpu_feature_keys_init(void) { }
CPU_FTR_STCX_CHECKS_ADDRESS | CPU_FTR_POPCNTB | CPU_FTR_POPCNTD | \
CPU_FTR_CFAR | CPU_FTR_HVMODE | CPU_FTR_VMX_COPY | \
CPU_FTR_DBELL | CPU_FTR_HAS_PPR | CPU_FTR_ARCH_207S | \
-   CPU_FTR_TM_COMP | CPU_FTR_ARCH_300 | CPU_FTR_PKEY | \
-   CPU_FTR_P9_TLBIE_STQ_BUG | CPU_FTR_P9_TLBIE_ERAT_BUG | 
CPU_FTR_P9_TIDR)
+   CPU_FTR_TM_COMP | CPU_FTR_ARCH_300 | CPU_FTR_P9_TLBIE_STQ_BUG | \
+   CPU_FTR_P9_TLBIE_ERAT_BUG | CPU_FTR_P9_TIDR)
 #define CPU_FTRS_POWER9_DD2_0 (CPU_FTRS_POWER9 | CPU_FTR_P9_RADIX_PREFETCH_BUG)
 #define CPU_FTRS_POWER9_DD2_1 (CPU_FTRS_POWER9 | \
   CPU_FTR_P9_RADIX_PREFETCH_BUG | \
@@ -477,8 +477,7 @@ static inline void cpu_feature_keys_init(void) { }
CPU_FTR_STCX_CHECKS_ADDRESS | CPU_FTR_POPCNTB | CPU_FTR_POPCNTD | \
CPU_FTR_CFAR | CPU_FTR_HVMODE | CPU_FTR_VMX_COPY | \
CPU_FTR_DBELL | CPU_FTR_HAS_PPR | CPU_FTR_ARCH_207S | \
-   CPU_FTR_TM_COMP | CPU_FTR_ARCH_300 | CPU_FTR_PKEY | \
-   CPU_FTR_ARCH_31)
+   CPU_FTR_TM_COMP | CPU_FTR_ARCH_300 | CPU_FTR_ARCH_31)
 #define CPU_FTRS_CELL  (CPU_FTR_LWSYNC | \
CPU_FTR_PPCAS_ARCH_V2 | CPU_FTR_CTRL | \
CPU_FTR_ALTIVEC_COMP | CPU_FTR_MMCRA | CPU_FTR_SMT | \
diff --git a/arch/powerpc/kernel/dt_cpu_ftrs.c 
b/arch/powerpc/kernel/dt_cpu_ftrs.c
index 3a409517c031..0acec481d4d1 100644
--- a/arch/powerpc/kernel/dt_cpu_ftrs.c
+++ b/arch/powerpc/kernel/dt_cpu_ftrs.c
@@ -776,12 +776,6 @@ static __init void cpufeatures_cpu_quirks(void)
}
 
update_tlbie_feature_flag(version);
-   /*
-* PKEY was not in the initial base or feature node
-* specification, but it should become optional in the next
-* cpu feature version sequence.
-*/
-   cur_cpu_spec->cpu_features |= CPU_FTR_PKEY;
 }
 
 static void __init cpufeatures_setup_finished(void)
-- 
2.26.2



[PATCH v5 06/26] powerpc/book3s64/pkeys: Prevent key 1 modification from userspace.

2020-06-19 Thread Aneesh Kumar K.V
Key 1 is marked reserved by ISA. Setup uamor to prevent userspace modification
of the same.

Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/mm/book3s64/pkeys.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/powerpc/mm/book3s64/pkeys.c b/arch/powerpc/mm/book3s64/pkeys.c
index 3db0b3cfc322..9e68a08799ee 100644
--- a/arch/powerpc/mm/book3s64/pkeys.c
+++ b/arch/powerpc/mm/book3s64/pkeys.c
@@ -174,6 +174,7 @@ static int pkey_initialize(void)
 * programming note.
 */
reserved_allocation_mask |= (0x1 << 1);
+   default_uamor &= ~(0x3ul << pkeyshift(1));
 
/*
 * Prevent the usage of OS reserved the keys. Update UAMOR
-- 
2.26.2



[PATCH v5 04/26] powerpc/book3s64/pkeys: Explain key 1 reservation details

2020-06-19 Thread Aneesh Kumar K.V
This explains the details w.r.t key 1.

Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/mm/book3s64/pkeys.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/mm/book3s64/pkeys.c b/arch/powerpc/mm/book3s64/pkeys.c
index 1199fc2bfaec..d60e6bfa3e03 100644
--- a/arch/powerpc/mm/book3s64/pkeys.c
+++ b/arch/powerpc/mm/book3s64/pkeys.c
@@ -124,7 +124,10 @@ static int pkey_initialize(void)
 #else
os_reserved = 0;
 #endif
-   /* Bits are in LE format. */
+   /*
+* key 1 is recommended not to be used. PowerISA(3.0) page 1015,
+* programming note.
+*/
reserved_allocation_mask = (0x1 << 1) | (0x1 << execute_only_key);
 
/* register mask is in BE format */
-- 
2.26.2



[PATCH v5 03/26] powerpc/book3s64/pkeys: Move pkey related bits in the linux page table

2020-06-19 Thread Aneesh Kumar K.V
To keep things simple, all the pkey related bits are kept together
in linux page table for 64K config with hash translation. With hash-4k
kernel requires 4 bits to store slots details. This is done by overloading
some of the RPN bits for storing the slot details. Due to this PKEY_BIT0 on
the 4K config is used for storing hash slot details.

64K before

||RSV1| RSV2| RSV3 | RSV4 | RPN44| RPN43   | | RSV5|
|| P4 |  P3 |  P2  |  P1  | Busy | HASHPTE | |  P0 |

after

||RSV1| RSV2| RSV3 | RSV4 | RPN44 | RPN43   | | RSV5 |
|| P4 |  P3 |  P2  |  P1  | P0| HASHPTE | | Busy |

4k before

|| RSV1 | RSV2 | RSV3 | RSV4 | RPN44| RPN43 | RSV5|
|| Busy |  HASHPTE |  P2  |  P1  | F_SEC| F_GIX |  P0 |

after

|| RSV1| RSV2| RSV3 | RSV4 | Free | RPN43 | RSV5 |
|| HASHPTE |  P2 |  P1  |  P0  | F_SEC| F_GIX | BUSY |

Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/include/asm/book3s/64/hash-4k.h  | 16 
 arch/powerpc/include/asm/book3s/64/hash-64k.h | 12 ++--
 arch/powerpc/include/asm/book3s/64/pgtable.h  | 17 -
 3 files changed, 22 insertions(+), 23 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/hash-4k.h 
b/arch/powerpc/include/asm/book3s/64/hash-4k.h
index f889d56bf8cf..082b98808701 100644
--- a/arch/powerpc/include/asm/book3s/64/hash-4k.h
+++ b/arch/powerpc/include/asm/book3s/64/hash-4k.h
@@ -34,11 +34,11 @@
 #define H_PUD_TABLE_SIZE   (sizeof(pud_t) << H_PUD_INDEX_SIZE)
 #define H_PGD_TABLE_SIZE   (sizeof(pgd_t) << H_PGD_INDEX_SIZE)
 
-#define H_PAGE_F_GIX_SHIFT 53
-#define H_PAGE_F_SECOND_RPAGE_RPN44/* HPTE is in 2ndary HPTEG */
-#define H_PAGE_F_GIX   (_RPAGE_RPN43 | _RPAGE_RPN42 | _RPAGE_RPN41)
-#define H_PAGE_BUSY_RPAGE_RSV1 /* software: PTE & hash are busy */
-#define H_PAGE_HASHPTE _RPAGE_RSV2 /* software: PTE & hash are busy */
+#define H_PAGE_F_GIX_SHIFT _PAGE_PA_MAX
+#define H_PAGE_F_SECOND_RPAGE_PKEY_BIT0 /* HPTE is in 2ndary 
HPTEG */
+#define H_PAGE_F_GIX   (_RPAGE_RPN43 | _RPAGE_RPN42 | _RPAGE_RPN41)
+#define H_PAGE_BUSY_RPAGE_RSV1
+#define H_PAGE_HASHPTE _RPAGE_PKEY_BIT4
 
 /* PTE flags to conserve for HPTE identification */
 #define _PAGE_HPTEFLAGS (H_PAGE_BUSY | H_PAGE_HASHPTE | \
@@ -59,9 +59,9 @@
 /* memory key bits, only 8 keys supported */
 #define H_PTE_PKEY_BIT40
 #define H_PTE_PKEY_BIT30
-#define H_PTE_PKEY_BIT2_RPAGE_RSV3
-#define H_PTE_PKEY_BIT1_RPAGE_RSV4
-#define H_PTE_PKEY_BIT0_RPAGE_RSV5
+#define H_PTE_PKEY_BIT2_RPAGE_PKEY_BIT3
+#define H_PTE_PKEY_BIT1_RPAGE_PKEY_BIT2
+#define H_PTE_PKEY_BIT0_RPAGE_PKEY_BIT1
 
 
 /*
diff --git a/arch/powerpc/include/asm/book3s/64/hash-64k.h 
b/arch/powerpc/include/asm/book3s/64/hash-64k.h
index 0a15fd14cf72..f20de1149ebe 100644
--- a/arch/powerpc/include/asm/book3s/64/hash-64k.h
+++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h
@@ -32,15 +32,15 @@
  */
 #define H_PAGE_COMBO   _RPAGE_RPN0 /* this is a combo 4k page */
 #define H_PAGE_4K_PFN  _RPAGE_RPN1 /* PFN is for a single 4k page */
-#define H_PAGE_BUSY_RPAGE_RPN44 /* software: PTE & hash are busy */
+#define H_PAGE_BUSY_RPAGE_RSV1 /* software: PTE & hash are busy */
 #define H_PAGE_HASHPTE _RPAGE_RPN43/* PTE has associated HPTE */
 
 /* memory key bits. */
-#define H_PTE_PKEY_BIT4_RPAGE_RSV1
-#define H_PTE_PKEY_BIT3_RPAGE_RSV2
-#define H_PTE_PKEY_BIT2_RPAGE_RSV3
-#define H_PTE_PKEY_BIT1_RPAGE_RSV4
-#define H_PTE_PKEY_BIT0_RPAGE_RSV5
+#define H_PTE_PKEY_BIT4_RPAGE_PKEY_BIT4
+#define H_PTE_PKEY_BIT3_RPAGE_PKEY_BIT3
+#define H_PTE_PKEY_BIT2_RPAGE_PKEY_BIT2
+#define H_PTE_PKEY_BIT1_RPAGE_PKEY_BIT1
+#define H_PTE_PKEY_BIT0_RPAGE_PKEY_BIT0
 
 /*
  * We need to differentiate between explicit huge page and THP huge
diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h 
b/arch/powerpc/include/asm/book3s/64/pgtable.h
index f17442c3a092..b7c0ba977d6a 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -32,11 +32,13 @@
 #define _RPAGE_SW1 0x00800
 #define _RPAGE_SW2 0x00400
 #define _RPAGE_SW3 0x00200
-#define _RPAGE_RSV10x1000UL
-#define _RPAGE_RSV20x0800UL
-#define _RPAGE_RSV30x0400UL
-#define _RPAGE_RSV40x0200UL
-#define _RPAGE_RSV50x00040UL
+#define _RPAGE_RSV10x00040UL
+
+#define _RPAGE_PKEY_BIT4   0x1000UL
+#define _RPAGE_PKEY_BIT3   0x0800UL
+#define _RPAGE_PKEY_BIT2   0x0400UL
+#define _RPAGE_PKEY_BIT1   0x0200UL
+#define _RPAGE_PKEY_BIT0   0x0100UL
 
 #define _PAGE_PTE  

[PATCH v5 02/26] powerpc/book3s64/pkeys: pkeys are supported only on hash on book3s.

2020-06-19 Thread Aneesh Kumar K.V
Move them to hash specific file and add BUG() for radix path.
---
 .../powerpc/include/asm/book3s/64/hash-pkey.h | 32 
 arch/powerpc/include/asm/book3s/64/pkeys.h| 25 +
 arch/powerpc/include/asm/pkeys.h  | 37 ---
 3 files changed, 64 insertions(+), 30 deletions(-)
 create mode 100644 arch/powerpc/include/asm/book3s/64/hash-pkey.h
 create mode 100644 arch/powerpc/include/asm/book3s/64/pkeys.h

diff --git a/arch/powerpc/include/asm/book3s/64/hash-pkey.h 
b/arch/powerpc/include/asm/book3s/64/hash-pkey.h
new file mode 100644
index ..795010897e5d
--- /dev/null
+++ b/arch/powerpc/include/asm/book3s/64/hash-pkey.h
@@ -0,0 +1,32 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_POWERPC_BOOK3S_64_HASH_PKEY_H
+#define _ASM_POWERPC_BOOK3S_64_HASH_PKEY_H
+
+static inline u64 hash__vmflag_to_pte_pkey_bits(u64 vm_flags)
+{
+   return (((vm_flags & VM_PKEY_BIT0) ? H_PTE_PKEY_BIT0 : 0x0UL) |
+   ((vm_flags & VM_PKEY_BIT1) ? H_PTE_PKEY_BIT1 : 0x0UL) |
+   ((vm_flags & VM_PKEY_BIT2) ? H_PTE_PKEY_BIT2 : 0x0UL) |
+   ((vm_flags & VM_PKEY_BIT3) ? H_PTE_PKEY_BIT3 : 0x0UL) |
+   ((vm_flags & VM_PKEY_BIT4) ? H_PTE_PKEY_BIT4 : 0x0UL));
+}
+
+static inline u64 pte_to_hpte_pkey_bits(u64 pteflags)
+{
+   return (((pteflags & H_PTE_PKEY_BIT4) ? HPTE_R_KEY_BIT4 : 0x0UL) |
+   ((pteflags & H_PTE_PKEY_BIT3) ? HPTE_R_KEY_BIT3 : 0x0UL) |
+   ((pteflags & H_PTE_PKEY_BIT2) ? HPTE_R_KEY_BIT2 : 0x0UL) |
+   ((pteflags & H_PTE_PKEY_BIT1) ? HPTE_R_KEY_BIT1 : 0x0UL) |
+   ((pteflags & H_PTE_PKEY_BIT0) ? HPTE_R_KEY_BIT0 : 0x0UL));
+}
+
+static inline u16 hash__pte_to_pkey_bits(u64 pteflags)
+{
+   return (((pteflags & H_PTE_PKEY_BIT4) ? 0x10 : 0x0UL) |
+   ((pteflags & H_PTE_PKEY_BIT3) ? 0x8 : 0x0UL) |
+   ((pteflags & H_PTE_PKEY_BIT2) ? 0x4 : 0x0UL) |
+   ((pteflags & H_PTE_PKEY_BIT1) ? 0x2 : 0x0UL) |
+   ((pteflags & H_PTE_PKEY_BIT0) ? 0x1 : 0x0UL));
+}
+
+#endif
diff --git a/arch/powerpc/include/asm/book3s/64/pkeys.h 
b/arch/powerpc/include/asm/book3s/64/pkeys.h
new file mode 100644
index ..8174662a9173
--- /dev/null
+++ b/arch/powerpc/include/asm/book3s/64/pkeys.h
@@ -0,0 +1,25 @@
+/* SPDX-License-Identifier: GPL-2.0+ */
+
+#ifndef _ASM_POWERPC_BOOK3S_64_PKEYS_H
+#define _ASM_POWERPC_BOOK3S_64_PKEYS_H
+
+#include 
+
+static inline u64 vmflag_to_pte_pkey_bits(u64 vm_flags)
+{
+   if (static_branch_likely(_disabled))
+   return 0x0UL;
+
+   if (radix_enabled())
+   BUG();
+   return hash__vmflag_to_pte_pkey_bits(vm_flags);
+}
+
+static inline u16 pte_to_pkey_bits(u64 pteflags)
+{
+   if (radix_enabled())
+   BUG();
+   return hash__pte_to_pkey_bits(pteflags);
+}
+
+#endif /*_ASM_POWERPC_KEYS_H */
diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
index f8f4d0793789..5dd0a79d1809 100644
--- a/arch/powerpc/include/asm/pkeys.h
+++ b/arch/powerpc/include/asm/pkeys.h
@@ -25,23 +25,18 @@ extern u32 reserved_allocation_mask; /* bits set for 
reserved keys */
PKEY_DISABLE_WRITE  | \
PKEY_DISABLE_EXECUTE)
 
+#ifdef CONFIG_PPC_BOOK3S_64
+#include 
+#else
+#error "Not supported"
+#endif
+
+
 static inline u64 pkey_to_vmflag_bits(u16 pkey)
 {
return (((u64)pkey << VM_PKEY_SHIFT) & ARCH_VM_PKEY_FLAGS);
 }
 
-static inline u64 vmflag_to_pte_pkey_bits(u64 vm_flags)
-{
-   if (static_branch_likely(_disabled))
-   return 0x0UL;
-
-   return (((vm_flags & VM_PKEY_BIT0) ? H_PTE_PKEY_BIT0 : 0x0UL) |
-   ((vm_flags & VM_PKEY_BIT1) ? H_PTE_PKEY_BIT1 : 0x0UL) |
-   ((vm_flags & VM_PKEY_BIT2) ? H_PTE_PKEY_BIT2 : 0x0UL) |
-   ((vm_flags & VM_PKEY_BIT3) ? H_PTE_PKEY_BIT3 : 0x0UL) |
-   ((vm_flags & VM_PKEY_BIT4) ? H_PTE_PKEY_BIT4 : 0x0UL));
-}
-
 static inline int vma_pkey(struct vm_area_struct *vma)
 {
if (static_branch_likely(_disabled))
@@ -51,24 +46,6 @@ static inline int vma_pkey(struct vm_area_struct *vma)
 
 #define arch_max_pkey() pkeys_total
 
-static inline u64 pte_to_hpte_pkey_bits(u64 pteflags)
-{
-   return (((pteflags & H_PTE_PKEY_BIT4) ? HPTE_R_KEY_BIT4 : 0x0UL) |
-   ((pteflags & H_PTE_PKEY_BIT3) ? HPTE_R_KEY_BIT3 : 0x0UL) |
-   ((pteflags & H_PTE_PKEY_BIT2) ? HPTE_R_KEY_BIT2 : 0x0UL) |
-   ((pteflags & H_PTE_PKEY_BIT1) ? HPTE_R_KEY_BIT1 : 0x0UL) |
-   ((pteflags & H_PTE_PKEY_BIT0) ? HPTE_R_KEY_BIT0 : 0x0UL));
-}
-
-static inline u16 pte_to_pkey_bits(u64 pteflags)
-{
-   return (((pteflags & H_PTE_PKEY_BIT4) ? 0x10 : 0x0UL) |
-   ((pteflags & H_PTE_PKEY_BIT3) ? 0x8 : 0x0UL) |
-   ((pteflags & H_PTE_PKEY_BIT2) ? 0x4 : 0x0UL) |
-   ((pteflags & H_PTE_PKEY_BIT1) ? 0x2 : 0x0UL) |
-   

[PATCH v5 01/26] powerpc/book3s64/pkeys: Fixup bit numbering

2020-06-19 Thread Aneesh Kumar K.V
This number the pkey bit such that it is easy to follow. PKEY_BIT0 is
the lower order bit. This makes further changes easy to follow.

No functional change in this patch other than linux page table for
hash translation now maps pkeys differently.

Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/include/asm/book3s/64/hash-4k.h  |  9 +++
 arch/powerpc/include/asm/book3s/64/hash-64k.h |  8 +++
 arch/powerpc/include/asm/book3s/64/mmu-hash.h |  8 +++
 arch/powerpc/include/asm/pkeys.h  | 24 +--
 4 files changed, 25 insertions(+), 24 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/hash-4k.h 
b/arch/powerpc/include/asm/book3s/64/hash-4k.h
index 3f9ae3585ab9..f889d56bf8cf 100644
--- a/arch/powerpc/include/asm/book3s/64/hash-4k.h
+++ b/arch/powerpc/include/asm/book3s/64/hash-4k.h
@@ -57,11 +57,12 @@
 #define H_PMD_FRAG_NR  (PAGE_SIZE >> H_PMD_FRAG_SIZE_SHIFT)
 
 /* memory key bits, only 8 keys supported */
-#define H_PTE_PKEY_BIT00
-#define H_PTE_PKEY_BIT10
+#define H_PTE_PKEY_BIT40
+#define H_PTE_PKEY_BIT30
 #define H_PTE_PKEY_BIT2_RPAGE_RSV3
-#define H_PTE_PKEY_BIT3_RPAGE_RSV4
-#define H_PTE_PKEY_BIT4_RPAGE_RSV5
+#define H_PTE_PKEY_BIT1_RPAGE_RSV4
+#define H_PTE_PKEY_BIT0_RPAGE_RSV5
+
 
 /*
  * On all 4K setups, remap_4k_pfn() equates to remap_pfn_range()
diff --git a/arch/powerpc/include/asm/book3s/64/hash-64k.h 
b/arch/powerpc/include/asm/book3s/64/hash-64k.h
index 0729c034e56f..0a15fd14cf72 100644
--- a/arch/powerpc/include/asm/book3s/64/hash-64k.h
+++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h
@@ -36,11 +36,11 @@
 #define H_PAGE_HASHPTE _RPAGE_RPN43/* PTE has associated HPTE */
 
 /* memory key bits. */
-#define H_PTE_PKEY_BIT0_RPAGE_RSV1
-#define H_PTE_PKEY_BIT1_RPAGE_RSV2
+#define H_PTE_PKEY_BIT4_RPAGE_RSV1
+#define H_PTE_PKEY_BIT3_RPAGE_RSV2
 #define H_PTE_PKEY_BIT2_RPAGE_RSV3
-#define H_PTE_PKEY_BIT3_RPAGE_RSV4
-#define H_PTE_PKEY_BIT4_RPAGE_RSV5
+#define H_PTE_PKEY_BIT1_RPAGE_RSV4
+#define H_PTE_PKEY_BIT0_RPAGE_RSV5
 
 /*
  * We need to differentiate between explicit huge page and THP huge
diff --git a/arch/powerpc/include/asm/book3s/64/mmu-hash.h 
b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
index 3fa1b962dc27..58fcc959f9d5 100644
--- a/arch/powerpc/include/asm/book3s/64/mmu-hash.h
+++ b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
@@ -86,8 +86,8 @@
 #define HPTE_R_PP0 ASM_CONST(0x8000)
 #define HPTE_R_TS  ASM_CONST(0x4000)
 #define HPTE_R_KEY_HI  ASM_CONST(0x3000)
-#define HPTE_R_KEY_BIT0ASM_CONST(0x2000)
-#define HPTE_R_KEY_BIT1ASM_CONST(0x1000)
+#define HPTE_R_KEY_BIT4ASM_CONST(0x2000)
+#define HPTE_R_KEY_BIT3ASM_CONST(0x1000)
 #define HPTE_R_RPN_SHIFT   12
 #define HPTE_R_RPN ASM_CONST(0x0000)
 #define HPTE_R_RPN_3_0 ASM_CONST(0x01fff000)
@@ -103,8 +103,8 @@
 #define HPTE_R_R   ASM_CONST(0x0100)
 #define HPTE_R_KEY_LO  ASM_CONST(0x0e00)
 #define HPTE_R_KEY_BIT2ASM_CONST(0x0800)
-#define HPTE_R_KEY_BIT3ASM_CONST(0x0400)
-#define HPTE_R_KEY_BIT4ASM_CONST(0x0200)
+#define HPTE_R_KEY_BIT1ASM_CONST(0x0400)
+#define HPTE_R_KEY_BIT0ASM_CONST(0x0200)
 #define HPTE_R_KEY (HPTE_R_KEY_LO | HPTE_R_KEY_HI)
 
 #define HPTE_V_1TB_SEG ASM_CONST(0x4000)
diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
index 20ebf153c871..f8f4d0793789 100644
--- a/arch/powerpc/include/asm/pkeys.h
+++ b/arch/powerpc/include/asm/pkeys.h
@@ -35,11 +35,11 @@ static inline u64 vmflag_to_pte_pkey_bits(u64 vm_flags)
if (static_branch_likely(_disabled))
return 0x0UL;
 
-   return (((vm_flags & VM_PKEY_BIT0) ? H_PTE_PKEY_BIT4 : 0x0UL) |
-   ((vm_flags & VM_PKEY_BIT1) ? H_PTE_PKEY_BIT3 : 0x0UL) |
+   return (((vm_flags & VM_PKEY_BIT0) ? H_PTE_PKEY_BIT0 : 0x0UL) |
+   ((vm_flags & VM_PKEY_BIT1) ? H_PTE_PKEY_BIT1 : 0x0UL) |
((vm_flags & VM_PKEY_BIT2) ? H_PTE_PKEY_BIT2 : 0x0UL) |
-   ((vm_flags & VM_PKEY_BIT3) ? H_PTE_PKEY_BIT1 : 0x0UL) |
-   ((vm_flags & VM_PKEY_BIT4) ? H_PTE_PKEY_BIT0 : 0x0UL));
+   ((vm_flags & VM_PKEY_BIT3) ? H_PTE_PKEY_BIT3 : 0x0UL) |
+   ((vm_flags & VM_PKEY_BIT4) ? H_PTE_PKEY_BIT4 : 0x0UL));
 }
 
 static inline int vma_pkey(struct vm_area_struct *vma)
@@ -53,20 +53,20 @@ static inline int vma_pkey(struct vm_area_struct *vma)
 
 static inline u64 pte_to_hpte_pkey_bits(u64 pteflags)
 {
-   return (((pteflags & H_PTE_PKEY_BIT0) ? 

[PATCH v5 00/26] powerpc/book3s/64/pkeys: Simplify the code

2020-06-19 Thread Aneesh Kumar K.V
This patch series update the pkey subsystem with more documentation and
rename variables so that it is easy to follow the code. We drop the changes
to support KUAP/KUEP with hash translation in this update. The changes
are adding 200 cycles to null syscalls benchmark and I want to look at that
closely before requesting a merge. The rest of the patches are included
in this series. This should avoid having to carry a large patchset across
the upstream merge. Some of the changes in here make the hash KUEP/KUAP
addition simpler.

Changes from v4:
* Drop hash KUAP/KUEP changes.

Changes from v3:
* Fix build error reported by kernel test robot 

Changes from v2:
* Rebase to the latest kernel.
* Fixed a bug with disabling KUEP/KUAP on kernel command line
* Added a patch to make kup key dynamic.

Changes from v1:
* Rebased on latest kernel

Aneesh Kumar K.V (26):
  powerpc/book3s64/pkeys: Fixup bit numbering
  powerpc/book3s64/pkeys: pkeys are supported only on hash on book3s.
  powerpc/book3s64/pkeys: Move pkey related bits in the linux page table
  powerpc/book3s64/pkeys: Explain key 1 reservation details
  powerpc/book3s64/pkeys: Simplify the key initialization
  powerpc/book3s64/pkeys: Prevent key 1 modification from userspace.
  powerpc/book3s64/pkeys: kill cpu feature key CPU_FTR_PKEY
  powerpc/book3s64/pkeys: Convert execute key support to static key
  powerpc/book3s64/pkeys: Simplify pkey disable branch
  powerpc/book3s64/pkeys: Convert pkey_total to max_pkey
  powerpc/book3s64/pkeys: Make initial_allocation_mask static
  powerpc/book3s64/pkeys: Mark all the pkeys above max pkey as reserved
  powerpc/book3s64/pkeys: Enable MMU_FTR_PKEY
  powerpc/book3s64/kuep: Add MMU_FTR_KUEP
  powerpc/book3s64/pkeys: Use execute_pkey_disable static key
  powerpc/book3s64/pkeys: Use MMU_FTR_PKEY instead of pkey_disabled
static key
  powerpc/book3s64/keys: Print information during boot.
  powerpc/book3s64/keys/kuap: Reset AMR/IAMR values on kexec
  powerpc/book3s64/kuap: Move KUAP related function outside radix
  powerpc/book3s64/kuep: Move KUEP related function outside radix
  powerpc/book3s64/kuap: Rename MMU_FTR_RADIX_KUAP to MMU_FTR_KUAP
  powerpc/book3s64/kuap/kuep: Make KUAP and KUEP a subfeature of
PPC_MEM_KEYS
  powerpc/book3s64/kuap: Move UAMOR setup to key init function
  powerpc/selftest/ptrave-pkey: Rename variables to make it easier to
follow code
  powerpc/selftest/ptrace-pkey: Update the test to mark an invalid pkey
correctly
  powerpc/selftest/ptrace-pkey: IAMR and uamor cannot be updated by
ptrace

 arch/powerpc/include/asm/book3s/64/hash-4k.h  |  21 +-
 arch/powerpc/include/asm/book3s/64/hash-64k.h |  12 +-
 .../powerpc/include/asm/book3s/64/hash-pkey.h |  32 ++
 .../asm/book3s/64/{kup-radix.h => kup.h}  |  70 ++--
 arch/powerpc/include/asm/book3s/64/mmu-hash.h |   8 +-
 arch/powerpc/include/asm/book3s/64/mmu.h  |   6 +
 arch/powerpc/include/asm/book3s/64/pgtable.h  |  17 +-
 arch/powerpc/include/asm/book3s/64/pkeys.h|  25 ++
 arch/powerpc/include/asm/cputable.h   |  13 +-
 arch/powerpc/include/asm/kup.h|  16 +-
 arch/powerpc/include/asm/mmu.h|  17 +-
 arch/powerpc/include/asm/pkeys.h  |  65 +---
 arch/powerpc/include/asm/processor.h  |   1 -
 arch/powerpc/include/asm/ptrace.h |   2 +-
 arch/powerpc/kernel/asm-offsets.c |   2 +-
 arch/powerpc/kernel/dt_cpu_ftrs.c |   6 -
 arch/powerpc/kernel/misc_64.S |  14 -
 arch/powerpc/kernel/prom.c|   5 +
 arch/powerpc/kernel/ptrace/ptrace-view.c  |  17 +-
 arch/powerpc/kernel/smp.c |   5 +
 arch/powerpc/kernel/syscall_64.c  |   2 +-
 arch/powerpc/kexec/core_64.c  |   3 +
 arch/powerpc/mm/book3s64/pgtable.c|   3 +
 arch/powerpc/mm/book3s64/pkeys.c  | 315 +++---
 arch/powerpc/mm/book3s64/radix_pgtable.c  |  36 --
 arch/powerpc/platforms/Kconfig.cputype|   4 +-
 .../selftests/powerpc/ptrace/ptrace-pkey.c|  53 ++-
 27 files changed, 448 insertions(+), 322 deletions(-)
 create mode 100644 arch/powerpc/include/asm/book3s/64/hash-pkey.h
 rename arch/powerpc/include/asm/book3s/64/{kup-radix.h => kup.h} (78%)
 create mode 100644 arch/powerpc/include/asm/book3s/64/pkeys.h

-- 
2.26.2



Re: [PATCH v2] ASoC: fsl_spdif: Add pm runtime function

2020-06-19 Thread Mark Brown
On Fri, 19 Jun 2020 15:54:33 +0800, Shengjiu Wang wrote:
> Add pm runtime support and move clock handling there.
> Close the clocks at suspend to reduce the power consumption.
> 
> fsl_spdif_suspend is replaced by pm_runtime_force_suspend.
> fsl_spdif_resume is replaced by pm_runtime_force_resume.

Applied to

   https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound.git for-next

Thanks!

[1/1] ASoC: fsl_spdif: Add pm runtime function
  commit: 9cb2b3796e083169b368a7add19faec1750ad998

All being well this means that it will be integrated into the linux-next
tree (usually sometime in the next 24 hours) and sent to Linus during
the next merge window (or sooner if it is a bug fix), however if
problems are discovered then the patch may be dropped or reverted.

You may get further e-mails resulting from automated or manual testing
and review of the tree, please engage with people reporting problems and
send followup patches addressing any issues that are reported if needed.

If any updates are required or you are submitting further changes they
should be sent as incremental updates against current git, existing
patches will not be replaced.

Please add any relevant lists and maintainers to the CCs when replying
to this mail.

Thanks,
Mark


Re: [PATCH v5 00/10] Support new pmem flush and sync instructions for POWER

2020-06-19 Thread Aneesh Kumar K.V


"Aneesh Kumar K.V"  writes:

> This patch series enables the usage os new pmem flush and sync instructions 
> on POWER
> architecture. POWER10 introduces two new variants of dcbf instructions 
> (dcbstps and dcbfps)
> that can be used to write modified locations back to persistent storage. 
> Additionally,
> POWER10 also introduce phwsync and plwsync which can be used to establish 
> order of these
> writes to persistent storage.
> 
> This series exposes these instructions to the rest of the kernel. The existing
> dcbf and hwsync instructions in P8 and P9 are adequate to enable appropriate
> synchronization with OpenCAPI-hosted persistent storage. Hence the new 
> instructions
> are added as a variant of the old ones that old hardware won't differentiate.
>
> On POWER10, pmem devices will be represented by a different device tree compat
> strings. This ensures that older kernels won't initialize pmem devices on 
> POWER10.
>
> W.r.t userspace we want to make sure applications are enabled to use MAP_SYNC 
> only
> if they are using the new instructions. To avoid the wrong usage of MAP_SYNC 
> on
> newer hardware, we disable MAP_SYNC by default on newer hardware. The 
> namespace specific
> attribute /sys/block/pmem0/dax/sync_fault can be used to enable MAP_SYNC 
> later.
>
> With this:
> 1) vPMEM continues to work since it is a volatile region. That 
> doesn't need any flush instructions.
>
> 2) pmdk and other user applications get updated to use new instructions
> and updated packages are made available to all distributions
>
> 3) On newer hardware, the device will appear with a new compat string. 
> Hence older distributions won't initialize pmem on newer hardware.
>
> 4) If we have a newer kernel with an older distro, we use the per 
> namespace sysfs knob that prevents the usage of MAP_SYNC.
>
> 5) Sometime in the future, we mark the CONFIG_ARCH_MAP_SYNC_DISABLE=n
> on ppc64 when we are confident that everybody is using the new flush 
> instruction.
>
> Chaanges from V4:
> * Add namespace specific sychronous fault control.
>
> Changes from V3:
> * Add new compat string to be used for the device.
> * Use arch_pmem_flush_barrier() in dm-writecache.
>
> Aneesh Kumar K.V (10):
>   powerpc/pmem: Restrict papr_scm to P8 and above.
>   powerpc/pmem: Add new instructions for persistent storage and sync
>   powerpc/pmem: Add flush routines using new pmem store and sync
> instruction
>   libnvdimm/nvdimm/flush: Allow architecture to override the flush
> barrier
>   powerpc/pmem/of_pmem: Update of_pmem to use the new barrier
> instruction.
>   powerpc/pmem: Avoid the barrier in flush routines
>   powerpc/book3s/pmem: Add WARN_ONCE to catch the wrong usage of pmem
> flush functions.
>   libnvdimm/dax: Add a dax flag to control synchronous fault support
>   powerpc/pmem: Disable synchronous fault by default
>   powerpc/pmem: Initialize pmem device on newer hardware
>
>  arch/powerpc/include/asm/cacheflush.h | 10 
>  arch/powerpc/include/asm/ppc-opcode.h | 12 
>  arch/powerpc/lib/pmem.c   | 46 --
>  arch/powerpc/platforms/Kconfig.cputype|  9 +++
>  arch/powerpc/platforms/pseries/papr_scm.c | 31 +-
>  arch/powerpc/platforms/pseries/pmem.c |  6 ++
>  drivers/dax/bus.c |  2 +-
>  drivers/dax/super.c   | 73 +++
>  drivers/md/dm-writecache.c|  2 +-
>  drivers/nvdimm/of_pmem.c  |  8 +++
>  drivers/nvdimm/pmem.c |  4 ++
>  drivers/nvdimm/region_devs.c  | 24 ++--
>  include/linux/dax.h   | 16 +
>  include/linux/libnvdimm.h |  8 +++
>  mm/Kconfig|  3 +
>  15 files changed, 243 insertions(+), 11 deletions(-)

Ping.

Are we good with the approach here? 

-aneesh


Re: [PATCH 2/2] powerpc/syscalls: Split SPU-ness out of ABI

2020-06-19 Thread Arnd Bergmann
On Tue, Jun 16, 2020 at 3:56 PM Michael Ellerman  wrote:
>
> Using the ABI field to encode whether a syscall is usable by SPU
> programs or not is a bit of kludge.
>
> The ABI of the syscall doesn't change depending on the SPU-ness, but
> in order to make the syscall generation work we have to pretend that
> it does.

The idea of the ABI field is not to identify which ABI a syscall follows
but which ABIs do or do not implement it. This is the same with e.g.
the x32 ABI on x86.

> It also means we have more duplicated syscall lines than we need to,
> and the SPU logic is not well contained, instead all of the syscall
> generation targets need to know if they are spu or nospu.
>
> So instead add a separate file which contains the information on which
> syscalls are available for SPU programs. It's just a list of syscall
> numbers with a single "spu" field. If the field has the value "spu"
> then the syscall is available to SPU programs, any other value or no
> entry entirely means the syscall is not available to SPU programs.
>
> Signed-off-by: Michael Ellerman 

I have a patch series originally from Firoz that was never quite finished
to unify the scripts across all architectures. I think making the format of
the table format more powerpc specific like you do here takes it a step
backwards and makes it harder to do that eventually.

>  4 files changed, 523 insertions(+), 128 deletions(-)
>  create mode 100644 arch/powerpc/kernel/syscalls/spu.tbl
>
>
> I'm inclined to put this in next and ask Linus to pull it before rc2, that 
> seems
> like the least disruptive way to get this in, unless anyone objects?

I still hope we can get a better solution.

> diff --git a/arch/powerpc/kernel/syscalls/spu.tbl 
> b/arch/powerpc/kernel/syscalls/spu.tbl
> new file mode 100644
> index ..5eac04919303
> --- /dev/null
> +++ b/arch/powerpc/kernel/syscalls/spu.tbl
> @@ -0,0 +1,430 @@
> +# SPDX-License-Identifier: GPL-2.0
> +#
> +# The format is:
> +#   
> +#
> +# To indicate a syscall can be used by SPU programs use "spu" for the spu 
> column.
> +#
> +# Syscalls that are not to be used by SPU programs can be left out of the 
> file
> +# entirely, or an entry with a value other than "spu" can be added.
> +0  restart_syscall -
> +1  exit-
> +2  fork-
> +3  readspu
> +4  write   spu
> +5  openspu

Having a new table format here also makes it harder for others to add
a new system call, both because it doesn't follow the syscall*.tbl naming
and because one has to first understand what the format is.

If you absolutely want to split it out, could you at least make the format
compatible with the existing scripts and avoid the change to
the syscalltbl.sh file?

   Arnd


Re: linux-next: manual merge of the pidfd tree with the powerpc-fixes tree

2020-06-19 Thread Michael Ellerman
Stephen Rothwell  writes:
> Hi all,
>
> Today's linux-next merge of the pidfd tree got a conflict in:
>
>   arch/powerpc/kernel/syscalls/syscall.tbl
>
> between commit:
>
>   35e32a6cb5f6 ("powerpc/syscalls: Split SPU-ness out of ABI")
>
> from the powerpc-fixes tree and commit:
>
>   9b4feb630e8e ("arch: wire-up close_range()")
>
> from the pidfd tree.
>
> I fixed it up (see below) and can carry the fix as necessary. This
> is now fixed as far as linux-next is concerned, but any non trivial
> conflicts should be mentioned to your upstream maintainer when your tree
> is submitted for merging.  You may also want to consider cooperating
> with the maintainer of the conflicting tree to minimise any particularly
> complex conflicts.

Thanks.

I thought the week between rc1 and rc2 would be a safe time to do that
conversion of the syscall table, but I guess I was wrong :)

I'm planning to send those changes to Linus for rc2, so the conflict
will then be vs mainline. But I guess it's pretty trivial so it doesn't
really matter.

cheers

> diff --cc arch/powerpc/kernel/syscalls/syscall.tbl
> index c0cdaacd770e,dd87a782d80e..
> --- a/arch/powerpc/kernel/syscalls/syscall.tbl
> +++ b/arch/powerpc/kernel/syscalls/syscall.tbl
> @@@ -480,6 -524,8 +480,7 @@@
>   434 common  pidfd_open  sys_pidfd_open
>   435 32  clone3  ppc_clone3  
> sys_clone3
>   435 64  clone3  sys_clone3
>  -435 spu clone3  sys_ni_syscall
> + 436 common  close_range sys_close_range
>   437 common  openat2 sys_openat2
>   438 common  pidfd_getfd sys_pidfd_getfd
>   439 common  faccessat2  sys_faccessat2


Re: [PATCH] mm/debug_vm_pgtable: Fix build failure with powerpc 8xx

2020-06-19 Thread Anshuman Khandual


On 06/18/2020 08:01 PM, Christophe Leroy wrote:
> Fix it by using the recently added ptep_get() helper.
> 
> Fixes: 9e343b467c70 ("READ_ONCE: Enforce atomicity for {READ,WRITE}_ONCE() 
> memory accesses")
> Signed-off-by: Christophe Leroy 
> ---
>  mm/debug_vm_pgtable.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
> index e45623016aea..61ab16fb2e36 100644
> --- a/mm/debug_vm_pgtable.c
> +++ b/mm/debug_vm_pgtable.c
> @@ -246,13 +246,13 @@ static void __init pgd_populate_tests(struct mm_struct 
> *mm, pgd_t *pgdp,
>  static void __init pte_clear_tests(struct mm_struct *mm, pte_t *ptep,
>  unsigned long vaddr)
>  {
> - pte_t pte = READ_ONCE(*ptep);
> + pte_t pte = ptep_get(ptep);
>  
>   pte = __pte(pte_val(pte) | RANDOM_ORVALUE);
>   set_pte_at(mm, vaddr, ptep, pte);
>   barrier();
>   pte_clear(mm, vaddr, ptep);
> - pte = READ_ONCE(*ptep);
> + pte = ptep_get(ptep);
>   WARN_ON(!pte_none(pte));
>  }

Tested this on arm64 and x86 platforms after applying the previous
series which adds ptep_get() and a follow up patch.

https://patchwork.kernel.org/project/linux-mm/list/?series=302949
https://patchwork.kernel.org/patch/11611929/

Build tested on s390 and arc platforms as well.

Reviewed-by: Anshuman Khandual 


Re: [PATCH v5 01/13] powerpc: Remove Xilinx PPC405/PPC440 support

2020-06-19 Thread Michael Ellerman
Nathan Chancellor  writes:
> On Thu, Jun 18, 2020 at 10:48:21AM +1000, Michael Ellerman wrote:
>> Nick Desaulniers  writes:
>> > On Wed, Jun 17, 2020 at 3:20 AM Michael Ellerman  
>> > wrote:
>> >> Michael Ellerman  writes:
>> >> > Michal Simek  writes:
>> >> 
>> >>
>> >> >> Or if bamboo requires uImage to be built by default you can do it via
>> >> >> Kconfig.
>> >> >>
>> >> >> diff --git a/arch/powerpc/platforms/44x/Kconfig
>> >> >> b/arch/powerpc/platforms/44x/Kconfig
>> >> >> index 39e93d23fb38..300864d7b8c9 100644
>> >> >> --- a/arch/powerpc/platforms/44x/Kconfig
>> >> >> +++ b/arch/powerpc/platforms/44x/Kconfig
>> >> >> @@ -13,6 +13,7 @@ config BAMBOO
>> >> >> select PPC44x_SIMPLE
>> >> >> select 440EP
>> >> >> select FORCE_PCI
>> >> >> +   select DEFAULT_UIMAGE
>> >> >> help
>> >> >>   This option enables support for the IBM PPC440EP evaluation 
>> >> >> board.
>> >> >
>> >> > Who knows what the actual bamboo board used. But I'd be happy to take a
>> >> > SOB'ed patch to do the above, because these days the qemu emulation is
>> >> > much more likely to be used than the actual board.
>> >>
>> >> I just went to see why my CI boot of 44x didn't catch this, and it's
>> >> because I don't use the uImage, I just boot the vmlinux directly:
>> >>
>> >>   $ qemu-system-ppc -M bamboo -m 128m -display none -kernel 
>> >> build~/vmlinux -append "console=ttyS0" -display none -nodefaults -serial 
>> >> mon:stdio
>> >>   Linux version 5.8.0-rc1-00118-g69119673bd50 (michael@alpine1-p1) (gcc 
>> >> (Ubuntu 9.3.0-10ubuntu2) 9.3.0, GNU ld (GNU Binutils for Ubuntu) 2.34) #4 
>> >> Wed Jun 17 20:19:22 AEST 2020
>> >>   Using PowerPC 44x Platform machine description
>> >>   ioremap() called early from find_legacy_serial_ports+0x690/0x770. Use 
>> >> early_ioremap() instead
>> >>   printk: bootconsole [udbg0] enabled
>> >>
>> >>
>> >> So that's probably the simplest solution?
>> >
>> > If the uImage or zImage self decompresses, I would prefer to test that as 
>> > well.
>> 
>> The uImage is decompressed by qemu AIUI.
>> 
>> >> That means previously arch/powerpc/boot/zImage was just a hardlink to
>> >> the uImage:
>> >
>> > It sounds like we can just boot the zImage, or is that no longer
>> > created with the uImage?
>> 
>> The zImage won't boot on bamboo.
>> 
>> Because of the vagaries of the arch/powerpc/boot/Makefile the zImage
>> ends up pointing to treeImage.ebony, which is for a different board.
>> 
>> The zImage link is made to the first item in $(image-y):
>> 
>> $(obj)/zImage:   $(addprefix $(obj)/, $(image-y))
>>  $(Q)rm -f $@; ln $< $@
>>  ^
>>  first preqrequisite
>> 
>> Which for this defconfig happens to be:
>> 
>> image-$(CONFIG_EBONY)+= treeImage.ebony cuImage.ebony
>> 
>> If you turned off CONFIG_EBONY then the zImage will be a link to
>> treeImage.bamboo, but qemu can't boot that either.
>> 
>> It's kind of nuts that the zImage points to some arbitrary image
>> depending on what's configured and the order of things in the Makefile.
>> But I'm not sure how we make it less nuts without risking breaking
>> people's existing setups.
>
> Hi Michael,
>
> For what it's worth, this is squared this away in terms of our CI by
> just building and booting the uImage directly, rather than implicitly
> using the zImage:
>
> https://github.com/ClangBuiltLinux/continuous-integration/pull/282
> https://github.com/ClangBuiltLinux/boot-utils/pull/22

Great.

> We were only using the zImage because that is what Joel Stanley intially
> set us up with when PowerPC 32-bit was added to our CI:
>
> https://github.com/ClangBuiltLinux/continuous-integration/pull/100

Ah, so Joel owes us all beers then ;)

> Admittedly, we really do not have many PowerPC experts in our
> organization so we are supporting it on a "best effort" basis, which
> often involves using whatever knowledge is floating around or can be
> gained from interactions such as this :) so thank you for that!

No worries. I definitely don't expect you folks to invest much effort in
powerpc, especially the old 32-bit stuff, so always happy to help debug
things, and really appreciate the testing you do.

cheers


Re: [PATCH 2/2] powerpc/syscalls: Split SPU-ness out of ABI

2020-06-19 Thread Michael Ellerman
Michael Ellerman  writes:
> Using the ABI field to encode whether a syscall is usable by SPU
> programs or not is a bit of kludge.
>
> The ABI of the syscall doesn't change depending on the SPU-ness, but
> in order to make the syscall generation work we have to pretend that
> it does.
>
> It also means we have more duplicated syscall lines than we need to,
> and the SPU logic is not well contained, instead all of the syscall
> generation targets need to know if they are spu or nospu.
>
> So instead add a separate file which contains the information on which
> syscalls are available for SPU programs. It's just a list of syscall
> numbers with a single "spu" field. If the field has the value "spu"
> then the syscall is available to SPU programs, any other value or no
> entry entirely means the syscall is not available to SPU programs.
>
> Signed-off-by: Michael Ellerman 
> ---
>  arch/powerpc/kernel/syscalls/Makefile  |  16 +-
>  arch/powerpc/kernel/syscalls/spu.tbl   | 430 +
>  arch/powerpc/kernel/syscalls/syscall.tbl   | 195 --
>  arch/powerpc/kernel/syscalls/syscalltbl.sh |  10 +-
>  4 files changed, 523 insertions(+), 128 deletions(-)
>  create mode 100644 arch/powerpc/kernel/syscalls/spu.tbl

For the archives, the changes to the syscall table & the generation of
the spu.tbl can be more-or-less generated with the script below
(ignoring whitespace & comments).

cheers


#!/bin/bash

git checkout v5.8-rc1

table=arch/powerpc/kernel/syscalls/syscall.tbl

for number in {0..439}
do
line=$(grep -E "^$number\s+(common|spu)" $table)
if [[ -n "$line" ]]; then
read number abi name syscall compat <<< "$line"
if [[ "$syscall" != "sys_ni_syscall" ]]; then
if [[ "$name" == "utimesat" ]]; then # fix typo
name="futimesat"
fi
echo -e "$number\t$name\tspu"
continue
fi
fi

line=$(grep -m 1 -E "^$number\s+" $table)
read number abi name syscall compat <<< "$line"
if [[ -n "$name" ]]; then
echo -e "$number\t$name\t-"
fi
done > spu-generated.tbl

cat $table | while read line
do
read number abi name syscall compat <<< "$line"

if [[ "$number" == "#" ]]; then
echo $line
continue
fi

case "$abi" in
"nospu");&
"common")   ;&
"32")   ;&
"64") echo "$line" | sed -e "s/nospu/common/" ;;
esac
done > syscall-generated.tbl

git cat-file -p 35e32a6cb5f6:$table | diff -w -u - syscall-generated.tbl
git cat-file -p 35e32a6cb5f6:arch/powerpc/kernel/syscalls/spu.tbl | diff -w -u 
- spu-generated.tbl



Re: [PATCH 3/6] exec: cleanup the count() function

2020-06-19 Thread Sergei Shtylyov

Hello!

On 18.06.2020 17:46, Christoph Hellwig wrote:


Remove the max argument as it is hard wired to MAX_ARG_STRINGS, and


   Technically, argument is what's actually passed to a function, you're 
removing a function parameter.



give the function a slightly less generic name.

Signed-off-by: Christoph Hellwig 

[...]

MBR, Sergei


[PATCH v2] ASoC: fsl_spdif: Add pm runtime function

2020-06-19 Thread Shengjiu Wang
Add pm runtime support and move clock handling there.
Close the clocks at suspend to reduce the power consumption.

fsl_spdif_suspend is replaced by pm_runtime_force_suspend.
fsl_spdif_resume is replaced by pm_runtime_force_resume.

Signed-off-by: Shengjiu Wang 
Acked-by: Nicolin Chen 
---
changes in v2
- remove goto in startup()
- remove goto disable_spba_clk
- Add Acked-by: Nicolin Chen

 sound/soc/fsl/fsl_spdif.c | 117 ++
 1 file changed, 67 insertions(+), 50 deletions(-)

diff --git a/sound/soc/fsl/fsl_spdif.c b/sound/soc/fsl/fsl_spdif.c
index 5bc0e4729341..5b2689ae63d4 100644
--- a/sound/soc/fsl/fsl_spdif.c
+++ b/sound/soc/fsl/fsl_spdif.c
@@ -16,6 +16,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -495,29 +496,14 @@ static int fsl_spdif_startup(struct snd_pcm_substream 
*substream,
struct platform_device *pdev = spdif_priv->pdev;
struct regmap *regmap = spdif_priv->regmap;
u32 scr, mask;
-   int i;
int ret;
 
/* Reset module and interrupts only for first initialization */
if (!snd_soc_dai_active(cpu_dai)) {
-   ret = clk_prepare_enable(spdif_priv->coreclk);
-   if (ret) {
-   dev_err(>dev, "failed to enable core clock\n");
-   return ret;
-   }
-
-   if (!IS_ERR(spdif_priv->spbaclk)) {
-   ret = clk_prepare_enable(spdif_priv->spbaclk);
-   if (ret) {
-   dev_err(>dev, "failed to enable spba 
clock\n");
-   goto err_spbaclk;
-   }
-   }
-
ret = spdif_softreset(spdif_priv);
if (ret) {
dev_err(>dev, "failed to soft reset\n");
-   goto err;
+   return ret;
}
 
/* Disable all the interrupts */
@@ -531,18 +517,10 @@ static int fsl_spdif_startup(struct snd_pcm_substream 
*substream,
mask = SCR_TXFIFO_AUTOSYNC_MASK | SCR_TXFIFO_CTRL_MASK |
SCR_TXSEL_MASK | SCR_USRC_SEL_MASK |
SCR_TXFIFO_FSEL_MASK;
-   for (i = 0; i < SPDIF_TXRATE_MAX; i++) {
-   ret = clk_prepare_enable(spdif_priv->txclk[i]);
-   if (ret)
-   goto disable_txclk;
-   }
} else {
scr = SCR_RXFIFO_FSEL_IF8 | SCR_RXFIFO_AUTOSYNC;
mask = SCR_RXFIFO_FSEL_MASK | SCR_RXFIFO_AUTOSYNC_MASK|
SCR_RXFIFO_CTL_MASK | SCR_RXFIFO_OFF_MASK;
-   ret = clk_prepare_enable(spdif_priv->rxclk);
-   if (ret)
-   goto err;
}
regmap_update_bits(regmap, REG_SPDIF_SCR, mask, scr);
 
@@ -550,17 +528,6 @@ static int fsl_spdif_startup(struct snd_pcm_substream 
*substream,
regmap_update_bits(regmap, REG_SPDIF_SCR, SCR_LOW_POWER, 0);
 
return 0;
-
-disable_txclk:
-   for (i--; i >= 0; i--)
-   clk_disable_unprepare(spdif_priv->txclk[i]);
-err:
-   if (!IS_ERR(spdif_priv->spbaclk))
-   clk_disable_unprepare(spdif_priv->spbaclk);
-err_spbaclk:
-   clk_disable_unprepare(spdif_priv->coreclk);
-
-   return ret;
 }
 
 static void fsl_spdif_shutdown(struct snd_pcm_substream *substream,
@@ -569,20 +536,17 @@ static void fsl_spdif_shutdown(struct snd_pcm_substream 
*substream,
struct snd_soc_pcm_runtime *rtd = substream->private_data;
struct fsl_spdif_priv *spdif_priv = 
snd_soc_dai_get_drvdata(asoc_rtd_to_cpu(rtd, 0));
struct regmap *regmap = spdif_priv->regmap;
-   u32 scr, mask, i;
+   u32 scr, mask;
 
if (substream->stream == SNDRV_PCM_STREAM_PLAYBACK) {
scr = 0;
mask = SCR_TXFIFO_AUTOSYNC_MASK | SCR_TXFIFO_CTRL_MASK |
SCR_TXSEL_MASK | SCR_USRC_SEL_MASK |
SCR_TXFIFO_FSEL_MASK;
-   for (i = 0; i < SPDIF_TXRATE_MAX; i++)
-   clk_disable_unprepare(spdif_priv->txclk[i]);
} else {
scr = SCR_RXFIFO_OFF | SCR_RXFIFO_CTL_ZERO;
mask = SCR_RXFIFO_FSEL_MASK | SCR_RXFIFO_AUTOSYNC_MASK|
SCR_RXFIFO_CTL_MASK | SCR_RXFIFO_OFF_MASK;
-   clk_disable_unprepare(spdif_priv->rxclk);
}
regmap_update_bits(regmap, REG_SPDIF_SCR, mask, scr);
 
@@ -591,9 +555,6 @@ static void fsl_spdif_shutdown(struct snd_pcm_substream 
*substream,
spdif_intr_status_clear(spdif_priv);
regmap_update_bits(regmap, REG_SPDIF_SCR,
SCR_LOW_POWER, SCR_LOW_POWER);
-   if (!IS_ERR(spdif_priv->spbaclk))
-   clk_disable_unprepare(spdif_priv->spbaclk);
-   clk_disable_unprepare(spdif_priv->coreclk);
}

Re: [PATCH] mm/debug_vm_pgtable: Fix build failure with powerpc 8xx

2020-06-19 Thread Will Deacon
On Thu, Jun 18, 2020 at 02:31:29PM +, Christophe Leroy wrote:
> diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
> index e45623016aea..61ab16fb2e36 100644
> --- a/mm/debug_vm_pgtable.c
> +++ b/mm/debug_vm_pgtable.c
> @@ -246,13 +246,13 @@ static void __init pgd_populate_tests(struct mm_struct 
> *mm, pgd_t *pgdp,
>  static void __init pte_clear_tests(struct mm_struct *mm, pte_t *ptep,
>  unsigned long vaddr)
>  {
> - pte_t pte = READ_ONCE(*ptep);
> + pte_t pte = ptep_get(ptep);
>  
>   pte = __pte(pte_val(pte) | RANDOM_ORVALUE);
>   set_pte_at(mm, vaddr, ptep, pte);
>   barrier();
>   pte_clear(mm, vaddr, ptep);
> - pte = READ_ONCE(*ptep);
> + pte = ptep_get(ptep);
>   WARN_ON(!pte_none(pte));
>  }

Acked-by: Will Deacon 

I wonder if there's a way to do this with coccinelle in one big go (but the
resulting diff would obviously need manual inspection)?

Will


Re: [PATCH] ASoC: fsl_spdif: Add pm runtime function

2020-06-19 Thread Shengjiu Wang
On Fri, Jun 19, 2020 at 1:51 PM Nicolin Chen  wrote:
>
> On Thu, Jun 18, 2020 at 07:55:34PM +0800, Shengjiu Wang wrote:
> > Add pm runtime support and move clock handling there.
> > Close the clocks at suspend to reduce the power consumption.
> >
> > fsl_spdif_suspend is replaced by pm_runtime_force_suspend.
> > fsl_spdif_resume is replaced by pm_runtime_force_resume.
> >
> > Signed-off-by: Shengjiu Wang 
>
> LGTM, yet some nits, please add my ack after fixing:
>
> Acked-by: Nicolin Chen 
>
> > @@ -495,25 +496,10 @@ static int fsl_spdif_startup(struct snd_pcm_substream 
> > *substream,
>
> >
> > -disable_txclk:
> > - for (i--; i >= 0; i--)
> > - clk_disable_unprepare(spdif_priv->txclk[i]);
> >  err:
> > - if (!IS_ERR(spdif_priv->spbaclk))
> > - clk_disable_unprepare(spdif_priv->spbaclk);
> > -err_spbaclk:
> > - clk_disable_unprepare(spdif_priv->coreclk);
> > -
> >   return ret;
>
> Only "return ret;" remains now. We could clean the goto away.
>
> > -static int fsl_spdif_resume(struct device *dev)
> > +static int fsl_spdif_runtime_resume(struct device *dev)
>
> > +disable_rx_clk:
> > + clk_disable_unprepare(spdif_priv->rxclk);
> > +disable_tx_clk:
> > +disable_spba_clk:
>
> Why have two duplicated ones? Could probably drop the 2nd one.

seems can drop one, will send an update.

best regards
wang shengjiu


[PATCH V2] powerpc/pseries/svm: Remove unwanted check for shared_lppaca_size

2020-06-19 Thread Satheesh Rajendran
Early secure guest boot hits the below crash while booting with
vcpus numbers aligned with page boundary for PAGE size of 64k
and LPPACA size of 1k i.e 64, 128 etc, due to the BUG_ON assert
for shared_lppaca_total_size equal to shared_lppaca_size,

 [0.00] Partition configured for 64 cpus.
 [0.00] CPU maps initialized for 1 thread per core
 [0.00] [ cut here ]
 [0.00] kernel BUG at arch/powerpc/kernel/paca.c:89!
 [0.00] Oops: Exception in kernel mode, sig: 5 [#1]
 [0.00] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries

which is not necessary, let's remove it.

Fixes: bd104e6db6f0 ("powerpc/pseries/svm: Use shared memory for LPPACA 
structures")
Cc: linux-ker...@vger.kernel.org
Cc: Michael Ellerman 
Cc: Thiago Jung Bauermann 
Cc: Ram Pai 
Cc: Sukadev Bhattiprolu 
Cc: Laurent Dufour 
Reviewed-by: Laurent Dufour 
Reviewed-by: Thiago Jung Bauermann 
Signed-off-by: Satheesh Rajendran 
---

V2:
Added Reviewed by Thiago and Laurent.
Added Fixes tag as per Thiago suggest.

V1: 
https://patchwork.ozlabs.org/project/linuxppc-dev/patch/20200609105731.14032-1-sathn...@linux.vnet.ibm.com/
 
---
 arch/powerpc/kernel/paca.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/paca.c b/arch/powerpc/kernel/paca.c
index 2168372b792d..74da65aacbc9 100644
--- a/arch/powerpc/kernel/paca.c
+++ b/arch/powerpc/kernel/paca.c
@@ -87,7 +87,7 @@ static void *__init alloc_shared_lppaca(unsigned long size, 
unsigned long align,
 * This is very early in boot, so no harm done if the kernel crashes at
 * this point.
 */
-   BUG_ON(shared_lppaca_size >= shared_lppaca_total_size);
+   BUG_ON(shared_lppaca_size > shared_lppaca_total_size);
 
return ptr;
 }
-- 
2.26.2



Re: rename probe_kernel_* and probe_user_*

2020-06-19 Thread Michael Ellerman
Linus Torvalds  writes:
> [ Explicitly added architecture lists and developers to the cc to make
> this more visible ]
>
> On Wed, Jun 17, 2020 at 12:38 AM Christoph Hellwig  wrote:
>>
>> Andrew and I decided to drop the patches implementing your suggested
>> rename of the probe_kernel_* and probe_user_* helpers from -mm as there
>> were way to many conflicts.  After -rc1 might be a good time for this as
>> all the conflicts are resolved now.
>
> So I've merged this renaming now, together with my changes to make
> 'get_kernel_nofault()' look and act a lot more like 'get_user()'.
>
> It just felt wrong (and potentially dangerous) to me to have a
> 'get_kernel_nofault()' naming that implied semantics that we're all
> familiar with from 'get_user()', but acting very differently.
>
> But part of the fixups I made for the type checking are for
> architectures where I didn't even compile-test the end result. I
> looked at every case individually, and the patch looks sane, but I
> could have screwed something up.
>
> Basically, 'get_kernel_nofault()' doesn't do the same automagic type
> munging from the pointer to the target that 'get_user()' does, but at
> least now it checks that the types are superficially compatible.
> There should be build failures if they aren't, but I hopefully fixed
> everything up properly for all architectures.
>
> This email is partly to ask people to double-check, but partly just as
> a heads-up so that _if_ I screwed something up, you'll have the
> background and it won't take you by surprise.

The powerpc changes look right, compile cleanly and seem to work
correctly.

cheers


Re: powerpc/pci: [PATCH 1/1 V3] PCIE PHB reset

2020-06-19 Thread Oliver O'Halloran
On Wed, Jun 17, 2020 at 4:29 PM Michael Ellerman  wrote:
>
> "Oliver O'Halloran"  writes:
> > On Tue, Jun 16, 2020 at 9:55 PM Michael Ellerman  
> > wrote:
> >> wenxi...@linux.vnet.ibm.com writes:
> >> > From: Wen Xiong 
> >> >
> >> > Several device drivers hit EEH(Extended Error handling) when triggering
> >> > kdump on Pseries PowerVM. This patch implemented a reset of the PHBs
> >> > in pci general code when triggering kdump.
> >>
> >> Actually it's in pseries specific PCI code, and the reset is done in the
> >> 2nd kernel as it boots, not when triggering the kdump.
> >>
> >> You're doing it as a:
> >>
> >>   machine_postcore_initcall(pseries, pseries_phb_reset);
> >>
> >> But we do the EEH initialisation in:
> >>
> >>   core_initcall_sync(eeh_init);
> >>
> >> Which happens first.
> >>
> >> So it seems to me that this should be called from pseries_eeh_init().
> >
> > This happens to use some of the same RTAS calls as EEH, but it's
> > entirely orthogonal to it.
>
> I don't agree. I mean it's literally calling EEH_RESET_FUNDAMENTAL etc.
> Those RTAS calls are all documented in the EEH section of PAPR.
>
> I guess you're saying it's orthogonal to the kernel handling an EEH and
> doing the recovery process etc, which I can kind of see.
>
> > Wedging the two together doesn't make any real sense IMO since this
> > should be usable even with !CONFIG_EEH.
>
> You can't turn CONFIG_EEH off for pseries or powernv.

Not yet :)

> And if you could this patch wouldn't compile because it uses EEH
> constants that are behind #ifdef CONFIG_EEH.

That's fixable.

> If you could turn CONFIG_EEH off it would presumably be because you were
> on a platform that didn't support EEH, in which case you wouldn't need
> this code.

I think there's an argument to be made for disabling EEH in some
situations. A lot of drivers do a pretty poor job of recovering in the
first place so it's conceivable that someone might want to disable it
in say, a kdump kernel. That said, the real reason is mostly for the
sake of code organisation. EEH is an optional platform feature but you
wouldn't know it looking at the implementation and I'd like to stop it
bleeding into odd places. Making it buildable without !CONFIG_EEH
would probably help.

> So IMO this is EEH code, and should be with the other EEH code and
> should be behind CONFIG_EEH.

*shrug*

I wanted it to follow the model of the powernv implementation of the
same feature which is done immediately after initialising the
pci_controller and independent of all of the EEH setup. Although,
looking at it again I see it calls pnv_eeh_phb_reset() which is in
eeh_powernv.c so I guess that's pretty similar to what you're
suggesting.

> That sounds like a good cleanup. I'm not concerned about conflicts
> within arch/powerpc, I can fix them up.
>
> >> > + list_for_each_entry(phb, _list, list_node) {
> >> > + config_addr = pseries_get_pdn_addr(phb);
> >> > + if (config_addr == -1)
> >> > + continue;
> >> > +
> >> > + ret = rtas_call(ibm_set_slot_reset, 4, 1, NULL,
> >> > + config_addr, BUID_HI(phb->buid),
> >> > + BUID_LO(phb->buid), EEH_RESET_FUNDAMENTAL);
> >> > +
> >> > + /* If fundamental-reset not supported, try 
> >> > hot-reset */
> >> > + if (ret == -8)
> >>
> >> Where does -8 come from?
> >
> > There's a comment right there.
>
> Yeah I guess. I was expecting it would map to some RTAS_ERROR_FOO value,
> but it's just literally -8 in PAPR.

Yeah, as far as I can tell the meaning of the return codes are
specific to each RTAS call, it's a bit bad.