Re: [PATCH] pseries/drmem: Check for zero filled ibm, dynamic-memory property.

2018-02-15 Thread Daniel Black

Looks good to me.

Boots without backtrace.

qemu-system-ppc64 -enable-kvm -cpu POWER8 -m 4G -M pseries -nographic
-vga none -kernel /tmp/vmlinux

qemu-system-ppc64 --version
QEMU emulator version 2.5.0 (Debian 1:2.5+dfsg-5ubuntu10.20)



Tested-by: Daniel Black 



Re: [PATCH] pseries/drmem: Check for zero filled ibm, dynamic-memory property.

2018-02-15 Thread Cyril Bur
On Thu, 2018-02-15 at 21:27 -0600, Nathan Fontenot wrote:
> Some versions of QEMU will produce an ibm,dynamic-reconfiguration-memory
> node with a ibm,dynamic-memory property that is zero-filled. This causes
> the drmem code to oops trying to parse this property.
> 
> The fix for this is to validate that the property does contain LMB
> entries before trying to parse it and bail if the count is zero.
> 
> Oops: Kernel access of bad area, sig: 11 [#1]
> SMP NR_CPUS=2048
> NUMA
> pSeries
> Modules linked in:
> Supported: Yes
> CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.12.14-11.2-default #1
> task: c0007e639680 task.stack: c0007e648000
> NIP: c0c709a4 LR: c0c70998 CTR: 
> REGS: c0007e64b8d0 TRAP: 0300   Not tainted  (4.12.14-11.2-default)
> MSR: 80010280b033 
>   CR: 84000248  XER: 
> CFAR: c067018c DAR: 0010 DSISR: 4200 SOFTE: 1
> GPR00: c0c70998 c0007e64bb50 c1157b00 
> GPR04: c0007e64bb70  002f 0022
> GPR08: 0003 c6f63fac c6f63fb0 001e
> GPR12:  cfa8 c000dca8 
> GPR16:    
> GPR20:    
> GPR24: c0cccb98 c0c636f0 c0c56cd0 0007
> GPR28: c0cccba8 c0007c30 c0007e64bbf0 0010
> NIP [c0c709a4] read_drconf_v1_cell+0x54/0x9c
> LR [c0c70998] read_drconf_v1_cell+0x48/0x9c
> Call Trace:
> [c0007e64bb50] [c0c56cd0] __param_initcall_debug+0x0/0x28 
> (unreliable)
> [c0007e64bb90] [c0c70e24] drmem_init+0x144/0x2f8
> [c0007e64bc40] [c000d034] do_one_initcall+0x64/0x1d0
> [c0007e64bd00] [c0c643d0] kernel_init_freeable+0x298/0x38c
> [c0007e64bdc0] [c000dcc4] kernel_init+0x24/0x160
> [c0007e64be30] [c000b428] ret_from_kernel_thread+0x5c/0xb4
> Instruction dump:
> 7c9e2378 6000 e9429050 e93e 7c240b78 7c7f1b78 f9240021 e86a0002
> 4804e41d 6000 e9210020 39490004  f9410020 39490010 7d004c2c
> 
> The ibm,dynamic-reconfiguration-memory device tree property
> generated that causes this:
> 
> ibm,dynamic-reconfiguration-memory {
> ibm,lmb-size = <0x0 0x1000>;
> ibm,memory-flags-mask = <0xff>;
> ibm,dynamic-memory = <0x0 0x0 0x0 0x0 0x0 0x0>;
> linux,phandle = <0x7e57eed8>;
> ibm,associativity-lookup-arrays = <0x1 0x4 0x0 0x0 0x0 0x0>;
> ibm,memory-preservation-time = <0x0>;
> };
> 
> Signed-off-by: Nathan Fontenot 

Works for me.

Reviewed-by: Cyril Bur 

> ---
>  arch/powerpc/mm/drmem.c |8 
>  1 file changed, 8 insertions(+)
> 
> diff --git a/arch/powerpc/mm/drmem.c b/arch/powerpc/mm/drmem.c
> index 1604110c4238..916844f99c64 100644
> --- a/arch/powerpc/mm/drmem.c
> +++ b/arch/powerpc/mm/drmem.c
> @@ -216,6 +216,8 @@ static void __init __walk_drmem_v1_lmbs(const __be32 
> *prop, const __be32 *usm,
>   u32 i, n_lmbs;
>  
>   n_lmbs = of_read_number(prop++, 1);
> + if (n_lmbs == 0)
> + return;
>  
>   for (i = 0; i < n_lmbs; i++) {
>   read_drconf_v1_cell(, );
> @@ -245,6 +247,8 @@ static void __init __walk_drmem_v2_lmbs(const __be32 
> *prop, const __be32 *usm,
>   u32 i, j, lmb_sets;
>  
>   lmb_sets = of_read_number(prop++, 1);
> + if (lmb_sets == 0)
> + return;
>  
>   for (i = 0; i < lmb_sets; i++) {
>   read_drconf_v2_cell(_cell, );
> @@ -354,6 +358,8 @@ static void __init init_drmem_v1_lmbs(const __be32 *prop)
>   struct drmem_lmb *lmb;
>  
>   drmem_info->n_lmbs = of_read_number(prop++, 1);
> + if (drmem_info->n_lmbs == 0)
> + return;
>  
>   drmem_info->lmbs = kcalloc(drmem_info->n_lmbs, sizeof(*lmb),
>  GFP_KERNEL);
> @@ -373,6 +379,8 @@ static void __init init_drmem_v2_lmbs(const __be32 *prop)
>   int lmb_index;
>  
>   lmb_sets = of_read_number(prop++, 1);
> + if (lmb_sets == 0)
> + return;
>  
>   /* first pass, calculate the number of LMBs */
>   p = prop;
> 


[PATCH] pseries/drmem: Check for zero filled ibm, dynamic-memory property.

2018-02-15 Thread Nathan Fontenot
Some versions of QEMU will produce an ibm,dynamic-reconfiguration-memory
node with a ibm,dynamic-memory property that is zero-filled. This causes
the drmem code to oops trying to parse this property.

The fix for this is to validate that the property does contain LMB
entries before trying to parse it and bail if the count is zero.

Oops: Kernel access of bad area, sig: 11 [#1]
SMP NR_CPUS=2048
NUMA
pSeries
Modules linked in:
Supported: Yes
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.12.14-11.2-default #1
task: c0007e639680 task.stack: c0007e648000
NIP: c0c709a4 LR: c0c70998 CTR: 
REGS: c0007e64b8d0 TRAP: 0300   Not tainted  (4.12.14-11.2-default)
MSR: 80010280b033 
  CR: 84000248  XER: 
CFAR: c067018c DAR: 0010 DSISR: 4200 SOFTE: 1
GPR00: c0c70998 c0007e64bb50 c1157b00 
GPR04: c0007e64bb70  002f 0022
GPR08: 0003 c6f63fac c6f63fb0 001e
GPR12:  cfa8 c000dca8 
GPR16:    
GPR20:    
GPR24: c0cccb98 c0c636f0 c0c56cd0 0007
GPR28: c0cccba8 c0007c30 c0007e64bbf0 0010
NIP [c0c709a4] read_drconf_v1_cell+0x54/0x9c
LR [c0c70998] read_drconf_v1_cell+0x48/0x9c
Call Trace:
[c0007e64bb50] [c0c56cd0] __param_initcall_debug+0x0/0x28 
(unreliable)
[c0007e64bb90] [c0c70e24] drmem_init+0x144/0x2f8
[c0007e64bc40] [c000d034] do_one_initcall+0x64/0x1d0
[c0007e64bd00] [c0c643d0] kernel_init_freeable+0x298/0x38c
[c0007e64bdc0] [c000dcc4] kernel_init+0x24/0x160
[c0007e64be30] [c000b428] ret_from_kernel_thread+0x5c/0xb4
Instruction dump:
7c9e2378 6000 e9429050 e93e 7c240b78 7c7f1b78 f9240021 e86a0002
4804e41d 6000 e9210020 39490004  f9410020 39490010 7d004c2c

The ibm,dynamic-reconfiguration-memory device tree property
generated that causes this:

ibm,dynamic-reconfiguration-memory {
ibm,lmb-size = <0x0 0x1000>;
ibm,memory-flags-mask = <0xff>;
ibm,dynamic-memory = <0x0 0x0 0x0 0x0 0x0 0x0>;
linux,phandle = <0x7e57eed8>;
ibm,associativity-lookup-arrays = <0x1 0x4 0x0 0x0 0x0 0x0>;
ibm,memory-preservation-time = <0x0>;
};

Signed-off-by: Nathan Fontenot 
---
 arch/powerpc/mm/drmem.c |8 
 1 file changed, 8 insertions(+)

diff --git a/arch/powerpc/mm/drmem.c b/arch/powerpc/mm/drmem.c
index 1604110c4238..916844f99c64 100644
--- a/arch/powerpc/mm/drmem.c
+++ b/arch/powerpc/mm/drmem.c
@@ -216,6 +216,8 @@ static void __init __walk_drmem_v1_lmbs(const __be32 *prop, 
const __be32 *usm,
u32 i, n_lmbs;
 
n_lmbs = of_read_number(prop++, 1);
+   if (n_lmbs == 0)
+   return;
 
for (i = 0; i < n_lmbs; i++) {
read_drconf_v1_cell(, );
@@ -245,6 +247,8 @@ static void __init __walk_drmem_v2_lmbs(const __be32 *prop, 
const __be32 *usm,
u32 i, j, lmb_sets;
 
lmb_sets = of_read_number(prop++, 1);
+   if (lmb_sets == 0)
+   return;
 
for (i = 0; i < lmb_sets; i++) {
read_drconf_v2_cell(_cell, );
@@ -354,6 +358,8 @@ static void __init init_drmem_v1_lmbs(const __be32 *prop)
struct drmem_lmb *lmb;
 
drmem_info->n_lmbs = of_read_number(prop++, 1);
+   if (drmem_info->n_lmbs == 0)
+   return;
 
drmem_info->lmbs = kcalloc(drmem_info->n_lmbs, sizeof(*lmb),
   GFP_KERNEL);
@@ -373,6 +379,8 @@ static void __init init_drmem_v2_lmbs(const __be32 *prop)
int lmb_index;
 
lmb_sets = of_read_number(prop++, 1);
+   if (lmb_sets == 0)
+   return;
 
/* first pass, calculate the number of LMBs */
p = prop;



[PATCH v2 (skiboot)] dt: add /cpus/ibm, powerpc-cpu-features device tree bindings

2018-02-15 Thread Nicholas Piggin
This is a new CPU feature advertising interface that is fine-grained,
extensible, aware of privilege levels, and gives control of features
to all levels of the stack (firmware, hypervisor, and OS).

The design and binding specification is described in detail in doc/.

Signed-off-by: Nicholas Piggin 
---
Since v1:

- Fold branch-v3 into fixed-point-v3, as pointed out by Segher they
  are all fixed point facilities.

- Fixed typo (Segher)

 core/Makefile.inc  |   2 +-
 core/cpufeatures.c | 921 +
 core/device.c  |   7 +
 core/init.c|   1 +
 .../ibm,powerpc-cpu-features/binding.txt   | 245 ++
 .../ibm,powerpc-cpu-features/design.txt| 157 
 include/device.h   |   1 +
 include/skiboot.h  |   5 +
 8 files changed, 1338 insertions(+), 1 deletion(-)
 create mode 100644 core/cpufeatures.c
 create mode 100644 doc/device-tree/ibm,powerpc-cpu-features/binding.txt
 create mode 100644 doc/device-tree/ibm,powerpc-cpu-features/design.txt

diff --git a/core/Makefile.inc b/core/Makefile.inc
index d6a7269f..5c120564 100644
--- a/core/Makefile.inc
+++ b/core/Makefile.inc
@@ -9,7 +9,7 @@ CORE_OBJS += vpd.o hostservices.o platform.o nvram.o 
nvram-format.o hmi.o
 CORE_OBJS += console-log.o ipmi.o time-utils.o pel.o pool.o errorlog.o
 CORE_OBJS += timer.o i2c.o rtc.o flash.o sensor.o ipmi-opal.o
 CORE_OBJS += flash-subpartition.o bitmap.o buddy.o pci-quirk.o powercap.o psr.o
-CORE_OBJS += pci-dt-slot.o direct-controls.o
+CORE_OBJS += pci-dt-slot.o direct-controls.o cpufeatures.o
 
 ifeq ($(SKIBOOT_GCOV),1)
 CORE_OBJS += gcov-profiling.o
diff --git a/core/cpufeatures.c b/core/cpufeatures.c
new file mode 100644
index ..fa5c4ce5
--- /dev/null
+++ b/core/cpufeatures.c
@@ -0,0 +1,921 @@
+/* Copyright 2017-2018 IBM Corp.
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
+ * implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+/*
+ * This file deals with setup of /cpus/ibm,powerpc-cpu-features dt
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#ifdef DEBUG
+#define DBG(fmt, a...) prlog(PR_DEBUG, "CPUFT: " fmt, ##a)
+#else
+#define DBG(fmt, a...)
+#endif
+
+/* Device-tree visible constants follow */
+#define ISA_V2_07B 2070
+#define ISA_V3_0B  3000
+
+#define USABLE_PR  (1U << 0)
+#define USABLE_OS  (1U << 1)
+#define USABLE_HV  (1U << 2)
+
+#define HV_SUPPORT_HFSCR   (1U << 0)
+#define OS_SUPPORT_FSCR(1U << 0)
+
+/* Following are definitions for the match tables, not the DT binding itself */
+#define ISA_BASE   0
+
+#define HV_NONE0
+#define HV_CUSTOM  1
+#define HV_HFSCR   2
+
+#define OS_NONE0
+#define OS_CUSTOM  1
+#define OS_FSCR2
+
+/* CPU bitmasks for match table */
+#define CPU_P8_DD1 (1U << 0)
+#define CPU_P8_DD2 (1U << 1)
+#define CPU_P9_DD1 (1U << 2)
+#define CPU_P9_DD2 (1U << 3)
+
+#define CPU_P8 (CPU_P8_DD1|CPU_P8_DD2)
+#define CPU_P9 (CPU_P9_DD1|CPU_P9_DD2)
+#define CPU_ALL(CPU_P8|CPU_P9)
+
+struct cpu_feature {
+   const char *name;
+   uint32_t cpus_supported;
+   uint32_t isa;
+   uint32_t usable_privilege;
+   uint32_t hv_support;
+   uint32_t os_support;
+   uint32_t hfscr_bit_nr;
+   uint32_t fscr_bit_nr;
+   uint32_t hwcap_bit_nr;
+   const char *dependencies_names; /* space-delimited names */
+};
+
+/*
+ * The base (or NULL) cpu feature set is the CPU features available
+ * when no child nodes of the /cpus/ibm,powerpc-cpu-features node exist. The
+ * base feature set is POWER8 (ISAv2.07B), less features that are listed
+ * explicitly.
+ *
+ * XXX: currently, the feature dependencies are not necessarily captured
+ * exactly or completely. This is somewhat acceptable because all
+ * implementations must be aware of all these features.
+ */
+static const struct cpu_feature cpu_features_table[] = {
+   /*
+* Big endian as in ISAv2.07B, MSR_LE=0
+*/
+   { "big-endian",
+   CPU_ALL,
+   ISA_BASE, USABLE_HV|USABLE_OS|USABLE_PR,
+   HV_CUSTOM, OS_CUSTOM,
+   -1, -1, -1,
+   NULL, },
+
+   /*
+* Little endian as in ISAv2.07B, MSR_LE=1.
+*
+* When both big and little endian are defined, 

Re: [PATCH] powerpc/npu-dma.c: Fix deadlock in mmio_invalidate

2018-02-15 Thread Mark Hairgrove


On Wed, 14 Feb 2018, Alistair Popple wrote:

> > > +struct mmio_atsd_reg {
> > > + struct npu *npu;
> > > + int reg;
> > > +};
> > > +
> > 
> > Is it just easier to move reg to inside of struct npu?
> 
> I don't think so, struct npu is global to all npu contexts where as this is
> specific to the given invalidation. We don't have enough registers to assign
> each NPU context it's own dedicated register so I'm not sure it makes sense to
> put it there either.
> 
> > > +static void acquire_atsd_reg(struct npu_context *npu_context,
> > > + struct mmio_atsd_reg mmio_atsd_reg[NV_MAX_NPUS])
> > > +{
> > > + int i, j;
> > > + struct npu *npu;
> > > + struct pci_dev *npdev;
> > > + struct pnv_phb *nphb;
> > >  
> > > - /*
> > > -  * The GPU requires two flush ATSDs to ensure all entries have
> > > -  * been flushed. We use PID 0 as it will never be used for a
> > > -  * process on the GPU.
> > > -  */
> > > - if (flush)
> > > - mmio_invalidate_pid(npu, 0, true);
> > > + for (i = 0; i <= max_npu2_index; i++) {
> > > + mmio_atsd_reg[i].reg = -1;
> > > + for (j = 0; j < NV_MAX_LINKS; j++) {
> > 
> > Is it safe to assume that npu_context->npdev will not change in this
> > loop? I guess it would need to be stronger than just this loop.
> 
> It is not safe to assume that npu_context->npdev won't change during this 
> loop,
> however I don't think it is a problem if it does as we only read each element
> once during the invalidation.

Shouldn't that be enforced with READ_ONCE() then?

I assume that npdev->bus can't change until after the last
pnv_npu2_destroy_context() is called for an npu. In that case, the
mmu_notifier_unregister() in pnv_npu2_release_context() will block until
mmio_invalidate() is done using npdev. That seems safe enough, but a
comment somewhere about that would be useful.

> 
> There are two possibilities for how this could change. pnv_npu2_init_context()
> will add a nvlink to the npdev which will result in the TLB invalidation being
> sent to that GPU as well which should not be a problem.
> 
> pnv_npu2_destroy_context() will remove the the nvlink from npdev. If it 
> happens
> prior to this loop it should not be a problem (as the destruction will have
> already invalidated the GPU TLB). If it happens after this loop it shouldn't 
> be
> a problem either (it will just result in an extra TLB invalidate being sent to
> this GPU).
> 
> > > + npdev = npu_context->npdev[i][j];
> > > + if (!npdev)
> > > + continue;
> > > +
> > > + nphb = pci_bus_to_host(npdev->bus)->private_data;
> > > + npu = >npu;
> > > + mmio_atsd_reg[i].npu = npu;
> > > + mmio_atsd_reg[i].reg = get_mmio_atsd_reg(npu);
> > > + while (mmio_atsd_reg[i].reg < 0) {
> > > + mmio_atsd_reg[i].reg = get_mmio_atsd_reg(npu);
> > > + cpu_relax();
> > 
> > A cond_resched() as well if we have too many tries?
> 
> I don't think we can as the invalidate_range() function is called under the 
> ptl
> spin-lock and is not allowed to sleep (at least according to
> include/linux/mmu_notifier.h).
> 
> - Alistair
> 
> > Balbir
> > 
> 
> 
> 


Re: [PATCH v2] cxl: Check if PSL data-cache is available before issue flush request

2018-02-15 Thread Andrew Donnellan

On 16/02/18 02:49, Vaibhav Jain wrote:

PSL9D doesn't have a data-cache that needs to be flushed before
resetting the card. However when cxl tries to flush data-cache on such
a card, it times-out as PSL_Control register never indicates flush
operation complete due to missing data-cache. This is usually
indicated in the kernel logs with this message:

"WARNING: cache flush timed out"

To fix this the patch checks PSL_Debug register CDC-Field(BIT:27)
which indicates the absence of a data-cache and sets a flag
'no_data_cache' in 'struct cxl_native' to indicate this. When
cxl_data_cache_flush() is called it checks the flag and if set bails
out early without requesting a data-cache flush operation to the PSL.

Signed-off-by: Vaibhav Jain 


LGTM

Acked-by: Andrew Donnellan 

--
Andrew Donnellan  OzLabs, ADL Canberra
andrew.donnel...@au1.ibm.com  IBM Australia Limited



[RFC 3/4] powerpc/hotplug/drcinfo: Fix hot-add CPU issues

2018-02-15 Thread Michael Bringmann
This patch applies a common parse function for the ibm,drc-info
property that can be modified by a callback function to the
hot-add CPU code.  Candidate code is replaced by a call to the
parser including a pointer to a local context-specific functions,
and local data.

In addition, a bug in the release of the previous patch set may
break things in some of the CPU DLPAR operations.  For instance,
when attempting to hot-add a new CPU or set of CPUs, the original
patch failed to always properly calculate the available resources,
and aborted the operation.

Signed-off-by: Michael Bringmann 
Fixes: 3f38000eda48 ("powerpc/firmware: Add definitions for new drc-info firmwar
e feature" -- end of patch series applied to powerpc next)
---
 arch/powerpc/platforms/pseries/hotplug-cpu.c|  129 +--
 arch/powerpc/platforms/pseries/pseries_energy.c |  112 ++--
 2 files changed, 154 insertions(+), 87 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/hotplug-cpu.c 
b/arch/powerpc/platforms/pseries/hotplug-cpu.c
index a7d14aa7..9346b04 100644
--- a/arch/powerpc/platforms/pseries/hotplug-cpu.c
+++ b/arch/powerpc/platforms/pseries/hotplug-cpu.c
@@ -408,25 +408,67 @@ static bool dlpar_cpu_exists(struct device_node *parent, 
u32 drc_index)
return found;
 }

-static bool valid_cpu_drc_index(struct device_node *parent, u32 drc_index)
+static bool check_cpu_drc_index(struct device_node *parent,
+   int (*checkRun)(struct of_drc_info *drc,
+   void *data,
+   void *not_used,
+   int *ret_code),
+   void *cdata)
 {
-   bool found = false;
-   int rc, index;
+   int found = 0;
+
+   if (firmware_has_feature(FW_FEATURE_DRC_INFO)) {
+   found = drc_info_parser(parent, checkRun, "CPU", cdata);
+   } else {
+   int rc, index = 0;

-   index = 0;
-   while (!found) {
-   u32 drc;
+   while (!found) {
+   u32 drc;

-   rc = of_property_read_u32_index(parent, "ibm,drc-indexes",
+   rc = of_property_read_u32_index(parent,
+   "ibm,drc-indexes",
index++, );
-   if (rc)
-   break;
+   if (rc)
+   break;
+   found = checkRun(NULL, cdata, , NULL);
+   }
+   }

-   if (drc == drc_index)
-   found = true;
+   return (bool)found;
+}
+
+struct valid_cpu_drc_index_struct {
+   u32 targ_drc_index;
+};
+
+static int valid_cpu_drc_index_checkRun(struct of_drc_info *drc,
+   void *idata,
+   void *drc_index,
+   int *ret_code)
+{
+   struct valid_cpu_drc_index_struct *cdata = idata;
+
+   if (drc) {
+   if ((drc->drc_index_start <= cdata->targ_drc_index) &&
+   (cdata->targ_drc_index <= drc->last_drc_index)) {
+   (*ret_code) = 1;
+   return 1;
+   }
+   } else {
+   if (*((u32*)drc_index) == cdata->targ_drc_index) {
+   (*ret_code) = 1;
+   return 1;
+   }
}
+   return 0;
+}

-   return found;
+static bool valid_cpu_drc_index(struct device_node *parent, u32 drc_index)
+{
+   struct valid_cpu_drc_index_struct cdata = { drc_index };
+
+   return check_cpu_drc_index(parent, valid_cpu_drc_index_checkRun,
+   );
 }

 static ssize_t dlpar_cpu_add(u32 drc_index)
@@ -718,11 +760,45 @@ static int dlpar_cpu_remove_by_count(u32 cpus_to_remove)
return rc;
 }

+struct find_dlpar_cpus_to_add_struct {
+   struct device_node *parent;
+   u32 *cpu_drcs;
+   u32 cpus_to_add;
+   u32 cpus_found;
+};
+
+static int find_dlpar_cpus_to_add_checkRun(struct of_drc_info *drc,
+   void *idata,
+   void *drc_index,
+   int *ret_code)
+{
+   struct find_dlpar_cpus_to_add_struct *cdata = idata;
+
+   if (drc) {
+   int k;
+
+   for (k = 0; (k < drc->num_sequential_elems) &&
+   (cdata->cpus_found < cdata->cpus_to_add); k++) {
+   u32 idrc = drc->drc_index_start +
+   (k * drc->sequential_inc);
+
+   if (dlpar_cpu_exists(cdata->parent, idrc))
+   continue;
+   cdata->cpu_drcs[cdata->cpus_found++] = idrc;
+   }
+   } else {
+  

[RFC] powerpc/kernel: Add 'ibm,thread-groups' property for CPU allocation

2018-02-15 Thread Michael Bringmann
Add code to parse the new property 'ibm,thread-groups" when it is
present.  The content of this property explicitly defines the number
of threads per core as well as the PowerPC 'threads_core_mask'.
The design provides a common device-tree for both P9 normal core and
P9 fused core systems.  The new property has been observed to be
available on P9 pHyp systems, but it may not be present on OpenPower
BMC systems.

The property updates the kernel to know which CPUs/threads of each
core are actually present, and then use the map when adding cores
to the system at boot, or during hotplug operations.

* Previously, the information about the number of threads per core
  was inferred solely from the "ibm,ppc-interrupt-server#s" property
  in the system device tree.
* Also previous to this property, The mask of threads per CPU was
  inferred to be a strict linear series from 0..(nthreads-1).
* There may be a different thread group mask for each core in the
  system.
* Also after reading the property, we can determine which of the
  possible threads we are allowed to online for each CPU.  It is no
  longer a simple linear sequence, but may be discontinuous e.g.
  activate threads 1,2,3,5,6,7 on a core instead of 0-5 sequentially.

In the event of LPAR migration, we also provide a hook to re-process
the property in the event that it is changed.  Rules about fused-core
and split-core migration are outside the scope of this change, however.
We update the 'ppc_thread_group_mask' for subsequent use by DLPAR
operations.  It is the responsibility of the user to put the source
system into SMT4 mode when moving from a fused-core to split-core
target.

Implementation of the "ibm,thread-groups" property is spread across
a few files in the powerpc specific code:

* prom.c: Parse the property and create 'ppc_thread_group_mask'.
  Use the mask in operation of early_init_dt_scan_cpus().
* setup-common.c: Parse the property, create 'ppc_thread_group_mask',
  and use the value in cpu_init_thread_core_maps(), and
  smp_setup_cpu_maps.
* hotplug-cpu.c: Use 'ppc_thread_group_mask' in several locations
  where the code previously expected to iterate over a
  linear series of active threads (0..nthreads-1).
* mobility.c: Look for and process changes to the thread group mask
  in the context of post migration topology changes

Note that the "ibm,thread-groups" property also includes semantics
of 'thread-group' i.e. define one or more subgroups of the available
threads, each group of threads to be used for a specific class of
task.  Translating thread group semantics into Linux kernel features
is TBD.

Signed-off-by: Michael Bringmann 
---
 arch/powerpc/include/asm/cputhreads.h|6 +
 arch/powerpc/kernel/setup-common.c   |  136 --
 arch/powerpc/platforms/pseries/hotplug-cpu.c |   14 ++-
 arch/powerpc/platforms/pseries/mobility.c|6 +
 4 files changed, 150 insertions(+), 12 deletions(-)

diff --git a/arch/powerpc/include/asm/cputhreads.h 
b/arch/powerpc/include/asm/cputhreads.h
index d71a909..df6ade9 100644
--- a/arch/powerpc/include/asm/cputhreads.h
+++ b/arch/powerpc/include/asm/cputhreads.h
@@ -31,6 +31,12 @@
 #define threads_core_mask  (*get_cpu_mask(0))
 #endif
 
+extern cpumask_t ppc_thread_group_mask;
+
+extern int process_thread_group_mask(struct device_node *dn,
+   const __be32 *prop, int prop_len);
+
+
 /* cpu_thread_mask_to_cores - Return a cpumask of one per cores
  *hit by the argument
  *
diff --git a/arch/powerpc/kernel/setup-common.c 
b/arch/powerpc/kernel/setup-common.c
index 8fd3a70..1102d12 100644
--- a/arch/powerpc/kernel/setup-common.c
+++ b/arch/powerpc/kernel/setup-common.c
@@ -416,13 +416,18 @@ void __init check_for_initrd(void)
 EXPORT_SYMBOL_GPL(threads_shift);
 EXPORT_SYMBOL_GPL(threads_core_mask);
 
-static void __init cpu_init_thread_core_maps(int tpc)
+cpumask_t ppc_thread_group_mask;
+EXPORT_SYMBOL_GPL(ppc_thread_group_mask);
+
+static void __init cpu_init_thread_core_maps(int tpc,
+   cpumask_t *thread_group_mask)
 {
int i;
 
threads_per_core = tpc;
threads_per_subcore = tpc;
cpumask_clear(_core_mask);
+   DBG("INFO: Entry %s (%d)\n", __FUNCTION__, tpc);
 
/* This implementation only supports power of 2 number of threads
 * for simplicity and performance
@@ -432,12 +437,112 @@ static void __init cpu_init_thread_core_maps(int tpc)
 
for (i = 0; i < tpc; i++)
cpumask_set_cpu(i, _core_mask);
+   cpumask_and(_core_mask, _core_mask, thread_group_mask);
 
printk(KERN_INFO "CPU maps initialized for %d thread%s per core\n",
   tpc, tpc > 1 ? "s" : "");
printk(KERN_DEBUG " (thread shift is %d)\n", threads_shift);
 }
 
+int process_thread_group_mask(struct device_node *dn,
+

[RFC 3/3] postmigration/memory: Associativity & ibm, dynamic-memory-v2

2018-02-15 Thread Michael Bringmann
postmigration/memory: Now apply changes to the associativity of memory
blocks described by the 'ibm,dynamic-memory-v2' property regarding
the topology of LPARS in Post Migration events.

* Extend the previous work done for the 'ibm,associativity-lookup-array'
  to apply to either property 'ibm,dynamic-memory' or
  'ibm,dynamic-memory-v2', whichever is present.
* Add new code to parse the 'ibm,dynamic-memory-v2' property looking
  for differences in block 'assignment', associativity indexes per
  block, and any other difference currently known.

When block differences are recognized, the memory block may be removed,
added, or updated depending upon the state of the new device tree
property and differences from the migrated value of the property.

Signed-off-by: Michael Bringmann 
---
 arch/powerpc/include/asm/prom.h |   12 ++
 arch/powerpc/platforms/pseries/hotplug-memory.c |  171 ++-
 2 files changed, 174 insertions(+), 9 deletions(-)

diff --git a/arch/powerpc/include/asm/prom.h b/arch/powerpc/include/asm/prom.h
index 825bd59..e16ef0f 100644
--- a/arch/powerpc/include/asm/prom.h
+++ b/arch/powerpc/include/asm/prom.h
@@ -92,6 +92,18 @@ struct of_drconf_cell {
u32 flags;
 };
 
+/* The of_drconf_cell_v2 struct defines the layout of the LMB array
+ * specified in the device tree property
+ * ibm,dynamic-reconfiguration-memory/ibm,dynamic-memory-v2
+ */
+struct of_drconf_cell_v2 {
+   u32 num_seq_lmbs;
+   u64 base_address;
+   u32 drc_index;
+   u32 aa_index;
+   u32 flags;
+} __attribute__((packed));
+
 #define DRCONF_MEM_ASSIGNED0x0008
 #define DRCONF_MEM_AI_INVALID  0x0040
 #define DRCONF_MEM_RESERVED0x0080
diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c 
b/arch/powerpc/platforms/pseries/hotplug-memory.c
index 04208b0..96eaa9a 100644
--- a/arch/powerpc/platforms/pseries/hotplug-memory.c
+++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
@@ -1172,14 +1172,112 @@ static int pseries_update_drconf_memory(struct 
of_reconfig_data *pr)
return rc;
 }
 
+static inline int pseries_memory_v2_find_drc(u32 drc_index,
+   u64 *base_addr, unsigned long memblock_size,
+   struct of_drconf_cell_v2 **drmem,
+   struct of_drconf_cell_v2 *last_drmem)
+{
+   struct of_drconf_cell_v2 *dm = (*drmem);
+
+   while (dm < last_drmem) {
+   if ((be32_to_cpu(dm->drc_index) <= drc_index) &&
+   (drc_index <= (be32_to_cpu(dm->drc_index)+
+   be32_to_cpu(dm->num_seq_lmbs)-1))) {
+   int offset = drc_index - be32_to_cpu(dm->drc_index);
+   (*base_addr) = be64_to_cpu(dm->base_address) +
+   (offset * memblock_size);
+   break;
+   } else if (drc_index > (be32_to_cpu(dm->drc_index)+
+   be32_to_cpu(dm->num_seq_lmbs)-1)) {
+   dm++;
+   (*drmem) = dm;
+   } else if (be32_to_cpu(dm->drc_index) > drc_index) {
+   return -1;
+   }
+   }
+
+   return 0;
+}
+
+static int pseries_update_drconf_memory_v2(struct of_reconfig_data *pr)
+{
+   struct of_drconf_cell_v2 *new_drmem, *old_drmem, *last_old_drmem;
+   unsigned long memblock_size;
+   u32 new_entries, old_entries;
+   u64 old_base_addr;
+   __be32 *p;
+   int i, rc = 0;
+
+   if (rtas_hp_event)
+   return 0;
+
+   memblock_size = pseries_memory_block_size();
+   if (!memblock_size)
+   return -EINVAL;
+
+   /* The first int of the property is the number of lmb's
+* described by the property. This is followed by an array
+* of of_drconf_cell_v2 entries. Get the number of entries
+* and skip to the array of of_drconf_cell_v2's.
+*/
+   p = (__be32 *) pr->old_prop->value;
+   if (!p)
+   return -EINVAL;
+   old_entries = be32_to_cpu(*p++);
+   old_drmem = (struct of_drconf_cell_v2 *)p;
+   last_old_drmem = old_drmem +
+   (sizeof(struct of_drconf_cell_v2) * old_entries);
+
+   p = (__be32 *)pr->prop->value;
+   new_entries = be32_to_cpu(*p++);
+   new_drmem = (struct of_drconf_cell_v2 *)p;
+
+   for (i = 0; i < new_entries; i++) {
+   int j;
+   u32 new_drc_index = be32_to_cpu(new_drmem->drc_index);
+
+   for (j = 0; j < new_drmem->num_seq_lmbs; j++) {
+   if (!pseries_memory_v2_find_drc(new_drc_index+j,
+   _base_addr,
+   memblock_size,
+   _drmem,
+   last_old_drmem)) {
+ 

[RFC 2/3] postmigration/memory: Review assoc lookup array changes

2018-02-15 Thread Michael Bringmann
postmigration/memory: In an LPAR migration scenario, the property
"ibm,associativity-lookup-arrays" may change.  In the event that a
row of the array differs, locate all assigned memory blocks with that
'aa_index' and 're-add' them to the system memory block data structures.
In the process of the 're-add', the appropriate entry of the property
'ibm,dynamic-memory' would be updated as well as any other applicable
system data structures.

Signed-off-by: Michael Bringmann 
---
Changes in RFC:
  -- Simplify code to update memory nodes during mobility checks.
 Remove functions to generate extra HP_ELOG messages in favor
 of direct function calls to dlpar_memory_readd_by_index.
  -- Resubmit as RFC pending further integration with LMB changes
 by Nathan Fontenot
---
 arch/powerpc/platforms/pseries/hotplug-memory.c |  121 +++
 1 file changed, 121 insertions(+)

diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c 
b/arch/powerpc/platforms/pseries/hotplug-memory.c
index 0e2ae20..04208b0 100644
--- a/arch/powerpc/platforms/pseries/hotplug-memory.c
+++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
@@ -1172,6 +1172,124 @@ static int pseries_update_drconf_memory(struct 
of_reconfig_data *pr)
return rc;
 }
 
+struct assoc_arrays {
+   u32 n_arrays;
+   u32 array_sz;
+   const __be32 *arrays;
+};
+
+static int pseries_update_ala_memory_aai(int aa_index,
+   struct property *dmprop)
+{
+   struct of_drconf_cell *drmem;
+   u32 entries;
+   __be32 *p;
+   int i;
+   int rc = 0;
+
+   p = (__be32 *) dmprop->value;
+   if (!p)
+   return -EINVAL;
+
+   /* The first int of the property is the number of lmb's
+* described by the property. This is followed by an array
+* of of_drconf_cell entries. Get the number of entries
+* and skip to the array of of_drconf_cell's.
+*/
+   entries = be32_to_cpu(*p++);
+   drmem = (struct of_drconf_cell *)p;
+
+   for (i = 0; i < entries; i++) {
+   if ((be32_to_cpu(drmem[i].aa_index) != aa_index) &&
+   (be32_to_cpu(drmem[i].flags) & DRCONF_MEM_ASSIGNED)) {
+   rc = dlpar_memory_readd_by_index(
+   be32_to_cpu(drmem[i].drc_index),
+   dmprop);
+   }
+   }
+
+   return rc;
+}
+
+static int pseries_update_ala_memory(struct of_reconfig_data *pr)
+{
+   struct assoc_arrays new_ala, old_ala;
+   struct device_node *dn;
+   struct property *dmprop;
+   __be32 *p;
+   int i, lim;
+
+   if (rtas_hp_event)
+   return 0;
+
+   dn = of_find_node_by_path("/ibm,dynamic-reconfiguration-memory");
+   if (!dn)
+   return -ENODEV;
+
+   dmprop = of_find_property(dn, "ibm,dynamic-memory", NULL);
+   if (!dmprop) {
+   of_node_put(dn);
+   return -ENODEV;
+   }
+
+   /*
+* The layout of the ibm,associativity-lookup-arrays
+* property is a number N indicating the number of
+* associativity arrays, followed by a number M
+* indicating the size of each associativity array,
+* followed by a list of N associativity arrays.
+*/
+
+   p = (__be32 *) pr->old_prop->value;
+   if (!p) {
+   of_node_put(dn);
+   return -EINVAL;
+   }
+   old_ala.n_arrays = of_read_number(p++, 1);
+   old_ala.array_sz = of_read_number(p++, 1);
+   old_ala.arrays = p;
+
+   p = (__be32 *) pr->prop->value;
+   if (!p) {
+   of_node_put(dn);
+   return -EINVAL;
+   }
+   new_ala.n_arrays = of_read_number(p++, 1);
+   new_ala.array_sz = of_read_number(p++, 1);
+   new_ala.arrays = p;
+
+   lim = (new_ala.n_arrays > old_ala.n_arrays) ? old_ala.n_arrays :
+   new_ala.n_arrays;
+
+   if (old_ala.array_sz == new_ala.array_sz) {
+
+   for (i = 0; i < lim; i++) {
+   int index = (i * new_ala.array_sz);
+
+   if (!memcmp(_ala.arrays[index],
+   _ala.arrays[index],
+   new_ala.array_sz))
+   continue;
+
+   pseries_update_ala_memory_aai(i, dmprop);
+   }
+
+   for (i = lim; i < new_ala.n_arrays; i++)
+   pseries_update_ala_memory_aai(i, dmprop);
+
+   } else {
+   /* Update all entries representing these rows;
+* as all rows have different sizes, none can
+* have equivalent values.
+*/
+   for (i = 0; i < lim; i++)
+   pseries_update_ala_memory_aai(i, dmprop);
+   }
+
+   of_node_put(dn);
+   return 0;
+}
+
 static 

[RFC 1/3] hotplug/mobility: Apply assoc updates for Post Migration Topo

2018-02-15 Thread Michael Bringmann
hotplug/mobility: Recognize more changes to the associativity of
memory blocks described by the 'ibm,dynamic-memory' and 'cpu'
properties when processing the topology of LPARS in Post Migration
events.  Previous efforts only recognized whether a memory block's
assignment had changed in the property.  Changes here include:

* Checking the aa_index values of the old/new properties and 'readd'
  any block for which the setting has changed.
* Checking for changes in cpu associativity and making 'readd' calls
  when differences are observed.

Signed-off-by: Michael Bringmann 
---
Changes in RFC:
  -- Simplify code to update CPU nodes during mobility checks.
 Remove functions to generate extra HP_ELOG messages in favor
 of direct function calls to dlpar_cpu_readd_by_index.
  -- Move check for "cpu" node type from pseries_update_cpu to
 pseries_smp_notifier in 'hotplug-cpu.c'
  -- Remove functions 'pseries_memory_readd_by_index' and
 'pseries_cpu_readd_by_index' as no longer needed outside of
 'mobility.c'.
  -- Update patch for recent checking compatibility
  -- Resubmit as RFC pending further integration with LMB changes
 by Nathan Fontenot
---
 arch/powerpc/platforms/pseries/hotplug-cpu.c|   69 +++
 arch/powerpc/platforms/pseries/hotplug-memory.c |7 ++
 2 files changed, 76 insertions(+)

diff --git a/arch/powerpc/platforms/pseries/hotplug-cpu.c 
b/arch/powerpc/platforms/pseries/hotplug-cpu.c
index a7d14aa7..91ef22a 100644
--- a/arch/powerpc/platforms/pseries/hotplug-cpu.c
+++ b/arch/powerpc/platforms/pseries/hotplug-cpu.c
@@ -636,6 +636,27 @@ static int dlpar_cpu_remove_by_index(u32 drc_index)
return rc;
 }
 
+static int dlpar_cpu_readd_by_index(u32 drc_index)
+{
+   int rc = 0;
+
+   pr_info("Attempting to update CPU, drc index %x\n", drc_index);
+
+   if (dlpar_cpu_remove_by_index(drc_index))
+   rc = -EINVAL;
+   else if (dlpar_cpu_add(drc_index))
+   rc = -EINVAL;
+
+   if (rc)
+   pr_info("Failed to update cpu at drc_index %lx\n",
+   (unsigned long int)drc_index);
+   else
+   pr_info("CPU at drc_index %lx was updated\n",
+   (unsigned long int)drc_index);
+
+   return rc;
+}
+
 static int find_dlpar_cpus_to_remove(u32 *cpu_drcs, int cpus_to_remove)
 {
struct device_node *dn;
@@ -826,6 +847,9 @@ int dlpar_cpu(struct pseries_hp_errorlog *hp_elog)
else
rc = -EINVAL;
break;
+   case PSERIES_HP_ELOG_ACTION_READD:
+   rc = dlpar_cpu_readd_by_index(drc_index);
+   break;
default:
pr_err("Invalid action (%d) specified\n", hp_elog->action);
rc = -EINVAL;
@@ -876,12 +900,53 @@ static ssize_t dlpar_cpu_release(const char *buf, size_t 
count)
 
 #endif /* CONFIG_ARCH_CPU_PROBE_RELEASE */
 
+static int pseries_update_cpu(struct of_reconfig_data *pr)
+{
+   u32 old_entries, new_entries;
+   __be32 *p, *old_assoc, *new_assoc;
+   int rc = 0;
+
+   /* So far, we only handle the 'ibm,associativity' property,
+* here.
+* The first int of the property is the number of domains
+* described.  This is followed by an array of level values.
+*/
+   p = (__be32 *) pr->old_prop->value;
+   if (!p)
+   return -EINVAL;
+   old_entries = be32_to_cpu(*p++);
+   old_assoc = p;
+
+   p = (__be32 *)pr->prop->value;
+   if (!p)
+   return -EINVAL;
+   new_entries = be32_to_cpu(*p++);
+   new_assoc = p;
+
+   if (old_entries == new_entries) {
+   int sz = old_entries * sizeof(int);
+
+   if (!memcmp(old_assoc, new_assoc, sz))
+   rc = dlpar_cpu_readd_by_index(
+   be32_to_cpu(pr->dn->phandle));
+
+   } else {
+   rc = dlpar_cpu_readd_by_index(
+   be32_to_cpu(pr->dn->phandle));
+   }
+
+   return rc;
+}
+
 static int pseries_smp_notifier(struct notifier_block *nb,
unsigned long action, void *data)
 {
struct of_reconfig_data *rd = data;
int err = 0;
 
+   if (strcmp(rd->dn->type, "cpu"))
+   return notifier_from_errno(err);
+
switch (action) {
case OF_RECONFIG_ATTACH_NODE:
err = pseries_add_processor(rd->dn);
@@ -889,6 +954,10 @@ static int pseries_smp_notifier(struct notifier_block *nb,
case OF_RECONFIG_DETACH_NODE:
pseries_remove_processor(rd->dn);
break;
+   case OF_RECONFIG_UPDATE_PROPERTY:
+   if (!strcmp(rd->prop->name, "ibm,associativity"))
+   err = pseries_update_cpu(rd);
+   break;
}
return notifier_from_errno(err);
 }
diff --git 

[RFC 0/3] powerpc/hotplug: Fix affinity assoc for LPAR migration

2018-02-15 Thread Michael Bringmann
The migration of LPARs across Power systems affects many attributes
including that of the associativity of memory blocks and CPUs.  The
patches in this set execute when a system is coming up fresh upon a
migration target.  They are intended to,

* Recognize changes to the associativity of memory and CPUs recorded
  in internal data structures when compared to the latest copies in
  the device tree (e.g. ibm,dynamic-memory, ibm,dynamic-memory-v2,
  cpus),
* Recognize changes to the associativity mapping (e.g. ibm,
  associativity-lookup-arrays), locate all assigned memory blocks
  corresponding to each changed row, and readd all such blocks.
* Generate calls to other code layers to reset the data structures
  related to associativity of the CPUs and memory.
* Re-register the 'changed' entities into the target system.
  Re-registration of CPUs and memory blocks mostly entails acting as
  if they have been newly hot-added into the target system.

Signed-off-by: Michael Bringmann 

Michael Bringmann (3):
  hotplug/mobility: Apply assoc lookup updates for Post Migration Topo
  postmigration/memory: Review assoc lookup array changes
  postmigration/memory: Associativity & 'ibm,dynamic-memory-v2'
---
Changes in RFC:
  -- Rename pseries_update_drconf_cpu to pseries_update_cpu
  -- Simplify code to update CPU nodes during mobility checks.
 Remove functions to generate extra HP_ELOG messages in favor
 of direct function calls to dlpar_cpu_readd_by_index, or
 dlpar_memory_readd_by_index.
  -- Move check for "cpu" node type from pseries_update_cpu to
 pseries_smp_notifier in 'hotplug-cpu.c'
  -- Remove functions 'pseries_memory_readd_by_index' and
 'pseries_cpu_readd_by_index' as no longer needed outside of
 'mobility.c'.
  -- Update patch for recent checkin compatibility
  -- Resubmit as RFC pending further integration with LMB changes
 by Nathan Fontenot



Re: [PATCH] powerpc/eeh: Add conditional check on notify_resume

2018-02-15 Thread Andrew Donnellan

On 16/02/18 05:49, Bryant G. Ly wrote:

From: "Juan J. Alvarez" 

EEH structure is not populated with function
notify resume when running on systems that do not support
it, i.e: BMC. Hence adding a conditional check for NULL for


Seems to me that by "BMC" you really mean "powernv platform"?


systems that don't add function notify_resume.

Signed-off-by: Juan J. Alvarez 
Reviewed-by: Bryant G. Ly 
Tested-by: Carol L. Soto 


Reviewed-by: Andrew Donnellan 


---
  arch/powerpc/kernel/eeh_driver.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c
index beea218..0c0b66f 100644
--- a/arch/powerpc/kernel/eeh_driver.c
+++ b/arch/powerpc/kernel/eeh_driver.c
@@ -384,7 +384,8 @@ static void *eeh_report_resume(void *data, void *userdata)
eeh_pcid_put(dev);
pci_uevent_ers(dev, PCI_ERS_RESULT_RECOVERED);
  #ifdef CONFIG_PCI_IOV
-   eeh_ops->notify_resume(eeh_dev_to_pdn(edev));
+   if (eeh_ops->notify_resume && eeh_dev_to_pdn(edev))
+   eeh_ops->notify_resume(eeh_dev_to_pdn(edev));
  #endif
return NULL;
  }



--
Andrew Donnellan  OzLabs, ADL Canberra
andrew.donnel...@au1.ibm.com  IBM Australia Limited



[RFC 4/4] powerpc/hotplug/drcinfo: Improve code for ibm,drc-info device processing

2018-02-15 Thread Michael Bringmann
This patch extends the use of a common parse function for the
ibm,drc-info property that can be modified by a callback function
to the hotplug device processing.  Candidate code is replaced by
a call to the parser including a pointer to a local context-specific
functions, and local data.

In addition, the original set missed several opportunities to compress
and reuse common code which this patch attempts to provide.

Finally, a bug with the registration of slots was observed on some
systems, and the code was rewritten to prevent its reoccurrence.

Signed-off-by: Michael Bringmann 
Fixes: 3f38000eda48 ("powerpc/firmware: Add definitions for new drc-info firmwar
e feature" -- end of patch series applied to powerpc next)
---
 drivers/pci/hotplug/rpaphp_core.c |  188 ++---
 1 file changed, 130 insertions(+), 58 deletions(-)

diff --git a/drivers/pci/hotplug/rpaphp_core.c 
b/drivers/pci/hotplug/rpaphp_core.c
index 477a21c..2c3992d 100644
--- a/drivers/pci/hotplug/rpaphp_core.c
+++ b/drivers/pci/hotplug/rpaphp_core.c
@@ -236,49 +236,52 @@ static int rpaphp_check_drc_props_v1(struct device_node 
*dn, char *drc_name,
return -EINVAL;
 }
 
-static int rpaphp_check_drc_props_v2(struct device_node *dn, char *drc_name,
-   char *drc_type, unsigned int my_index)
+struct check_drc_props_v2_struct {
+   char *drc_name;
+   char *drc_type;
+unsigned int my_index;
+};
+
+static int check_drc_props_v2_checkRun(struct of_drc_info *drc,
+void *idata, void *not_used,
+   int *ret_code)
 {
-   struct property *info;
-   unsigned int entries;
-   struct of_drc_info drc;
-   const __be32 *value;
+   struct check_drc_props_v2_struct *cdata = idata;
char cell_drc_name[MAX_DRC_NAME_LEN];
-   int j, fndit;
-
-   info = of_find_property(dn->parent, "ibm,drc-info", NULL);
-   if (info == NULL)
-   return -EINVAL;
-
-   value = of_prop_next_u32(info, NULL, );
-   if (!value)
-   return -EINVAL;
-   value++;
-
-   for (j = 0; j < entries; j++) {
-   of_read_drc_info_cell(, , );
 
-   /* Should now know end of current entry */
+   (*ret_code) = -EINVAL;
 
-   if (my_index > drc.last_drc_index)
-   continue;
+   if (cdata->my_index > drc->last_drc_index)
+   return 0;
 
-   fndit = 1;
-   break;
+   /* Found drc_index.  Now match the rest. */
+   sprintf(cell_drc_name, "%s%d", drc->drc_name_prefix, 
+   cdata->my_index - drc->drc_index_start +
+   drc->drc_name_suffix_start);
+
+   if (((cdata->drc_name == NULL) ||
+(cdata->drc_name && !strcmp(cdata->drc_name, cell_drc_name))) &&
+   ((cdata->drc_type == NULL) ||
+(cdata->drc_type && !strcmp(cdata->drc_type, drc->drc_type {
+   (*ret_code) = 0;
+   return 1;
}
-   /* Found it */
 
-   if (fndit)
-   sprintf(cell_drc_name, "%s%d", drc.drc_name_prefix, 
-   my_index);
+return 0;
+}
 
-   if (((drc_name == NULL) ||
-(drc_name && !strcmp(drc_name, cell_drc_name))) &&
-   ((drc_type == NULL) ||
-(drc_type && !strcmp(drc_type, drc.drc_type
-   return 0;
+static int rpaphp_check_drc_props_v2(struct device_node *dn, char *drc_name,
+   char *drc_type, unsigned int my_index)
+{
+   struct device_node *root = dn;
+   struct check_drc_props_v2_struct cdata = {
+   drc_name, drc_type, be32_to_cpu(my_index) };
 
-   return -EINVAL;
+   if (drc_type && strcmp(drc_type, "SLOT"))
+   root = dn->parent;
+
+   return drc_info_parser(root, check_drc_props_v2_checkRun,
+   drc_type, );
 }
 
 int rpaphp_check_drc_props(struct device_node *dn, char *drc_name,
@@ -301,7 +304,6 @@ int rpaphp_check_drc_props(struct device_node *dn, char 
*drc_name,
 }
 EXPORT_SYMBOL_GPL(rpaphp_check_drc_props);
 
-
 static int is_php_type(char *drc_type)
 {
unsigned long value;
@@ -361,17 +363,40 @@ static int is_php_dn(struct device_node *dn, const int 
**indexes,
  *
  * To remove a slot, it suffices to call rpaphp_deregister_slot().
  */
-int rpaphp_add_slot(struct device_node *dn)
+
+static int rpaphp_add_slot_common(struct device_node *dn,
+   u32 drc_index, char *drc_name, char *drc_type,
+   u32 drc_power_domain)
 {
struct slot *slot;
int retval = 0;
-   int i;
+
+   slot = alloc_slot_struct(dn, drc_index, drc_name,
+drc_power_domain);
+   if (!slot)
+   return -ENOMEM;
+
+   slot->type = simple_strtoul(drc_type, NULL, 10);
+
+   dbg("Found 

[RFC 2/4] powerpc/hotplug/drcinfo: Provide common parser for ibm,drc-info

2018-02-15 Thread Michael Bringmann
This patch provides a common parse function for the ibm,drc-info
property that can be modified by a callback function.  The caller
provides a pointer to the function and a pointer to their unique
data, and the parser provides the current lmb set from the struct.
The callback function may return codes indicating that the parsing
is complete, or should continue, along with an error code that may
be returned to the caller.

Signed-off-by: Michael Bringmann 
Fixes: 3f38000eda48 ("powerpc/firmware: Add definitions for new drc-info firmwar
e feature" -- end of patch series applied to powerpc next)
---
 arch/powerpc/include/asm/prom.h |7 +++
 arch/powerpc/platforms/pseries/Makefile |2 -
 arch/powerpc/platforms/pseries/drchelpers.c |   66 +++
 3 files changed, 74 insertions(+), 1 deletion(-)
 create mode 100644 arch/powerpc/platforms/pseries/drchelpers.c

diff --git a/arch/powerpc/include/asm/prom.h b/arch/powerpc/include/asm/prom.h
index 7db7958..e75963e 100644
--- a/arch/powerpc/include/asm/prom.h
+++ b/arch/powerpc/include/asm/prom.h
@@ -94,6 +94,13 @@ struct of_drc_info {
 extern int of_read_drc_info_cell(struct property **prop,
const __be32 **curval, struct of_drc_info *data);
 
+extern int drc_info_parser(struct device_node *dn,
+   int (*usercb)(struct of_drc_info *drc,
+   void *data,
+   void *optional_data,
+   int *ret_code),
+   char *opt_drc_type,
+   void *data);
 
 /*
  * There are two methods for telling firmware what our capabilities are.
diff --git a/arch/powerpc/platforms/pseries/Makefile 
b/arch/powerpc/platforms/pseries/Makefile
index 13eede6..38c8547 100644
--- a/arch/powerpc/platforms/pseries/Makefile
+++ b/arch/powerpc/platforms/pseries/Makefile
@@ -3,7 +3,7 @@ ccflags-$(CONFIG_PPC64) := $(NO_MINIMAL_TOC)
 ccflags-$(CONFIG_PPC_PSERIES_DEBUG)+= -DDEBUG
 
 obj-y  := lpar.o hvCall.o nvram.o reconfig.o \
-  of_helpers.o \
+  of_helpers.o drchelpers.o \
   setup.o iommu.o event_sources.o ras.o \
   firmware.o power.o dlpar.o mobility.o rng.o \
   pci.o pci_dlpar.o eeh_pseries.o msi.o
diff --git a/arch/powerpc/platforms/pseries/drchelpers.c 
b/arch/powerpc/platforms/pseries/drchelpers.c
new file mode 100644
index 000..556e05d
--- /dev/null
+++ b/arch/powerpc/platforms/pseries/drchelpers.c
@@ -0,0 +1,66 @@
+/*
+ * Copyright (C) 2018 Michael Bringmann , IBM
+ *
+ * pSeries specific routines for device-tree properties.
+ * 
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ * 
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307 USA
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include "pseries.h"
+
+#defineMAX_DRC_NAME_LEN 64
+
+int drc_info_parser(struct device_node *dn,
+   int (*usercb)(struct of_drc_info *drc,
+   void *data,
+   void *optional_data,
+   int *ret_code),
+   char *opt_drc_type,
+   void *data)
+{
+   struct property *info;
+   unsigned int entries;
+   struct of_drc_info drc;
+   const __be32 *value;
+   int j, done = 0, ret_code = -EINVAL;
+
+   info = of_find_property(dn, "ibm,drc-info", NULL);
+   if (info == NULL)
+   return -EINVAL;
+
+   value = of_prop_next_u32(info, NULL, );
+   if (!value)
+   return -EINVAL;
+   value++;
+
+   for (j = 0, done = 0; (j < entries) && (!done); j++) {
+   of_read_drc_info_cell(, , );
+
+   if (opt_drc_type && strcmp(opt_drc_type, drc.drc_type))
+   continue;
+
+   done = usercb(, data, NULL, _code);
+   }
+
+   return ret_code;
+}
+EXPORT_SYMBOL(drc_info_parser);



[RFC 1/4] powerpc/hotplug/drcinfo: Fix bugs parsing ibm,drc-info structs

2018-02-15 Thread Michael Bringmann
This patch fixes a memory parsing bug when using of_prop_next_u32
calls at the start of a structure.  Depending upon the value of
"cur" memory pointer argument to of_prop_next_u32, it will or it
won't advance the value of the returned memory pointer by the
size of one u32.  This patch corrects the code to deal with that
indexing feature when parsing the ibm,drc-info structs for CPUs.
Also, need to advance the pointer at the end of_read_drc_info_cell
for same reason.

Signed-off-by: Michael Bringmann 
Fixes: 3f38000eda48 ("powerpc/firmware: Add definitions for new drc-info 
firmware feature" -- end of patch series applied to powerpc next)
---
 arch/powerpc/platforms/pseries/of_helpers.c |4 +---
 arch/powerpc/platforms/pseries/pseries_energy.c |2 ++
 drivers/pci/hotplug/rpaphp_core.c   |1 +
 3 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/of_helpers.c 
b/arch/powerpc/platforms/pseries/of_helpers.c
index 7eea891..20598b2 100644
--- a/arch/powerpc/platforms/pseries/of_helpers.c
+++ b/arch/powerpc/platforms/pseries/of_helpers.c
@@ -65,9 +65,7 @@ int of_read_drc_info_cell(struct property **prop, const 
__be32 **curval,
 
/* Get drc-index-start:encode-int */
p2 = (const __be32 *)p;
-   p2 = of_prop_next_u32(*prop, p2, >drc_index_start);
-   if (!p2)
-   return -EINVAL;
+   data->drc_index_start = of_read_number(p2, 1);
 
/* Get drc-name-suffix-start:encode-int */
p2 = of_prop_next_u32(*prop, p2, >drc_name_suffix_start);
diff --git a/arch/powerpc/platforms/pseries/pseries_energy.c 
b/arch/powerpc/platforms/pseries/pseries_energy.c
index 6ed2212..c7d84aa 100644
--- a/arch/powerpc/platforms/pseries/pseries_energy.c
+++ b/arch/powerpc/platforms/pseries/pseries_energy.c
@@ -64,6 +64,7 @@ static u32 cpu_to_drc_index(int cpu)
value = of_prop_next_u32(info, NULL, _set_entries);
if (!value)
goto err_of_node_put;
+   value++;
 
for (j = 0; j < num_set_entries; j++) {
 
@@ -126,6 +127,7 @@ static int drc_index_to_cpu(u32 drc_index)
value = of_prop_next_u32(info, NULL, _set_entries);
if (!value)
goto err_of_node_put;
+   value++;
 
for (j = 0; j < num_set_entries; j++) {
 
diff --git a/drivers/pci/hotplug/rpaphp_core.c 
b/drivers/pci/hotplug/rpaphp_core.c
index 53902c7..477a21c 100644
--- a/drivers/pci/hotplug/rpaphp_core.c
+++ b/drivers/pci/hotplug/rpaphp_core.c
@@ -253,6 +253,7 @@ static int rpaphp_check_drc_props_v2(struct device_node 
*dn, char *drc_name,
value = of_prop_next_u32(info, NULL, );
if (!value)
return -EINVAL;
+   value++;
 
for (j = 0; j < entries; j++) {
of_read_drc_info_cell(, , );



[RFC 0/4] powerpc/drcinfo: Fix bugs 'ibm,drc-info' property

2018-02-15 Thread Michael Bringmann
This patch set corrects some errors and omissions in the previous
set of patches adding support for the "ibm,drc-info" property to
powerpc systems.

Unfortunately, some errors in the previous patch set break things
in some of the DLPAR operations.  In particular when attempting to
hot-add a new CPU or set of CPUs, the original patch failed to
properly calculate the available resources, and aborted the operation.
In addition, the original set missed several opportunities to compress
and reuse common code, especially, in the area of device processing.

Signed-off-by: Michael W. Bringmann 



Re: [RFC][PATCH bpf v2 1/2] bpf: allow 64-bit offsets for bpf function calls

2018-02-15 Thread Daniel Borkmann
On 02/15/2018 05:25 PM, Daniel Borkmann wrote:
> On 02/13/2018 05:05 AM, Sandipan Das wrote:
>> The imm field of a bpf_insn is a signed 32-bit integer. For
>> JIT-ed bpf-to-bpf function calls, it stores the offset from
>> __bpf_call_base to the start of the callee function.
>>
>> For some architectures, such as powerpc64, it was found that
>> this offset may be as large as 64 bits because of which this
>> cannot be accomodated in the imm field without truncation.
>>
>> To resolve this, we additionally make aux->func within each
>> bpf_prog associated with the functions to point to the list
>> of all function addresses determined by the verifier.
>>
>> We keep the value assigned to the off field of the bpf_insn
>> as a way to index into aux->func and also set aux->func_cnt
>> so that this can be used for performing basic upper bound
>> checks for the off field.
>>
>> Signed-off-by: Sandipan Das 
>> ---
>> v2: Make aux->func point to the list of functions determined
>> by the verifier rather than allocating a separate callee
>> list for each function.
> 
> Approach looks good to me; do you know whether s390x JIT would
> have similar requirement? I think one limitation that would still
> need to be addressed later with such approach would be regarding the
> xlated prog dump in bpftool, see 'BPF calls via JIT' in 7105e828c087
> ("bpf: allow for correlation of maps and helpers in dump"). Any
> ideas for this (potentially if we could use off + imm for calls,
> we'd get to 48 bits, but that seems still not be enough as you say)?

One other random thought, although I'm not sure how feasible this
is for ppc64 JIT to realize ... but idea would be to have something
like the below:

diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index 29ca920..daa7258 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -512,6 +512,11 @@ int bpf_get_kallsym(unsigned int symnum, unsigned long 
*value, char *type,
return ret;
 }

+void * __weak bpf_jit_image_alloc(unsigned long size)
+{
+   return module_alloc(size);
+}
+
 struct bpf_binary_header *
 bpf_jit_binary_alloc(unsigned int proglen, u8 **image_ptr,
 unsigned int alignment,
@@ -525,7 +530,7 @@ bpf_jit_binary_alloc(unsigned int proglen, u8 **image_ptr,
 * random section of illegal instructions.
 */
size = round_up(proglen + sizeof(*hdr) + 128, PAGE_SIZE);
-   hdr = module_alloc(size);
+   hdr = bpf_jit_image_alloc(size);
if (hdr == NULL)
return NULL;

And ppc64 JIT could override bpf_jit_image_alloc() in a similar way
like some archs would override the module_alloc() helper through a
custom implementation, usually via __vmalloc_node_range(), so we
could perhaps fit the range for BPF JITed images in a way that they
could use the 32bit imm in the end? There are not that many progs
loaded typically, so the range could be a bit narrower in such case
anyway. (Not sure if this would work out though, but I thought to
bring it up.)

Thanks,
Daniel


[PATCH] powerpc/eeh: Add conditional check on notify_resume

2018-02-15 Thread Bryant G. Ly
From: "Juan J. Alvarez" 

EEH structure is not populated with function
notify resume when running on systems that do not support
it, i.e: BMC. Hence adding a conditional check for NULL for
systems that don't add function notify_resume.

Signed-off-by: Juan J. Alvarez 
Reviewed-by: Bryant G. Ly 
Tested-by: Carol L. Soto 
---
 arch/powerpc/kernel/eeh_driver.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c
index beea218..0c0b66f 100644
--- a/arch/powerpc/kernel/eeh_driver.c
+++ b/arch/powerpc/kernel/eeh_driver.c
@@ -384,7 +384,8 @@ static void *eeh_report_resume(void *data, void *userdata)
eeh_pcid_put(dev);
pci_uevent_ers(dev, PCI_ERS_RESULT_RECOVERED);
 #ifdef CONFIG_PCI_IOV
-   eeh_ops->notify_resume(eeh_dev_to_pdn(edev));
+   if (eeh_ops->notify_resume && eeh_dev_to_pdn(edev))
+   eeh_ops->notify_resume(eeh_dev_to_pdn(edev));
 #endif
return NULL;
 }
-- 
2.7.2



Re: [PATCH 0/3] perf trace powerpc: Remove libaudit dependency for syscalls

2018-02-15 Thread Arnaldo Carvalho de Melo
Em Mon, Jan 29, 2018 at 02:04:14PM +0530, Ravi Bangoria escreveu:
> This is almost identical set of patches recently done for s390.
> 
> With this, user can run perf trace without libaudit on powerpc
> as well. Ex,
> 
>   $ make
> ... libaudit: [ OFF ]
> 
>   $ ./perf trace ls

Thanks, applied.

- Arnaldo


Re: [PATCH 0/3] perf trace powerpc: Remove libaudit dependency for syscalls

2018-02-15 Thread Arnaldo Carvalho de Melo
Em Thu, Feb 15, 2018 at 10:43:36AM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Mon, Jan 29, 2018 at 02:04:14PM +0530, Ravi Bangoria escreveu:
> > This is almost identical set of patches recently done for s390.
> > 
> > With this, user can run perf trace without libaudit on powerpc
> > as well. Ex,
> > 
> >   $ make
> > ... libaudit: [ OFF ]
> > 
> >   $ ./perf trace ls
> 
> Thanks, applied.

Ah, I already had to update the unistd.h copy to catch some new syscalls
for s/390 :-)

https://git.kernel.org/acme/c/b4ec64dc68da

- Arnaldo


Re: [RFC][PATCH bpf v2 1/2] bpf: allow 64-bit offsets for bpf function calls

2018-02-15 Thread Daniel Borkmann
On 02/13/2018 05:05 AM, Sandipan Das wrote:
> The imm field of a bpf_insn is a signed 32-bit integer. For
> JIT-ed bpf-to-bpf function calls, it stores the offset from
> __bpf_call_base to the start of the callee function.
> 
> For some architectures, such as powerpc64, it was found that
> this offset may be as large as 64 bits because of which this
> cannot be accomodated in the imm field without truncation.
> 
> To resolve this, we additionally make aux->func within each
> bpf_prog associated with the functions to point to the list
> of all function addresses determined by the verifier.
> 
> We keep the value assigned to the off field of the bpf_insn
> as a way to index into aux->func and also set aux->func_cnt
> so that this can be used for performing basic upper bound
> checks for the off field.
> 
> Signed-off-by: Sandipan Das 
> ---
> v2: Make aux->func point to the list of functions determined
> by the verifier rather than allocating a separate callee
> list for each function.

Approach looks good to me; do you know whether s390x JIT would
have similar requirement? I think one limitation that would still
need to be addressed later with such approach would be regarding the
xlated prog dump in bpftool, see 'BPF calls via JIT' in 7105e828c087
("bpf: allow for correlation of maps and helpers in dump"). Any
ideas for this (potentially if we could use off + imm for calls,
we'd get to 48 bits, but that seems still not be enough as you say)?

Thanks,
Daniel


[PATCH V2] cxl: Fix timebase synchronization status on P9

2018-02-15 Thread Christophe Lombard
The PSL Timebase register is updated by the PSL to maintain the
timebase.
On P9, the Timebase value is only provided by the CAPP as received
the last time a timebase request was performed.
The timebase requests are initiated through the adapter configuration or
application registers.
The specific sysfs entry "/sys/class/cxl/cardxx/psl_timebase_synced" is
now dynamically updated according the content of the PSL Timebase
register.

Signed-off-by: Christophe Lombard 

---
Changelog[v2]
 - Missing Signed-off-by
 - Spaces required around the ':'
---
 drivers/misc/cxl/pci.c   | 35 +++
 drivers/misc/cxl/sysfs.c | 14 ++
 2 files changed, 33 insertions(+), 16 deletions(-)

diff --git a/drivers/misc/cxl/pci.c b/drivers/misc/cxl/pci.c
index 758842f..270afb5 100644
--- a/drivers/misc/cxl/pci.c
+++ b/drivers/misc/cxl/pci.c
@@ -612,8 +612,6 @@ static u64 timebase_read_xsl(struct cxl *adapter)
 
 static void cxl_setup_psl_timebase(struct cxl *adapter, struct pci_dev *dev)
 {
-   u64 psl_tb;
-   int delta;
unsigned int retry = 0;
struct device_node *np;
 
@@ -641,20 +639,25 @@ static void cxl_setup_psl_timebase(struct cxl *adapter, 
struct pci_dev *dev)
cxl_p1_write(adapter, CXL_PSL_Control, 0x);
cxl_p1_write(adapter, CXL_PSL_Control, CXL_PSL_Control_tb);
 
-   /* Wait until CORE TB and PSL TB difference <= 16usecs */
-   do {
-   msleep(1);
-   if (retry++ > 5) {
-   dev_info(>dev, "PSL timebase can't synchronize\n");
-   return;
-   }
-   psl_tb = adapter->native->sl_ops->timebase_read(adapter);
-   delta = mftb() - psl_tb;
-   if (delta < 0)
-   delta = -delta;
-   } while (tb_to_ns(delta) > 16000);
-
-   adapter->psl_timebase_synced = true;
+   if (cxl_is_power8()) {
+   u64 psl_tb;
+   int delta;
+
+   /* Wait until CORE TB and PSL TB difference <= 16usecs */
+   do {
+   msleep(1);
+   if (retry++ > 5) {
+   dev_info(>dev, "PSL timebase can't 
synchronize\n");
+   return;
+   }
+   psl_tb = 
adapter->native->sl_ops->timebase_read(adapter);
+   delta = mftb() - psl_tb;
+   if (delta < 0)
+   delta = -delta;
+   } while (tb_to_ns(delta) > 16000);
+
+   adapter->psl_timebase_synced = true;
+   }
return;
 }
 
diff --git a/drivers/misc/cxl/sysfs.c b/drivers/misc/cxl/sysfs.c
index a8b6d6a..5384c59 100644
--- a/drivers/misc/cxl/sysfs.c
+++ b/drivers/misc/cxl/sysfs.c
@@ -63,6 +63,20 @@ static ssize_t psl_timebase_synced_show(struct device 
*device,
 {
struct cxl *adapter = to_cxl_adapter(device);
 
+   /*
+* On P9, the Timebase value is only updated as a result of
+* PSL TimeBase command sent to CAPP.
+*/
+   if (cxl_is_power9()) {
+   u64 psl_tb;
+   int delta;
+
+   psl_tb = cxl_p1_read(adapter, CXL_PSL9_Timebase);
+   delta = mftb() - psl_tb;
+   if (delta < 0)
+   delta = -delta;
+   adapter->psl_timebase_synced = true ? tb_to_ns(delta) < 16000 : 
false;
+   }
return scnprintf(buf, PAGE_SIZE, "%i\n", adapter->psl_timebase_synced);
 }
 
-- 
2.7.4



[PATCH] cxl: Fix timebase synchronization status on P9

2018-02-15 Thread Christophe Lombard
The PSL Timebase register is updated by the PSL to maintain the
timebase.
On P9, the Timebase value is only provided by the CAPP as received
the last time a timebase request was performed.
The timebase requests are initiated through the adapter configuration or
application registers.
The specific sysfs entry "/sys/class/cxl/cardxx/psl_timebase_synced" is
now dynamically updated according the content of the PSL Timebase
register.
---
 drivers/misc/cxl/pci.c   | 35 +++
 drivers/misc/cxl/sysfs.c | 14 ++
 2 files changed, 33 insertions(+), 16 deletions(-)

diff --git a/drivers/misc/cxl/pci.c b/drivers/misc/cxl/pci.c
index 758842f..270afb5 100644
--- a/drivers/misc/cxl/pci.c
+++ b/drivers/misc/cxl/pci.c
@@ -612,8 +612,6 @@ static u64 timebase_read_xsl(struct cxl *adapter)
 
 static void cxl_setup_psl_timebase(struct cxl *adapter, struct pci_dev *dev)
 {
-   u64 psl_tb;
-   int delta;
unsigned int retry = 0;
struct device_node *np;
 
@@ -641,20 +639,25 @@ static void cxl_setup_psl_timebase(struct cxl *adapter, 
struct pci_dev *dev)
cxl_p1_write(adapter, CXL_PSL_Control, 0x);
cxl_p1_write(adapter, CXL_PSL_Control, CXL_PSL_Control_tb);
 
-   /* Wait until CORE TB and PSL TB difference <= 16usecs */
-   do {
-   msleep(1);
-   if (retry++ > 5) {
-   dev_info(>dev, "PSL timebase can't synchronize\n");
-   return;
-   }
-   psl_tb = adapter->native->sl_ops->timebase_read(adapter);
-   delta = mftb() - psl_tb;
-   if (delta < 0)
-   delta = -delta;
-   } while (tb_to_ns(delta) > 16000);
-
-   adapter->psl_timebase_synced = true;
+   if (cxl_is_power8()) {
+   u64 psl_tb;
+   int delta;
+
+   /* Wait until CORE TB and PSL TB difference <= 16usecs */
+   do {
+   msleep(1);
+   if (retry++ > 5) {
+   dev_info(>dev, "PSL timebase can't 
synchronize\n");
+   return;
+   }
+   psl_tb = 
adapter->native->sl_ops->timebase_read(adapter);
+   delta = mftb() - psl_tb;
+   if (delta < 0)
+   delta = -delta;
+   } while (tb_to_ns(delta) > 16000);
+
+   adapter->psl_timebase_synced = true;
+   }
return;
 }
 
diff --git a/drivers/misc/cxl/sysfs.c b/drivers/misc/cxl/sysfs.c
index a8b6d6a..f3bfc5a 100644
--- a/drivers/misc/cxl/sysfs.c
+++ b/drivers/misc/cxl/sysfs.c
@@ -63,6 +63,20 @@ static ssize_t psl_timebase_synced_show(struct device 
*device,
 {
struct cxl *adapter = to_cxl_adapter(device);
 
+   /*
+* On P9, the Timebase value is only updated as a result of
+* PSL TimeBase command sent to CAPP.
+*/
+   if (cxl_is_power9()) {
+   u64 psl_tb;
+   int delta;
+
+   psl_tb = cxl_p1_read(adapter, CXL_PSL9_Timebase);
+   delta = mftb() - psl_tb;
+   if (delta < 0)
+   delta = -delta;
+   adapter->psl_timebase_synced = true ? tb_to_ns(delta) < 16000: 
false;
+   }
return scnprintf(buf, PAGE_SIZE, "%i\n", adapter->psl_timebase_synced);
 }
 
-- 
2.7.4



[PATCH v2] cxl: Check if PSL data-cache is available before issue flush request

2018-02-15 Thread Vaibhav Jain
PSL9D doesn't have a data-cache that needs to be flushed before
resetting the card. However when cxl tries to flush data-cache on such
a card, it times-out as PSL_Control register never indicates flush
operation complete due to missing data-cache. This is usually
indicated in the kernel logs with this message:

"WARNING: cache flush timed out"

To fix this the patch checks PSL_Debug register CDC-Field(BIT:27)
which indicates the absence of a data-cache and sets a flag
'no_data_cache' in 'struct cxl_native' to indicate this. When
cxl_data_cache_flush() is called it checks the flag and if set bails
out early without requesting a data-cache flush operation to the PSL.

Signed-off-by: Vaibhav Jain 
---
Change-log:
v2  -> Changed the dev_info to dev_dbg (Fred)
   Removed the check for DD1.0 chips (Fred)
---
 drivers/misc/cxl/cxl.h|  4 
 drivers/misc/cxl/native.c | 11 ++-
 drivers/misc/cxl/pci.c| 18 --
 3 files changed, 26 insertions(+), 7 deletions(-)

diff --git a/drivers/misc/cxl/cxl.h b/drivers/misc/cxl/cxl.h
index 4f015da78f28..4949b8d5a748 100644
--- a/drivers/misc/cxl/cxl.h
+++ b/drivers/misc/cxl/cxl.h
@@ -369,6 +369,9 @@ static const cxl_p2n_reg_t CXL_PSL_WED_An = {0x0A0};
 #define CXL_PSL_TFC_An_AE (1ull << (63-30)) /* Restart PSL with address error 
*/
 #define CXL_PSL_TFC_An_R  (1ull << (63-31)) /* Restart PSL transaction */
 
+/** CXL_PSL_DEBUG */
+#define CXL_PSL_DEBUG_CDC  (1ull << (63-27)) /* Coherent Data cache support */
+
 /** CXL_XSL9_IERAT_ERAT - CAIA 2 **/
 #define CXL_XSL9_IERAT_MLPID(1ull << (63-0))  /* Match LPID */
 #define CXL_XSL9_IERAT_MPID (1ull << (63-1))  /* Match PID */
@@ -669,6 +672,7 @@ struct cxl_native {
irq_hw_number_t err_hwirq;
unsigned int err_virq;
u64 ps_off;
+   bool no_data_cache; /* set if no data cache on the card */
const struct cxl_service_layer_ops *sl_ops;
 };
 
diff --git a/drivers/misc/cxl/native.c b/drivers/misc/cxl/native.c
index 1b3d7c65ea3f..98f867fcef24 100644
--- a/drivers/misc/cxl/native.c
+++ b/drivers/misc/cxl/native.c
@@ -353,8 +353,17 @@ int cxl_data_cache_flush(struct cxl *adapter)
u64 reg;
unsigned long timeout = jiffies + (HZ * CXL_TIMEOUT);
 
-   pr_devel("Flushing data cache\n");
+   /*
+* Do a datacache flush only if datacache is available.
+* In case of PSL9D datacache absent hence flush operation.
+* would timeout.
+*/
+   if (adapter->native->no_data_cache) {
+   pr_devel("No PSL data cache. Ignoring cache flush req.\n");
+   return 0;
+   }
 
+   pr_devel("Flushing data cache\n");
reg = cxl_p1_read(adapter, CXL_PSL_Control);
reg |= CXL_PSL_Control_Fr;
cxl_p1_write(adapter, CXL_PSL_Control, reg);
diff --git a/drivers/misc/cxl/pci.c b/drivers/misc/cxl/pci.c
index 758842f65a1b..3255f89c85d0 100644
--- a/drivers/misc/cxl/pci.c
+++ b/drivers/misc/cxl/pci.c
@@ -456,6 +456,7 @@ static int init_implementation_adapter_regs_psl9(struct cxl 
*adapter,
u64 chipid;
u32 phb_index;
u64 capp_unit_id;
+   u64 psl_debug;
int rc;
 
rc = cxl_calc_capp_routing(dev, , _index, _unit_id);
@@ -506,6 +507,15 @@ static int init_implementation_adapter_regs_psl9(struct 
cxl *adapter,
} else
cxl_p1_write(adapter, CXL_PSL9_DEBUG, 0x4000ULL);
 
+   /* Check if PSL has data-cache. We need to flush adapter datacache
+* when as its about to be removed.
+*/
+   psl_debug = cxl_p1_read(adapter, CXL_PSL9_DEBUG);
+   if (psl_debug & CXL_PSL_DEBUG_CDC) {
+   dev_dbg(>dev, "No data-cache present\n");
+   adapter->native->no_data_cache = true;
+   }
+
return 0;
 }
 
@@ -1449,10 +1459,8 @@ int cxl_pci_reset(struct cxl *adapter)
 
/*
 * The adapter is about to be reset, so ignore errors.
-* Not supported on P9 DD1
 */
-   if ((cxl_is_power8()) || (!(cxl_is_power9_dd1(
-   cxl_data_cache_flush(adapter);
+   cxl_data_cache_flush(adapter);
 
/* pcie_warm_reset requests a fundamental pci reset which includes a
 * PERST assert/deassert.  PERST triggers a loading of the image
@@ -1936,10 +1944,8 @@ static void cxl_pci_remove_adapter(struct cxl *adapter)
 
/*
 * Flush adapter datacache as its about to be removed.
-* Not supported on P9 DD1.
 */
-   if ((cxl_is_power8()) || (!(cxl_is_power9_dd1(
-   cxl_data_cache_flush(adapter);
+   cxl_data_cache_flush(adapter);
 
cxl_deconfigure_adapter(adapter);
 
-- 
2.14.3



[PATCH] crypto: nx-842: Delete an error message for a failed memory allocation in nx842_pseries_init()

2018-02-15 Thread SF Markus Elfring
From: Markus Elfring 
Date: Wed, 14 Feb 2018 17:05:13 +0100

Omit an extra message for a memory allocation failure in this function.

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring 
---
 drivers/crypto/nx/nx-842-pseries.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/crypto/nx/nx-842-pseries.c 
b/drivers/crypto/nx/nx-842-pseries.c
index bf52cd1d7fca..66869976cfa2 100644
--- a/drivers/crypto/nx/nx-842-pseries.c
+++ b/drivers/crypto/nx/nx-842-pseries.c
@@ -1105,10 +1105,9 @@ static int __init nx842_pseries_init(void)
 
RCU_INIT_POINTER(devdata, NULL);
new_devdata = kzalloc(sizeof(*new_devdata), GFP_KERNEL);
-   if (!new_devdata) {
-   pr_err("Could not allocate memory for device data\n");
+   if (!new_devdata)
return -ENOMEM;
-   }
+
RCU_INIT_POINTER(devdata, new_devdata);
 
ret = vio_register_driver(_vio_driver);
-- 
2.16.1



Re: [PATCH v2] powerpc/via-pmu: Fix section mismatch warning

2018-02-15 Thread Laurent Vivier
On 14/02/2018 22:15, Mathieu Malaterre wrote:
> Make the struct via_pmu_driver const to avoid following warning:
> 
> WARNING: vmlinux.o(.data+0x4739c): Section mismatch in reference from the 
> variable via_pmu_driver to the function .init.text:pmu_init()
> The variable via_pmu_driver references
> the function __init pmu_init()
> If the reference is valid then annotate the
> variable with __init* or __refdata (see linux/init.h) or name the variable:
> *_template, *_timer, *_sht, *_ops, *_probe, *_probe_one, *_console
> 
> Signed-off-by: Mathieu Malaterre 
> Suggested-by: Laurent Vivier 
> ---
> v2: pmu_init() is really an init function, leave __init marker
> 
>  drivers/macintosh/via-pmu.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/macintosh/via-pmu.c b/drivers/macintosh/via-pmu.c
> index 94c0f3f7df69..fc56c7067732 100644
> --- a/drivers/macintosh/via-pmu.c
> +++ b/drivers/macintosh/via-pmu.c
> @@ -198,7 +198,7 @@ static const struct file_operations pmu_battery_proc_fops;
>  static const struct file_operations pmu_options_proc_fops;
>  
>  #ifdef CONFIG_ADB
> -struct adb_driver via_pmu_driver = {
> +const struct adb_driver via_pmu_driver = {
>   "PMU",
>   pmu_probe,
>   pmu_init,
> 

Reviewed-by: Laurent Vivier 




Re: [PATCH] cxl: Check if PSL data-cache is available before issue flush request

2018-02-15 Thread Frederic Barrat



Le 13/02/2018 à 12:10, Vaibhav Jain a écrit :

PSL9D doesn't have a data-cache that needs to be flushed before
resetting the card. However when cxl tries to flush data-cache on such
a card, it times-out as PSL_Control register never indicates flush
operation complete due to missing data-cache. This is usually
indicated in the kernel logs with this message:

"WARNING: cache flush timed out"

To fix this the patch checks PSL_Debug register CDC-Field(BIT:27)
which indicates the absence of a data-cache and sets a flag
'no_data_cache' in 'struct cxl_native' to indicate this. When
cxl_data_cache_flush() is called it checks the flag and if set bails
out early without requesting a data-cache flush operation to the PSL.

Signed-off-by: Vaibhav Jain 
---
  drivers/misc/cxl/cxl.h|  4 
  drivers/misc/cxl/native.c | 11 ++-
  drivers/misc/cxl/pci.c| 19 +--
  3 files changed, 27 insertions(+), 7 deletions(-)

diff --git a/drivers/misc/cxl/cxl.h b/drivers/misc/cxl/cxl.h
index 4f015da78f28..4949b8d5a748 100644
--- a/drivers/misc/cxl/cxl.h
+++ b/drivers/misc/cxl/cxl.h
@@ -369,6 +369,9 @@ static const cxl_p2n_reg_t CXL_PSL_WED_An = {0x0A0};
  #define CXL_PSL_TFC_An_AE (1ull << (63-30)) /* Restart PSL with address error 
*/
  #define CXL_PSL_TFC_An_R  (1ull << (63-31)) /* Restart PSL transaction */

+/** CXL_PSL_DEBUG */
+#define CXL_PSL_DEBUG_CDC  (1ull << (63-27)) /* Coherent Data cache support */
+
  /** CXL_XSL9_IERAT_ERAT - CAIA 2 **/
  #define CXL_XSL9_IERAT_MLPID(1ull << (63-0))  /* Match LPID */
  #define CXL_XSL9_IERAT_MPID (1ull << (63-1))  /* Match PID */
@@ -669,6 +672,7 @@ struct cxl_native {
irq_hw_number_t err_hwirq;
unsigned int err_virq;
u64 ps_off;
+   bool no_data_cache; /* set if no data cache on the card */
const struct cxl_service_layer_ops *sl_ops;
  };

diff --git a/drivers/misc/cxl/native.c b/drivers/misc/cxl/native.c
index 1b3d7c65ea3f..98f867fcef24 100644
--- a/drivers/misc/cxl/native.c
+++ b/drivers/misc/cxl/native.c
@@ -353,8 +353,17 @@ int cxl_data_cache_flush(struct cxl *adapter)
u64 reg;
unsigned long timeout = jiffies + (HZ * CXL_TIMEOUT);

-   pr_devel("Flushing data cache\n");
+   /*
+* Do a datacache flush only if datacache is available.
+* In case of PSL9D datacache absent hence flush operation.
+* would timeout.
+*/
+   if (adapter->native->no_data_cache) {
+   pr_devel("No PSL data cache. Ignoring cache flush req.\n");
+   return 0;
+   }

+   pr_devel("Flushing data cache\n");
reg = cxl_p1_read(adapter, CXL_PSL_Control);
reg |= CXL_PSL_Control_Fr;
cxl_p1_write(adapter, CXL_PSL_Control, reg);
diff --git a/drivers/misc/cxl/pci.c b/drivers/misc/cxl/pci.c
index 758842f65a1b..39ddf89c3c14 100644
--- a/drivers/misc/cxl/pci.c
+++ b/drivers/misc/cxl/pci.c
@@ -456,6 +456,7 @@ static int init_implementation_adapter_regs_psl9(struct cxl 
*adapter,
u64 chipid;
u32 phb_index;
u64 capp_unit_id;
+   u64 psl_debug;
int rc;

rc = cxl_calc_capp_routing(dev, , _index, _unit_id);
@@ -506,6 +507,16 @@ static int init_implementation_adapter_regs_psl9(struct 
cxl *adapter,
} else
cxl_p1_write(adapter, CXL_PSL9_DEBUG, 0x4000ULL);

+   /* Check if PSL has data-cache. We need to flush adapter datacache
+* when as its about to be removed. But data-cache flush is not
+* supported supported on P9-DD1 and
+*/
+   psl_debug = cxl_p1_read(adapter, CXL_PSL9_DEBUG);
+   if (cxl_is_power9_dd1() || (psl_debug & CXL_PSL_DEBUG_CDC)) {
+   dev_info(>dev, "No data-cache present\n");


Doesn't dev_info() always show in the log? If so then it should be tuned 
down to dev_dbg(), as nobody cares.


Also, I wouldn't introduce any new code testing for dd1. It's dead code 
we're going to have to remove soon anyway.


  Fred



+   adapter->native->no_data_cache = true;
+   }
+
return 0;
  }

@@ -1449,10 +1460,8 @@ int cxl_pci_reset(struct cxl *adapter)

/*
 * The adapter is about to be reset, so ignore errors.
-* Not supported on P9 DD1
 */
-   if ((cxl_is_power8()) || (!(cxl_is_power9_dd1(
-   cxl_data_cache_flush(adapter);
+   cxl_data_cache_flush(adapter);

/* pcie_warm_reset requests a fundamental pci reset which includes a
 * PERST assert/deassert.  PERST triggers a loading of the image
@@ -1936,10 +1945,8 @@ static void cxl_pci_remove_adapter(struct cxl *adapter)

/*
 * Flush adapter datacache as its about to be removed.
-* Not supported on P9 DD1.
 */
-   if ((cxl_is_power8()) || (!(cxl_is_power9_dd1(
-   cxl_data_cache_flush(adapter);
+ 

Re: [PATCH v2] cxl: Remove function write_timebase_ctrl_psl9() for PSL9

2018-02-15 Thread Frederic Barrat



Le 15/02/2018 à 07:19, Vaibhav Jain a écrit :

For PSL9 the contents of PSL_TB_CTLSTAT register have changed in PSL9
and all of the register is now readonly. Hence we don't need an sl_ops
implementation for 'write_timebase_ctrl' for to populate this register
for PSL9.

Hence this patch removes function write_timebase_ctrl_psl9() and its
references from the code.

Signed-off-by: Vaibhav Jain 
---


Thanks!

Acked-by: Frederic Barrat 



Change-log:
v2 -> Updated the patch description to accurately reflect changes
   between PSL9 and PSL8. (Fred)
---
  drivers/misc/cxl/pci.c | 10 ++
  1 file changed, 2 insertions(+), 8 deletions(-)

diff --git a/drivers/misc/cxl/pci.c b/drivers/misc/cxl/pci.c
index c983f23cc2ed..9bc30c20b66b 100644
--- a/drivers/misc/cxl/pci.c
+++ b/drivers/misc/cxl/pci.c
@@ -572,12 +572,6 @@ static int init_implementation_adapter_regs_xsl(struct cxl 
*adapter, struct pci_
  /* For the PSL this is a multiple for 0 < n <= 7: */
  #define PSL_2048_250MHZ_CYCLES 1

-static void write_timebase_ctrl_psl9(struct cxl *adapter)
-{
-   cxl_p1_write(adapter, CXL_PSL9_TB_CTLSTAT,
-TBSYNC_CNT(2 * PSL_2048_250MHZ_CYCLES));
-}
-
  static void write_timebase_ctrl_psl8(struct cxl *adapter)
  {
cxl_p1_write(adapter, CXL_PSL_TB_CTLSTAT,
@@ -639,7 +633,8 @@ static void cxl_setup_psl_timebase(struct cxl *adapter, 
struct pci_dev *dev)
 * Setup PSL Timebase Control and Status register
 * with the recommended Timebase Sync Count value
 */
-   adapter->native->sl_ops->write_timebase_ctrl(adapter);
+   if (adapter->native->sl_ops->write_timebase_ctrl)
+   adapter->native->sl_ops->write_timebase_ctrl(adapter);

/* Enable PSL Timebase */
cxl_p1_write(adapter, CXL_PSL_Control, 0x);
@@ -1805,7 +1800,6 @@ static const struct cxl_service_layer_ops psl9_ops = {
.psl_irq_dump_registers = cxl_native_irq_dump_regs_psl9,
.err_irq_dump_registers = cxl_native_err_irq_dump_regs_psl9,
.debugfs_stop_trace = cxl_stop_trace_psl9,
-   .write_timebase_ctrl = write_timebase_ctrl_psl9,
.timebase_read = timebase_read_psl9,
.capi_mode = OPAL_PHB_CAPI_MODE_CAPI,
.needs_reset_before_disable = true,