Re: [PATCH] memblock: stop using implicit alignement to SMP_CACHE_BYTES

2018-10-10 Thread Mike Rapoport
On Fri, Oct 05, 2018 at 03:19:34PM -0700, Andrew Morton wrote:
> On Fri,  5 Oct 2018 00:07:04 +0300 Mike Rapoport  
> wrote:
> 
> > When a memblock allocation APIs are called with align = 0, the alignment is
> > implicitly set to SMP_CACHE_BYTES.
> > 
> > Replace all such uses of memblock APIs with the 'align' parameter explicitly
> > set to SMP_CACHE_BYTES and stop implicit alignment assignment in the
> > memblock internal allocation functions.
> > 
> > For the case when memblock APIs are used via helper functions, e.g. like
> > iommu_arena_new_node() in Alpha, the helper functions were detected with
> > Coccinelle's help and then manually examined and updated where appropriate.
> > 
> > ...
> >
> > --- a/mm/memblock.c
> > +++ b/mm/memblock.c
> > @@ -1298,9 +1298,6 @@ static phys_addr_t __init 
> > memblock_alloc_range_nid(phys_addr_t size,
> >  {
> > phys_addr_t found;
> >  
> > -   if (!align)
> > -   align = SMP_CACHE_BYTES;
> > -
> 
> Can we add a WARN_ON_ONCE(!align) here?  To catch unconverted code
> which sneaks in later on.

Here it goes:

>From baec825c58e8bc11371433d3a4b20b2216877a50 Mon Sep 17 00:00:00 2001
From: Mike Rapoport 
Date: Mon, 8 Oct 2018 11:22:10 +0300
Subject: [PATCH] memblock: warn if zero alignment was requested

After update of all memblock users to explicitly specify SMP_CACHE_BYTES
alignment rather than use 0, it is still possible that uncovered users
may sneak in. Add a WARN_ON_ONCE for such cases.

Signed-off-by: Mike Rapoport 
---
 mm/memblock.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/mm/memblock.c b/mm/memblock.c
index 0bbae56..5fefc70 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -1298,6 +1298,9 @@ static phys_addr_t __init 
memblock_alloc_range_nid(phys_addr_t size,
 {
phys_addr_t found;
 
+   if (WARN_ON_ONCE(!align))
+   align = SMP_CACHE_BYTES;
+
found = memblock_find_in_range_node(size, align, start, end, nid,
flags);
if (found && !memblock_reserve(found, size)) {
@@ -1420,6 +1423,9 @@ static void * __init memblock_alloc_internal(
if (WARN_ON_ONCE(slab_is_available()))
return kzalloc_node(size, GFP_NOWAIT, nid);
 
+   if (WARN_ON_ONCE(!align))
+   align = SMP_CACHE_BYTES;
+
if (max_addr > memblock.current_limit)
max_addr = memblock.current_limit;
 again:
-- 
2.7.4


-- 
Sincerely yours,
Mike.



Re: [PATCH 05.1/16] of:overlay: missing name, phandle, linux, phandle in new nodes

2018-10-10 Thread Frank Rowand
On 10/10/18 14:03, Frank Rowand wrote:
> On 10/10/18 13:40, Alan Tull wrote:
>> On Wed, Oct 10, 2018 at 1:49 AM Frank Rowand  wrote:
>>>
>>> On 10/09/18 23:04, frowand.l...@gmail.com wrote:
 From: Frank Rowand 


 "of: overlay: use prop add changeset entry for property in new nodes"
 fixed a problem where an 'update property' changeset entry was
 created for properties contained in nodes added by a changeset.
 The fix was to use an 'add property' changeset entry.

 This exposed more bugs in the apply overlay code.  The properties
 'name', 'phandle', and 'linux,phandle' were filtered out by
 add_changeset_property() as special properties.  Change the filter
 to be only for existing nodes, not newly added nodes.

 The second bug is that the 'name' property does not exist in the
 newest FDT version, and has to be constructed from the node's
 full_name.  Construct an 'add property' changeset entry for
 newly added nodes.

 Signed-off-by: Frank Rowand 
 ---


 Hi Alan,

 Thanks for reporting the problem with missing node names.

 I was able to replicate the problem, and have created this preliminary
 version of a patch to fix the problem.

 I have not extensively reviewed the patch yet, but would appreciate
 if you can confirm this fixes your problem.

 I created this patch as patch 17 of the series, but have also
 applied it as patch 05.1, immediately after patch 05/16, and
 built the kernel, booted, and verified name and phandle for
 one of the nodes in a unittest overlay for both cases.  So
 minimal testing so far on my part.

 I have not verified whether the series builds and boots after
 each of patches 06..16 if this patch is applied as patch 05.1.

 There is definitely more work needed for me to complete this
 patch because it allocates some more memory, but does not yet
 free it when the overlay is released.

 -Frank


  drivers/of/overlay.c | 72 
 
  1 file changed, 67 insertions(+), 5 deletions(-)

 diff --git a/drivers/of/overlay.c b/drivers/of/overlay.c
 index 0b0904f44bc7..9746cea2aa91 100644
 --- a/drivers/of/overlay.c
 +++ b/drivers/of/overlay.c
 @@ -301,10 +301,11 @@ static int add_changeset_property(struct 
 overlay_changeset *ovcs,
   struct property *new_prop = NULL, *prop;
   int ret = 0;

 - if (!of_prop_cmp(overlay_prop->name, "name") ||
 - !of_prop_cmp(overlay_prop->name, "phandle") ||
 - !of_prop_cmp(overlay_prop->name, "linux,phandle"))
 - return 0;
 + if (target->in_livetree)
 + if (!of_prop_cmp(overlay_prop->name, "name") ||
 + !of_prop_cmp(overlay_prop->name, "phandle") ||
 + !of_prop_cmp(overlay_prop->name, "linux,phandle"))
 + return 0;
>>>
>>> This is a big hammer patch.
>>>
>>> Nobody should waste time reviewing this patch.
>>
>> I wasn't clear if you still could use the testing so I did re-run my
>> test.  This patch adds back some of the missing properties, but the
>> the kobject names aren't set as dev_name() returns NULL:
>>
>> * without this patch some of_node properties don't show up in sysfs:
>> root@arria10:~# ls
>> /sys/bus/platform/drivers/altera_freeze_br/ff200450.\/of_node
>> clockscompatibleinterrupt-parent  interruptsreg
>>
>> * with this patch, the of_node properties phandle and name are back:
>> root@arria10:~#  ls
>> /sys/bus/platform/drivers/altera_freeze_br/ff200450.\/of_node
>> clockscompatibleinterrupt-parent  interrupts
>>  name  phandle   reg
> 
> Thanks for the testing.  I'll keep chasing after this problem today.
> 
> This is useful data for me as I was not looking under the /sys/bus/...
> tree that you reported, but was instead looking at /proc/device-tree/...
> which showed the same type of problem since the overlay I was using
> does not show up under /sys/bus/...
> 
> I'll have to create a useful overlay test case that will show up under
> /sys/bus/...
> 
> In the meantime, can you send me the base FDT and the overlay FDT for
> your test case?

I now have a test case that shows the problem under /sys/bus/... so I
no longer need the base FDT and overlay FDT for your test case.

I have determined the location that sets the name to "" but do
not have the fix yet.  Still working on that.

-Frank

> 
> Thanks,
> 
> Frank
> 
> 
>>
>> root@arria10:~# cat
>> /sys/bus/platform/drivers/altera_freeze_br/ff200450.\/of_node/name
>> freeze_controllerroot@arria10:~#  ("freeze_controller" w/o the \n so
>> the name is correct)
>>
>> * with or without the patch I see the behavior I reported yesterday,
>> kobj names are NULL.
>> root@arria10:~# ls /sys/bus/platform/driver

[PATCH v10 3/3] powerpc/cacheinfo: Report the correct shared_cpu_map on big-cores

2018-10-10 Thread Gautham R. Shenoy
From: "Gautham R. Shenoy" 

Currently on POWER9 SMT8 cores systems, in sysfs, we report the
shared_cache_map for L1 caches (both data and instruction) to be the
cpu-ids of the threads in SMT8 cores. This is incorrect since on
POWER9 SMT8 cores there are two groups of threads, each of which
shares its own L1 cache.

This patch addresses this by reporting the shared_cpu_map correctly in
sysfs for L1 caches.

Before the patch
   /sys/devices/system/cpu/cpu0/cache/index0/shared_cpu_map : 00ff
   /sys/devices/system/cpu/cpu0/cache/index1/shared_cpu_map : 00ff
   /sys/devices/system/cpu/cpu1/cache/index0/shared_cpu_map : 00ff
   /sys/devices/system/cpu/cpu1/cache/index1/shared_cpu_map : 00ff

After the patch
   /sys/devices/system/cpu/cpu0/cache/index0/shared_cpu_map : 0055
   /sys/devices/system/cpu/cpu0/cache/index1/shared_cpu_map : 0055
   /sys/devices/system/cpu/cpu1/cache/index0/shared_cpu_map : 00aa
   /sys/devices/system/cpu/cpu1/cache/index1/shared_cpu_map : 00aa

Signed-off-by: Gautham R. Shenoy 
---
 arch/powerpc/kernel/cacheinfo.c | 37 +++--
 1 file changed, 35 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kernel/cacheinfo.c b/arch/powerpc/kernel/cacheinfo.c
index a8f20e5..be57bd0 100644
--- a/arch/powerpc/kernel/cacheinfo.c
+++ b/arch/powerpc/kernel/cacheinfo.c
@@ -20,6 +20,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 
 #include "cacheinfo.h"
 
@@ -627,17 +629,48 @@ static ssize_t level_show(struct kobject *k, struct 
kobj_attribute *attr, char *
 static struct kobj_attribute cache_level_attr =
__ATTR(level, 0444, level_show, NULL);
 
+static unsigned int index_dir_to_cpu(struct cache_index_dir *index)
+{
+   struct kobject *index_dir_kobj = &index->kobj;
+   struct kobject *cache_dir_kobj = index_dir_kobj->parent;
+   struct kobject *cpu_dev_kobj = cache_dir_kobj->parent;
+   struct device *dev = kobj_to_dev(cpu_dev_kobj);
+
+   return dev->id;
+}
+
+/*
+ * On big-core systems, each core has two groups of CPUs each of which
+ * has its own L1-cache. The thread-siblings which share l1-cache with
+ * @cpu can be obtained via cpu_smallcore_mask().
+ */
+static const struct cpumask *get_big_core_shared_cpu_map(int cpu, struct cache 
*cache)
+{
+   if (cache->level == 1)
+   return cpu_smallcore_mask(cpu);
+
+   return &cache->shared_cpu_map;
+}
+
 static ssize_t shared_cpu_map_show(struct kobject *k, struct kobj_attribute 
*attr, char *buf)
 {
struct cache_index_dir *index;
struct cache *cache;
-   int ret;
+   const struct cpumask *mask;
+   int ret, cpu;
 
index = kobj_to_cache_index_dir(k);
cache = index->cache;
 
+   if (has_big_cores) {
+   cpu = index_dir_to_cpu(index);
+   mask = get_big_core_shared_cpu_map(cpu, cache);
+   } else {
+   mask  = &cache->shared_cpu_map;
+   }
+
ret = scnprintf(buf, PAGE_SIZE - 1, "%*pb\n",
-   cpumask_pr_args(&cache->shared_cpu_map));
+   cpumask_pr_args(mask));
buf[ret++] = '\n';
buf[ret] = '\0';
return ret;
-- 
1.9.4



[PATCH v10 0/3] powerpc: Detection and scheduler optimization for POWER9 bigcore

2018-10-10 Thread Gautham R. Shenoy
From: "Gautham R. Shenoy" 

Hi,

This is the tenth iteration of the patchset to add support for
big-core on POWER9. This patch also optimizes the task placement on
such big-core systems.

The previous versions can be found here:
v9: https://lkml.org/lkml/2018/10/1/608
v8: https://lkml.org/lkml/2018/9/20/899
v7: https://lkml.org/lkml/2018/8/20/52
v6: https://lkml.org/lkml/2018/8/9/119
v5: https://lkml.org/lkml/2018/8/6/587
v4: https://lkml.org/lkml/2018/7/24/79
v3: https://lkml.org/lkml/2018/7/6/255
v2: https://lkml.org/lkml/2018/7/3/401
v1: https://lkml.org/lkml/2018/5/11/245

Changes :
v9 --> v10:
   - Rebased it on v4.19-rc7
   - Added a patch to report the correct shared_cpu_map for L1-caches
   on big-core systems.

Description:


IBM POWER9 SMT8 cores consists of two groups of small-cores where each
group has its own L1 cache, translation cache and instruction-data
flow. This can be discovered via the "ibm,thread-groups" CPU property
in the device tree. Furthermore, on POWER9 the thread-ids of such a
big-core is obtained by interleaving the thread-ids of the two
small-cores.

Eg: In an SMT8 core with thread ids {0,1,2,3,4,5,6,7}, the thread-ids
of the threads in the two small-cores respectively will be {0,2,4,6}
and {1,3,5,7} respectively.

   -
   |L1 Cache   |
   --
   |L2| | | |  |
   |  |  0  |  2  |  4  |  6   |Small Core0
   |C | | | |  |
Big|a --
Core   |c | | | |  |
   |h |  1  |  3  |  5  |  7   | Small Core1
   |e | | | |  |
   -
  | L1 Cache   |
  --

On such a big-core system, when multiple tasks are scheduled to run on
the big-core, we get the best performance when the tasks are spread
across the pair of small-cores.

Eg: Suppose there 4 tasks {p1, p2, p3, p4} are run on a big core, then

An Example of Optimal Task placement:
   --
   | | | |  |
   |  0  |  2  |  4  |  6   |   Small Core0
   | (p1)| (p2)| |  |
Big Core   --
   | | | |  |
   |  1  |  3  |  5  |  7   |   Small Core1
   | | (p3)| | (p4) |
   --

An example of Suboptimal Task placement:
   --
   | | | |  |
   |  0  |  2  |  4  |  6   |   Small Core0
   | (p1)| (p2)| |  (p4)|
Big Core   --
   | | | |  |
   |  1  |  3  |  5  |  7   |   Small Core1
   | | (p3)| |  |
   --

Currently on the big-core systems, the sched domain hierarchy is:

SMT   : group of CPUs in the SMT8 core.
DIE   : groups of CPUs on the same die.
NUMA  : all the CPUs in the system.

Thus the scheduler doesn't distinguish between CPUs in the core that
share the L1-cache vs the ones that don't resulting in a run-to-run
variance when multithreaded applications are run on an SMT8 core.

In this patch-set, we address this by defining the sched-domain on the
big-core systems to be:

SMT   : group of CPUs sharing the L1 cache
CACHE : group of CPUs in the SMT8 core.
DIE   : groups of CPUs on the same die.
NUMA  : all the CPUs in the system.

With this, the Linux Kernel load-balancer will ensure that the tasks
are spread across all the component small cores in the system, thereby
yielding optimum performance.

Furthermore, this solution works correctly across all SMT modes
(8,4,2), as the interleaved thread-ids ensures that when we go to
lower SMT modes (4,2) the threads are offlined in a descending order,
thereby leaving equal number of threads from the component small cores
online as illustrated below.

This patchset contains three patches which on detecting the presence
of big-cores, defines the SMT level sched domain to correspond to the
threads of the small cores.

Patch 1: adds support to detect the presence of
big-cores and parses the output of "ibm,thread-groups" device-tree
which using which it updates a per-cpu mask named cpu_smallcore_mask

Patch 2: Defines the SMT level sched domain to correspond to the
threads of the small cores.

Patch 3: Added a patch to report the correct shared_cpu_map for L1-caches
on big-core systems.

   Without patch 3:
   /sys/devices/system/cpu/cpu0/cache/index0/shared_cpu_map : 00ff
   /sys/devices/system/cpu/cpu0/cache/index1/shared_cpu_map : 00ff
   /sys/devices/system/cpu/cpu1/cache/index0/shared_cpu_map : 00ff
   /sys/devices/system/cpu/cpu1/cache/index1/shared_cpu_map : 00ff

With patch 3:
   /sys/devices/system/cpu/cpu0/cache/index0/shared_cpu_map : 0055
   /sys/devices/system/cpu/cpu0/cache/index1/shared_cpu_map :

[PATCH v10 2/3] powerpc: Use cpu_smallcore_sibling_mask at SMT level on bigcores

2018-10-10 Thread Gautham R. Shenoy
From: "Gautham R. Shenoy" 

POWER9 SMT8 cores consist of two groups of threads, where threads in
each group shares L1-cache. The scheduler is not aware of this
distinction as the current sched-domain hierarchy has all the threads
of the core defined at the SMT domain.

SMT  [Thread siblings of the SMT8 core]
DIE  [CPUs in the same die]
NUMA [All the CPUs in the system]

Due to this, we can observe run-to-run variance when we run a
multi-threaded benchmark bound to a single core based on how the
scheduler spreads the software threads across the two groups in the
core.

We fix this in this patch by defining each group of threads which
share L1-cache to be the SMT level. The group of threads in the SMT8
core is defined to be the CACHE level. The sched-domain hierarchy
after this patch will be :

SMT [Thread siblings in the core that share L1 cache]
CACHE   [Thread siblings that are in the SMT8 core]
DIE [CPUs in the same die]
NUMA[All the CPUs in the system]

Signed-off-by: Gautham R. Shenoy 
---
 arch/powerpc/kernel/smp.c | 19 ++-
 1 file changed, 18 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index 22a14a9..356751e 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -1266,6 +1266,7 @@ static void add_cpu_to_masks(int cpu)
 void start_secondary(void *unused)
 {
unsigned int cpu = smp_processor_id();
+   struct cpumask *(*sibling_mask)(int) = cpu_sibling_mask;
 
mmgrab(&init_mm);
current->active_mm = &init_mm;
@@ -1291,11 +1292,13 @@ void start_secondary(void *unused)
/* Update topology CPU masks */
add_cpu_to_masks(cpu);
 
+   if (has_big_cores)
+   sibling_mask = cpu_smallcore_mask;
/*
 * Check for any shared caches. Note that this must be done on a
 * per-core basis because one core in the pair might be disabled.
 */
-   if (!cpumask_equal(cpu_l2_cache_mask(cpu), cpu_sibling_mask(cpu)))
+   if (!cpumask_equal(cpu_l2_cache_mask(cpu), sibling_mask(cpu)))
shared_caches = true;
 
set_numa_node(numa_cpu_lookup_table[cpu]);
@@ -1362,6 +1365,13 @@ static const struct cpumask *shared_cache_mask(int cpu)
return cpu_l2_cache_mask(cpu);
 }
 
+#ifdef CONFIG_SCHED_SMT
+static const struct cpumask *smallcore_smt_mask(int cpu)
+{
+   return cpu_smallcore_mask(cpu);
+}
+#endif
+
 static struct sched_domain_topology_level power9_topology[] = {
 #ifdef CONFIG_SCHED_SMT
{ cpu_smt_mask, powerpc_smt_flags, SD_INIT_NAME(SMT) },
@@ -1389,6 +1399,13 @@ void __init smp_cpus_done(unsigned int max_cpus)
shared_proc_topology_init();
dump_numa_cpu_topology();
 
+#ifdef CONFIG_SCHED_SMT
+   if (has_big_cores) {
+   pr_info("Using small cores at SMT level\n");
+   power9_topology[0].mask = smallcore_smt_mask;
+   powerpc_topology[0].mask = smallcore_smt_mask;
+   }
+#endif
/*
 * If any CPU detects that it's sharing a cache with another CPU then
 * use the deeper topology that is aware of this sharing.
-- 
1.9.4



[PATCH v10 1/3] powerpc: Detect the presence of big-cores via "ibm, thread-groups"

2018-10-10 Thread Gautham R. Shenoy
From: "Gautham R. Shenoy" 

On IBM POWER9, the device tree exposes a property array identifed by
"ibm,thread-groups" which will indicate which groups of threads share
a particular set of resources.

As of today we only have one form of grouping identifying the group of
threads in the core that share the L1 cache, translation cache and
instruction data flow.

This patch adds helper functions to parse the contents of
"ibm,thread-groups" and populate a per-cpu variable to cache
information about siblings of each CPU that share the L1, traslation
cache and instruction data-flow.

It also defines a new global variable named "has_big_cores" which
indicates if the cores on this configuration have multiple groups of
threads that share L1 cache.

For each online CPU, it maintains a cpu_smallcore_mask, which
indicates the online siblings which share the L1-cache with it.

Signed-off-by: Gautham R. Shenoy 
---
 arch/powerpc/include/asm/cputhreads.h |   2 +
 arch/powerpc/include/asm/smp.h|  11 ++
 arch/powerpc/kernel/smp.c | 222 ++
 3 files changed, 235 insertions(+)

diff --git a/arch/powerpc/include/asm/cputhreads.h 
b/arch/powerpc/include/asm/cputhreads.h
index d71a909..deb99fd 100644
--- a/arch/powerpc/include/asm/cputhreads.h
+++ b/arch/powerpc/include/asm/cputhreads.h
@@ -23,11 +23,13 @@
 extern int threads_per_core;
 extern int threads_per_subcore;
 extern int threads_shift;
+extern bool has_big_cores;
 extern cpumask_t threads_core_mask;
 #else
 #define threads_per_core   1
 #define threads_per_subcore1
 #define threads_shift  0
+#define has_big_cores  0
 #define threads_core_mask  (*get_cpu_mask(0))
 #endif
 
diff --git a/arch/powerpc/include/asm/smp.h b/arch/powerpc/include/asm/smp.h
index 95b66a0..4169574 100644
--- a/arch/powerpc/include/asm/smp.h
+++ b/arch/powerpc/include/asm/smp.h
@@ -100,6 +100,7 @@ static inline void set_hard_smp_processor_id(int cpu, int 
phys)
 DECLARE_PER_CPU(cpumask_var_t, cpu_sibling_map);
 DECLARE_PER_CPU(cpumask_var_t, cpu_l2_cache_map);
 DECLARE_PER_CPU(cpumask_var_t, cpu_core_map);
+DECLARE_PER_CPU(cpumask_var_t, cpu_smallcore_map);
 
 static inline struct cpumask *cpu_sibling_mask(int cpu)
 {
@@ -116,6 +117,11 @@ static inline struct cpumask *cpu_l2_cache_mask(int cpu)
return per_cpu(cpu_l2_cache_map, cpu);
 }
 
+static inline struct cpumask *cpu_smallcore_mask(int cpu)
+{
+   return per_cpu(cpu_smallcore_map, cpu);
+}
+
 extern int cpu_to_core_id(int cpu);
 
 /* Since OpenPIC has only 4 IPIs, we use slightly different message numbers.
@@ -166,6 +172,11 @@ static inline const struct cpumask *cpu_sibling_mask(int 
cpu)
return cpumask_of(cpu);
 }
 
+static inline const struct cpumask *cpu_smallcore_mask(int cpu)
+{
+   return cpumask_of(cpu);
+}
+
 #endif /* CONFIG_SMP */
 
 #ifdef CONFIG_PPC64
diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index 61c1fad..22a14a9 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -74,14 +74,32 @@
 #endif
 
 struct thread_info *secondary_ti;
+bool has_big_cores;
 
 DEFINE_PER_CPU(cpumask_var_t, cpu_sibling_map);
+DEFINE_PER_CPU(cpumask_var_t, cpu_smallcore_map);
 DEFINE_PER_CPU(cpumask_var_t, cpu_l2_cache_map);
 DEFINE_PER_CPU(cpumask_var_t, cpu_core_map);
 
 EXPORT_PER_CPU_SYMBOL(cpu_sibling_map);
 EXPORT_PER_CPU_SYMBOL(cpu_l2_cache_map);
 EXPORT_PER_CPU_SYMBOL(cpu_core_map);
+EXPORT_SYMBOL_GPL(has_big_cores);
+
+#define MAX_THREAD_LIST_SIZE   8
+#define THREAD_GROUP_SHARE_L1   1
+struct thread_groups {
+   unsigned int property;
+   unsigned int nr_groups;
+   unsigned int threads_per_group;
+   unsigned int thread_list[MAX_THREAD_LIST_SIZE];
+};
+
+/*
+ * On big-cores system, cpu_l1_cache_map for each CPU corresponds to
+ * the set its siblings that share the L1-cache.
+ */
+DEFINE_PER_CPU(cpumask_var_t, cpu_l1_cache_map);
 
 /* SMP operations for this machine */
 struct smp_ops_t *smp_ops;
@@ -674,6 +692,185 @@ static void set_cpus_unrelated(int i, int j,
 }
 #endif
 
+/*
+ * parse_thread_groups: Parses the "ibm,thread-groups" device tree
+ *  property for the CPU device node @dn and stores
+ *  the parsed output in the thread_groups
+ *  structure @tg if the ibm,thread-groups[0]
+ *  matches @property.
+ *
+ * @dn: The device node of the CPU device.
+ * @tg: Pointer to a thread group structure into which the parsed
+ *  output of "ibm,thread-groups" is stored.
+ * @property: The property of the thread-group that the caller is
+ *interested in.
+ *
+ * ibm,thread-groups[0..N-1] array defines which group of threads in
+ * the CPU-device node can be grouped together based on the property.
+ *
+ * ibm,thread-groups[0] tells us the property based on which the
+ * threads are being grouped together. If this value is 1, it implies
+ * that the threads in the same group share L1, translation c

[PATCH 5/5] arch/powerpc/mm/hugetlb: NestMMU workaround for hugetlb mprotect RW upgrade

2018-10-10 Thread Aneesh Kumar K.V
NestMMU requires us to mark the pte invalid and flush the tlb when we do a
RW upgrade of pte. We fixed a variant of this in the fault path in commit
Fixes: bd5050e38aec ("powerpc/mm/radix: Change pte relax sequence to handle 
nest MMU hang")

Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/include/asm/book3s/64/hugetlb.h |  8 +
 arch/powerpc/include/asm/hugetlb.h   |  2 +-
 arch/powerpc/mm/hugetlbpage.c| 35 
 3 files changed, 44 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/book3s/64/hugetlb.h 
b/arch/powerpc/include/asm/book3s/64/hugetlb.h
index 5b0177733994..a12bde29a5f0 100644
--- a/arch/powerpc/include/asm/book3s/64/hugetlb.h
+++ b/arch/powerpc/include/asm/book3s/64/hugetlb.h
@@ -42,4 +42,12 @@ static inline bool gigantic_page_supported(void)
 /* hugepd entry valid bit */
 #define HUGEPD_VAL_BITS(0x8000UL)
 
+#define huge_ptep_modify_prot_start huge_ptep_modify_prot_start
+extern pte_t huge_ptep_modify_prot_start(struct vm_area_struct *vma,
+unsigned long addr, pte_t *ptep);
+
+#define huge_ptep_modify_prot_commit huge_ptep_modify_prot_commit
+extern void huge_ptep_modify_prot_commit(struct vm_area_struct *vma,
+unsigned long addr, pte_t *ptep,
+pte_t old_pte, pte_t new_pte);
 #endif
diff --git a/arch/powerpc/include/asm/hugetlb.h 
b/arch/powerpc/include/asm/hugetlb.h
index 2d00cc530083..60c1d37e446a 100644
--- a/arch/powerpc/include/asm/hugetlb.h
+++ b/arch/powerpc/include/asm/hugetlb.h
@@ -4,7 +4,6 @@
 
 #ifdef CONFIG_HUGETLB_PAGE
 #include 
-#include 
 
 extern struct kmem_cache *hugepte_cache;
 
@@ -176,6 +175,7 @@ static inline void arch_clear_hugepage_flags(struct page 
*page)
 {
 }
 
+#include 
 #else /* ! CONFIG_HUGETLB_PAGE */
 static inline void flush_hugetlb_page(struct vm_area_struct *vma,
  unsigned long vmaddr)
diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c
index a7226ed9cae6..8b098bedaff5 100644
--- a/arch/powerpc/mm/hugetlbpage.c
+++ b/arch/powerpc/mm/hugetlbpage.c
@@ -913,3 +913,38 @@ int gup_hugepte(pte_t *ptep, unsigned long sz, unsigned 
long addr,
 
return 1;
 }
+
+#ifdef CONFIG_PPC_BOOK3S_64
+pte_t huge_ptep_modify_prot_start(struct vm_area_struct *vma,
+ unsigned long addr, pte_t *ptep)
+{
+   unsigned long pte_val;
+   /*
+* Clear the _PAGE_PRESENT so that no hardware parallel update is
+* possible. Also keep the pte_present true so that we don't take
+* wrong fault.
+*/
+   pte_val = pte_update(vma->vm_mm, addr, ptep,
+_PAGE_PRESENT, _PAGE_INVALID, 1);
+
+   return __pte(pte_val);
+}
+EXPORT_SYMBOL(huge_ptep_modify_prot_start);
+
+void huge_ptep_modify_prot_commit(struct vm_area_struct *vma, unsigned long 
addr,
+ pte_t *ptep, pte_t old_pte, pte_t pte)
+{
+   struct mm_struct *mm = vma->vm_mm;
+
+   /*
+* To avoid NMMU hang while relaxing access we need to flush the tlb 
before
+* we set the new value.
+*/
+   if (is_pte_upgrade(pte_val(old_pte), pte_val(pte)) &&
+   (atomic_read(&mm->context.copros) > 0))
+   flush_hugetlb_page(vma, addr);
+
+   set_huge_pte_at(vma->vm_mm, addr, ptep, pte);
+}
+EXPORT_SYMBOL(huge_ptep_modify_prot_commit);
+#endif
-- 
2.17.1



[PATCH 4/5] mm/hugetlb: Add prot_modify_start/commit sequence for hugetlb update

2018-10-10 Thread Aneesh Kumar K.V
Signed-off-by: Aneesh Kumar K.V 
---
 include/linux/hugetlb.h | 18 ++
 mm/hugetlb.c|  8 +---
 2 files changed, 23 insertions(+), 3 deletions(-)

diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index 087fd5f48c91..e2a3b0c854eb 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -543,6 +543,24 @@ static inline void set_huge_swap_pte_at(struct mm_struct 
*mm, unsigned long addr
set_huge_pte_at(mm, addr, ptep, pte);
 }
 #endif
+
+#ifndef huge_ptep_modify_prot_start
+static inline pte_t huge_ptep_modify_prot_start(struct vm_area_struct *vma,
+   unsigned long addr, pte_t *ptep)
+{
+   return huge_ptep_get_and_clear(vma->vm_mm, addr, ptep);
+}
+#endif
+
+#ifndef huge_ptep_modify_prot_commit
+static inline void huge_ptep_modify_prot_commit(struct vm_area_struct *vma,
+   unsigned long addr, pte_t *ptep,
+   pte_t old_pte, pte_t pte)
+{
+   set_huge_pte_at(vma->vm_mm, addr, ptep, pte);
+}
+#endif
+
 #else  /* CONFIG_HUGETLB_PAGE */
 struct hstate {};
 #define alloc_huge_page(v, a, r) NULL
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 5c390f5a5207..1f3a4df95b2e 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -4367,10 +4367,12 @@ unsigned long hugetlb_change_protection(struct 
vm_area_struct *vma,
continue;
}
if (!huge_pte_none(pte)) {
-   pte = huge_ptep_get_and_clear(mm, address, ptep);
-   pte = pte_mkhuge(huge_pte_modify(pte, newprot));
+   pte_t old_pte;
+
+   old_pte = huge_ptep_modify_prot_start(vma, address, 
ptep);
+   pte = pte_mkhuge(huge_pte_modify(old_pte, newprot));
pte = arch_make_huge_pte(pte, vma, NULL, 0);
-   set_huge_pte_at(mm, address, ptep, pte);
+   huge_ptep_modify_prot_commit(vma, address, ptep, 
old_pte, pte);
pages++;
}
spin_unlock(ptl);
-- 
2.17.1



[PATCH 3/5] arch/powerpc/mm: Nest MMU workaround for mprotect/autonuma RW upgrade.

2018-10-10 Thread Aneesh Kumar K.V
NestMMU requires us to mark the pte invalid and flush the tlb when we do a
RW upgrade of pte. We fixed a variant of this in the fault path in commit
Fixes: bd5050e38aec ("powerpc/mm/radix: Change pte relax sequence to handle 
nest MMU hang")

Do the same for mprotect and autonuma upgrades.

Hugetlb is handled in the next patch.

Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/include/asm/book3s/64/pgtable.h | 18 +++
 arch/powerpc/mm/pgtable-book3s64.c   | 34 
 2 files changed, 52 insertions(+)

diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h 
b/arch/powerpc/include/asm/book3s/64/pgtable.h
index f108e2ce7f64..c55468eaedc7 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -1324,6 +1324,24 @@ static inline const int pud_pfn(pud_t pud)
BUILD_BUG();
return 0;
 }
+#define __HAVE_ARCH_PTEP_MODIFY_PROT_TRANSACTION
+pte_t ptep_modify_prot_start(struct vm_area_struct *, unsigned long, pte_t *);
+void ptep_modify_prot_commit(struct vm_area_struct *, unsigned long,
+pte_t *, pte_t, pte_t);
+
+/*
+ * Returns true for a Read or Write upgrade of pte.
+ */
+static inline bool is_pte_upgrade(unsigned long old_val, unsigned long new_val)
+{
+   if ((!(old_val & _PAGE_READ)) && (new_val & _PAGE_READ))
+   return true;
+
+   if ((!(old_val & _PAGE_WRITE)) && (new_val & _PAGE_WRITE))
+   return true;
+
+   return false;
+}
 
 #endif /* __ASSEMBLY__ */
 #endif /* _ASM_POWERPC_BOOK3S_64_PGTABLE_H_ */
diff --git a/arch/powerpc/mm/pgtable-book3s64.c 
b/arch/powerpc/mm/pgtable-book3s64.c
index 43e99e1d947b..43f71125249b 100644
--- a/arch/powerpc/mm/pgtable-book3s64.c
+++ b/arch/powerpc/mm/pgtable-book3s64.c
@@ -481,3 +481,37 @@ void arch_report_meminfo(struct seq_file *m)
   atomic_long_read(&direct_pages_count[MMU_PAGE_1G]) << 20);
 }
 #endif /* CONFIG_PROC_FS */
+
+pte_t ptep_modify_prot_start(struct vm_area_struct *vma, unsigned long addr,
+pte_t *ptep)
+{
+   unsigned long pte_val;
+
+   /*
+* Clear the _PAGE_PRESENT so that no hardware parallel update is
+* possible. Also keep the pte_present true so that we don't take
+* wrong fault.
+*/
+   pte_val = pte_update(vma->vm_mm, addr, ptep, _PAGE_PRESENT, 
_PAGE_INVALID, 0);
+
+   return __pte(pte_val);
+
+}
+EXPORT_SYMBOL(ptep_modify_prot_start);
+
+void ptep_modify_prot_commit(struct vm_area_struct *vma, unsigned long addr,
+pte_t *ptep, pte_t old_pte, pte_t pte)
+{
+   struct mm_struct *mm = vma->vm_mm;
+
+   /*
+* To avoid NMMU hang while relaxing access we need to flush the tlb 
before
+* we set the new value.
+*/
+   if (is_pte_upgrade(pte_val(old_pte), pte_val(pte)) &&
+   (atomic_read(&mm->context.copros) > 0))
+   flush_tlb_page(vma, addr);
+
+   set_pte_at(mm, addr, ptep, pte);
+}
+EXPORT_SYMBOL(ptep_modify_prot_commit);
-- 
2.17.1



[PATCH 2/5] mm: update ptep_modify_prot_commit to take old pte value as arg

2018-10-10 Thread Aneesh Kumar K.V
Architectures like ppc64 requires to do a conditional tlb flush based on the old
and new value of pte. Enable that by passing old pte value as the arg.

Signed-off-by: Aneesh Kumar K.V 
---
 arch/s390/include/asm/pgtable.h | 3 ++-
 arch/s390/mm/pgtable.c  | 2 +-
 arch/x86/include/asm/paravirt.h | 2 +-
 fs/proc/task_mmu.c  | 8 +---
 include/asm-generic/pgtable.h   | 2 +-
 mm/memory.c | 8 
 mm/mprotect.c   | 6 +++---
 7 files changed, 17 insertions(+), 14 deletions(-)

diff --git a/arch/s390/include/asm/pgtable.h b/arch/s390/include/asm/pgtable.h
index 8e7f26dfedc6..626250436897 100644
--- a/arch/s390/include/asm/pgtable.h
+++ b/arch/s390/include/asm/pgtable.h
@@ -1036,7 +1036,8 @@ static inline pte_t ptep_get_and_clear(struct mm_struct 
*mm,
 
 #define __HAVE_ARCH_PTEP_MODIFY_PROT_TRANSACTION
 pte_t ptep_modify_prot_start(struct vm_area_struct *, unsigned long, pte_t *);
-void ptep_modify_prot_commit(struct vm_area_struct *, unsigned long, pte_t *, 
pte_t);
+void ptep_modify_prot_commit(struct vm_area_struct *, unsigned long,
+pte_t *, pte_t, pte_t);
 
 #define __HAVE_ARCH_PTEP_CLEAR_FLUSH
 static inline pte_t ptep_clear_flush(struct vm_area_struct *vma,
diff --git a/arch/s390/mm/pgtable.c b/arch/s390/mm/pgtable.c
index 29c0a21cd34a..b283b92722cc 100644
--- a/arch/s390/mm/pgtable.c
+++ b/arch/s390/mm/pgtable.c
@@ -322,7 +322,7 @@ pte_t ptep_modify_prot_start(struct vm_area_struct *vma, 
unsigned long addr,
 EXPORT_SYMBOL(ptep_modify_prot_start);
 
 void ptep_modify_prot_commit(struct vm_area_struct *vma, unsigned long addr,
-pte_t *ptep, pte_t pte)
+pte_t *ptep, pte_t old_pte, pte_t pte)
 {
pgste_t pgste;
struct mm_struct *mm = vma->vm_mm;
diff --git a/arch/x86/include/asm/paravirt.h b/arch/x86/include/asm/paravirt.h
index c5d203a51e50..17214e074286 100644
--- a/arch/x86/include/asm/paravirt.h
+++ b/arch/x86/include/asm/paravirt.h
@@ -434,7 +434,7 @@ static inline pte_t ptep_modify_prot_start(struct 
vm_area_struct *vma, unsigned
 }
 
 static inline void ptep_modify_prot_commit(struct vm_area_struct *vma, 
unsigned long addr,
-  pte_t *ptep, pte_t pte)
+  pte_t *ptep, pte_t old_pte, pte_t 
pte)
 {
struct mm_struct *mm = vma->vm_mm;
 
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 229df16e7ad0..505aa21d04df 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -938,10 +938,12 @@ static inline void clear_soft_dirty(struct vm_area_struct 
*vma,
pte_t ptent = *pte;
 
if (pte_present(ptent)) {
-   ptent = ptep_modify_prot_start(vma, addr, pte);
-   ptent = pte_wrprotect(ptent);
+   pte_t old_pte;
+
+   old_pte = ptep_modify_prot_start(vma, addr, pte);
+   ptent = pte_wrprotect(old_pte);
ptent = pte_clear_soft_dirty(ptent);
-   ptep_modify_prot_commit(vma, addr, pte, ptent);
+   ptep_modify_prot_commit(vma, addr, pte, old_pte, ptent);
} else if (is_swap_pte(ptent)) {
ptent = pte_swp_clear_soft_dirty(ptent);
set_pte_at(vma->vm_mm, addr, pte, ptent);
diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
index 021b94cd3260..4e4723f6be5e 100644
--- a/include/asm-generic/pgtable.h
+++ b/include/asm-generic/pgtable.h
@@ -619,7 +619,7 @@ static inline pte_t ptep_modify_prot_start(struct 
vm_area_struct *vma,
  */
 static inline void ptep_modify_prot_commit(struct vm_area_struct *vma,
   unsigned long addr,
-  pte_t *ptep, pte_t pte)
+  pte_t *ptep, pte_t old_pte, pte_t 
pte)
 {
__ptep_modify_prot_commit(vma->vm_mm, addr, ptep, pte);
 }
diff --git a/mm/memory.c b/mm/memory.c
index 261d30f51499..211df764f232 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3786,7 +3786,7 @@ static vm_fault_t do_numa_page(struct vm_fault *vmf)
int last_cpupid;
int target_nid;
bool migrated = false;
-   pte_t pte;
+   pte_t pte, old_pte;
bool was_writable = pte_savedwrite(vmf->orig_pte);
int flags = 0;
 
@@ -3806,12 +3806,12 @@ static vm_fault_t do_numa_page(struct vm_fault *vmf)
 * Make it present again, Depending on how arch implementes non
 * accessible ptes, some can allow access by kernel mode.
 */
-   pte = ptep_modify_prot_start(vma, vmf->address, vmf->pte);
-   pte = pte_modify(pte, vma->vm_page_prot);
+   old_pte = ptep_modify_prot_start(vma, vmf->address, vmf->pte);
+   pte = pte_modify(old_pte, vma->vm_page_prot);
pte = pte_mkyoung(pte);
if (was_writable)
pte = pte_mkwrite(pte);
-   ptep_modify_prot_commit(vma, vmf->address, vmf->pte, pte);
+

[PATCH 1/5] mm: Update ptep_modify_prot_start/commit to take vm_area_struct as arg

2018-10-10 Thread Aneesh Kumar K.V
Some architecture may want to call flush_tlb_range from these helpers.

Signed-off-by: Aneesh Kumar K.V 
---
 arch/s390/include/asm/pgtable.h | 4 ++--
 arch/s390/mm/pgtable.c  | 6 --
 arch/x86/include/asm/paravirt.h | 7 +--
 fs/proc/task_mmu.c  | 4 ++--
 include/asm-generic/pgtable.h   | 8 
 mm/memory.c | 4 ++--
 mm/mprotect.c   | 4 ++--
 7 files changed, 21 insertions(+), 16 deletions(-)

diff --git a/arch/s390/include/asm/pgtable.h b/arch/s390/include/asm/pgtable.h
index 0e7cb0dc9c33..8e7f26dfedc6 100644
--- a/arch/s390/include/asm/pgtable.h
+++ b/arch/s390/include/asm/pgtable.h
@@ -1035,8 +1035,8 @@ static inline pte_t ptep_get_and_clear(struct mm_struct 
*mm,
 }
 
 #define __HAVE_ARCH_PTEP_MODIFY_PROT_TRANSACTION
-pte_t ptep_modify_prot_start(struct mm_struct *, unsigned long, pte_t *);
-void ptep_modify_prot_commit(struct mm_struct *, unsigned long, pte_t *, 
pte_t);
+pte_t ptep_modify_prot_start(struct vm_area_struct *, unsigned long, pte_t *);
+void ptep_modify_prot_commit(struct vm_area_struct *, unsigned long, pte_t *, 
pte_t);
 
 #define __HAVE_ARCH_PTEP_CLEAR_FLUSH
 static inline pte_t ptep_clear_flush(struct vm_area_struct *vma,
diff --git a/arch/s390/mm/pgtable.c b/arch/s390/mm/pgtable.c
index f2cc7da473e4..29c0a21cd34a 100644
--- a/arch/s390/mm/pgtable.c
+++ b/arch/s390/mm/pgtable.c
@@ -301,12 +301,13 @@ pte_t ptep_xchg_lazy(struct mm_struct *mm, unsigned long 
addr,
 }
 EXPORT_SYMBOL(ptep_xchg_lazy);
 
-pte_t ptep_modify_prot_start(struct mm_struct *mm, unsigned long addr,
+pte_t ptep_modify_prot_start(struct vm_area_struct *vma, unsigned long addr,
 pte_t *ptep)
 {
pgste_t pgste;
pte_t old;
int nodat;
+   struct mm_struct *mm = vma->vm_mm;
 
preempt_disable();
pgste = ptep_xchg_start(mm, addr, ptep);
@@ -320,10 +321,11 @@ pte_t ptep_modify_prot_start(struct mm_struct *mm, 
unsigned long addr,
 }
 EXPORT_SYMBOL(ptep_modify_prot_start);
 
-void ptep_modify_prot_commit(struct mm_struct *mm, unsigned long addr,
+void ptep_modify_prot_commit(struct vm_area_struct *vma, unsigned long addr,
 pte_t *ptep, pte_t pte)
 {
pgste_t pgste;
+   struct mm_struct *mm = vma->vm_mm;
 
if (!MACHINE_HAS_NX)
pte_val(pte) &= ~_PAGE_NOEXEC;
diff --git a/arch/x86/include/asm/paravirt.h b/arch/x86/include/asm/paravirt.h
index e375d4266b53..c5d203a51e50 100644
--- a/arch/x86/include/asm/paravirt.h
+++ b/arch/x86/include/asm/paravirt.h
@@ -421,10 +421,11 @@ static inline pgdval_t pgd_val(pgd_t pgd)
 }
 
 #define  __HAVE_ARCH_PTEP_MODIFY_PROT_TRANSACTION
-static inline pte_t ptep_modify_prot_start(struct mm_struct *mm, unsigned long 
addr,
+static inline pte_t ptep_modify_prot_start(struct vm_area_struct *vma, 
unsigned long addr,
   pte_t *ptep)
 {
pteval_t ret;
+   struct mm_struct *mm = vma->vm_mm;
 
ret = PVOP_CALL3(pteval_t, pv_mmu_ops.ptep_modify_prot_start,
 mm, addr, ptep);
@@ -432,9 +433,11 @@ static inline pte_t ptep_modify_prot_start(struct 
mm_struct *mm, unsigned long a
return (pte_t) { .pte = ret };
 }
 
-static inline void ptep_modify_prot_commit(struct mm_struct *mm, unsigned long 
addr,
+static inline void ptep_modify_prot_commit(struct vm_area_struct *vma, 
unsigned long addr,
   pte_t *ptep, pte_t pte)
 {
+   struct mm_struct *mm = vma->vm_mm;
+
if (sizeof(pteval_t) > sizeof(long))
/* 5 arg words */
pv_mmu_ops.ptep_modify_prot_commit(mm, addr, ptep, pte);
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 5ea1d64cb0b4..229df16e7ad0 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -938,10 +938,10 @@ static inline void clear_soft_dirty(struct vm_area_struct 
*vma,
pte_t ptent = *pte;
 
if (pte_present(ptent)) {
-   ptent = ptep_modify_prot_start(vma->vm_mm, addr, pte);
+   ptent = ptep_modify_prot_start(vma, addr, pte);
ptent = pte_wrprotect(ptent);
ptent = pte_clear_soft_dirty(ptent);
-   ptep_modify_prot_commit(vma->vm_mm, addr, pte, ptent);
+   ptep_modify_prot_commit(vma, addr, pte, ptent);
} else if (is_swap_pte(ptent)) {
ptent = pte_swp_clear_soft_dirty(ptent);
set_pte_at(vma->vm_mm, addr, pte, ptent);
diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
index 88ebc6102c7c..021b94cd3260 100644
--- a/include/asm-generic/pgtable.h
+++ b/include/asm-generic/pgtable.h
@@ -606,22 +606,22 @@ static inline void __ptep_modify_prot_commit(struct 
mm_struct *mm,
  * queue the update to be done at some later time.  The update must be
  * actually committed before the pte lock is released, however.
  */
-static inline pte_t ptep_modify_prot_start(struct mm_s

[PATCH 0/5] NestMMU pte upgrade workaround for mprotect and autonuma

2018-10-10 Thread Aneesh Kumar K.V
We can upgrade pte access (R -> RW transition) via mprotect or autonuma. We need
to make sure we follow the recommended pte update sequence as outlined in
commit: bd5050e38aec ("powerpc/mm/radix: Change pte relax sequence to handle 
nest MMU hang")
for such updates. This patch series do that.

Aneesh Kumar K.V (5):
  mm: Update ptep_modify_prot_start/commit to take vm_area_struct as arg
  mm: update ptep_modify_prot_commit to take old pte value as arg
  arch/powerpc/mm: Nest MMU workaround for mprotect/autonuma RW upgrade.
  mm/hugetlb: Add prot_modify_start/commit sequence for hugetlb update
  arch/powerpc/mm/hugetlb: NestMMU workaround for hugetlb mprotect RW
upgrade

 arch/powerpc/include/asm/book3s/64/hugetlb.h |  8 +
 arch/powerpc/include/asm/book3s/64/pgtable.h | 18 ++
 arch/powerpc/include/asm/hugetlb.h   |  2 +-
 arch/powerpc/mm/hugetlbpage.c| 35 
 arch/powerpc/mm/pgtable-book3s64.c   | 34 +++
 arch/s390/include/asm/pgtable.h  |  5 +--
 arch/s390/mm/pgtable.c   |  8 +++--
 arch/x86/include/asm/paravirt.h  |  9 +++--
 fs/proc/task_mmu.c   |  8 +++--
 include/asm-generic/pgtable.h| 10 +++---
 include/linux/hugetlb.h  | 18 ++
 mm/hugetlb.c |  8 +++--
 mm/memory.c  |  8 ++---
 mm/mprotect.c|  6 ++--
 14 files changed, 150 insertions(+), 27 deletions(-)

-- 
2.17.1



[PATCH v3] powerpc/Makefile: Fix PPC_BOOK3S_64 ASFLAGS

2018-10-10 Thread Joel Stanley
Ever since commit 15a3204d24a3 ("powerpc/64s: Set assembler machine type
to POWER4") we force -mpower4 to be passed to the assembler irrespective
of the CFLAGS used.

When building a powerpc64 kernel with clang, clang will not add -many to
the assembler flags, so any instructions that the compiler has generated
that are not available on power4 will cause an error:

 /usr/bin/as -a64 -mppc64 -mlittle-endian -mpower8 \
  -I ./arch/powerpc/include -I ./arch/powerpc/include/generated \
  -I ./include -I ./arch/powerpc/include/uapi \
  -I ./arch/powerpc/include/generated/uapi -I ./include/uapi \
  -I ./include/generated/uapi -I arch/powerpc -I arch/powerpc \
  -maltivec -mpower4 -o init/do_mounts.o /tmp/do_mounts-3b0a3d.s
 /tmp/do_mounts-51ce54.s:748: Error: unrecognized opcode: `isel'

GCC does include -many, so the GCC driven gas call will succeed:

  as -v -I ./arch/powerpc/include -I ./arch/powerpc/include/generated -I
  ./include -I ./arch/powerpc/include/uapi
  -I ./arch/powerpc/include/generated/uapi -I ./include/uapi
  -I ./include/generated/uapi -I arch/powerpc -I arch/powerpc
   -a64 -mpower8 -many -mlittle -maltivec -mpower4 -o init/do_mounts.o

Note that isel is power7 and above for IBM CPUs. GCC only generates it
for Power9 and above, but the above test was run against the clang
generated assembly.

Peter Bergner explains:

 > When using -many -mpower4, gas will first try and find a matching
 > power4 mnemonic and failing that, it will then allow any valid mnemonic
 > that gas knows about.  GCC's use of -many predates me though.
 >
 > IIRC, Alan looked at trying to remove it, but I forget why he didn't.
 > Could be either a gcc or gas issue at the time.  I'm not sure whether
 > issue still exists or not.  He and I have modified how gas works
 > internally a fair amount since he tried removing gcc use of -many
 >
 > I will also note that when using -many, gas will choose the first
 > mnemonic that matches in the mnemonic table and we have (mostly) sorted
 > the table so that server mnemonics show up earlier in the table than
 > other mnemonics, so they'll be seen/chosen first

By explicitly setting -many we can build with Clang and GCC while
retaining the -mpower4 option.

Signed-off-by: Joel Stanley 
---
Following up on:
https://lore.kernel.org/linuxppc-dev/20180914040649.1794-2-j...@jms.id.au/
https://lore.kernel.org/linuxppc-dev/20180821010203.23213-1-an...@ozlabs.org/

mpe, please trim the commit message if you think it's a bit verbose

We could test for these flags in case -many is removed in the future,
but if an assembler does not support -many then -mpower4 will probably
break it anyway.

 arch/powerpc/Makefile | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/Makefile b/arch/powerpc/Makefile
index 81552c7b46eb..ae097fa9abe9 100644
--- a/arch/powerpc/Makefile
+++ b/arch/powerpc/Makefile
@@ -249,7 +249,10 @@ cpu-as-$(CONFIG_4xx)   += -Wa,-m405
 cpu-as-$(CONFIG_ALTIVEC)   += $(call as-option,-Wa$(comma)-maltivec)
 cpu-as-$(CONFIG_E200)  += -Wa,-me200
 cpu-as-$(CONFIG_E500)  += -Wa,-me500
-cpu-as-$(CONFIG_PPC_BOOK3S_64) += -Wa,-mpower4
+# When using -many -mpower4 gas will first try and find a matching power4
+# mnemonic and failing that it will allow any valid mnemonic that GAS knows
+# about. GCC will pass -many to GAS when assembling, clang does not
+cpu-as-$(CONFIG_PPC_BOOK3S_64) += -Wa,-mpower4 -Wa,-many
 cpu-as-$(CONFIG_PPC_E500MC)+= $(call as-option,-Wa$(comma)-me500mc)
 
 KBUILD_AFLAGS += $(cpu-as-y)
-- 
2.17.1



Re: [PATCH 4/4] powerpc: Add -Wimplicit-fallthrough to arch CFLAGS

2018-10-10 Thread Kees Cook
On Wed, Oct 10, 2018 at 5:32 PM, Michael Ellerman  wrote:
> Kees Cook  writes:
>> On Tue, Oct 9, 2018 at 10:13 PM, Michael Ellerman  
>> wrote:
>>> Warn whenever a switch statement has a fallthrough without a comment
>>> annotating it.
>>>
>>> Signed-off-by: Michael Ellerman 
>>
>> Yes please. :)
>>
>> Reviewed-by: Kees Cook 
>
> Haha, thanks.
>
> It still pops a few errors, including in linux/signal.h & compat.h, so
> it's somewhat aspirational until we can get those fixed up :)

Gustavo, any chance you can target those two files? It could get us a
whole architecture using the flag. :)

-Kees

-- 
Kees Cook
Pixel Security


Re: [PATCH 4/4] powerpc: Add -Wimplicit-fallthrough to arch CFLAGS

2018-10-10 Thread Michael Ellerman
Kees Cook  writes:
> On Tue, Oct 9, 2018 at 10:13 PM, Michael Ellerman  wrote:
>> Warn whenever a switch statement has a fallthrough without a comment
>> annotating it.
>>
>> Signed-off-by: Michael Ellerman 
>
> Yes please. :)
>
> Reviewed-by: Kees Cook 

Haha, thanks.

It still pops a few errors, including in linux/signal.h & compat.h, so
it's somewhat aspirational until we can get those fixed up :)

cheers


Re: [PATCH 3/4] powerpc: Add -Wvla to arch CFLAGS

2018-10-10 Thread Michael Ellerman
Kees Cook  writes:
> On Tue, Oct 9, 2018 at 10:13 PM, Michael Ellerman  wrote:
>> Upstream has declared that Variable Length Array's (VLAs) are a bad
>> idea, and eventually -Wvla will be added to the top-level Makefile. We
>> can go one better and make sure we don't introduce any more by adding
>> it to the arch Makefile.
>>
>> Signed-off-by: Michael Ellerman 
>> ---
>>  arch/powerpc/Kbuild | 3 ++-
>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/powerpc/Kbuild b/arch/powerpc/Kbuild
>> index 1625a06802ca..86b261d6bde5 100644
>> --- a/arch/powerpc/Kbuild
>> +++ b/arch/powerpc/Kbuild
>> @@ -1,4 +1,5 @@
>> -subdir-ccflags-$(CONFIG_PPC_WERROR) := -Werror
>> +subdir-ccflags-y := $(call cc-option, -Wvla)
>> +subdir-ccflags-$(CONFIG_PPC_WERROR) += -Werror
>>
>>  obj-y += kernel/
>>  obj-y += mm/
>
> -Wvla will be going into the top-level Makefile in the merge window
> (see linux-next), so this will be redundant.

Thanks, yeah I saw that after I posted. Will drop this one.

cheers


Re: [PATCH 1/2] powerpc/boot: Expose Kconfig symbols to wrapper

2018-10-10 Thread Joel Stanley
On Thu, 11 Oct 2018 at 10:32, Michael Ellerman  wrote:
>
> Michael Ellerman  writes:
> > Joel Stanley  writes:
> >> diff --git a/arch/powerpc/boot/Makefile b/arch/powerpc/boot/Makefile
> >> index 0fb96c26136f..eeed74e0dfca 100644
> >> --- a/arch/powerpc/boot/Makefile
> >> +++ b/arch/powerpc/boot/Makefile
> >> @@ -197,9 +197,14 @@ $(obj)/empty.c:
> >>  $(obj)/zImage.coff.lds $(obj)/zImage.ps3.lds : $(obj)/%: 
> >> $(srctree)/$(src)/%.S
> >>  $(Q)cp $< $@
> >>
> >> +$(obj)/serial.c: $(obj)/autoconf.h
> >> +
> >> +$(obj)/autoconf.h: $(obj)/%: $(srctree)/include/generated/%
> >> +$(Q)cp $< $@
> >> +
> >
> > This gives me:
> >   make[2]: *** No rule to make target '../include/generated/autoconf.h', 
> > needed by 'arch/powerpc/boot/autoconf.h'.  Stop.
> >
> > The ../ is $(srctree).
>
> Seems autoconf.h is in objtree:
>
>   ~/linux$ make O=build prepare
>   ...
>   ~/linux$ find . -name autoconf.h
>   ./drivers/staging/rtl8723bs/include/autoconf.h
>   ./tools/testing/radix-tree/generated/autoconf.h
>   ./build/include/generated/autoconf.h

Ah. That's obvious now that you point it out. Obviously myself and
0day do in-tree builds.

> So I'll fix that up.

Thanks!


Re: [PATCH 1/2] powerpc/boot: Expose Kconfig symbols to wrapper

2018-10-10 Thread Michael Ellerman
Michael Ellerman  writes:
> Joel Stanley  writes:
>> diff --git a/arch/powerpc/boot/Makefile b/arch/powerpc/boot/Makefile
>> index 0fb96c26136f..eeed74e0dfca 100644
>> --- a/arch/powerpc/boot/Makefile
>> +++ b/arch/powerpc/boot/Makefile
>> @@ -197,9 +197,14 @@ $(obj)/empty.c:
>>  $(obj)/zImage.coff.lds $(obj)/zImage.ps3.lds : $(obj)/%: 
>> $(srctree)/$(src)/%.S
>>  $(Q)cp $< $@
>>  
>> +$(obj)/serial.c: $(obj)/autoconf.h
>> +
>> +$(obj)/autoconf.h: $(obj)/%: $(srctree)/include/generated/%
>> +$(Q)cp $< $@
>> +
>
> This gives me:
>   make[2]: *** No rule to make target '../include/generated/autoconf.h', 
> needed by 'arch/powerpc/boot/autoconf.h'.  Stop.
>
> The ../ is $(srctree).

Seems autoconf.h is in objtree:

  ~/linux$ make O=build prepare
  ...
  ~/linux$ find . -name autoconf.h
  ./drivers/staging/rtl8723bs/include/autoconf.h
  ./tools/testing/radix-tree/generated/autoconf.h
  ./build/include/generated/autoconf.h


So I'll fix that up.

cheers


Re: [PATCH 1/2] powerpc/boot: Disable vector instructions

2018-10-10 Thread Michael Ellerman
Segher Boessenkool  writes:
> On Thu, Oct 11, 2018 at 08:22:54AM +1030, Joel Stanley wrote:
>> On Wed, 10 Oct 2018 at 22:41, Michael Ellerman  wrote:
>> > Joel Stanley  writes:
>> > >  BOOTCFLAGS:= -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs \
>> > > -  -fno-strict-aliasing -Os -msoft-float -pipe \
>> > > -  -fomit-frame-pointer -fno-builtin -fPIC -nostdinc \
>> > > +  -fno-strict-aliasing -Os -msoft-float -mno-altivec 
>> > > -mno-vsx \
>> >
>> > That's going to break if the compiler doesn't understand -mno-vsx isn't it?
>> >
>> > I'm not sure if "support" a compiler that old though.
>> 
>> Segher, the kernel mandates 4.6 as the minimum. Do we need to worry
>> about the compiler not supporting  -mno-altivec -mno-vsx?
>
> -mvsx is gcc 4.5 and later.
> https://www.gnu.org/software/gcc/gcc-4.5/changes.html
>
> -maltivec is...  Hrm, not so easy to find...  gcc 3.1 and later it seems.
> https://www.gnu.org/software/gcc/gcc-3.1/changes.html

Thanks.

cheers


Re: [PATCH 2/2] powerpc/pseries: Add driver for PAPR SCM regions

2018-10-10 Thread Michael Ellerman
Oliver O'Halloran  writes:

> Adds a driver that implements support for enabling and accessing PAPR
> SCM regions. Unfortunately due to how the PAPR interface works we can't
> use the existing of_pmem driver (yet) because:
>
>  a) The guest is required to use the H_SCM_BIND_MEM h-call to add
> add the SCM region to it's physical address space, and
>  b) There is currently no mechanism for relating a bare of_pmem region
> to the backing DIMM (or not-a-DIMM for our case).
>
> Both of these are easily handled by rolling the functionality into a
> seperate driver so here we are...
>
> Signed-off-by: Oliver O'Halloran 
> ---
> The alternative implementation here is that we have the pseries code
> do the h-calls and craft a pmem-region@ node based on that.
> However, that doesn't solve b) and mpe has expressed his dislike of
> adding new stuff to the DT at runtime so i'd say that's a non-starter.

Hmm, from memory you yelled something at me across the office about
that, so my response may not have been entirely well considered.

I'm not quite sure what the state of the OF overlays support is, but
that would be The Right Way to do that sort of modification to the
device tree at runtime.

If we merged this and then later got the of_pmem driver to work for us
would there be any user-visible change?

cheers


[PATCH v2 2/2] powerpc/pseries: Add driver for PAPR SCM regions

2018-10-10 Thread Oliver O'Halloran
Adds a driver that implements support for enabling and accessing PAPR
SCM regions. Unfortunately due to how the PAPR interface works we can't
use the existing of_pmem driver (yet) because:

 a) The guest is required to use the H_SCM_BIND_MEM h-call to add
add the SCM region to it's physical address space, and
 b) There is currently no mechanism for relating a bare of_pmem region
to the backing DIMM (or not-a-DIMM for our case).

Both of these are easily handled by rolling the functionality into a
seperate driver so here we are...

Signed-off-by: Oliver O'Halloran 
---
The alternative implementation here is that we have the pseries code
do the h-calls and craft a pmem-region@ node based on that.
However, that doesn't solve b) and mpe has expressed his dislike of
adding new stuff to the DT at runtime so i'd say that's a non-starter.
---
 arch/powerpc/platforms/pseries/Kconfig|   7 +
 arch/powerpc/platforms/pseries/Makefile   |   1 +
 arch/powerpc/platforms/pseries/papr_scm.c | 340 ++
 3 files changed, 348 insertions(+)
 create mode 100644 arch/powerpc/platforms/pseries/papr_scm.c

diff --git a/arch/powerpc/platforms/pseries/Kconfig 
b/arch/powerpc/platforms/pseries/Kconfig
index 0c698fd6d491..4b0fcb80efe5 100644
--- a/arch/powerpc/platforms/pseries/Kconfig
+++ b/arch/powerpc/platforms/pseries/Kconfig
@@ -140,3 +140,10 @@ config IBMEBUS
bool "Support for GX bus based adapters"
help
  Bus device driver for GX bus based adapters.
+
+config PAPR_SCM
+   depends on PPC_PSERIES && MEMORY_HOTPLUG
+   select LIBNVDIMM
+   tristate "Support for the PAPR Storage Class Memory interface"
+   help
+ Enable access to hypervisor provided storage class memory.
diff --git a/arch/powerpc/platforms/pseries/Makefile 
b/arch/powerpc/platforms/pseries/Makefile
index 892b27ced973..a43ec843c8e2 100644
--- a/arch/powerpc/platforms/pseries/Makefile
+++ b/arch/powerpc/platforms/pseries/Makefile
@@ -24,6 +24,7 @@ obj-$(CONFIG_IO_EVENT_IRQ)+= io_event_irq.o
 obj-$(CONFIG_LPARCFG)  += lparcfg.o
 obj-$(CONFIG_IBMVIO)   += vio.o
 obj-$(CONFIG_IBMEBUS)  += ibmebus.o
+obj-$(CONFIG_PAPR_SCM) += papr_scm.o
 
 ifdef CONFIG_PPC_PSERIES
 obj-$(CONFIG_SUSPEND)  += suspend.o
diff --git a/arch/powerpc/platforms/pseries/papr_scm.c 
b/arch/powerpc/platforms/pseries/papr_scm.c
new file mode 100644
index ..87d4dbc18845
--- /dev/null
+++ b/arch/powerpc/platforms/pseries/papr_scm.c
@@ -0,0 +1,340 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#define pr_fmt(fmt)"papr-scm: " fmt
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+
+#define BIND_ANY_ADDR (~0ul)
+
+#define PAPR_SCM_DIMM_CMD_MASK \
+   ((1ul << ND_CMD_GET_CONFIG_SIZE) | \
+(1ul << ND_CMD_GET_CONFIG_DATA) | \
+(1ul << ND_CMD_SET_CONFIG_DATA))
+
+struct papr_scm_priv {
+   struct platform_device *pdev;
+   struct device_node *dn;
+   uint32_t drc_index;
+   uint64_t blocks;
+   uint64_t block_size;
+   int metadata_size;
+
+   uint64_t bound_addr;
+
+   struct nvdimm_bus_descriptor bus_desc;
+   struct nvdimm_bus *bus;
+   struct nvdimm *nvdimm;
+   struct resource res;
+   struct nd_region *region;
+   struct nd_interleave_set nd_set;
+};
+
+static int drc_pmem_bind(struct papr_scm_priv *p)
+{
+   unsigned long ret[PLPAR_HCALL_BUFSIZE];
+   uint64_t rc, token;
+
+   /*
+* When the hypervisor cannot map all the requested memory in a single
+* hcall it returns H_BUSY and we call again with the token until
+* we get H_SUCCESS. Aborting the retry loop before getting H_SUCCESS
+* leave the system in an undefined state, so we wait.
+*/
+   token = 0;
+
+   do {
+   rc = plpar_hcall(H_SCM_BIND_MEM, ret, p->drc_index, 0,
+   p->blocks, BIND_ANY_ADDR, token);
+   token = be64_to_cpu(ret[0]);
+   } while (rc == H_BUSY);
+
+   if (rc) {
+   dev_err(&p->pdev->dev, "bind err: %lld\n", rc);
+   return -ENXIO;
+   }
+
+   p->bound_addr = be64_to_cpu(ret[1]);
+
+   dev_dbg(&p->pdev->dev, "bound drc %x to %pR\n", p->drc_index, &p->res);
+
+   return 0;
+}
+
+static int drc_pmem_unbind(struct papr_scm_priv *p)
+{
+   unsigned long ret[PLPAR_HCALL_BUFSIZE];
+   uint64_t rc, token;
+
+   token = 0;
+
+   /* NB: unbind has the same retry requirements mentioned above */
+   do {
+   rc = plpar_hcall(H_SCM_UNBIND_MEM, ret, p->drc_index,
+   p->bound_addr, p->blocks, token);
+   token = be64_to_cpu(ret);
+   } while (rc == H_BUSY);
+
+   if (rc)
+   dev_err(&p->pdev->dev, "unbind error: %lld\n", rc);
+
+   return !!rc;
+}
+
+static int papr_scm_meta_get(struct papr_scm_priv *p,
+   s

[PATCH v2 1/2] powerpc/pseries: PAPR persistent memory support

2018-10-10 Thread Oliver O'Halloran
This patch implements support for discovering storage class memory
devices at boot and for handling hotplug of new regions via RTAS
hotplug events.

Signed-off-by: Oliver O'Halloran 
---
v2: Added missing pmem.c file
---
 arch/powerpc/include/asm/firmware.h   |   3 +-
 arch/powerpc/include/asm/hvcall.h |  10 +-
 arch/powerpc/include/asm/rtas.h   |   2 +
 arch/powerpc/kernel/rtasd.c   |   2 +
 arch/powerpc/platforms/pseries/Makefile   |   2 +-
 arch/powerpc/platforms/pseries/dlpar.c|   4 +
 arch/powerpc/platforms/pseries/firmware.c |   1 +
 arch/powerpc/platforms/pseries/pmem.c | 164 ++
 arch/powerpc/platforms/pseries/pseries.h  |   5 +
 arch/powerpc/platforms/pseries/ras.c  |   3 +-
 10 files changed, 192 insertions(+), 4 deletions(-)
 create mode 100644 arch/powerpc/platforms/pseries/pmem.c

diff --git a/arch/powerpc/include/asm/firmware.h 
b/arch/powerpc/include/asm/firmware.h
index 7a051bd21f87..113c64d5d394 100644
--- a/arch/powerpc/include/asm/firmware.h
+++ b/arch/powerpc/include/asm/firmware.h
@@ -52,6 +52,7 @@
 #define FW_FEATURE_PRRNASM_CONST(0x0002)
 #define FW_FEATURE_DRMEM_V2ASM_CONST(0x0004)
 #define FW_FEATURE_DRC_INFOASM_CONST(0x0008)
+#define FW_FEATURE_PAPR_SCMASM_CONST(0x0010)
 
 #ifndef __ASSEMBLY__
 
@@ -69,7 +70,7 @@ enum {
FW_FEATURE_SET_MODE | FW_FEATURE_BEST_ENERGY |
FW_FEATURE_TYPE1_AFFINITY | FW_FEATURE_PRRN |
FW_FEATURE_HPT_RESIZE | FW_FEATURE_DRMEM_V2 |
-   FW_FEATURE_DRC_INFO,
+   FW_FEATURE_DRC_INFO | FW_FEATURE_PAPR_SCM,
FW_FEATURE_PSERIES_ALWAYS = 0,
FW_FEATURE_POWERNV_POSSIBLE = FW_FEATURE_OPAL,
FW_FEATURE_POWERNV_ALWAYS = 0,
diff --git a/arch/powerpc/include/asm/hvcall.h 
b/arch/powerpc/include/asm/hvcall.h
index a0b17f9f1ea4..0e81ef83b35a 100644
--- a/arch/powerpc/include/asm/hvcall.h
+++ b/arch/powerpc/include/asm/hvcall.h
@@ -295,7 +295,15 @@
 #define H_INT_ESB   0x3C8
 #define H_INT_SYNC  0x3CC
 #define H_INT_RESET 0x3D0
-#define MAX_HCALL_OPCODE   H_INT_RESET
+#define H_SCM_READ_METADATA 0x3E4
+#define H_SCM_WRITE_METADATA0x3E8
+#define H_SCM_BIND_MEM  0x3EC
+#define H_SCM_UNBIND_MEM0x3F0
+#define H_SCM_QUERY_BLOCK_MEM_BINDING 0x3F4
+#define H_SCM_QUERY_LOGICAL_MEM_BINDING 0x3F8
+#define H_SCM_MEM_QUERY0x3FC
+#define H_SCM_BLOCK_CLEAR   0x400
+#define MAX_HCALL_OPCODE   H_SCM_BLOCK_CLEAR
 
 /* H_VIOCTL functions */
 #define H_GET_VIOA_DUMP_SIZE   0x01
diff --git a/arch/powerpc/include/asm/rtas.h b/arch/powerpc/include/asm/rtas.h
index 71e393c46a49..1e81f3d55457 100644
--- a/arch/powerpc/include/asm/rtas.h
+++ b/arch/powerpc/include/asm/rtas.h
@@ -125,6 +125,7 @@ struct rtas_suspend_me_data {
 #define RTAS_TYPE_INFO 0xE2
 #define RTAS_TYPE_DEALLOC  0xE3
 #define RTAS_TYPE_DUMP 0xE4
+#define RTAS_TYPE_HOTPLUG  0xE5
 /* I don't add PowerMGM events right now, this is a different topic */ 
 #define RTAS_TYPE_PMGM_POWER_SW_ON 0x60
 #define RTAS_TYPE_PMGM_POWER_SW_OFF0x61
@@ -316,6 +317,7 @@ struct pseries_hp_errorlog {
 #define PSERIES_HP_ELOG_RESOURCE_MEM   2
 #define PSERIES_HP_ELOG_RESOURCE_SLOT  3
 #define PSERIES_HP_ELOG_RESOURCE_PHB   4
+#define PSERIES_HP_ELOG_RESOURCE_PMEM   6
 
 #define PSERIES_HP_ELOG_ACTION_ADD 1
 #define PSERIES_HP_ELOG_ACTION_REMOVE  2
diff --git a/arch/powerpc/kernel/rtasd.c b/arch/powerpc/kernel/rtasd.c
index 6fafc82c04b0..fad0baddfcba 100644
--- a/arch/powerpc/kernel/rtasd.c
+++ b/arch/powerpc/kernel/rtasd.c
@@ -91,6 +91,8 @@ static char *rtas_event_type(int type)
return "Dump Notification Event";
case RTAS_TYPE_PRRN:
return "Platform Resource Reassignment Event";
+   case RTAS_TYPE_HOTPLUG:
+   return "Hotplug Event";
}
 
return rtas_type[0];
diff --git a/arch/powerpc/platforms/pseries/Makefile 
b/arch/powerpc/platforms/pseries/Makefile
index 7e89d5c47068..892b27ced973 100644
--- a/arch/powerpc/platforms/pseries/Makefile
+++ b/arch/powerpc/platforms/pseries/Makefile
@@ -13,7 +13,7 @@ obj-$(CONFIG_KEXEC_CORE)  += kexec.o
 obj-$(CONFIG_PSERIES_ENERGY)   += pseries_energy.o
 
 obj-$(CONFIG_HOTPLUG_CPU)  += hotplug-cpu.o
-obj-$(CONFIG_MEMORY_HOTPLUG)   += hotplug-memory.o
+obj-$(CONFIG_MEMORY_HOTPLUG)   += hotplug-memory.o pmem.o
 
 obj-$(CONFIG_HVC_CONSOLE)  += hvconsole.o
 obj-$(CONFIG_HVCS) += hvcserver.o
diff --git a/arch/powerpc/platforms/pseries/dlpar.c 
b/arch/powerpc/platforms/pseries/dlpar.c
index a0b20c03f078..795996fefdb9 100644
--- a/arch/powerpc/platforms/pseries/dlpar.c
+++ b/arch/powerpc/platforms/pseries/dlpar.c
@@ -357,6 +357,10 @@ static int handle_dlpar_errorlog(struct 
pseries_hp_errorlog *hp_elog)

Re: [PATCH 1/2] powerpc/pseries: PAPR persistent memory support

2018-10-10 Thread Oliver
On Thu, Oct 11, 2018 at 3:36 AM Nathan Fontenot
 wrote:
>
> On 10/10/2018 01:08 AM, Oliver O'Halloran wrote:
> > This patch implements support for discovering storage class memory
> > devices at boot and for handling hotplug of new regions via RTAS
> > hotplug events.
> >
> > Signed-off-by: Oliver O'Halloran 
> > ---
> >  arch/powerpc/include/asm/firmware.h   |  3 ++-
> >  arch/powerpc/include/asm/hvcall.h | 10 +-
> >  arch/powerpc/include/asm/rtas.h   |  2 ++
> >  arch/powerpc/kernel/rtasd.c   |  2 ++
> >  arch/powerpc/platforms/pseries/Makefile   |  2 +-
> >  arch/powerpc/platforms/pseries/dlpar.c|  4 
> >  arch/powerpc/platforms/pseries/firmware.c |  1 +
> >  arch/powerpc/platforms/pseries/pseries.h  |  5 +
> >  arch/powerpc/platforms/pseries/ras.c  |  3 ++-
> >  9 files changed, 28 insertions(+), 4 deletions(-)
> >
> > diff --git a/arch/powerpc/include/asm/firmware.h 
> > b/arch/powerpc/include/asm/firmware.h
> > index 7a051bd21f87..113c64d5d394 100644
> > --- a/arch/powerpc/include/asm/firmware.h
> > +++ b/arch/powerpc/include/asm/firmware.h
> > @@ -52,6 +52,7 @@
> >  #define FW_FEATURE_PRRN  ASM_CONST(0x0002)
> >  #define FW_FEATURE_DRMEM_V2  ASM_CONST(0x0004)
> >  #define FW_FEATURE_DRC_INFO  ASM_CONST(0x0008)
> > +#define FW_FEATURE_PAPR_SCM  ASM_CONST(0x0010)
> >
> >  #ifndef __ASSEMBLY__
> >
> > @@ -69,7 +70,7 @@ enum {
> >   FW_FEATURE_SET_MODE | FW_FEATURE_BEST_ENERGY |
> >   FW_FEATURE_TYPE1_AFFINITY | FW_FEATURE_PRRN |
> >   FW_FEATURE_HPT_RESIZE | FW_FEATURE_DRMEM_V2 |
> > - FW_FEATURE_DRC_INFO,
> > + FW_FEATURE_DRC_INFO | FW_FEATURE_PAPR_SCM,
> >   FW_FEATURE_PSERIES_ALWAYS = 0,
> >   FW_FEATURE_POWERNV_POSSIBLE = FW_FEATURE_OPAL,
> >   FW_FEATURE_POWERNV_ALWAYS = 0,
> > diff --git a/arch/powerpc/include/asm/hvcall.h 
> > b/arch/powerpc/include/asm/hvcall.h
> > index a0b17f9f1ea4..0e81ef83b35a 100644
> > --- a/arch/powerpc/include/asm/hvcall.h
> > +++ b/arch/powerpc/include/asm/hvcall.h
> > @@ -295,7 +295,15 @@
> >  #define H_INT_ESB   0x3C8
> >  #define H_INT_SYNC  0x3CC
> >  #define H_INT_RESET 0x3D0
> > -#define MAX_HCALL_OPCODE H_INT_RESET
> > +#define H_SCM_READ_METADATA 0x3E4
> > +#define H_SCM_WRITE_METADATA0x3E8
> > +#define H_SCM_BIND_MEM  0x3EC
> > +#define H_SCM_UNBIND_MEM0x3F0
> > +#define H_SCM_QUERY_BLOCK_MEM_BINDING 0x3F4
> > +#define H_SCM_QUERY_LOGICAL_MEM_BINDING 0x3F8
> > +#define H_SCM_MEM_QUERY  0x3FC
> > +#define H_SCM_BLOCK_CLEAR   0x400
> > +#define MAX_HCALL_OPCODE H_SCM_BLOCK_CLEAR
> >
> >  /* H_VIOCTL functions */
> >  #define H_GET_VIOA_DUMP_SIZE 0x01
> > diff --git a/arch/powerpc/include/asm/rtas.h 
> > b/arch/powerpc/include/asm/rtas.h
> > index 71e393c46a49..1e81f3d55457 100644
> > --- a/arch/powerpc/include/asm/rtas.h
> > +++ b/arch/powerpc/include/asm/rtas.h
> > @@ -125,6 +125,7 @@ struct rtas_suspend_me_data {
> >  #define RTAS_TYPE_INFO   0xE2
> >  #define RTAS_TYPE_DEALLOC0xE3so we might as well
> >  #define RTAS_TYPE_DUMP   0xE4
> > +#define RTAS_TYPE_HOTPLUG0xE5
> >  /* I don't add PowerMGM events right now, this is a different topic */
> >  #define RTAS_TYPE_PMGM_POWER_SW_ON   0x60
> >  #define RTAS_TYPE_PMGM_POWER_SW_OFF  0x61
> > @@ -316,6 +317,7 @@ struct pseries_hp_errorlog {
> >  #define PSERIES_HP_ELOG_RESOURCE_MEM 2
> >  #define PSERIES_HP_ELOG_RESOURCE_SLOT3
> >  #define PSERIES_HP_ELOG_RESOURCE_PHB 4
> > +#define PSERIES_HP_ELOG_RESOURCE_PMEM   6
> >
> >  #define PSERIES_HP_ELOG_ACTION_ADD   1
> >  #define PSERIES_HP_ELOG_ACTION_REMOVE2
> > diff --git a/arch/powerpc/kernel/rtasd.c b/arch/powerpc/kernel/rtasd.c
> > index 6fafc82c04b0..fad0baddfcba 100644
> > --- a/arch/powerpc/kernel/rtasd.c
> > +++ b/arch/powerpc/kernel/rtasd.c
> > @@ -91,6 +91,8 @@ static char *rtas_event_type(int type)
> >   return "Dump Notification Event";
> >   case RTAS_TYPE_PRRN:
> >   return "Platform Resource Reassignment Event";
> > + case RTAS_TYPE_HOTPLUG:
> > + return "Hotplug Event";
> >   }
> >
> >   return rtas_type[0];
> > diff --git a/arch/powerpc/platforms/pseries/Makefile 
> > b/arch/powerpc/platforms/pseries/Makefile
> > index 7e89d5c47068..892b27ced973 100644
> > --- a/arch/powerpc/platforms/pseries/Makefile
> > +++ b/arch/powerpc/platforms/pseries/Makefile
> > @@ -13,7 +13,7 @@ obj-$(CONFIG_KEXEC_CORE)+= kexec.o
> >  obj-$(CONFIG_PSERIES_ENERGY) += pseries_energy.o
> >
> >  obj-$(CONFIG_HOTPLUG_CPU)+= hotplug-cpu.o
> > -obj-$(CONFIG_MEMORY_HOTPLUG) += hotplug-memory.o
> > +obj-$(CONFIG_MEMORY_HOTPLUG) += hotplug-memory.o pmem.o
> >
> >  obj-$(CONFIG_HVC_CONSOLE)+= hvconsole.o
> >  obj-$(CONFIG_HVCS)

Re: [PATCH 1/2] powerpc/boot: Disable vector instructions

2018-10-10 Thread Segher Boessenkool
On Thu, Oct 11, 2018 at 08:22:54AM +1030, Joel Stanley wrote:
> On Wed, 10 Oct 2018 at 22:41, Michael Ellerman  wrote:
> > Joel Stanley  writes:
> > >  BOOTCFLAGS:= -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs \
> > > -  -fno-strict-aliasing -Os -msoft-float -pipe \
> > > -  -fomit-frame-pointer -fno-builtin -fPIC -nostdinc \
> > > +  -fno-strict-aliasing -Os -msoft-float -mno-altivec 
> > > -mno-vsx \
> >
> > That's going to break if the compiler doesn't understand -mno-vsx isn't it?
> >
> > I'm not sure if "support" a compiler that old though.
> 
> Segher, the kernel mandates 4.6 as the minimum. Do we need to worry
> about the compiler not supporting  -mno-altivec -mno-vsx?

-mvsx is gcc 4.5 and later.
https://www.gnu.org/software/gcc/gcc-4.5/changes.html

-maltivec is...  Hrm, not so easy to find...  gcc 3.1 and later it seems.
https://www.gnu.org/software/gcc/gcc-3.1/changes.html

You should be fine.


Segher


Re: [PATCH 1/2] powerpc/boot: Disable vector instructions

2018-10-10 Thread Joel Stanley
On Wed, 10 Oct 2018 at 22:41, Michael Ellerman  wrote:
>
> Joel Stanley  writes:
>
> > This will avoid auto-vectorisation when building with higher
> > optimisation levels.
> >
> > We don't know if the machine can support VSX and even if it's present
> > it's probably not going to be enabled at this point in boot.
> >
> > Signed-off-by: Joel Stanley 
> > ---
> >  arch/powerpc/boot/Makefile | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/arch/powerpc/boot/Makefile b/arch/powerpc/boot/Makefile
> > index 0fb96c26136f..739ef8d43b91 100644
> > --- a/arch/powerpc/boot/Makefile
> > +++ b/arch/powerpc/boot/Makefile
> > @@ -32,8 +32,8 @@ else
> >  endif
> >
> >  BOOTCFLAGS:= -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs \
> > -  -fno-strict-aliasing -Os -msoft-float -pipe \
> > -  -fomit-frame-pointer -fno-builtin -fPIC -nostdinc \
> > +  -fno-strict-aliasing -Os -msoft-float -mno-altivec -mno-vsx \
>
> That's going to break if the compiler doesn't understand -mno-vsx isn't it?
>
> I'm not sure if "support" a compiler that old though.

Segher, the kernel mandates 4.6 as the minimum. Do we need to worry
about the compiler not supporting  -mno-altivec -mno-vsx?


Re: [PATCH 05.1/16] of:overlay: missing name, phandle, linux, phandle in new nodes

2018-10-10 Thread Frank Rowand
On 10/10/18 13:40, Alan Tull wrote:
> On Wed, Oct 10, 2018 at 1:49 AM Frank Rowand  wrote:
>>
>> On 10/09/18 23:04, frowand.l...@gmail.com wrote:
>>> From: Frank Rowand 
>>>
>>>
>>> "of: overlay: use prop add changeset entry for property in new nodes"
>>> fixed a problem where an 'update property' changeset entry was
>>> created for properties contained in nodes added by a changeset.
>>> The fix was to use an 'add property' changeset entry.
>>>
>>> This exposed more bugs in the apply overlay code.  The properties
>>> 'name', 'phandle', and 'linux,phandle' were filtered out by
>>> add_changeset_property() as special properties.  Change the filter
>>> to be only for existing nodes, not newly added nodes.
>>>
>>> The second bug is that the 'name' property does not exist in the
>>> newest FDT version, and has to be constructed from the node's
>>> full_name.  Construct an 'add property' changeset entry for
>>> newly added nodes.
>>>
>>> Signed-off-by: Frank Rowand 
>>> ---
>>>
>>>
>>> Hi Alan,
>>>
>>> Thanks for reporting the problem with missing node names.
>>>
>>> I was able to replicate the problem, and have created this preliminary
>>> version of a patch to fix the problem.
>>>
>>> I have not extensively reviewed the patch yet, but would appreciate
>>> if you can confirm this fixes your problem.
>>>
>>> I created this patch as patch 17 of the series, but have also
>>> applied it as patch 05.1, immediately after patch 05/16, and
>>> built the kernel, booted, and verified name and phandle for
>>> one of the nodes in a unittest overlay for both cases.  So
>>> minimal testing so far on my part.
>>>
>>> I have not verified whether the series builds and boots after
>>> each of patches 06..16 if this patch is applied as patch 05.1.
>>>
>>> There is definitely more work needed for me to complete this
>>> patch because it allocates some more memory, but does not yet
>>> free it when the overlay is released.
>>>
>>> -Frank
>>>
>>>
>>>  drivers/of/overlay.c | 72 
>>> 
>>>  1 file changed, 67 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/drivers/of/overlay.c b/drivers/of/overlay.c
>>> index 0b0904f44bc7..9746cea2aa91 100644
>>> --- a/drivers/of/overlay.c
>>> +++ b/drivers/of/overlay.c
>>> @@ -301,10 +301,11 @@ static int add_changeset_property(struct 
>>> overlay_changeset *ovcs,
>>>   struct property *new_prop = NULL, *prop;
>>>   int ret = 0;
>>>
>>> - if (!of_prop_cmp(overlay_prop->name, "name") ||
>>> - !of_prop_cmp(overlay_prop->name, "phandle") ||
>>> - !of_prop_cmp(overlay_prop->name, "linux,phandle"))
>>> - return 0;
>>> + if (target->in_livetree)
>>> + if (!of_prop_cmp(overlay_prop->name, "name") ||
>>> + !of_prop_cmp(overlay_prop->name, "phandle") ||
>>> + !of_prop_cmp(overlay_prop->name, "linux,phandle"))
>>> + return 0;
>>
>> This is a big hammer patch.
>>
>> Nobody should waste time reviewing this patch.
> 
> I wasn't clear if you still could use the testing so I did re-run my
> test.  This patch adds back some of the missing properties, but the
> the kobject names aren't set as dev_name() returns NULL:
> 
> * without this patch some of_node properties don't show up in sysfs:
> root@arria10:~# ls
> /sys/bus/platform/drivers/altera_freeze_br/ff200450.\/of_node
> clockscompatibleinterrupt-parent  interruptsreg
> 
> * with this patch, the of_node properties phandle and name are back:
> root@arria10:~#  ls
> /sys/bus/platform/drivers/altera_freeze_br/ff200450.\/of_node
> clockscompatibleinterrupt-parent  interrupts
>  name  phandle   reg

Thanks for the testing.  I'll keep chasing after this problem today.

This is useful data for me as I was not looking under the /sys/bus/...
tree that you reported, but was instead looking at /proc/device-tree/...
which showed the same type of problem since the overlay I was using
does not show up under /sys/bus/...

I'll have to create a useful overlay test case that will show up under
/sys/bus/...

In the meantime, can you send me the base FDT and the overlay FDT for
your test case?

Thanks,

Frank


> 
> root@arria10:~# cat
> /sys/bus/platform/drivers/altera_freeze_br/ff200450.\/of_node/name
> freeze_controllerroot@arria10:~#  ("freeze_controller" w/o the \n so
> the name is correct)
> 
> * with or without the patch I see the behavior I reported yesterday,
> kobj names are NULL.
> root@arria10:~# ls /sys/bus/platform/drivers/altera_freeze_br/
> bind ff200450.  uevent   unbind
> 
> root@arria10:~# ls /sys/bus/platform/drivers/altera_gpio/
> bind ff200010.  ff200020.  ff200030.
> uevent   unbind
> 
> Alan
> 
> Alan
> 
>>
>> The following part should not be needed (though the above section might have
>> to become _slightly_ more complex).
>>
>> -Frank
>>>
>>>   if (target->in_livetree)
>>>   

Re: [PATCH 05.1/16] of:overlay: missing name, phandle, linux, phandle in new nodes

2018-10-10 Thread Alan Tull
On Wed, Oct 10, 2018 at 1:49 AM Frank Rowand  wrote:
>
> On 10/09/18 23:04, frowand.l...@gmail.com wrote:
> > From: Frank Rowand 
> >
> >
> > "of: overlay: use prop add changeset entry for property in new nodes"
> > fixed a problem where an 'update property' changeset entry was
> > created for properties contained in nodes added by a changeset.
> > The fix was to use an 'add property' changeset entry.
> >
> > This exposed more bugs in the apply overlay code.  The properties
> > 'name', 'phandle', and 'linux,phandle' were filtered out by
> > add_changeset_property() as special properties.  Change the filter
> > to be only for existing nodes, not newly added nodes.
> >
> > The second bug is that the 'name' property does not exist in the
> > newest FDT version, and has to be constructed from the node's
> > full_name.  Construct an 'add property' changeset entry for
> > newly added nodes.
> >
> > Signed-off-by: Frank Rowand 
> > ---
> >
> >
> > Hi Alan,
> >
> > Thanks for reporting the problem with missing node names.
> >
> > I was able to replicate the problem, and have created this preliminary
> > version of a patch to fix the problem.
> >
> > I have not extensively reviewed the patch yet, but would appreciate
> > if you can confirm this fixes your problem.
> >
> > I created this patch as patch 17 of the series, but have also
> > applied it as patch 05.1, immediately after patch 05/16, and
> > built the kernel, booted, and verified name and phandle for
> > one of the nodes in a unittest overlay for both cases.  So
> > minimal testing so far on my part.
> >
> > I have not verified whether the series builds and boots after
> > each of patches 06..16 if this patch is applied as patch 05.1.
> >
> > There is definitely more work needed for me to complete this
> > patch because it allocates some more memory, but does not yet
> > free it when the overlay is released.
> >
> > -Frank
> >
> >
> >  drivers/of/overlay.c | 72 
> > 
> >  1 file changed, 67 insertions(+), 5 deletions(-)
> >
> > diff --git a/drivers/of/overlay.c b/drivers/of/overlay.c
> > index 0b0904f44bc7..9746cea2aa91 100644
> > --- a/drivers/of/overlay.c
> > +++ b/drivers/of/overlay.c
> > @@ -301,10 +301,11 @@ static int add_changeset_property(struct 
> > overlay_changeset *ovcs,
> >   struct property *new_prop = NULL, *prop;
> >   int ret = 0;
> >
> > - if (!of_prop_cmp(overlay_prop->name, "name") ||
> > - !of_prop_cmp(overlay_prop->name, "phandle") ||
> > - !of_prop_cmp(overlay_prop->name, "linux,phandle"))
> > - return 0;
> > + if (target->in_livetree)
> > + if (!of_prop_cmp(overlay_prop->name, "name") ||
> > + !of_prop_cmp(overlay_prop->name, "phandle") ||
> > + !of_prop_cmp(overlay_prop->name, "linux,phandle"))
> > + return 0;
>
> This is a big hammer patch.
>
> Nobody should waste time reviewing this patch.

I wasn't clear if you still could use the testing so I did re-run my
test.  This patch adds back some of the missing properties, but the
the kobject names aren't set as dev_name() returns NULL:

* without this patch some of_node properties don't show up in sysfs:
root@arria10:~# ls
/sys/bus/platform/drivers/altera_freeze_br/ff200450.\/of_node
clockscompatibleinterrupt-parent  interruptsreg

* with this patch, the of_node properties phandle and name are back:
root@arria10:~#  ls
/sys/bus/platform/drivers/altera_freeze_br/ff200450.\/of_node
clockscompatibleinterrupt-parent  interrupts
 name  phandle   reg

root@arria10:~# cat
/sys/bus/platform/drivers/altera_freeze_br/ff200450.\/of_node/name
freeze_controllerroot@arria10:~#  ("freeze_controller" w/o the \n so
the name is correct)

* with or without the patch I see the behavior I reported yesterday,
kobj names are NULL.
root@arria10:~# ls /sys/bus/platform/drivers/altera_freeze_br/
bind ff200450.  uevent   unbind

root@arria10:~# ls /sys/bus/platform/drivers/altera_gpio/
bind ff200010.  ff200020.  ff200030.
uevent   unbind

Alan

Alan

>
> The following part should not be needed (though the above section might have
> to become _slightly_ more complex).
>
> -Frank
> >
> >   if (target->in_livetree)
> >   prop = of_find_property(target->np, overlay_prop->name, NULL);
> > @@ -443,10 +444,13 @@ static int build_changeset_next_level(struct 
> > overlay_changeset *ovcs,
> >   struct target *target, const struct device_node *overlay_node)
> >  {
> >   struct device_node *child;
> > - struct property *prop;
> > + struct property *prop, *name_prop;
> > + bool has_name = false;
> >   int ret;
> >
> >   for_each_property_of_node(overlay_node, prop) {
> > + if (!strcmp(prop->name, "name"))
> > + has_name = true;
> >   ret = add_changeset_property(ovcs

[RFC PATCH for 4.21 09/16] powerpc: Wire up cpu_opv system call

2018-10-10 Thread Mathieu Desnoyers
Signed-off-by: Mathieu Desnoyers 
CC: Benjamin Herrenschmidt 
CC: Paul Mackerras 
CC: Michael Ellerman 
CC: Boqun Feng 
CC: Peter Zijlstra 
CC: "Paul E. McKenney" 
CC: linuxppc-dev@lists.ozlabs.org
---
 arch/powerpc/include/asm/systbl.h  | 1 +
 arch/powerpc/include/uapi/asm/unistd.h | 1 +
 2 files changed, 2 insertions(+)

diff --git a/arch/powerpc/include/asm/systbl.h 
b/arch/powerpc/include/asm/systbl.h
index 01b5171ea189..8f58710f5e8b 100644
--- a/arch/powerpc/include/asm/systbl.h
+++ b/arch/powerpc/include/asm/systbl.h
@@ -394,3 +394,4 @@ SYSCALL(pkey_free)
 SYSCALL(pkey_mprotect)
 SYSCALL(rseq)
 COMPAT_SYS(io_pgetevents)
+SYSCALL(cpu_opv)
diff --git a/arch/powerpc/include/uapi/asm/unistd.h 
b/arch/powerpc/include/uapi/asm/unistd.h
index 985534d0b448..112e2c54750a 100644
--- a/arch/powerpc/include/uapi/asm/unistd.h
+++ b/arch/powerpc/include/uapi/asm/unistd.h
@@ -400,5 +400,6 @@
 #define __NR_pkey_mprotect 386
 #define __NR_rseq  387
 #define __NR_io_pgetevents 388
+#define __NR_cpu_opv   389
 
 #endif /* _UAPI_ASM_POWERPC_UNISTD_H_ */
-- 
2.11.0



Re: [PATCH 13/36] dt-bindings: arm: Convert PMU binding to json-schema

2018-10-10 Thread Rob Herring
On Wed, Oct 10, 2018 at 11:50 AM Will Deacon  wrote:
>
> On Tue, Oct 09, 2018 at 01:14:02PM -0500, Rob Herring wrote:
> > On Tue, Oct 9, 2018 at 6:57 AM Will Deacon  wrote:
> > >
> > > Hi Rob,
> > >
> > > On Fri, Oct 05, 2018 at 11:58:25AM -0500, Rob Herring wrote:
> > > > Convert ARM PMU binding to DT schema format using json-schema.
> > > >
> > > > Cc: Will Deacon 
> > > > Cc: Mark Rutland 
> > > > Cc: linux-arm-ker...@lists.infradead.org
> > > > Cc: devicet...@vger.kernel.org
> > > > Signed-off-by: Rob Herring 
> > > > ---
> > > >  Documentation/devicetree/bindings/arm/pmu.txt | 70 --
> > > >  .../devicetree/bindings/arm/pmu.yaml  | 96 +++
> > > >  2 files changed, 96 insertions(+), 70 deletions(-)
> > > >  delete mode 100644 Documentation/devicetree/bindings/arm/pmu.txt
> > > >  create mode 100644 Documentation/devicetree/bindings/arm/pmu.yaml
> > >
> > > [...]
> > >
> > > > -- interrupts : 1 combined interrupt or 1 per core. If the interrupt is 
> > > > a per-cpu
> > > > -   interrupt (PPI) then 1 interrupt should be specified.
> > >
> > > [...]
> > >
> > > > +  interrupts:
> > > > +oneOf:
> > > > +  - maxItems: 1
> > > > +  - minItems: 2
> > > > +maxItems: 8
> > > > +description: 1 interrupt per core.
> > > > +
> > > > +  interrupts-extended:
> > > > +$ref: '#/properties/interrupts'
> > >
> > > This seems like a semantic different between the two representations, or 
> > > am
> > > I missing something here? Specifically, both the introduction of
> > > interrupts-extended and also dropping any mention of using a single 
> > > per-cpu
> > > interrupt (the single combined case is no longer support by Linux; not 
> > > sure
> > > if you want to keep it in the binding).
> >
> > 'interrupts-extended' was implied before as it is always supported and
> > outside the scope of the binding. But now it is needed to validate
> > bindings. There must be some use of it and that's why I added it.
> > However, thinking some more about this, I think it may be better to
> > have the tools add this in automatically whenever we have an
> > interrupts property.
>
> To be honest, if you'd included that in the commit message I'd have been
> happy :)
>
> > I guess the single interrupt case is less obvious now with no
> > description (it's the first list item of 'oneOf'). The schema If the
> > single interrupt is not supported, then we can drop it here.
>
> Well the description says "1 interrupt per core" which is incorrect.

You are reading the schema wrong. There are 2 cases supported as
defined by each '-'. The 2nd case is all the keywords until the
indentation decreases. So 'description' is just description of the 2nd
case. The first case is just "maxItems: 1". I probably didn't put a
description because why write in free form text what the schema says
(other than of course no one knows json-schema...).

YAML combines the best of Makefiles and python. You can't have tabs
and Indentation is significant. :)

> I also
> don't understand why maxItems is 8.

Humm, I probably just made that up based on GICv2 limitations. What
should it be? If there's not any inherit maximum, can we put something
reasonable? There's not really any way to express that it should match
the number of cores in the system.

Rob


Re: [PATCH v04 3/4] migration/memory: Evaluate LMB assoc changes

2018-10-10 Thread Michael Bringmann
On 10/10/2018 12:24 PM, Nathan Fontenot wrote:
> On 10/09/2018 03:37 PM, Michael Bringmann wrote:

>
>> +static void pseries_update_ala_memory_aai(int aa_index)
>> +{
>> +struct drmem_lmb *lmb;
>> +
>> +/* Readd all LMBs which were previously using the
>> + * specified aa_index value.
>> + */
>> +for_each_drmem_lmb(lmb) {
>> +if ((lmb->aa_index == aa_index) &&
>> +(lmb->flags & DRCONF_MEM_ASSIGNED)) {
>> +drmem_mark_lmb_update(lmb);
>> +dlpar_memory_pmt_changes_set();
>> +}
>> +}
>> +}
>> +
>> +struct assoc_arrays {
>> +u32 n_arrays;
>> +u32 array_sz;
>> +const __be32 *arrays;
>> +};
> 
> This struct is also defined in arch/powerpc/mm/numa.c. May be a good idea to 
> move the
> definition to common place.

Moving to topology.h in arch/powerpc/include/asm.

> 
>> +
>> +static int pseries_update_ala_memory(struct of_reconfig_data *pr)
>> +{
>> +struct assoc_arrays new_ala, old_ala;
>> +__be32 *p;
>> +int i, lim;
>> +
>> +if (rtas_hp_event)
>> +return 0;
>> +
>> +/*
>> + * The layout of the ibm,associativity-lookup-arrays
>> + * property is a number N indicating the number of
>> + * associativity arrays, followed by a number M
>> + * indicating the size of each associativity array,
>> + * followed by a list of N associativity arrays.
>> + */
>> +
>> +p = (__be32 *) pr->old_prop->value;
>> +if (!p)
>> +return -EINVAL;
>> +old_ala.n_arrays = of_read_number(p++, 1);
>> +old_ala.array_sz = of_read_number(p++, 1);
>> +old_ala.arrays = p;
>> +
>> +p = (__be32 *) pr->prop->value;
>> +if (!p)
>> +return -EINVAL;
>> +new_ala.n_arrays = of_read_number(p++, 1);
>> +new_ala.array_sz = of_read_number(p++, 1);
>> +new_ala.arrays = p;
>> +
>> +lim = (new_ala.n_arrays > old_ala.n_arrays) ? old_ala.n_arrays :
>> +new_ala.n_arrays;
>> +
>> +if (old_ala.array_sz == new_ala.array_sz) {
>> +
>> +/* Reset any entries where the old and new rows
>> + * the array have changed.
> 
> Small nit, the wording in that comment could be clearer.

Right.

> 
> -Nathan

Michael

-- 
Michael W. Bringmann
Linux Technology Center
IBM Corporation
Tie-Line  363-5196
External: (512) 286-5196
Cell:   (512) 466-0650
m...@linux.vnet.ibm.com



Re: [PATCH v04 3/5] migration/memory: Add hotplug READD_MULTIPLE

2018-10-10 Thread Michael Bringmann
On 10/10/2018 11:59 AM, Nathan Fontenot wrote:
> On 10/09/2018 03:36 PM, Michael Bringmann wrote:
>> migration/memory: This patch adds a new pseries hotplug action
>> for CPU and memory operations, PSERIES_HP_ELOG_ACTION_READD_MULTIPLE.
>> This is a variant of the READD operation which performs the action
>> upon multiple instances of the resource at one time.  The operation
>> is to be triggered by device-tree analysis of updates by RTAS events
>> analyzed by 'migation_store' during post-migration processing.  It
>> will be used for memory updates, initially.
>>
>> Signed-off-by: Michael Bringmann 
>> ---
>> Changes in v04:
>>   -- Move init of 'lmb->internal_flags' in init_drmem_v2_lmbs to
>>  previous patch.
>>   -- Pull in implementation of dlpar_memory_readd_multiple() to go
>>  with operation flag.
>> ---
>>  arch/powerpc/include/asm/rtas.h |1 +
>>  arch/powerpc/platforms/pseries/hotplug-memory.c |   31 
>> +++
>>  2 files changed, 32 insertions(+)
>>
>> diff --git a/arch/powerpc/include/asm/rtas.h 
>> b/arch/powerpc/include/asm/rtas.h
>> index 0183e95..cc00451 100644
>> --- a/arch/powerpc/include/asm/rtas.h
>> +++ b/arch/powerpc/include/asm/rtas.h
>> @@ -333,6 +333,7 @@ struct pseries_hp_errorlog {
>>  #define PSERIES_HP_ELOG_ACTION_ADD  1
>>  #define PSERIES_HP_ELOG_ACTION_REMOVE   2
>>  #define PSERIES_HP_ELOG_ACTION_READD3
>> +#define PSERIES_HP_ELOG_ACTION_READD_MULTIPLE   4
>>
>>  #define PSERIES_HP_ELOG_ID_DRC_NAME 1
>>  #define PSERIES_HP_ELOG_ID_DRC_INDEX2
>> diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c 
>> b/arch/powerpc/platforms/pseries/hotplug-memory.c
>> index 9a15d39..bf2420a 100644
>> --- a/arch/powerpc/platforms/pseries/hotplug-memory.c
>> +++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
>> @@ -546,6 +546,30 @@ static int dlpar_memory_readd_by_index(u32 drc_index)
>>  return rc;
>>  }
>>
>> +static int dlpar_memory_readd_multiple(void)
>> +{
>> +struct drmem_lmb *lmb;
>> +int rc;
>> +
>> +pr_info("Attempting to update multiple LMBs\n");
>> +
>> +for_each_drmem_lmb(lmb) {
>> +if (drmem_lmb_update(lmb)) {
>> +rc = dlpar_remove_lmb(lmb);
>> +
>> +if (!rc) {
>> +rc = dlpar_add_lmb(lmb);
>> +if (rc)
>> +dlpar_release_drc(lmb->drc_index);
>> +}
> 
> The work you're doing here is essentially the same that is done in
> dlpar_memory_readd_by_index(). Perhaps pulling the commin bits of both
> routines into a helper routine. This could include the success/failure
> messages in dlpar_memory_readd_by_index()

Really, only the interior of the loop is common to the two functions.
Creating a helper that incorporated the loop would mean either several
helper functions customized to each path (and a lot more code).
Or a common helper function that does everything for both paths, and
would be harder to understand/maintain.

It would be a lot cleaner to put the common loop interior into a helper 
function,
and retain the other two functions with their unique loop + test + extra
operations.  I will update with this method.

> 
> -Nathan

Michael

> 
>> +
>> +drmem_remove_lmb_update(lmb);
>> +}
>> +}
>> +
>> +return rc;
>> +}
>> +
>>  static int dlpar_memory_remove_by_ic(u32 lmbs_to_remove, u32 drc_index)
>>  {
>>  struct drmem_lmb *lmb, *start_lmb, *end_lmb;
>> @@ -646,6 +670,10 @@ static int dlpar_memory_readd_by_index(u32 drc_index)
>>  {
>>  return -EOPNOTSUPP;
>>  }
>> +static int dlpar_memory_readd_multiple(void)
>> +{
>> +return -EOPNOTSUPP;
>> +}
>>
>>  static int dlpar_memory_remove_by_ic(u32 lmbs_to_remove, u32 drc_index)
>>  {
>> @@ -923,6 +951,9 @@ int dlpar_memory(struct pseries_hp_errorlog *hp_elog)
>>  drc_index = hp_elog->_drc_u.drc_index;
>>  rc = dlpar_memory_readd_by_index(drc_index);
>>  break;
>> +case PSERIES_HP_ELOG_ACTION_READD_MULTIPLE:
>> +rc = dlpar_memory_readd_multiple();
>> +break;
>>  default:
>>  pr_err("Invalid action (%d) specified\n", hp_elog->action);
>>  rc = -EINVAL;
>>
> 
> 

-- 
Michael W. Bringmann
Linux Technology Center
IBM Corporation
Tie-Line  363-5196
External: (512) 286-5196
Cell:   (512) 466-0650
m...@linux.vnet.ibm.com



Re: [PATCH v04 1/5] powerpc/drmem: Export 'dynamic-memory' loader

2018-10-10 Thread Michael Bringmann
On 10/10/2018 11:54 AM, Nathan Fontenot wrote:
> On 10/09/2018 03:36 PM, Michael Bringmann wrote:
>> powerpc/drmem: Export many of the functions of DRMEM to parse
>> "ibm,dynamic-memory" and "ibm,dynamic-memory-v2" during hotplug
>> operations and for Post Migration events.
>>
>> Also modify the DRMEM initialization code to allow it to,
>>
>> * Be called after system initialization
>> * Provide a separate user copy of the LMB array that is produces
>> * Free the user copy upon request
>>
>> In addition, a couple of changes were made to make the creation
>> of additional copies of the LMB array more useful including,
>>
>> * Add new iterator to work through a pair of drmem_info arrays.
>> * Modify DRMEM code to replace usages of dt_root_addr_cells, and
>>   dt_mem_next_cell, as these are only available at first boot.
>>
>> Signed-off-by: Michael Bringmann 
>> ---
>>  arch/powerpc/include/asm/drmem.h |   15 
>>  arch/powerpc/mm/drmem.c  |   75 
>> --
>>  2 files changed, 70 insertions(+), 20 deletions(-)
>>
>> diff --git a/arch/powerpc/include/asm/drmem.h 
>> b/arch/powerpc/include/asm/drmem.h
>> index 7c1d8e7..1fbb684 100644
>> --- a/arch/powerpc/include/asm/drmem.h
>> +++ b/arch/powerpc/include/asm/drmem.h
>> @@ -35,6 +35,18 @@ struct drmem_lmb_info {
>>  &drmem_info->lmbs[0],   \
>>  &drmem_info->lmbs[drmem_info->n_lmbs - 1])
>>
>> +#define for_each_dinfo_lmb(dinfo, lmb)  \
>> +for_each_drmem_lmb_in_range((lmb),  \
>> +&dinfo->lmbs[0],\
>> +&dinfo->lmbs[dinfo->n_lmbs - 1])
>> +
>> +#define for_each_pair_dinfo_lmb(dinfo1, lmb1, dinfo2, lmb2) \
>> +for ((lmb1) = (&dinfo1->lmbs[0]),   \
>> + (lmb2) = (&dinfo2->lmbs[0]);   \
>> + ((lmb1) <= (&dinfo1->lmbs[dinfo1->n_lmbs - 1])) && \
>> + ((lmb2) <= (&dinfo2->lmbs[dinfo2->n_lmbs - 1]));   \
>> + (lmb1)++, (lmb2)++)
>> +
> 
> The macros for traversing seem to be getting a bit unwieldy with these
> updates. I wonder if we should move to just using walk routine
> for all traversing of the drmem lmbs.

We can do that.  One new routine + one API - several macros + 2 files changed.

> 
>>  /*
>>   * The of_drconf_cell_v1 struct defines the layout of the LMB data
>>   * specified in the ibm,dynamic-memory device tree property.
>> @@ -94,6 +106,9 @@ void __init walk_drmem_lmbs(struct device_node *dn,
>>  void (*func)(struct drmem_lmb *, const __be32 **));
>>  int drmem_update_dt(void);
>>
>> +struct drmem_lmb_info *drmem_lmbs_init(struct property *prop);
>> +void drmem_lmbs_free(struct drmem_lmb_info *dinfo);
>> +
>>  #ifdef CONFIG_PPC_PSERIES
>>  void __init walk_drmem_lmbs_early(unsigned long node,
>>  void (*func)(struct drmem_lmb *, const __be32 **));
>> diff --git a/arch/powerpc/mm/drmem.c b/arch/powerpc/mm/drmem.c
>> index 3f18036..13d2abb 100644
>> --- a/arch/powerpc/mm/drmem.c
>> +++ b/arch/powerpc/mm/drmem.c
>> @@ -20,6 +20,7 @@
>>
>>  static struct drmem_lmb_info __drmem_info;
>>  struct drmem_lmb_info *drmem_info = &__drmem_info;
>> +static int n_root_addr_cells;
>>
>>  u64 drmem_lmb_memory_max(void)
>>  {
>> @@ -193,12 +194,13 @@ int drmem_update_dt(void)
>>  return rc;
>>  }
>>
>> -static void __init read_drconf_v1_cell(struct drmem_lmb *lmb,
>> +static void read_drconf_v1_cell(struct drmem_lmb *lmb,
>> const __be32 **prop)
>>  {
>>  const __be32 *p = *prop;
>>
>> -lmb->base_addr = dt_mem_next_cell(dt_root_addr_cells, &p);
>> +lmb->base_addr = of_read_number(p, n_root_addr_cells);
>> +p += n_root_addr_cells;
> 
> Any reason this can't just be
>   lmb->base_addr= dt_mem_next_cell(n_root_addr_cells, &p);

Probably, not.  I will rebuild/retest with this.

> 
>>  lmb->drc_index = of_read_number(p++, 1);
>>
>>  p++; /* skip reserved field */
>> @@ -209,7 +211,7 @@ static void __init read_drconf_v1_cell(struct drmem_lmb 
>> *lmb,
>>  *prop = p;
>>  }
>>
>> -static void __init __walk_drmem_v1_lmbs(const __be32 *prop, const __be32 
>> *usm,
>> +static void __walk_drmem_v1_lmbs(const __be32 *prop, const __be32 *usm,
>>  void (*func)(struct drmem_lmb *, const __be32 **))
>>  {
>>  struct drmem_lmb lmb;
>> @@ -225,13 +227,14 @@ static void __init __walk_drmem_v1_lmbs(const __be32 
>> *prop, const __be32 *usm,
>>  }
>>  }
>>
>> -static void __init read_drconf_v2_cell(struct of_drconf_cell_v2 *dr_cell,
>> +static void read_drconf_v2_cell(struct of_drconf_cell_v2 *dr_cell,
>> const __be32 **prop)
>>  {
>>  const __be32 *p = *prop;
>>
>>  dr_cell->seq_lmbs = of_read_number(p++, 1);
>> -dr_cell->base_addr = dt_mem_next_cell(dt_root_addr_cells, &p);
>> +dr_cell->base_addr = of_read_number(p, n_root_addr_ce

Re: [PATCH v04 3/4] migration/memory: Evaluate LMB assoc changes

2018-10-10 Thread Nathan Fontenot
On 10/09/2018 03:37 PM, Michael Bringmann wrote:
> migration/memory: This patch adds code that recognizes changes to
> the associativity of memory blocks described by the device-tree
> properties in order to drive equivalent 'hotplug' operations to
> update local and general kernel data structures to reflect those
> changes.  These differences may include:
> 
> * Evaluate 'ibm,dynamic-memory' properties when processing the
>   updated device-tree properties of the system during Post Migration
>   events (migration_store).  The new functionality looks for changes
>   to the aa_index values for each drc_index/LMB to identify any memory
>   blocks that should be readded.
> 
> * In an LPAR migration scenario, the "ibm,associativity-lookup-arrays"
>   property may change.  In the event that a row of the array differs,
>   locate all assigned memory blocks with that 'aa_index' and 're-add'
>   them to the system memory block data structures.  In the process of
>   the 're-add', the system routines will update the corresponding entry
>   for the memory in the LMB structures and any other relevant kernel
>   data structures.
> 
> A number of previous extensions made to the DRMEM code for scanning
> device-tree properties and creating LMB arrays are used here to
> ensure that the resulting code is simpler and more usable:
> 
> * Use new paired list iterator for the DRMEM LMB info arrays to find
>   differences in old and new versions of properties.
> * Use new iterator for copies of the DRMEM info arrays to evaluate
>   completely new structures.
> * Combine common code for parsing and evaluating memory description
>   properties based on the DRMEM LMB array model to greatly simplify
>   extension from the older property 'ibm,dynamic-memory' to the new
>   property model of 'ibm,dynamic-memory-v2'.
> 
> For support, add a new pseries hotplug action for DLPAR operations,
> PSERIES_HP_ELOG_ACTION_READD_MULTIPLE.  It is a variant of the READD
> operation which performs the action upon multiple instances of the
> resource at one time.  The operation is to be triggered by device-tree
> analysis of updates by RTAS events analyzed by 'migation_store' during
> post-migration processing.  It will be used for memory updates,
> initially.
> 
> Signed-off-by: Michael Bringmann 
> ---
> Changes in v04:
>   -- Move dlpar_memory_readd_multiple() function definition and use
>  into previous patch along with action constant definition.
>   -- Correct spacing in patch
> Changes in v03:
>   -- Modify the code that parses the memory affinity attributes to
>  mark relevant DRMEM LMB array entries using the internal_flags
>  mechanism instead of generate unique hotplug actions for each
>  memory block to be readded.  The change is intended to both
>  simplify the code, and to require fewer resources on systems
>  with huge amounts of memory.
>   -- Save up notice about any all LMB entries until the end of the
>  'migration_store' operation at which point a single action is
>  queued to scan the entire DRMEM array.
>   -- Add READD_MULTIPLE function for memory that scans the DRMEM
>  array to identify multiple entries that were marked previously.
>  The corresponding memory blocks are to be readded to the system
>  to update relevant data structures outside of the powerpc-
>  specific code.
>   -- Change dlpar_memory_pmt_changes_action to directly queue worker
>  to pseries work queue.
> ---
>  arch/powerpc/platforms/pseries/hotplug-memory.c |  189 
> +++
>  arch/powerpc/platforms/pseries/mobility.c   |4 
>  arch/powerpc/platforms/pseries/pseries.h|4 
>  3 files changed, 163 insertions(+), 34 deletions(-)
> 
> diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c 
> b/arch/powerpc/platforms/pseries/hotplug-memory.c
> index bf2420a..a7ca22e 100644
> --- a/arch/powerpc/platforms/pseries/hotplug-memory.c
> +++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
> @@ -534,8 +534,11 @@ static int dlpar_memory_readd_by_index(u32 drc_index)
>   }
>   }
> 
> - if (!lmb_found)
> - rc = -EINVAL;
> + if (!lmb_found) {
> + pr_info("Failed to update memory for drc index %lx\n",
> + (unsigned long) drc_index);
> + return -EINVAL;
> + }
> 
>   if (rc)
>   pr_info("Failed to update memory at %llx\n",
> @@ -1002,13 +1005,43 @@ static int pseries_add_mem_node(struct device_node 
> *np)
>   return (ret < 0) ? -EINVAL : 0;
>  }
> 
> -static int pseries_update_drconf_memory(struct of_reconfig_data *pr)
> +static int pmt_changes = 0;
> +
> +void dlpar_memory_pmt_changes_set(void)
> +{
> + pmt_changes = 1;
> +}
> +
> +void dlpar_memory_pmt_changes_clear(void)
> +{
> + pmt_changes = 0;
> +}
> +
> +int dlpar_memory_pmt_changes(void)
> +{
> + return pmt_changes;
> +}
> +
> +void dlpar_memory_pmt_changes_action(void)
> +{
> + if (dlpar_memory_pmt_chan

Re: [PATCH 4/4] powerpc: Add -Wimplicit-fallthrough to arch CFLAGS

2018-10-10 Thread kbuild test robot
Hi Michael,

I love your patch! Yet something to improve:

[auto build test ERROR on powerpc/next]
[also build test ERROR on v4.19-rc7 next-20181010]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Michael-Ellerman/powerpc-Move-core-kernel-logic-into-arch-powerpc-Kbuild/20181010-205834
base:   https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next
config: powerpc-defconfig (attached as .config)
compiler: powerpc64-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0
reproduce:
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
GCC_VERSION=7.2.0 make.cross ARCH=powerpc 

All errors (new ones prefixed by >>):

   In file included from arch/powerpc/kernel/signal_32.c:32:0:
   include/linux/compat.h: In function 'put_compat_sigset':
>> include/linux/compat.h:494:51: error: this statement may fall through 
>> [-Werror=implicit-fallthrough=]
 case 4: v.sig[7] = (set->sig[3] >> 32); v.sig[6] = set->sig[3];
 ~^
   include/linux/compat.h:495:2: note: here
 case 3: v.sig[5] = (set->sig[2] >> 32); v.sig[4] = set->sig[2];
 ^~~~
   include/linux/compat.h:495:51: error: this statement may fall through 
[-Werror=implicit-fallthrough=]
 case 3: v.sig[5] = (set->sig[2] >> 32); v.sig[4] = set->sig[2];
 ~^
   include/linux/compat.h:496:2: note: here
 case 2: v.sig[3] = (set->sig[1] >> 32); v.sig[2] = set->sig[1];
 ^~~~
   include/linux/compat.h:496:51: error: this statement may fall through 
[-Werror=implicit-fallthrough=]
 case 2: v.sig[3] = (set->sig[1] >> 32); v.sig[2] = set->sig[1];
 ~^
   include/linux/compat.h:497:2: note: here
 case 1: v.sig[1] = (set->sig[0] >> 32); v.sig[0] = set->sig[0];
 ^~~~
   cc1: all warnings being treated as errors
--
   arch/powerpc/kernel/nvram_64.c: In function 'dev_nvram_ioctl':
>> arch/powerpc/kernel/nvram_64.c:811:3: error: this statement may fall through 
>> [-Werror=implicit-fallthrough=]
  printk(KERN_WARNING "nvram: Using obsolete PMAC_NVRAM_GET_OFFSET 
ioctl\n");
  ^~
   arch/powerpc/kernel/nvram_64.c:812:2: note: here
 case IOC_NVRAM_GET_OFFSET: {
 ^~~~
   cc1: all warnings being treated as errors
--
   In file included from include/linux/kvm_host.h:14:0,
from arch/powerpc/kvm/../../../virt/kvm/kvm_main.c:21:
   include/linux/signal.h: In function 'sigemptyset':
>> include/linux/signal.h:180:22: error: this statement may fall through 
>> [-Werror=implicit-fallthrough=]
 case 2: set->sig[1] = 0;
 ^~~
   include/linux/signal.h:181:2: note: here
 case 1: set->sig[0] = 0;
 ^~~~
   cc1: all warnings being treated as errors
--
   arch/powerpc/platforms/powermac/feature.c: In function 'g5_i2s_enable':
>> arch/powerpc/platforms/powermac/feature.c:1477:6: error: this statement may 
>> fall through [-Werror=implicit-fallthrough=]
  if (macio->type == macio_shasta)
 ^
   arch/powerpc/platforms/powermac/feature.c:1479:2: note: here
 default:
 ^~~
   cc1: all warnings being treated as errors
--
   arch/powerpc/xmon/xmon.c: In function 'do_spu_cmd':
>> arch/powerpc/xmon/xmon.c:4023:24: error: this statement may fall through 
>> [-Werror=implicit-fallthrough=]
  if (isxdigit(subcmd) || subcmd == '\n')
   arch/powerpc/xmon/xmon.c:4025:2: note: here
 case 'f':
 ^~~~
   cc1: all warnings being treated as errors

vim +494 include/linux/compat.h

fde9fc76 Matt Redfearn 2018-02-19  481  
fde9fc76 Matt Redfearn 2018-02-19  482  /*
fde9fc76 Matt Redfearn 2018-02-19  483   * Defined inline such that size can be 
compile time constant, which avoids
fde9fc76 Matt Redfearn 2018-02-19  484   * CONFIG_HARDENED_USERCOPY complaining 
about copies from task_struct
fde9fc76 Matt Redfearn 2018-02-19  485   */
fde9fc76 Matt Redfearn 2018-02-19  486  static inline int
fde9fc76 Matt Redfearn 2018-02-19  487  put_compat_sigset(compat_sigset_t 
__user *compat, const sigset_t *set,
fde9fc76 Matt Redfearn 2018-02-19  488unsigned int size)
fde9fc76 Matt Redfearn 2018-02-19  489  {
fde9fc76 Matt Redfearn 2018-02-19  490  /* size <= 
sizeof(compat_sigset_t) <= sizeof(sigset_t) */
fde9fc76 Matt Redfearn 2018-02-19  491  #ifdef __BIG_ENDIAN
fde9fc76 Matt Redfearn 2018-02-19  492  compat_sigset_t v;
fde9f

Re: [PATCH v04 3/5] migration/memory: Add hotplug READD_MULTIPLE

2018-10-10 Thread Nathan Fontenot
On 10/09/2018 03:36 PM, Michael Bringmann wrote:
> migration/memory: This patch adds a new pseries hotplug action
> for CPU and memory operations, PSERIES_HP_ELOG_ACTION_READD_MULTIPLE.
> This is a variant of the READD operation which performs the action
> upon multiple instances of the resource at one time.  The operation
> is to be triggered by device-tree analysis of updates by RTAS events
> analyzed by 'migation_store' during post-migration processing.  It
> will be used for memory updates, initially.
> 
> Signed-off-by: Michael Bringmann 
> ---
> Changes in v04:
>   -- Move init of 'lmb->internal_flags' in init_drmem_v2_lmbs to
>  previous patch.
>   -- Pull in implementation of dlpar_memory_readd_multiple() to go
>  with operation flag.
> ---
>  arch/powerpc/include/asm/rtas.h |1 +
>  arch/powerpc/platforms/pseries/hotplug-memory.c |   31 
> +++
>  2 files changed, 32 insertions(+)
> 
> diff --git a/arch/powerpc/include/asm/rtas.h b/arch/powerpc/include/asm/rtas.h
> index 0183e95..cc00451 100644
> --- a/arch/powerpc/include/asm/rtas.h
> +++ b/arch/powerpc/include/asm/rtas.h
> @@ -333,6 +333,7 @@ struct pseries_hp_errorlog {
>  #define PSERIES_HP_ELOG_ACTION_ADD   1
>  #define PSERIES_HP_ELOG_ACTION_REMOVE2
>  #define PSERIES_HP_ELOG_ACTION_READD 3
> +#define PSERIES_HP_ELOG_ACTION_READD_MULTIPLE4
> 
>  #define PSERIES_HP_ELOG_ID_DRC_NAME  1
>  #define PSERIES_HP_ELOG_ID_DRC_INDEX 2
> diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c 
> b/arch/powerpc/platforms/pseries/hotplug-memory.c
> index 9a15d39..bf2420a 100644
> --- a/arch/powerpc/platforms/pseries/hotplug-memory.c
> +++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
> @@ -546,6 +546,30 @@ static int dlpar_memory_readd_by_index(u32 drc_index)
>   return rc;
>  }
> 
> +static int dlpar_memory_readd_multiple(void)
> +{
> + struct drmem_lmb *lmb;
> + int rc;
> +
> + pr_info("Attempting to update multiple LMBs\n");
> +
> + for_each_drmem_lmb(lmb) {
> + if (drmem_lmb_update(lmb)) {
> + rc = dlpar_remove_lmb(lmb);
> +
> + if (!rc) {
> + rc = dlpar_add_lmb(lmb);
> + if (rc)
> + dlpar_release_drc(lmb->drc_index);
> + }

The work you're doing here is essentially the same that is done in
dlpar_memory_readd_by_index(). Perhaps pulling the commin bits of both
routines into a helper routine. This could include the success/failure
messages in dlpar_memory_readd_by_index()

-Nathan

> +
> + drmem_remove_lmb_update(lmb);
> + }
> + }
> +
> + return rc;
> +}
> +
>  static int dlpar_memory_remove_by_ic(u32 lmbs_to_remove, u32 drc_index)
>  {
>   struct drmem_lmb *lmb, *start_lmb, *end_lmb;
> @@ -646,6 +670,10 @@ static int dlpar_memory_readd_by_index(u32 drc_index)
>  {
>   return -EOPNOTSUPP;
>  }
> +static int dlpar_memory_readd_multiple(void)
> +{
> + return -EOPNOTSUPP;
> +}
> 
>  static int dlpar_memory_remove_by_ic(u32 lmbs_to_remove, u32 drc_index)
>  {
> @@ -923,6 +951,9 @@ int dlpar_memory(struct pseries_hp_errorlog *hp_elog)
>   drc_index = hp_elog->_drc_u.drc_index;
>   rc = dlpar_memory_readd_by_index(drc_index);
>   break;
> + case PSERIES_HP_ELOG_ACTION_READD_MULTIPLE:
> + rc = dlpar_memory_readd_multiple();
> + break;
>   default:
>   pr_err("Invalid action (%d) specified\n", hp_elog->action);
>   rc = -EINVAL;
> 



Re: [PATCH 4/4] powerpc: Add -Wimplicit-fallthrough to arch CFLAGS

2018-10-10 Thread kbuild test robot
Hi Michael,

I love your patch! Yet something to improve:

[auto build test ERROR on powerpc/next]
[also build test ERROR on v4.19-rc7 next-20181010]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Michael-Ellerman/powerpc-Move-core-kernel-logic-into-arch-powerpc-Kbuild/20181010-205834
base:   https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next
config: powerpc-socrates_defconfig (attached as .config)
compiler: powerpc-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0
reproduce:
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
GCC_VERSION=7.2.0 make.cross ARCH=powerpc 

All errors (new ones prefixed by >>):

   arch/powerpc/kernel/align.c: In function 'emulate_spe':
>> arch/powerpc/kernel/align.c:183:8: error: this statement may fall through 
>> [-Werror=implicit-fallthrough=]
   ret |= __get_user_inatomic(temp.v[3], p++);
   ^~
   arch/powerpc/kernel/align.c:184:3: note: here
  case 4:
  ^~~~
   arch/powerpc/kernel/align.c:186:8: error: this statement may fall through 
[-Werror=implicit-fallthrough=]
   ret |= __get_user_inatomic(temp.v[5], p++);
   ^~
   arch/powerpc/kernel/align.c:187:3: note: here
  case 2:
  ^~~~
   arch/powerpc/kernel/align.c:266:8: error: this statement may fall through 
[-Werror=implicit-fallthrough=]
   ret |= __put_user_inatomic(data.v[3], p++);
   ^~
   arch/powerpc/kernel/align.c:267:3: note: here
  case 4:
  ^~~~
   arch/powerpc/kernel/align.c:269:8: error: this statement may fall through 
[-Werror=implicit-fallthrough=]
   ret |= __put_user_inatomic(data.v[5], p++);
   ^~
   arch/powerpc/kernel/align.c:270:3: note: here
  case 2:
  ^~~~
   cc1: all warnings being treated as errors

vim +183 arch/powerpc/kernel/align.c

26caeb2e Kumar Gala 2007-08-24  104  
26caeb2e Kumar Gala 2007-08-24  105  /*
26caeb2e Kumar Gala 2007-08-24  106   * Emulate SPE loads and 
stores.
26caeb2e Kumar Gala 2007-08-24  107   * Only Book-E has these 
instructions, and it does true little-endian,
26caeb2e Kumar Gala 2007-08-24  108   * so we don't need the 
address swizzling.
26caeb2e Kumar Gala 2007-08-24  109   */
26caeb2e Kumar Gala 2007-08-24  110  static int emulate_spe(struct 
pt_regs *regs, unsigned int reg,
26caeb2e Kumar Gala 2007-08-24  111unsigned 
int instr)
26caeb2e Kumar Gala 2007-08-24  112  {
f626190d Anton Blanchard2013-09-23  113 int ret;
26caeb2e Kumar Gala 2007-08-24  114 union {
26caeb2e Kumar Gala 2007-08-24  115 u64 ll;
26caeb2e Kumar Gala 2007-08-24  116 u32 w[2];
26caeb2e Kumar Gala 2007-08-24  117 u16 h[4];
26caeb2e Kumar Gala 2007-08-24  118 u8 v[8];
26caeb2e Kumar Gala 2007-08-24  119 } data, temp;
26caeb2e Kumar Gala 2007-08-24  120 unsigned char __user 
*p, *addr;
26caeb2e Kumar Gala 2007-08-24  121 unsigned long *evr = 
¤t->thread.evr[reg];
26caeb2e Kumar Gala 2007-08-24  122 unsigned int nb, flags;
26caeb2e Kumar Gala 2007-08-24  123  
26caeb2e Kumar Gala 2007-08-24  124 instr = (instr >> 1) & 
0x1f;
26caeb2e Kumar Gala 2007-08-24  125  
26caeb2e Kumar Gala 2007-08-24  126 /* DAR has the operand 
effective address */
26caeb2e Kumar Gala 2007-08-24  127 addr = (unsigned char 
__user *)regs->dar;
26caeb2e Kumar Gala 2007-08-24  128  
26caeb2e Kumar Gala 2007-08-24  129 nb = 
spe_aligninfo[instr].len;
26caeb2e Kumar Gala 2007-08-24  130 flags = 
spe_aligninfo[instr].flags;
26caeb2e Kumar Gala 2007-08-24  131  
26caeb2e Kumar Gala 2007-08-24  132 /* Verify the address 
of the operand */
26caeb2e Kumar Gala 2007-08-24  133 if 
(unlikely(user_mode(regs) &&
26caeb2e Kumar Gala 2007-08-24  134  
!access_ok((flags & ST ? VERIFY_WRITE : VERIFY_READ),
26caeb2e Kumar Gala 2007-08-24  135 
addr, nb)))
26caeb2e Kumar Gala 2007-08-24  136 return -EFAULT;
26caeb2e Kumar Gala 2007-08-24  137  
26caeb2e Kumar Gala 2007-08-24  138 /* userland only */
26caeb2e Kumar Gala 2007-08-24  139 if 
(unlikely(!user_mode(regs)))
26caeb2e Kumar Gala 2007-08-24  140 return 0;
26c

Re: [PATCH v04 1/5] powerpc/drmem: Export 'dynamic-memory' loader

2018-10-10 Thread Nathan Fontenot
On 10/09/2018 03:36 PM, Michael Bringmann wrote:
> powerpc/drmem: Export many of the functions of DRMEM to parse
> "ibm,dynamic-memory" and "ibm,dynamic-memory-v2" during hotplug
> operations and for Post Migration events.
> 
> Also modify the DRMEM initialization code to allow it to,
> 
> * Be called after system initialization
> * Provide a separate user copy of the LMB array that is produces
> * Free the user copy upon request
> 
> In addition, a couple of changes were made to make the creation
> of additional copies of the LMB array more useful including,
> 
> * Add new iterator to work through a pair of drmem_info arrays.
> * Modify DRMEM code to replace usages of dt_root_addr_cells, and
>   dt_mem_next_cell, as these are only available at first boot.
> 
> Signed-off-by: Michael Bringmann 
> ---
>  arch/powerpc/include/asm/drmem.h |   15 
>  arch/powerpc/mm/drmem.c  |   75 
> --
>  2 files changed, 70 insertions(+), 20 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/drmem.h 
> b/arch/powerpc/include/asm/drmem.h
> index 7c1d8e7..1fbb684 100644
> --- a/arch/powerpc/include/asm/drmem.h
> +++ b/arch/powerpc/include/asm/drmem.h
> @@ -35,6 +35,18 @@ struct drmem_lmb_info {
>   &drmem_info->lmbs[0],   \
>   &drmem_info->lmbs[drmem_info->n_lmbs - 1])
> 
> +#define for_each_dinfo_lmb(dinfo, lmb)   \
> + for_each_drmem_lmb_in_range((lmb),  \
> + &dinfo->lmbs[0],\
> + &dinfo->lmbs[dinfo->n_lmbs - 1])
> +
> +#define for_each_pair_dinfo_lmb(dinfo1, lmb1, dinfo2, lmb2)  \
> + for ((lmb1) = (&dinfo1->lmbs[0]),   \
> +  (lmb2) = (&dinfo2->lmbs[0]);   \
> +  ((lmb1) <= (&dinfo1->lmbs[dinfo1->n_lmbs - 1])) && \
> +  ((lmb2) <= (&dinfo2->lmbs[dinfo2->n_lmbs - 1]));   \
> +  (lmb1)++, (lmb2)++)
> +

The macros for traversing seem to be getting a bit unwieldy with these
updates. I wonder if we should move to just using walk routine
for all traversing of the drmem lmbs.

>  /*
>   * The of_drconf_cell_v1 struct defines the layout of the LMB data
>   * specified in the ibm,dynamic-memory device tree property.
> @@ -94,6 +106,9 @@ void __init walk_drmem_lmbs(struct device_node *dn,
>   void (*func)(struct drmem_lmb *, const __be32 **));
>  int drmem_update_dt(void);
> 
> +struct drmem_lmb_info *drmem_lmbs_init(struct property *prop);
> +void drmem_lmbs_free(struct drmem_lmb_info *dinfo);
> +
>  #ifdef CONFIG_PPC_PSERIES
>  void __init walk_drmem_lmbs_early(unsigned long node,
>   void (*func)(struct drmem_lmb *, const __be32 **));
> diff --git a/arch/powerpc/mm/drmem.c b/arch/powerpc/mm/drmem.c
> index 3f18036..13d2abb 100644
> --- a/arch/powerpc/mm/drmem.c
> +++ b/arch/powerpc/mm/drmem.c
> @@ -20,6 +20,7 @@
> 
>  static struct drmem_lmb_info __drmem_info;
>  struct drmem_lmb_info *drmem_info = &__drmem_info;
> +static int n_root_addr_cells;
> 
>  u64 drmem_lmb_memory_max(void)
>  {
> @@ -193,12 +194,13 @@ int drmem_update_dt(void)
>   return rc;
>  }
> 
> -static void __init read_drconf_v1_cell(struct drmem_lmb *lmb,
> +static void read_drconf_v1_cell(struct drmem_lmb *lmb,
>  const __be32 **prop)
>  {
>   const __be32 *p = *prop;
> 
> - lmb->base_addr = dt_mem_next_cell(dt_root_addr_cells, &p);
> + lmb->base_addr = of_read_number(p, n_root_addr_cells);
> + p += n_root_addr_cells;

Any reason this can't just be
lmb->base_addr= dt_mem_next_cell(n_root_addr_cells, &p);

>   lmb->drc_index = of_read_number(p++, 1);
> 
>   p++; /* skip reserved field */
> @@ -209,7 +211,7 @@ static void __init read_drconf_v1_cell(struct drmem_lmb 
> *lmb,
>   *prop = p;
>  }
> 
> -static void __init __walk_drmem_v1_lmbs(const __be32 *prop, const __be32 
> *usm,
> +static void __walk_drmem_v1_lmbs(const __be32 *prop, const __be32 *usm,
>   void (*func)(struct drmem_lmb *, const __be32 **))
>  {
>   struct drmem_lmb lmb;
> @@ -225,13 +227,14 @@ static void __init __walk_drmem_v1_lmbs(const __be32 
> *prop, const __be32 *usm,
>   }
>  }
> 
> -static void __init read_drconf_v2_cell(struct of_drconf_cell_v2 *dr_cell,
> +static void read_drconf_v2_cell(struct of_drconf_cell_v2 *dr_cell,
>  const __be32 **prop)
>  {
>   const __be32 *p = *prop;
> 
>   dr_cell->seq_lmbs = of_read_number(p++, 1);
> - dr_cell->base_addr = dt_mem_next_cell(dt_root_addr_cells, &p);
> + dr_cell->base_addr = of_read_number(p, n_root_addr_cells);
> + p += n_root_addr_cells;

Same here.

-Nathan

>   dr_cell->drc_index = of_read_number(p++, 1);
>   dr_cell->aa_index = of_read_number(p++, 1);
>   dr_cell->flags = of_read_number(p++, 1);
> @@ -239,7 +242,7 @@ static void __init read_d

Re: [PATCH 13/36] dt-bindings: arm: Convert PMU binding to json-schema

2018-10-10 Thread Will Deacon
On Tue, Oct 09, 2018 at 01:14:02PM -0500, Rob Herring wrote:
> On Tue, Oct 9, 2018 at 6:57 AM Will Deacon  wrote:
> >
> > Hi Rob,
> >
> > On Fri, Oct 05, 2018 at 11:58:25AM -0500, Rob Herring wrote:
> > > Convert ARM PMU binding to DT schema format using json-schema.
> > >
> > > Cc: Will Deacon 
> > > Cc: Mark Rutland 
> > > Cc: linux-arm-ker...@lists.infradead.org
> > > Cc: devicet...@vger.kernel.org
> > > Signed-off-by: Rob Herring 
> > > ---
> > >  Documentation/devicetree/bindings/arm/pmu.txt | 70 --
> > >  .../devicetree/bindings/arm/pmu.yaml  | 96 +++
> > >  2 files changed, 96 insertions(+), 70 deletions(-)
> > >  delete mode 100644 Documentation/devicetree/bindings/arm/pmu.txt
> > >  create mode 100644 Documentation/devicetree/bindings/arm/pmu.yaml
> >
> > [...]
> >
> > > -- interrupts : 1 combined interrupt or 1 per core. If the interrupt is a 
> > > per-cpu
> > > -   interrupt (PPI) then 1 interrupt should be specified.
> >
> > [...]
> >
> > > +  interrupts:
> > > +oneOf:
> > > +  - maxItems: 1
> > > +  - minItems: 2
> > > +maxItems: 8
> > > +description: 1 interrupt per core.
> > > +
> > > +  interrupts-extended:
> > > +$ref: '#/properties/interrupts'
> >
> > This seems like a semantic different between the two representations, or am
> > I missing something here? Specifically, both the introduction of
> > interrupts-extended and also dropping any mention of using a single per-cpu
> > interrupt (the single combined case is no longer support by Linux; not sure
> > if you want to keep it in the binding).
> 
> 'interrupts-extended' was implied before as it is always supported and
> outside the scope of the binding. But now it is needed to validate
> bindings. There must be some use of it and that's why I added it.
> However, thinking some more about this, I think it may be better to
> have the tools add this in automatically whenever we have an
> interrupts property.

To be honest, if you'd included that in the commit message I'd have been
happy :)

> I guess the single interrupt case is less obvious now with no
> description (it's the first list item of 'oneOf'). The schema If the
> single interrupt is not supported, then we can drop it here.

Well the description says "1 interrupt per core" which is incorrect. I also
don't understand why maxItems is 8.

Will


Re: [PATCH] powerpc/pseries: Export maximum memory value

2018-10-10 Thread Naveen N. Rao

Nathan Fontenot wrote:

On 10/10/2018 05:22 AM, Aravinda Prasad wrote:

This patch exports the maximum possible amount of memory
configured on the system via /proc/powerpc/lparcfg.

Signed-off-by: Aravinda Prasad 
---
 arch/powerpc/platforms/pseries/lparcfg.c |   13 +
 1 file changed, 13 insertions(+)

diff --git a/arch/powerpc/platforms/pseries/lparcfg.c 
b/arch/powerpc/platforms/pseries/lparcfg.c
index 7c872dc..aa82f55 100644
--- a/arch/powerpc/platforms/pseries/lparcfg.c
+++ b/arch/powerpc/platforms/pseries/lparcfg.c
@@ -26,6 +26,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -36,6 +37,7 @@
 #include 
 #include 
 #include 
+#include 

 #include "pseries.h"

@@ -433,6 +435,16 @@ static void parse_em_data(struct seq_file *m)
seq_printf(m, "power_mode_data=%016lx\n", retbuf[0]);
 }

+static void maxmem_data(struct seq_file *m)
+{
+   unsigned long maxmem = 0;
+
+   maxmem += drmem_info->n_lmbs * drmem_info->lmb_size;
+   maxmem += hugetlb_total_pages() * PAGE_SIZE;
+
+   seq_printf(m, "MaxMem=%ld\n", maxmem);


Should this be MaxPossibleMem?

At least for the drmem memory the value calculated is the maximum possible
memory. I wonder if calling it MaxMem would lead users to think they have
that much memory available to them.


That's a good point. This seems to be referred to as just 'maximum 
memory' in the LPAR configuration as well as in the lparstat 
documentation, but it shouldn't hurt to rename it here.


- Naveen




Re: [PATCH 1/2] powerpc/pseries: PAPR persistent memory support

2018-10-10 Thread Nathan Fontenot
On 10/10/2018 01:08 AM, Oliver O'Halloran wrote:
> This patch implements support for discovering storage class memory
> devices at boot and for handling hotplug of new regions via RTAS
> hotplug events.
> 
> Signed-off-by: Oliver O'Halloran 
> ---
>  arch/powerpc/include/asm/firmware.h   |  3 ++-
>  arch/powerpc/include/asm/hvcall.h | 10 +-
>  arch/powerpc/include/asm/rtas.h   |  2 ++
>  arch/powerpc/kernel/rtasd.c   |  2 ++
>  arch/powerpc/platforms/pseries/Makefile   |  2 +-
>  arch/powerpc/platforms/pseries/dlpar.c|  4 
>  arch/powerpc/platforms/pseries/firmware.c |  1 +
>  arch/powerpc/platforms/pseries/pseries.h  |  5 +
>  arch/powerpc/platforms/pseries/ras.c  |  3 ++-
>  9 files changed, 28 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/firmware.h 
> b/arch/powerpc/include/asm/firmware.h
> index 7a051bd21f87..113c64d5d394 100644
> --- a/arch/powerpc/include/asm/firmware.h
> +++ b/arch/powerpc/include/asm/firmware.h
> @@ -52,6 +52,7 @@
>  #define FW_FEATURE_PRRN  ASM_CONST(0x0002)
>  #define FW_FEATURE_DRMEM_V2  ASM_CONST(0x0004)
>  #define FW_FEATURE_DRC_INFO  ASM_CONST(0x0008)
> +#define FW_FEATURE_PAPR_SCM  ASM_CONST(0x0010)
> 
>  #ifndef __ASSEMBLY__
> 
> @@ -69,7 +70,7 @@ enum {
>   FW_FEATURE_SET_MODE | FW_FEATURE_BEST_ENERGY |
>   FW_FEATURE_TYPE1_AFFINITY | FW_FEATURE_PRRN |
>   FW_FEATURE_HPT_RESIZE | FW_FEATURE_DRMEM_V2 |
> - FW_FEATURE_DRC_INFO,
> + FW_FEATURE_DRC_INFO | FW_FEATURE_PAPR_SCM,
>   FW_FEATURE_PSERIES_ALWAYS = 0,
>   FW_FEATURE_POWERNV_POSSIBLE = FW_FEATURE_OPAL,
>   FW_FEATURE_POWERNV_ALWAYS = 0,
> diff --git a/arch/powerpc/include/asm/hvcall.h 
> b/arch/powerpc/include/asm/hvcall.h
> index a0b17f9f1ea4..0e81ef83b35a 100644
> --- a/arch/powerpc/include/asm/hvcall.h
> +++ b/arch/powerpc/include/asm/hvcall.h
> @@ -295,7 +295,15 @@
>  #define H_INT_ESB   0x3C8
>  #define H_INT_SYNC  0x3CC
>  #define H_INT_RESET 0x3D0
> -#define MAX_HCALL_OPCODE H_INT_RESET
> +#define H_SCM_READ_METADATA 0x3E4
> +#define H_SCM_WRITE_METADATA0x3E8
> +#define H_SCM_BIND_MEM  0x3EC
> +#define H_SCM_UNBIND_MEM0x3F0
> +#define H_SCM_QUERY_BLOCK_MEM_BINDING 0x3F4
> +#define H_SCM_QUERY_LOGICAL_MEM_BINDING 0x3F8
> +#define H_SCM_MEM_QUERY  0x3FC
> +#define H_SCM_BLOCK_CLEAR   0x400
> +#define MAX_HCALL_OPCODE H_SCM_BLOCK_CLEAR
> 
>  /* H_VIOCTL functions */
>  #define H_GET_VIOA_DUMP_SIZE 0x01
> diff --git a/arch/powerpc/include/asm/rtas.h b/arch/powerpc/include/asm/rtas.h
> index 71e393c46a49..1e81f3d55457 100644
> --- a/arch/powerpc/include/asm/rtas.h
> +++ b/arch/powerpc/include/asm/rtas.h
> @@ -125,6 +125,7 @@ struct rtas_suspend_me_data {
>  #define RTAS_TYPE_INFO   0xE2
>  #define RTAS_TYPE_DEALLOC0xE3
>  #define RTAS_TYPE_DUMP   0xE4
> +#define RTAS_TYPE_HOTPLUG0xE5
>  /* I don't add PowerMGM events right now, this is a different topic */ 
>  #define RTAS_TYPE_PMGM_POWER_SW_ON   0x60
>  #define RTAS_TYPE_PMGM_POWER_SW_OFF  0x61
> @@ -316,6 +317,7 @@ struct pseries_hp_errorlog {
>  #define PSERIES_HP_ELOG_RESOURCE_MEM 2
>  #define PSERIES_HP_ELOG_RESOURCE_SLOT3
>  #define PSERIES_HP_ELOG_RESOURCE_PHB 4
> +#define PSERIES_HP_ELOG_RESOURCE_PMEM   6
> 
>  #define PSERIES_HP_ELOG_ACTION_ADD   1
>  #define PSERIES_HP_ELOG_ACTION_REMOVE2
> diff --git a/arch/powerpc/kernel/rtasd.c b/arch/powerpc/kernel/rtasd.c
> index 6fafc82c04b0..fad0baddfcba 100644
> --- a/arch/powerpc/kernel/rtasd.c
> +++ b/arch/powerpc/kernel/rtasd.c
> @@ -91,6 +91,8 @@ static char *rtas_event_type(int type)
>   return "Dump Notification Event";
>   case RTAS_TYPE_PRRN:
>   return "Platform Resource Reassignment Event";
> + case RTAS_TYPE_HOTPLUG:
> + return "Hotplug Event";
>   }
> 
>   return rtas_type[0];
> diff --git a/arch/powerpc/platforms/pseries/Makefile 
> b/arch/powerpc/platforms/pseries/Makefile
> index 7e89d5c47068..892b27ced973 100644
> --- a/arch/powerpc/platforms/pseries/Makefile
> +++ b/arch/powerpc/platforms/pseries/Makefile
> @@ -13,7 +13,7 @@ obj-$(CONFIG_KEXEC_CORE)+= kexec.o
>  obj-$(CONFIG_PSERIES_ENERGY) += pseries_energy.o
> 
>  obj-$(CONFIG_HOTPLUG_CPU)+= hotplug-cpu.o
> -obj-$(CONFIG_MEMORY_HOTPLUG) += hotplug-memory.o
> +obj-$(CONFIG_MEMORY_HOTPLUG) += hotplug-memory.o pmem.o
> 
>  obj-$(CONFIG_HVC_CONSOLE)+= hvconsole.o
>  obj-$(CONFIG_HVCS)   += hvcserver.o
> diff --git a/arch/powerpc/platforms/pseries/dlpar.c 
> b/arch/powerpc/platforms/pseries/dlpar.c
> index a0b20c03f078..795996fefdb9 100644
> --- a/arch/powerpc/platforms/pseries/dlpar.c
> +++ b/arch/powerpc/platforms/pseries/dlpar.c
> @@ -357,6 +357,10 @@ static int hand

Re: [PATCH 0/2] sriov enablement on s390

2018-10-10 Thread Bjorn Helgaas
On Wed, Oct 10, 2018 at 02:55:07PM +0200, Sebastian Ott wrote:
> Hello Bjorn,
> 
> On Wed, 12 Sep 2018, Bjorn Helgaas wrote:
> > On Wed, Sep 12, 2018 at 02:34:09PM +0200, Sebastian Ott wrote:
> > > On s390 we currently handle SRIOV within firmware. Which means
> > > that the PF is under firmware control and not visible to operating
> > > systems. SRIOV enablement happens within firmware and VFs are
> > > passed through to logical partitions.
> > > 
> > > I'm working on a new mode were the PF is under operating system
> > > control (including SRIOV enablement). However we still need
> > > firmware support to access the VFs. The way this is supposed
> > > to work is that when firmware traps the SRIOV enablement it
> > > will present machine checks to the logical partition that
> > > triggered the SRIOV enablement and provide the VFs via hotplug
> > > events.
> > > 
> > > The problem I'm faced with is that the VF detection code in
> > > sriov_enable leads to unusable functions in s390.
> > 
> > We're moving away from the weak function implementation style.  Can
> > you take a look at Arnd's work here, which uses pci_host_bridge
> > callbacks instead?
> > 
> >   https://lkml.kernel.org/r/20180817102645.3839621-1-a...@arndb.de
> 
> What's the status of Arnd's patches - will they go upstream in the next
> couple of versions?

I hope so [1].  IIRC Arnd mentioned doing some minor updates, so I'm
waiting on that.

> What about my patches that I rebased on Arnd's branch
> will they be considered?

Definitely.  From my point of view they're just lined up behind Arnd's
patches.

[1] 
https://lore.kernel.org/linux-pci/20181002205903.gd120...@bhelgaas-glaptop.roam.corp.google.com


Re: [PATCH v02] powerpc/mobility: Extend start/stop topology update scope

2018-10-10 Thread Nathan Fontenot
On 10/09/2018 03:12 PM, Michael Bringmann wrote:
> The PPC mobility code may receive RTAS requests to perform PRRN
> topology changes at any time, including during LPAR migration
> operations.  In some configurations where the affinity of CPUs
> or memory is being changed on that platform, the PRRN requests
> may apply or refer to outdated information prior to the complete
> update of the device-tree.  This patch changes the duration for
> which topology updates are suppressed during LPAR migrations from
> just the rtas_ibm_suspend_me / 'ibm,suspend-me' call(s) to cover
> the entire 'migration_store' operation to allow all changes to
> the device-tree to be applied prior to accepting and applying any
> PRRN requests.
> 
> For tracking purposes, pr_info notices are added to the functions
> start_topology_update() and stop_topology_update() of 'numa.c'.
> 
> Signed-off-by: Michael Bringmann 

Reviewed-by: Nathan Fontenot 

> ---
> Changes in v02:
>   -- Rebase to latest powerpc next tree.
> ---
>  arch/powerpc/kernel/rtas.c|2 --
>  arch/powerpc/mm/numa.c|6 ++
>  arch/powerpc/platforms/pseries/mobility.c |5 +
>  3 files changed, 11 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c
> index 2c7ed31..e02ac37 100644
> --- a/arch/powerpc/kernel/rtas.c
> +++ b/arch/powerpc/kernel/rtas.c
> @@ -982,7 +982,6 @@ int rtas_ibm_suspend_me(u64 handle)
>   }
> 
>   cpu_hotplug_disable();
> - stop_topology_update();
> 
>   /* Call function on all CPUs.  One of us will make the
>* rtas call
> @@ -995,7 +994,6 @@ int rtas_ibm_suspend_me(u64 handle)
>   if (atomic_read(&data.error) != 0)
>   printk(KERN_ERR "Error doing global join\n");
> 
> - start_topology_update();
>   cpu_hotplug_enable();
> 
>   /* Take down CPUs not online prior to suspend */
> diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
> index b5a71ba..0ade0a1 100644
> --- a/arch/powerpc/mm/numa.c
> +++ b/arch/powerpc/mm/numa.c
> @@ -1518,6 +1518,10 @@ int start_topology_update(void)
>   }
>   }
> 
> + pr_info("Starting topology update%s%s\n",
> + (prrn_enabled ? " prrn_enabled" : ""),
> + (vphn_enabled ? " vphn_enabled" : ""));
> +
>   return rc;
>  }
> 
> @@ -1539,6 +1543,8 @@ int stop_topology_update(void)
>   rc = del_timer_sync(&topology_timer);
>   }
> 
> + pr_info("Stopping topology update\n");
> +
>   return rc;
>  }
> 
> diff --git a/arch/powerpc/platforms/pseries/mobility.c 
> b/arch/powerpc/platforms/pseries/mobility.c
> index 2f0f512..7da222d 100644
> --- a/arch/powerpc/platforms/pseries/mobility.c
> +++ b/arch/powerpc/platforms/pseries/mobility.c
> @@ -367,6 +367,8 @@ static ssize_t migration_store(struct class *class,
>   if (rc)
>   return rc;
> 
> + stop_topology_update();
> +
>   do {
>   rc = rtas_ibm_suspend_me(streamid);
>   if (rc == -EAGAIN)
> @@ -377,6 +379,9 @@ static ssize_t migration_store(struct class *class,
>   return rc;
> 
>   post_mobility_fixup();
> +
> + start_topology_update();
> +
>   return count;
>  }
> 
> 



Re: [PATCH] powerpc/pseries: Export maximum memory value

2018-10-10 Thread Nathan Fontenot
On 10/10/2018 05:22 AM, Aravinda Prasad wrote:
> This patch exports the maximum possible amount of memory
> configured on the system via /proc/powerpc/lparcfg.
> 
> Signed-off-by: Aravinda Prasad 
> ---
>  arch/powerpc/platforms/pseries/lparcfg.c |   13 +
>  1 file changed, 13 insertions(+)
> 
> diff --git a/arch/powerpc/platforms/pseries/lparcfg.c 
> b/arch/powerpc/platforms/pseries/lparcfg.c
> index 7c872dc..aa82f55 100644
> --- a/arch/powerpc/platforms/pseries/lparcfg.c
> +++ b/arch/powerpc/platforms/pseries/lparcfg.c
> @@ -26,6 +26,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -36,6 +37,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
> 
>  #include "pseries.h"
> 
> @@ -433,6 +435,16 @@ static void parse_em_data(struct seq_file *m)
>   seq_printf(m, "power_mode_data=%016lx\n", retbuf[0]);
>  }
> 
> +static void maxmem_data(struct seq_file *m)
> +{
> + unsigned long maxmem = 0;
> +
> + maxmem += drmem_info->n_lmbs * drmem_info->lmb_size;
> + maxmem += hugetlb_total_pages() * PAGE_SIZE;
> +
> + seq_printf(m, "MaxMem=%ld\n", maxmem);

Should this be MaxPossibleMem?

At least for the drmem memory the value calculated is the maximum possible
memory. I wonder if calling it MaxMem would lead users to think they have
that much memory available to them.

-Nathan

> +}
> +
>  static int pseries_lparcfg_data(struct seq_file *m, void *v)
>  {
>   int partition_potential_processors;
> @@ -491,6 +503,7 @@ static int pseries_lparcfg_data(struct seq_file *m, void 
> *v)
>   seq_printf(m, "slb_size=%d\n", mmu_slb_size);
>  #endif
>   parse_em_data(m);
> + maxmem_data(m);
> 
>   return 0;
>  }
> 



Re: [PATCH v3 -next] powerpc/pseries/memory-hotplug: Fix return value type of find_aa_index

2018-10-10 Thread Nathan Fontenot
On 10/09/2018 08:59 AM, YueHaibing wrote:
> 'aa_index' is defined as an unsigned value, but find_aa_index
> may return -1 when dlpar_clone_property fails. So change 
> find_aa_index return value type to bool, which indicate 'aa_index'
> whether found or not.
> 
> Fixes: c05a5a40969e ("powerpc/pseries: Dynamic add entires to associativity 
> lookup array")
> Signed-off-by: YueHaibing 

Reviewed-by: Nathan Fontenot nf...@linux.vnet.ibm.com>
 
> ---
> v3: change find_aa_index return type to bool
> v2: use 'rc' track the validation of aa_index
> ---
>  arch/powerpc/platforms/pseries/hotplug-memory.c | 61 
> -
>  1 file changed, 28 insertions(+), 33 deletions(-)
> 
> diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c 
> b/arch/powerpc/platforms/pseries/hotplug-memory.c
> index d26a771..4db510f 100644
> --- a/arch/powerpc/platforms/pseries/hotplug-memory.c
> +++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
> @@ -101,11 +101,12 @@ static struct property *dlpar_clone_property(struct 
> property *prop,
>   return new_prop;
>  }
> 
> -static u32 find_aa_index(struct device_node *dr_node,
> -  struct property *ala_prop, const u32 *lmb_assoc)
> +static bool find_aa_index(struct device_node *dr_node,
> +  struct property *ala_prop,
> +  const u32 *lmb_assoc, u32 *aa_index)
>  {
> - u32 *assoc_arrays;
> - u32 aa_index;
> + u32 *assoc_arrays, new_prop_size;
> + struct property *new_prop;
>   int aa_arrays, aa_array_entries, aa_array_sz;
>   int i, index;
> 
> @@ -121,46 +122,39 @@ static u32 find_aa_index(struct device_node *dr_node,
>   aa_array_entries = be32_to_cpu(assoc_arrays[1]);
>   aa_array_sz = aa_array_entries * sizeof(u32);
> 
> - aa_index = -1;
>   for (i = 0; i < aa_arrays; i++) {
>   index = (i * aa_array_entries) + 2;
> 
>   if (memcmp(&assoc_arrays[index], &lmb_assoc[1], aa_array_sz))
>   continue;
> 
> - aa_index = i;
> - break;
> + *aa_index = i;
> + return true;
>   }
> 
> - if (aa_index == -1) {
> - struct property *new_prop;
> - u32 new_prop_size;
> -
> - new_prop_size = ala_prop->length + aa_array_sz;
> - new_prop = dlpar_clone_property(ala_prop, new_prop_size);
> - if (!new_prop)
> - return -1;
> -
> - assoc_arrays = new_prop->value;
> + new_prop_size = ala_prop->length + aa_array_sz;
> + new_prop = dlpar_clone_property(ala_prop, new_prop_size);
> + if (!new_prop)
> + return false;
> 
> - /* increment the number of entries in the lookup array */
> - assoc_arrays[0] = cpu_to_be32(aa_arrays + 1);
> + assoc_arrays = new_prop->value;
> 
> - /* copy the new associativity into the lookup array */
> - index = aa_arrays * aa_array_entries + 2;
> - memcpy(&assoc_arrays[index], &lmb_assoc[1], aa_array_sz);
> + /* increment the number of entries in the lookup array */
> + assoc_arrays[0] = cpu_to_be32(aa_arrays + 1);
> 
> - of_update_property(dr_node, new_prop);
> + /* copy the new associativity into the lookup array */
> + index = aa_arrays * aa_array_entries + 2;
> + memcpy(&assoc_arrays[index], &lmb_assoc[1], aa_array_sz);
> 
> - /*
> -  * The associativity lookup array index for this lmb is
> -  * number of entries - 1 since we added its associativity
> -  * to the end of the lookup array.
> -  */
> - aa_index = be32_to_cpu(assoc_arrays[0]) - 1;
> - }
> + of_update_property(dr_node, new_prop);
> 
> - return aa_index;
> + /*
> +  * The associativity lookup array index for this lmb is
> +  * number of entries - 1 since we added its associativity
> +  * to the end of the lookup array.
> +  */
> + *aa_index = be32_to_cpu(assoc_arrays[0]) - 1;
> + return true;
>  }
> 
>  static int update_lmb_associativity_index(struct drmem_lmb *lmb)
> @@ -169,6 +163,7 @@ static int update_lmb_associativity_index(struct 
> drmem_lmb *lmb)
>   struct property *ala_prop;
>   const u32 *lmb_assoc;
>   u32 aa_index;
> + bool is_found;
> 
>   parent = of_find_node_by_path("/");
>   if (!parent)
> @@ -200,11 +195,11 @@ static int update_lmb_associativity_index(struct 
> drmem_lmb *lmb)
>   return -ENODEV;
>   }
> 
> - aa_index = find_aa_index(dr_node, ala_prop, lmb_assoc);
> + is_found = find_aa_index(dr_node, ala_prop, lmb_assoc, &aa_index);
> 
>   dlpar_free_cc_nodes(lmb_node);
> 
> - if (aa_index < 0) {
> + if (!is_found) {
>   pr_err("Could not find LMB associativity\n");
>   return -1;
>   }
> 



Re: [PATCH 4/4] powerpc: Add -Wimplicit-fallthrough to arch CFLAGS

2018-10-10 Thread Kees Cook
On Tue, Oct 9, 2018 at 10:13 PM, Michael Ellerman  wrote:
> Warn whenever a switch statement has a fallthrough without a comment
> annotating it.
>
> Signed-off-by: Michael Ellerman 

Yes please. :)

Reviewed-by: Kees Cook 

-Kees

> ---
>  arch/powerpc/Kbuild | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/arch/powerpc/Kbuild b/arch/powerpc/Kbuild
> index 86b261d6bde5..ef625f1db576 100644
> --- a/arch/powerpc/Kbuild
> +++ b/arch/powerpc/Kbuild
> @@ -1,4 +1,5 @@
>  subdir-ccflags-y := $(call cc-option, -Wvla)
> +subdir-ccflags-y += $(call cc-option, -Wimplicit-fallthrough)
>  subdir-ccflags-$(CONFIG_PPC_WERROR) += -Werror
>
>  obj-y += kernel/
> --
> 2.17.1
>



-- 
Kees Cook
Pixel Security


Re: [PATCH 3/4] powerpc: Add -Wvla to arch CFLAGS

2018-10-10 Thread Kees Cook
On Tue, Oct 9, 2018 at 10:13 PM, Michael Ellerman  wrote:
> Upstream has declared that Variable Length Array's (VLAs) are a bad
> idea, and eventually -Wvla will be added to the top-level Makefile. We
> can go one better and make sure we don't introduce any more by adding
> it to the arch Makefile.
>
> Signed-off-by: Michael Ellerman 
> ---
>  arch/powerpc/Kbuild | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/arch/powerpc/Kbuild b/arch/powerpc/Kbuild
> index 1625a06802ca..86b261d6bde5 100644
> --- a/arch/powerpc/Kbuild
> +++ b/arch/powerpc/Kbuild
> @@ -1,4 +1,5 @@
> -subdir-ccflags-$(CONFIG_PPC_WERROR) := -Werror
> +subdir-ccflags-y := $(call cc-option, -Wvla)
> +subdir-ccflags-$(CONFIG_PPC_WERROR) += -Werror
>
>  obj-y += kernel/
>  obj-y += mm/

-Wvla will be going into the top-level Makefile in the merge window
(see linux-next), so this will be redundant.

-Kees

-- 
Kees Cook
Pixel Security


Re: [PATCH 0/2] sriov enablement on s390

2018-10-10 Thread Sebastian Ott
Hello Bjorn,

On Wed, 12 Sep 2018, Bjorn Helgaas wrote:
> On Wed, Sep 12, 2018 at 02:34:09PM +0200, Sebastian Ott wrote:
> > On s390 we currently handle SRIOV within firmware. Which means
> > that the PF is under firmware control and not visible to operating
> > systems. SRIOV enablement happens within firmware and VFs are
> > passed through to logical partitions.
> > 
> > I'm working on a new mode were the PF is under operating system
> > control (including SRIOV enablement). However we still need
> > firmware support to access the VFs. The way this is supposed
> > to work is that when firmware traps the SRIOV enablement it
> > will present machine checks to the logical partition that
> > triggered the SRIOV enablement and provide the VFs via hotplug
> > events.
> > 
> > The problem I'm faced with is that the VF detection code in
> > sriov_enable leads to unusable functions in s390.
> 
> We're moving away from the weak function implementation style.  Can
> you take a look at Arnd's work here, which uses pci_host_bridge
> callbacks instead?
> 
>   https://lkml.kernel.org/r/20180817102645.3839621-1-a...@arndb.de

What's the status of Arnd's patches - will they go upstream in the next
couple of versions? What about my patches that I rebased on Arnd's branch
will they be considered?

Regards,
Sebastian



Re: [PATCH 1/2] powerpc/boot: Expose Kconfig symbols to wrapper

2018-10-10 Thread Michael Ellerman
Joel Stanley  writes:
> diff --git a/arch/powerpc/boot/Makefile b/arch/powerpc/boot/Makefile
> index 0fb96c26136f..eeed74e0dfca 100644
> --- a/arch/powerpc/boot/Makefile
> +++ b/arch/powerpc/boot/Makefile
> @@ -197,9 +197,14 @@ $(obj)/empty.c:
>  $(obj)/zImage.coff.lds $(obj)/zImage.ps3.lds : $(obj)/%: 
> $(srctree)/$(src)/%.S
>   $(Q)cp $< $@
>  
> +$(obj)/serial.c: $(obj)/autoconf.h
> +
> +$(obj)/autoconf.h: $(obj)/%: $(srctree)/include/generated/%
> + $(Q)cp $< $@
> +

This gives me:
  make[2]: *** No rule to make target '../include/generated/autoconf.h', needed 
by 'arch/powerpc/boot/autoconf.h'.  Stop.

The ../ is $(srctree).

cheers


Re: [PATCH 1/2] powerpc/boot: Disable vector instructions

2018-10-10 Thread Michael Ellerman
Joel Stanley  writes:

> This will avoid auto-vectorisation when building with higher
> optimisation levels.
>
> We don't know if the machine can support VSX and even if it's present
> it's probably not going to be enabled at this point in boot.
>
> Signed-off-by: Joel Stanley 
> ---
>  arch/powerpc/boot/Makefile | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/arch/powerpc/boot/Makefile b/arch/powerpc/boot/Makefile
> index 0fb96c26136f..739ef8d43b91 100644
> --- a/arch/powerpc/boot/Makefile
> +++ b/arch/powerpc/boot/Makefile
> @@ -32,8 +32,8 @@ else
>  endif
>  
>  BOOTCFLAGS:= -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs \
> -  -fno-strict-aliasing -Os -msoft-float -pipe \
> -  -fomit-frame-pointer -fno-builtin -fPIC -nostdinc \
> +  -fno-strict-aliasing -Os -msoft-float -mno-altivec -mno-vsx \

That's going to break if the compiler doesn't understand -mno-vsx isn't it?

I'm not sure if "support" a compiler that old though.

cheers


Re: [PATCH] memblock: stop using implicit alignement to SMP_CACHE_BYTES

2018-10-10 Thread Michael Ellerman
Mike Rapoport  writes:

> When a memblock allocation APIs are called with align = 0, the alignment is
> implicitly set to SMP_CACHE_BYTES.
>
> Replace all such uses of memblock APIs with the 'align' parameter explicitly
> set to SMP_CACHE_BYTES and stop implicit alignment assignment in the
> memblock internal allocation functions.
>
> For the case when memblock APIs are used via helper functions, e.g. like
> iommu_arena_new_node() in Alpha, the helper functions were detected with
> Coccinelle's help and then manually examined and updated where appropriate.
>
> The direct memblock APIs users were updated using the semantic patch below:
>
> @@
> expression size, min_addr, max_addr, nid;
> @@
> (
> |
> - memblock_alloc_try_nid_raw(size, 0, min_addr, max_addr, nid)
> + memblock_alloc_try_nid_raw(size, SMP_CACHE_BYTES, min_addr, max_addr,
> nid)
> |
> - memblock_alloc_try_nid_nopanic(size, 0, min_addr, max_addr, nid)
> + memblock_alloc_try_nid_nopanic(size, SMP_CACHE_BYTES, min_addr, max_addr,
> nid)
> |
> - memblock_alloc_try_nid(size, 0, min_addr, max_addr, nid)
> + memblock_alloc_try_nid(size, SMP_CACHE_BYTES, min_addr, max_addr, nid)
> |
> - memblock_alloc(size, 0)
> + memblock_alloc(size, SMP_CACHE_BYTES)
> |
> - memblock_alloc_raw(size, 0)
> + memblock_alloc_raw(size, SMP_CACHE_BYTES)
> |
> - memblock_alloc_from(size, 0, min_addr)
> + memblock_alloc_from(size, SMP_CACHE_BYTES, min_addr)
> |
> - memblock_alloc_nopanic(size, 0)
> + memblock_alloc_nopanic(size, SMP_CACHE_BYTES)
> |
> - memblock_alloc_low(size, 0)
> + memblock_alloc_low(size, SMP_CACHE_BYTES)
> |
> - memblock_alloc_low_nopanic(size, 0)
> + memblock_alloc_low_nopanic(size, SMP_CACHE_BYTES)
> |
> - memblock_alloc_from_nopanic(size, 0, min_addr)
> + memblock_alloc_from_nopanic(size, SMP_CACHE_BYTES, min_addr)
> |
> - memblock_alloc_node(size, 0, nid)
> + memblock_alloc_node(size, SMP_CACHE_BYTES, nid)
> )
>
> Suggested-by: Michal Hocko 
> Signed-off-by: Mike Rapoport 
> ---
...
>  arch/powerpc/kernel/pci_32.c  |  3 ++-
>  arch/powerpc/lib/alloc.c  |  2 +-
>  arch/powerpc/mm/mmu_context_nohash.c  |  7 +++---
>  arch/powerpc/platforms/powermac/nvram.c   |  2 +-
>  arch/powerpc/platforms/powernv/pci-ioda.c |  6 ++---
>  arch/powerpc/sysdev/msi_bitmap.c  |  2 +-

The powerpc changes all look fine.

I'm not quite clear on how SMP_CACHE_BYTES is getting included.

I think it's: memblock.h -> mm.h -> mmzone.h -> cache.h

So that's probably fine.

Acked-by: Michael Ellerman  (powerpc)


cheers


[PATCH] powerpc/pseries: Export maximum memory value

2018-10-10 Thread Aravinda Prasad
This patch exports the maximum possible amount of memory
configured on the system via /proc/powerpc/lparcfg.

Signed-off-by: Aravinda Prasad 
---
 arch/powerpc/platforms/pseries/lparcfg.c |   13 +
 1 file changed, 13 insertions(+)

diff --git a/arch/powerpc/platforms/pseries/lparcfg.c 
b/arch/powerpc/platforms/pseries/lparcfg.c
index 7c872dc..aa82f55 100644
--- a/arch/powerpc/platforms/pseries/lparcfg.c
+++ b/arch/powerpc/platforms/pseries/lparcfg.c
@@ -26,6 +26,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -36,6 +37,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "pseries.h"
 
@@ -433,6 +435,16 @@ static void parse_em_data(struct seq_file *m)
seq_printf(m, "power_mode_data=%016lx\n", retbuf[0]);
 }
 
+static void maxmem_data(struct seq_file *m)
+{
+   unsigned long maxmem = 0;
+
+   maxmem += drmem_info->n_lmbs * drmem_info->lmb_size;
+   maxmem += hugetlb_total_pages() * PAGE_SIZE;
+
+   seq_printf(m, "MaxMem=%ld\n", maxmem);
+}
+
 static int pseries_lparcfg_data(struct seq_file *m, void *v)
 {
int partition_potential_processors;
@@ -491,6 +503,7 @@ static int pseries_lparcfg_data(struct seq_file *m, void *v)
seq_printf(m, "slb_size=%d\n", mmu_slb_size);
 #endif
parse_em_data(m);
+   maxmem_data(m);
 
return 0;
 }



Re: [PATCH 32/36] dt-bindings: arm: Convert ST STi board/soc bindings to json-schema

2018-10-10 Thread Patrice CHOTARD
Hi Rob

On 10/05/2018 06:58 PM, Rob Herring wrote:
> Convert ST STi SoC bindings to DT schema format using json-schema.
> 
> Cc: Patrice Chotard 
> Cc: Mark Rutland 
> Cc: devicet...@vger.kernel.org
> Signed-off-by: Rob Herring 
> ---
>  Documentation/devicetree/bindings/arm/sti.txt | 23 ---
>  .../devicetree/bindings/arm/sti.yaml  | 23 +++
>  2 files changed, 23 insertions(+), 23 deletions(-)
>  delete mode 100644 Documentation/devicetree/bindings/arm/sti.txt
>  create mode 100644 Documentation/devicetree/bindings/arm/sti.yaml
> 
> diff --git a/Documentation/devicetree/bindings/arm/sti.txt 
> b/Documentation/devicetree/bindings/arm/sti.txt
> deleted file mode 100644
> index 8d27f6b084c7..
> --- a/Documentation/devicetree/bindings/arm/sti.txt
> +++ /dev/null
> @@ -1,23 +0,0 @@
> -ST STi Platforms Device Tree Bindings
> 
> -
> -Boards with the ST STiH415 SoC shall have the following properties:
> -Required root node property:
> -compatible = "st,stih415";
> -
> -Boards with the ST STiH416 SoC shall have the following properties:
> -Required root node property:
> -compatible = "st,stih416";
> -
> -Boards with the ST STiH407 SoC shall have the following properties:
> -Required root node property:
> -compatible = "st,stih407";
> -
> -Boards with the ST STiH410 SoC shall have the following properties:
> -Required root node property:
> -compatible = "st,stih410";
> -
> -Boards with the ST STiH418 SoC shall have the following properties:
> -Required root node property:
> -compatible = "st,stih418";
> -
> diff --git a/Documentation/devicetree/bindings/arm/sti.yaml 
> b/Documentation/devicetree/bindings/arm/sti.yaml
> new file mode 100644
> index ..10814334cfc9
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/arm/sti.yaml
> @@ -0,0 +1,23 @@
> +# SPDX-License-Identifier: None
> +%YAML 1.2
> +---
> +$id: http://devicetree.org/schemas/bindings/arm/sti.yaml#
> +$schema: http://devicetree.org/meta-schemas/core.yaml#
> +
> +title: ST STi Platforms Device Tree Bindings
> +
> +maintainers:
> +  - Maxime Coquelin 

Maxime has left STMicroelectronics, you can replace its email by mine
Patrice Chotard 


Thanks

Patrice

> +
> +properties:
> +  $nodename:
> +const: '/'
> +  compatible:
> +items:
> +  - enum:
> +  - st,stih415
> +  - st,stih416
> +  - st,stih407
> +  - st,stih410
> +  - st,stih418
> +...
> 

Re: [PATCH] memblock: stop using implicit alignement to SMP_CACHE_BYTES

2018-10-10 Thread Michal Hocko
On Fri 05-10-18 00:07:04, Mike Rapoport wrote:
> When a memblock allocation APIs are called with align = 0, the alignment is
> implicitly set to SMP_CACHE_BYTES.

I would add something like
"
Implicit alignment is done deep in the memblock allocator and it can
come as a surprise. Not that such an alignment would be wrong even when
used incorrectly but it is better to be explicit for the sake of clarity
and the prinicple of the least surprise.
"

> Replace all such uses of memblock APIs with the 'align' parameter explicitly
> set to SMP_CACHE_BYTES and stop implicit alignment assignment in the
> memblock internal allocation functions.
> 
> For the case when memblock APIs are used via helper functions, e.g. like
> iommu_arena_new_node() in Alpha, the helper functions were detected with
> Coccinelle's help and then manually examined and updated where appropriate.
> 
> The direct memblock APIs users were updated using the semantic patch below:
> 
> @@
> expression size, min_addr, max_addr, nid;
> @@
> (
> |
> - memblock_alloc_try_nid_raw(size, 0, min_addr, max_addr, nid)
> + memblock_alloc_try_nid_raw(size, SMP_CACHE_BYTES, min_addr, max_addr,
> nid)
> |
> - memblock_alloc_try_nid_nopanic(size, 0, min_addr, max_addr, nid)
> + memblock_alloc_try_nid_nopanic(size, SMP_CACHE_BYTES, min_addr, max_addr,
> nid)
> |
> - memblock_alloc_try_nid(size, 0, min_addr, max_addr, nid)
> + memblock_alloc_try_nid(size, SMP_CACHE_BYTES, min_addr, max_addr, nid)
> |
> - memblock_alloc(size, 0)
> + memblock_alloc(size, SMP_CACHE_BYTES)
> |
> - memblock_alloc_raw(size, 0)
> + memblock_alloc_raw(size, SMP_CACHE_BYTES)
> |
> - memblock_alloc_from(size, 0, min_addr)
> + memblock_alloc_from(size, SMP_CACHE_BYTES, min_addr)
> |
> - memblock_alloc_nopanic(size, 0)
> + memblock_alloc_nopanic(size, SMP_CACHE_BYTES)
> |
> - memblock_alloc_low(size, 0)
> + memblock_alloc_low(size, SMP_CACHE_BYTES)
> |
> - memblock_alloc_low_nopanic(size, 0)
> + memblock_alloc_low_nopanic(size, SMP_CACHE_BYTES)
> |
> - memblock_alloc_from_nopanic(size, 0, min_addr)
> + memblock_alloc_from_nopanic(size, SMP_CACHE_BYTES, min_addr)
> |
> - memblock_alloc_node(size, 0, nid)
> + memblock_alloc_node(size, SMP_CACHE_BYTES, nid)
> )
> 
> Suggested-by: Michal Hocko 
> Signed-off-by: Mike Rapoport 

I do agree that this is an improvement. I would also add WARN_ON_ONCE on
0 alignment to catch some left overs. If we ever grown a user which
would explicitly require the zero alignment (I would be surprised) then
we can remove the warning.

Acked-by: Michal Hocko 
-- 
Michal Hocko
SUSE Labs