Re: [PATCH 4/6] powerpc/mm: Add devmap support for ppc64

2017-05-23 Thread Oliver O'Halloran
On Tue, May 23, 2017 at 8:40 PM, Balbir Singh  wrote:
> On Tue, May 23, 2017 at 2:05 PM, Oliver O'Halloran  wrote:
>> Add support for the devmap bit on PTEs and PMDs for PPC64 Book3S.  This
>> is used to differentiate device backed memory from transparent huge
>> pages since they are handled in more or less the same manner by the core
>> mm code.
>>
>> Cc: Aneesh Kumar K.V 
>> Signed-off-by: Oliver O'Halloran 
>> ---
>> v1 -> v2: Properly differentiate THP and PMD Devmap entries. The
>> mm core assumes that pmd_trans_huge() and pmd_devmap() are mutually
>> exclusive and v1 had pmd_trans_huge() being true on a devmap pmd.
>>
>> Aneesh, this has been fleshed out substantially since v1. Can you
>> re-review it? Also no explicit gup support is required in this patch
>> since devmap support was added generic GUP as a part of making x86 use
>> the generic version.
>> ---
>>  arch/powerpc/include/asm/book3s/64/hash-64k.h |  2 +-
>>  arch/powerpc/include/asm/book3s/64/pgtable.h  | 37 
>> ++-
>>  arch/powerpc/include/asm/book3s/64/radix.h|  2 +-
>>  arch/powerpc/mm/hugetlbpage.c |  2 +-
>>  arch/powerpc/mm/pgtable-book3s64.c|  4 +--
>>  arch/powerpc/mm/pgtable-hash64.c  |  4 ++-
>>  arch/powerpc/mm/pgtable-radix.c   |  3 ++-
>>  arch/powerpc/mm/pgtable_64.c  |  2 +-
>>  8 files changed, 47 insertions(+), 9 deletions(-)
>>
>> diff --git a/arch/powerpc/include/asm/book3s/64/hash-64k.h 
>> b/arch/powerpc/include/asm/book3s/64/hash-64k.h
>> index 9732837aaae8..eaaf613c5347 100644
>> --- a/arch/powerpc/include/asm/book3s/64/hash-64k.h
>> +++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h
>> @@ -180,7 +180,7 @@ static inline void mark_hpte_slot_valid(unsigned char 
>> *hpte_slot_array,
>>   */
>>  static inline int hash__pmd_trans_huge(pmd_t pmd)
>>  {
>> -   return !!((pmd_val(pmd) & (_PAGE_PTE | H_PAGE_THP_HUGE)) ==
>> +   return !!((pmd_val(pmd) & (_PAGE_PTE | H_PAGE_THP_HUGE | 
>> _PAGE_DEVMAP)) ==
>>   (_PAGE_PTE | H_PAGE_THP_HUGE));
>>  }
>
> Like Aneesh suggested, I think we can probably skip this check here
>
>>
>> diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h 
>> b/arch/powerpc/include/asm/book3s/64/pgtable.h
>> index 85bc9875c3be..24634e92dd0b 100644
>> --- a/arch/powerpc/include/asm/book3s/64/pgtable.h
>> +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
>> @@ -79,6 +79,9 @@
>>
>>  #define _PAGE_SOFT_DIRTY   _RPAGE_SW3 /* software: software dirty 
>> tracking */
>>  #define _PAGE_SPECIAL  _RPAGE_SW2 /* software: special page */
>> +#define _PAGE_DEVMAP   _RPAGE_SW1
>> +#define __HAVE_ARCH_PTE_DEVMAP
>> +
>>  /*
>>   * Drivers request for cache inhibited pte mapping using _PAGE_NO_CACHE
>>   * Instead of fixing all of them, add an alternate define which
>> @@ -599,6 +602,16 @@ static inline pte_t pte_mkhuge(pte_t pte)
>> return pte;
>>  }
>>
>> +static inline pte_t pte_mkdevmap(pte_t pte)
>> +{
>> +   return __pte(pte_val(pte) | _PAGE_SPECIAL|_PAGE_DEVMAP);
>> +}
>> +
>> +static inline int pte_devmap(pte_t pte)
>> +{
>> +   return !!(pte_raw(pte) & cpu_to_be64(_PAGE_DEVMAP));
>> +}
>> +
>>  static inline pte_t pte_modify(pte_t pte, pgprot_t newprot)
>>  {
>> /* FIXME!! check whether this need to be a conditional */
>> @@ -963,6 +976,9 @@ static inline pte_t *pmdp_ptep(pmd_t *pmd)
>>  #define pmd_mk_savedwrite(pmd) pte_pmd(pte_mk_savedwrite(pmd_pte(pmd)))
>>  #define pmd_clear_savedwrite(pmd)  
>> pte_pmd(pte_clear_savedwrite(pmd_pte(pmd)))
>>
>> +#define pud_pfn(...) (0)
>> +#define pgd_pfn(...) (0)
>> +
>
> Don't get these bits.. why are they zero?

I think that was just hacking stuff until it worked. pud_pfn() needs
to exist for the kernel to build when __HAVE_ARCH_PTE_DEVMAP is set,
but we don't need it to do anything (yet) since pud_pfn() is only used
for handing devmap PUD faults. We currently support those so we will
never hit that code path. pgd_pfn() can die though.

>>  #ifdef CONFIG_HAVE_ARCH_SOFT_DIRTY
>>  #define pmd_soft_dirty(pmd)pte_soft_dirty(pmd_pte(pmd))
>>  #define pmd_mksoft_dirty(pmd)  pte_pmd(pte_mksoft_dirty(pmd_pte(pmd)))
>> @@ -1137,7 +1153,6 @@ static inline int pmd_move_must_withdraw(struct 
>> spinlock *new_pmd_ptl,
>> return true;
>>  }
>>
>> -
>>  #define arch_needs_pgtable_deposit arch_needs_pgtable_deposit
>>  static inline bool arch_needs_pgtable_deposit(void)
>>  {
>> @@ -1146,6 +1161,26 @@ static inline bool arch_needs_pgtable_deposit(void)
>> return true;
>>  }
>>
>> +static inline pmd_t pmd_mkdevmap(pmd_t pmd)
>> +{
>> +   return pte_pmd(pte_mkdevmap(pmd_pte(pmd)));
>> +}
>> +
>> +static inline int pmd_devmap(pmd_t pmd)
>> +{
>> +   return pte_devmap(pmd_pte(pmd));
>> +}
>
> This should be defined only if #ifdef __HAVE_ARCH_PTE_DEVMAP

ok

>
> The rest looks OK
>
> Balbir 

Re: SPU not working for kernel 4.9, 4.10, 4.11a and 4.12

2017-05-23 Thread Michael Ellerman
Jeremy Kerr  writes:

> Hi all,
>
>
>> Looks like this also happens with the simple spu_run test:
>> 
>>  
>> https://github.com/jk-ozlabs/spufs-testsuite/blob/master/tests/03-spu_run/01-spu_run.c
>> 
>> ... might need some debugging here, I'll update if I find anything.
>
> And it appears we're stuck in the POLL_WHILE_FALSE() loop in
> wait_tag_complete() called from restore_lscsa().
>
> Because spufs itself has been fairly static, I suspect some other change
> has meant that MFC DMAs aren't working; so at this point, a bisect might
> be the best way forward. Do you see the exact same behaviour on 4.9 (but
> 4.8 works?)

I can bisect it here, I have everything setup to autoboot etc.

cheers


[PATCH] powerpc/lib: Split xor_vmx file to guarantee instruction ordering

2017-05-23 Thread Matt Brown
The xor_vmx.c file is used for the RAID5 xor operations. In these functions
altivec is enabled to run the operation and then disabled. However due to
compiler instruction reordering, altivec instructions are being run before
enable_altivec() and after disable_altivec().

This patch splits the non-altivec code into xor_vmx_glue.c which calls the
altivec functions in xor_vmx.c. By compiling xor_vmx_glue.c without
-maltivec we can guarantee that altivec instruction will not be reordered
outside of the enable/disable block.

Signed-off-by: Matt Brown 
---
 arch/powerpc/lib/Makefile   |  2 +-
 arch/powerpc/lib/xor_vmx.c  | 53 ---
 arch/powerpc/lib/xor_vmx.h  | 20 +
 arch/powerpc/lib/xor_vmx_glue.c | 62 +
 4 files changed, 94 insertions(+), 43 deletions(-)
 create mode 100644 arch/powerpc/lib/xor_vmx.h
 create mode 100644 arch/powerpc/lib/xor_vmx_glue.c

diff --git a/arch/powerpc/lib/Makefile b/arch/powerpc/lib/Makefile
index 309361e8..a448464 100644
--- a/arch/powerpc/lib/Makefile
+++ b/arch/powerpc/lib/Makefile
@@ -31,7 +31,7 @@ obj-$(CONFIG_PPC_LIB_RHEAP) += rheap.o
 
 obj-$(CONFIG_FTR_FIXUP_SELFTEST) += feature-fixups-test.o
 
-obj-$(CONFIG_ALTIVEC)  += xor_vmx.o
+obj-$(CONFIG_ALTIVEC)  += xor_vmx.o xor_vmx_glue.o
 CFLAGS_xor_vmx.o += -maltivec $(call cc-option,-mabi=altivec)
 
 obj-$(CONFIG_PPC64) += $(obj64-y)
diff --git a/arch/powerpc/lib/xor_vmx.c b/arch/powerpc/lib/xor_vmx.c
index f9de69a..4df240a 100644
--- a/arch/powerpc/lib/xor_vmx.c
+++ b/arch/powerpc/lib/xor_vmx.c
@@ -29,10 +29,7 @@
 #define vector __attribute__((vector_size(16)))
 #endif
 
-#include 
-#include 
-#include 
-#include 
+#include "xor_vmx.h"
 
 typedef vector signed char unative_t;
 
@@ -64,16 +61,13 @@ typedef vector signed char unative_t;
V1##_3 = vec_xor(V1##_3, V2##_3);   \
} while (0)
 
-void xor_altivec_2(unsigned long bytes, unsigned long *v1_in,
-  unsigned long *v2_in)
+void __xor_altivec_2(unsigned long bytes, unsigned long *v1_in,
+unsigned long *v2_in)
 {
DEFINE(v1);
DEFINE(v2);
unsigned long lines = bytes / (sizeof(unative_t)) / 4;
 
-   preempt_disable();
-   enable_kernel_altivec();
-
do {
LOAD(v1);
LOAD(v2);
@@ -83,23 +77,16 @@ void xor_altivec_2(unsigned long bytes, unsigned long 
*v1_in,
v1 += 4;
v2 += 4;
} while (--lines > 0);
-
-   disable_kernel_altivec();
-   preempt_enable();
 }
-EXPORT_SYMBOL(xor_altivec_2);
 
-void xor_altivec_3(unsigned long bytes, unsigned long *v1_in,
-  unsigned long *v2_in, unsigned long *v3_in)
+void __xor_altivec_3(unsigned long bytes, unsigned long *v1_in,
+unsigned long *v2_in, unsigned long *v3_in)
 {
DEFINE(v1);
DEFINE(v2);
DEFINE(v3);
unsigned long lines = bytes / (sizeof(unative_t)) / 4;
 
-   preempt_disable();
-   enable_kernel_altivec();
-
do {
LOAD(v1);
LOAD(v2);
@@ -112,15 +99,11 @@ void xor_altivec_3(unsigned long bytes, unsigned long 
*v1_in,
v2 += 4;
v3 += 4;
} while (--lines > 0);
-
-   disable_kernel_altivec();
-   preempt_enable();
 }
-EXPORT_SYMBOL(xor_altivec_3);
 
-void xor_altivec_4(unsigned long bytes, unsigned long *v1_in,
-  unsigned long *v2_in, unsigned long *v3_in,
-  unsigned long *v4_in)
+void __xor_altivec_4(unsigned long bytes, unsigned long *v1_in,
+unsigned long *v2_in, unsigned long *v3_in,
+unsigned long *v4_in)
 {
DEFINE(v1);
DEFINE(v2);
@@ -128,9 +111,6 @@ void xor_altivec_4(unsigned long bytes, unsigned long 
*v1_in,
DEFINE(v4);
unsigned long lines = bytes / (sizeof(unative_t)) / 4;
 
-   preempt_disable();
-   enable_kernel_altivec();
-
do {
LOAD(v1);
LOAD(v2);
@@ -146,15 +126,11 @@ void xor_altivec_4(unsigned long bytes, unsigned long 
*v1_in,
v3 += 4;
v4 += 4;
} while (--lines > 0);
-
-   disable_kernel_altivec();
-   preempt_enable();
 }
-EXPORT_SYMBOL(xor_altivec_4);
 
-void xor_altivec_5(unsigned long bytes, unsigned long *v1_in,
-  unsigned long *v2_in, unsigned long *v3_in,
-  unsigned long *v4_in, unsigned long *v5_in)
+void __xor_altivec_5(unsigned long bytes, unsigned long *v1_in,
+unsigned long *v2_in, unsigned long *v3_in,
+unsigned long *v4_in, unsigned long *v5_in)
 {
DEFINE(v1);
DEFINE(v2);
@@ -163,9 +139,6 @@ void xor_altivec_5(unsigned long bytes, unsigned long 
*v1_in,
DEFINE(v5);
unsigned long lines = bytes / (sizeof(unative_t)) / 4;
 
-   preempt_disable();
-

Re: [Patch 2/2]: powerpc/hotplug/mm: Fix hot-add memory node assoc

2017-05-23 Thread Michael Bringmann


On 05/23/2017 04:49 PM, Reza Arbab wrote:
> On Tue, May 23, 2017 at 03:05:08PM -0500, Michael Bringmann wrote:
>> On 05/23/2017 10:52 AM, Reza Arbab wrote:
>>> On Tue, May 23, 2017 at 10:15:44AM -0500, Michael Bringmann wrote:
 +static void setup_nodes(void)
 +{
 +int i, l = 32 /* MAX_NUMNODES */;
 +
 +for (i = 0; i < l; i++) {
 +if (!node_possible(i)) {
 +setup_node_data(i, 0, 0);
 +node_set(i, node_possible_map);
 +}
 +}
 +}
>>>
>>> This seems to be a workaround for 3af229f2071f ("powerpc/numa: Reset 
>>> node_possible_map to only node_online_map").
>>
>> They may be related, but that commit is not a replacement.  The above patch 
>> ensures that
>> there are enough of the nodes initialized at startup to allow for memory 
>> hot-add into a
>> node that was not used at boot.  (See 'setup_node_data' function in 
>> 'numa.c'.)  That and
>> recording that the node was initialized.
> 
> Is it really necessary to preinitialize these empty nodes using 
> setup_node_data()? When you do memory hotadd into a node that was not used at 
> boot, the node data already gets set up by
> 
> add_memory
>  add_memory_resource
>hotadd_new_pgdat
>  arch_alloc_nodedata <-- allocs the pg_data_t
>  ...
>  free_area_init_node <-- sets NODE_DATA(nid)->node_id, etc.
> 
> Removing setup_node_data() from that loop leaves only the call to node_set(). 
> If 3af229f2071f (which reduces node_possible_map) was reverted, you wouldn't 
> need to do that either.

With or without 3af229f2071f, we would still need to add something, somewhere 
to add new
bits to the 'node_possible_map'.  That is not being done.

> 
>> I didn't see where any part of commit 3af229f2071f would touch the 
>> 'node_possible_map'
>> which is needed by 'numa.c' and 'workqueue.c'.  The nodemask created and 
>> updated by
>> 'mem_cgroup_may_update_nodemask()' does not appear to be the same mask.
> 
> Are you sure you're looking at 3af229f2071f? It only adds one line of code; 
> the reduction of node_possible_map.
> 

-- 
Michael W. Bringmann
Linux Technology Center
IBM Corporation
Tie-Line  363-5196
External: (512) 286-5196
Cell:   (512) 466-0650
m...@linux.vnet.ibm.com



Re: [Patch 2/2]: powerpc/hotplug/mm: Fix hot-add memory node assoc

2017-05-23 Thread Michael Bringmann


On 05/23/2017 04:49 PM, Reza Arbab wrote:
> On Tue, May 23, 2017 at 03:05:08PM -0500, Michael Bringmann wrote:
>> On 05/23/2017 10:52 AM, Reza Arbab wrote:
>>> On Tue, May 23, 2017 at 10:15:44AM -0500, Michael Bringmann wrote:
 +static void setup_nodes(void)
 +{
 +int i, l = 32 /* MAX_NUMNODES */;
 +
 +for (i = 0; i < l; i++) {
 +if (!node_possible(i)) {
 +setup_node_data(i, 0, 0);
 +node_set(i, node_possible_map);
 +}
 +}
 +}
>>>
>>> This seems to be a workaround for 3af229f2071f ("powerpc/numa: Reset 
>>> node_possible_map to only node_online_map").
>>
>> They may be related, but that commit is not a replacement.  The above patch 
>> ensures that
>> there are enough of the nodes initialized at startup to allow for memory 
>> hot-add into a
>> node that was not used at boot.  (See 'setup_node_data' function in 
>> 'numa.c'.)  That and
>> recording that the node was initialized.
> 
> Is it really necessary to preinitialize these empty nodes using 
> setup_node_data()? When you do memory hotadd into a node that was not used at 
> boot, the node data already gets set up by
> 
> add_memory
>  add_memory_resource
>hotadd_new_pgdat
>  arch_alloc_nodedata <-- allocs the pg_data_t
>  ...
>  free_area_init_node <-- sets NODE_DATA(nid)->node_id, etc.

I see that code now, but for some reason it did not work when I hot-added
memory.

> 
> Removing setup_node_data() from that loop leaves only the call to node_set(). 
> If 3af229f2071f (which reduces node_possible_map) was reverted, you wouldn't 
> need to do that either.
> 
>> I didn't see where any part of commit 3af229f2071f would touch the 
>> 'node_possible_map'
>> which is needed by 'numa.c' and 'workqueue.c'.  The nodemask created and 
>> updated by
>> 'mem_cgroup_may_update_nodemask()' does not appear to be the same mask.
> 
> Are you sure you're looking at 3af229f2071f? It only adds one line of code; 
> the reduction of node_possible_map.
> 

The 3rd file in the patch set removes,

-   nodes_and(node_possible_map, node_possible_map, node_online_map);

I need to add bits to 'node_possible_map' -- bits which may not be used
for the memory at boot, but which would be used when memory is hot-added
later.  I haven't found anything outside of the boot code that adds bits
to the 'possible' mask.

-- 
Michael W. Bringmann
Linux Technology Center
IBM Corporation
Tie-Line  363-5196
External: (512) 286-5196
Cell:   (512) 466-0650
m...@linux.vnet.ibm.com



Re: [Patch 2/2]: powerpc/hotplug/mm: Fix hot-add memory node assoc

2017-05-23 Thread Reza Arbab

On Tue, May 23, 2017 at 03:05:08PM -0500, Michael Bringmann wrote:

On 05/23/2017 10:52 AM, Reza Arbab wrote:

On Tue, May 23, 2017 at 10:15:44AM -0500, Michael Bringmann wrote:

+static void setup_nodes(void)
+{
+int i, l = 32 /* MAX_NUMNODES */;
+
+for (i = 0; i < l; i++) {
+if (!node_possible(i)) {
+setup_node_data(i, 0, 0);
+node_set(i, node_possible_map);
+}
+}
+}


This seems to be a workaround for 3af229f2071f ("powerpc/numa: Reset 
node_possible_map to only node_online_map").


They may be related, but that commit is not a replacement.  The above patch 
ensures that
there are enough of the nodes initialized at startup to allow for memory 
hot-add into a
node that was not used at boot.  (See 'setup_node_data' function in 'numa.c'.)  
That and
recording that the node was initialized.


Is it really necessary to preinitialize these empty nodes using 
setup_node_data()? When you do memory hotadd into a node that was not 
used at boot, the node data already gets set up by


add_memory
 add_memory_resource
   hotadd_new_pgdat
 arch_alloc_nodedata <-- allocs the pg_data_t
 ...
 free_area_init_node <-- sets NODE_DATA(nid)->node_id, etc.

Removing setup_node_data() from that loop leaves only the call to 
node_set(). If 3af229f2071f (which reduces node_possible_map) was 
reverted, you wouldn't need to do that either.



I didn't see where any part of commit 3af229f2071f would touch the 
'node_possible_map'
which is needed by 'numa.c' and 'workqueue.c'.  The nodemask created and 
updated by
'mem_cgroup_may_update_nodemask()' does not appear to be the same mask.


Are you sure you're looking at 3af229f2071f? It only adds one line of 
code; the reduction of node_possible_map.


--
Reza Arbab



Re: [Patch 2/2]: powerpc/hotplug/mm: Fix hot-add memory node assoc

2017-05-23 Thread Michael Bringmann


On 05/23/2017 10:52 AM, Reza Arbab wrote:
> On Tue, May 23, 2017 at 10:15:44AM -0500, Michael Bringmann wrote:
>> +static void setup_nodes(void)
>> +{
>> +int i, l = 32 /* MAX_NUMNODES */;
>> +
>> +for (i = 0; i < l; i++) {
>> +if (!node_possible(i)) {
>> +setup_node_data(i, 0, 0);
>> +node_set(i, node_possible_map);
>> +}
>> +}
>> +}
> 
> This seems to be a workaround for 3af229f2071f ("powerpc/numa: Reset 
> node_possible_map to only node_online_map").

They may be related, but that commit is not a replacement.  The above patch 
ensures that
there are enough of the nodes initialized at startup to allow for memory 
hot-add into a
node that was not used at boot.  (See 'setup_node_data' function in 'numa.c'.)  
That and
recording that the node was initialized.

I didn't see where any part of commit 3af229f2071f would touch the 
'node_possible_map'
which is needed by 'numa.c' and 'workqueue.c'.  The nodemask created and 
updated by
'mem_cgroup_may_update_nodemask()' does not appear to be the same mask.

> 
> Balbir, you have a patchset which reverts it. Do you think that will be 
> getting merged?
> 
> http://lkml.kernel.org/r/1479253501-26261-1-git-send-email-bsinghar...@gmail.com
> (see patch 3/3)
> 

-- 
Michael W. Bringmann
Linux Technology Center
IBM Corporation
Tie-Line  363-5196
External: (512) 286-5196
Cell:   (512) 466-0650
m...@linux.vnet.ibm.com



Re: [Patch 2/2]: powerpc/hotplug/mm: Fix hot-add memory node assoc

2017-05-23 Thread Reza Arbab

On Tue, May 23, 2017 at 10:15:44AM -0500, Michael Bringmann wrote:

+static void setup_nodes(void)
+{
+   int i, l = 32 /* MAX_NUMNODES */;
+
+   for (i = 0; i < l; i++) {
+   if (!node_possible(i)) {
+   setup_node_data(i, 0, 0);
+   node_set(i, node_possible_map);
+   }
+   }
+}


This seems to be a workaround for 3af229f2071f ("powerpc/numa: Reset 
node_possible_map to only node_online_map").


Balbir, you have a patchset which reverts it. Do you think that will be 
getting merged?


http://lkml.kernel.org/r/1479253501-26261-1-git-send-email-bsinghar...@gmail.com
(see patch 3/3)

--
Reza Arbab



[Patch 2/2]: powerpc/hotplug/mm: Fix hot-add memory node assoc

2017-05-23 Thread Michael Bringmann

Removing or adding memory via the PowerPC hotplug interface shows
anomalies in the association between memory and nodes.  The code
was updated to initialize more possible nodes to make them available
to subsequent DLPAR hotplug-memory operations, even if they are not
needed at boot time.

Signed-off-by: Michael Bringmann 
---
 arch/powerpc/mm/numa.c |   44 
 1 file changed, 32 insertions(+), 12 deletions(-)

diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
index 15c2dd5..3d58c1f 100644
--- a/arch/powerpc/mm/numa.c
+++ b/arch/powerpc/mm/numa.c
@@ -870,7 +870,7 @@ void __init dump_numa_cpu_topology(void)
 }
 
 /* Initialize NODE_DATA for a node on the local memory */
-static void __init setup_node_data(int nid, u64 start_pfn, u64 end_pfn)
+static void setup_node_data(int nid, u64 start_pfn, u64 end_pfn)
 {
u64 spanned_pages = end_pfn - start_pfn;
const size_t nd_size = roundup(sizeof(pg_data_t), SMP_CACHE_BYTES);
@@ -878,23 +878,41 @@ static void __init setup_node_data(int nid, u64 
start_pfn, u64 end_pfn)
void *nd;
int tnid;
 
-   nd_pa = memblock_alloc_try_nid(nd_size, SMP_CACHE_BYTES, nid);
-   nd = __va(nd_pa);
+   if (!node_data[nid]) {
+   nd_pa = memblock_alloc_try_nid(nd_size, SMP_CACHE_BYTES, nid);
+   nd = __va(nd_pa);
 
-   /* report and initialize */
-   pr_info("  NODE_DATA [mem %#010Lx-%#010Lx]\n",
-   nd_pa, nd_pa + nd_size - 1);
-   tnid = early_pfn_to_nid(nd_pa >> PAGE_SHIFT);
-   if (tnid != nid)
-   pr_info("NODE_DATA(%d) on node %d\n", nid, tnid);
+   node_data[nid] = nd;
+   memset(NODE_DATA(nid), 0, sizeof(pg_data_t));
+   NODE_DATA(nid)->node_id = nid;
+
+   /* report and initialize */
+   pr_info("  NODE_DATA [mem %#010Lx-%#010Lx]\n",
+   nd_pa, nd_pa + nd_size - 1);
+   tnid = early_pfn_to_nid(nd_pa >> PAGE_SHIFT);
+   if (tnid != nid)
+   pr_info("NODE_DATA(%d) on node %d\n", nid, tnid);
+   } else {
+   nd_pa = (u64) node_data[nid];
+   nd = __va(nd_pa);
+   }
 
-   node_data[nid] = nd;
-   memset(NODE_DATA(nid), 0, sizeof(pg_data_t));
-   NODE_DATA(nid)->node_id = nid;
NODE_DATA(nid)->node_start_pfn = start_pfn;
NODE_DATA(nid)->node_spanned_pages = spanned_pages;
 }
 
+static void setup_nodes(void)
+{
+   int i, l = 32 /* MAX_NUMNODES */;
+
+   for (i = 0; i < l; i++) {
+   if (!node_possible(i)) {
+   setup_node_data(i, 0, 0);
+   node_set(i, node_possible_map);
+   }
+   }
+}
+
 void __init initmem_init(void)
 {
int nid, cpu;
@@ -914,6 +932,8 @@ void __init initmem_init(void)
 */
nodes_and(node_possible_map, node_possible_map, node_online_map);
 
+   setup_nodes();
+
for_each_online_node(nid) {
unsigned long start_pfn, end_pfn;
 



[PATCH 1/2] powerpc/numa: Update CPU topology when VPHN enabled

2017-05-23 Thread Michael Bringmann

powerpc/numa: Correct the currently broken capability to set the
topology for shared CPUs in LPARs.  At boot time for shared CPU
lpars, the topology for each shared CPU is set to node zero, however,
this is now updated correctly using the Virtual Processor Home Node
(VPHN) capabilities information provided by the pHyp. The VPHN handling
in Linux is disabled, if PRRN handling is present.

Signed-off-by: Michael Bringmann 
---
 arch/powerpc/mm/numa.c   |   19 ++-
 arch/powerpc/platforms/pseries/dlpar.c   |2 ++
 arch/powerpc/platforms/pseries/hotplug-cpu.c |3 ++-
 3 files changed, 22 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
index 371792e..15c2dd5 100644
--- a/arch/powerpc/mm/numa.c
+++ b/arch/powerpc/mm/numa.c
@@ -29,6 +29,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -42,6 +43,8 @@
 #include 
 
 static int numa_enabled = 1;
+static int topology_inited;
+static int topology_update_needed;
 
 static char *cmdline __initdata;
 
@@ -1321,8 +1324,11 @@ int arch_update_cpu_topology(void)
struct device *dev;
int weight, new_nid, i = 0;
 
-   if (!prrn_enabled && !vphn_enabled)
+   if (!prrn_enabled && !vphn_enabled) {
+   if (!topology_inited)
+   topology_update_needed = 1;
return 0;
+   }
 
weight = cpumask_weight(_associativity_changes_mask);
if (!weight)
@@ -1361,6 +1367,8 @@ int arch_update_cpu_topology(void)
cpumask_andnot(_associativity_changes_mask,
_associativity_changes_mask,
cpu_sibling_mask(cpu));
+   pr_info("Assoc chg gives same node %d for cpu%d\n",
+   new_nid, cpu);
cpu = cpu_last_thread_sibling(cpu);
continue;
}
@@ -1377,6 +1385,9 @@ int arch_update_cpu_topology(void)
cpu = cpu_last_thread_sibling(cpu);
}
 
+   if (i)
+   updates[i-1].next = NULL;
+
pr_debug("Topology update for the following CPUs:\n");
if (cpumask_weight(_cpus)) {
for (ud = [0]; ud; ud = ud->next) {
@@ -1423,6 +1434,7 @@ int arch_update_cpu_topology(void)
 
 out:
kfree(updates);
+   topology_update_needed = 0;
return changed;
 }
 
@@ -1600,6 +1612,11 @@ static int topology_update_init(void)
if (!proc_create("powerpc/topology_updates", 0644, NULL, _ops))
return -ENOMEM;
 
+   topology_inited = 1;
+   if (topology_update_needed)
+   bitmap_fill(cpumask_bits(_associativity_changes_mask),
+   nr_cpumask_bits);
+
return 0;
 }
 device_initcall(topology_update_init);
diff --git a/arch/powerpc/platforms/pseries/dlpar.c 
b/arch/powerpc/platforms/pseries/dlpar.c
index bda18d8..5106263 100644
--- a/arch/powerpc/platforms/pseries/dlpar.c
+++ b/arch/powerpc/platforms/pseries/dlpar.c
@@ -592,6 +592,8 @@ static ssize_t dlpar_show(struct class *class, struct 
class_attribute *attr,
 
 static int __init pseries_dlpar_init(void)
 {
+   arch_update_cpu_topology();
+
pseries_hp_wq = alloc_workqueue("pseries hotplug workqueue",
WQ_UNBOUND, 1);
return sysfs_create_file(kernel_kobj, _attr_dlpar.attr);
diff --git a/arch/powerpc/platforms/pseries/hotplug-cpu.c 
b/arch/powerpc/platforms/pseries/hotplug-cpu.c
index 7bc0e91..b5eff35 100644
--- a/arch/powerpc/platforms/pseries/hotplug-cpu.c
+++ b/arch/powerpc/platforms/pseries/hotplug-cpu.c
@@ -619,7 +619,8 @@ static int dlpar_cpu_remove_by_index(u32 drc_index)
}
 
rc = dlpar_cpu_remove(dn, drc_index);
-   of_node_put(dn);
+   if (rc)
+   of_node_put(dn);
return rc;
 }
 



[PATCH 0/2] powerpc/dlpar: Correct display of hot-add/hot-remove CPUs and memory

2017-05-23 Thread Michael Bringmann

powerpc/numa: Correct the currently broken capability to set the
topology for shared CPUs in LPARs.  At boot time for shared CPU
lpars, the topology for each shared CPU is set to node zero, however,
this is now updated correctly using the Virtual Processor Home Node
(VPHN) capabilities information provided by the pHyp. The VPHN handling
in Linux is disabled, if PRRN handling is present.

powerpc/hotplug-memory: Removing or adding memory via the PowerPC
hotplug interface shows anomalies in the association between memory
and nodes.  The code was updated to better take advantage of defined
nodes in order to associate memory to nodes not needed at boot time,
but relevant to dynamically added memory.

Signed-off-by: Michael Bringmann 

Michael Bringmann (2):
  powerpc/numa: Update CPU topology when VPHN enabled
  powerpc/hotplug-memory: Fix hot-add memory node assoc
---



Re: [PATCH v4 19/20] powerpc/83xx: Add generic compatible string for I2C EEPROM

2017-05-23 Thread Javier Martinez Canillas
Hello Rob,

On Tue, May 23, 2017 at 3:42 PM, Rob Herring  wrote:
> On Mon, May 22, 2017 at 9:02 AM, Javier Martinez Canillas
>  wrote:
>> The at24 driver allows to register I2C EEPROM chips using different vendor
>> and devices, but the I2C subsystem does not take the vendor into account
>> when matching using the I2C table since it only has device entries.
>>
>> But when matching using an OF table, both the vendor and device has to be
>> taken into account so the driver defines only a set of compatible strings
>> using the "atmel" vendor as a generic fallback for compatible I2C devices.
>>
>> So add this generic fallback to the device node compatible string to make
>> the device to match the driver using the OF device ID table.
>>
>> Signed-off-by: Javier Martinez Canillas 
>>
>> ---
>>
>> Changes in v4:
>> - Only use the atmel manufacturer in the compatible string instead of
>>   keeping the deprecated ones (Rob Herring).
>>
>> Changes in v3: None
>> Changes in v2: None
>>
>>  arch/powerpc/boot/dts/mpc8308_p1m.dts  | 2 +-
>>  arch/powerpc/boot/dts/mpc8349emitx.dts | 4 ++--
>>  arch/powerpc/boot/dts/mpc8377_rdb.dts  | 2 +-
>>  arch/powerpc/boot/dts/mpc8377_wlan.dts | 2 +-
>>  arch/powerpc/boot/dts/mpc8378_rdb.dts  | 2 +-
>>  arch/powerpc/boot/dts/mpc8379_rdb.dts  | 2 +-
>>  6 files changed, 7 insertions(+), 7 deletions(-)
>>
>> diff --git a/arch/powerpc/boot/dts/mpc8308_p1m.dts 
>> b/arch/powerpc/boot/dts/mpc8308_p1m.dts
>> index 57f86cdf9f36..702ab4fc5b4a 100644
>> --- a/arch/powerpc/boot/dts/mpc8308_p1m.dts
>> +++ b/arch/powerpc/boot/dts/mpc8308_p1m.dts
>> @@ -123,7 +123,7 @@
>> interrupt-parent = <>;
>> dfsrr;
>> fram@50 {
>> -   compatible = "ramtron,24c64";
>> +   compatible = "atmel,24c64";
>
> This should be '"ramtron,24c64", "atmel,24c64"'
>

Yes, I (hopefully) fixed all the occurrences in the v5 that I posted
today, you are cc'ed on that series too.

Again, sorry for misunderstanding your comment on v3.

Best regards,
Javier


Re: [PATCH v4 19/20] powerpc/83xx: Add generic compatible string for I2C EEPROM

2017-05-23 Thread Rob Herring
On Mon, May 22, 2017 at 9:02 AM, Javier Martinez Canillas
 wrote:
> The at24 driver allows to register I2C EEPROM chips using different vendor
> and devices, but the I2C subsystem does not take the vendor into account
> when matching using the I2C table since it only has device entries.
>
> But when matching using an OF table, both the vendor and device has to be
> taken into account so the driver defines only a set of compatible strings
> using the "atmel" vendor as a generic fallback for compatible I2C devices.
>
> So add this generic fallback to the device node compatible string to make
> the device to match the driver using the OF device ID table.
>
> Signed-off-by: Javier Martinez Canillas 
>
> ---
>
> Changes in v4:
> - Only use the atmel manufacturer in the compatible string instead of
>   keeping the deprecated ones (Rob Herring).
>
> Changes in v3: None
> Changes in v2: None
>
>  arch/powerpc/boot/dts/mpc8308_p1m.dts  | 2 +-
>  arch/powerpc/boot/dts/mpc8349emitx.dts | 4 ++--
>  arch/powerpc/boot/dts/mpc8377_rdb.dts  | 2 +-
>  arch/powerpc/boot/dts/mpc8377_wlan.dts | 2 +-
>  arch/powerpc/boot/dts/mpc8378_rdb.dts  | 2 +-
>  arch/powerpc/boot/dts/mpc8379_rdb.dts  | 2 +-
>  6 files changed, 7 insertions(+), 7 deletions(-)
>
> diff --git a/arch/powerpc/boot/dts/mpc8308_p1m.dts 
> b/arch/powerpc/boot/dts/mpc8308_p1m.dts
> index 57f86cdf9f36..702ab4fc5b4a 100644
> --- a/arch/powerpc/boot/dts/mpc8308_p1m.dts
> +++ b/arch/powerpc/boot/dts/mpc8308_p1m.dts
> @@ -123,7 +123,7 @@
> interrupt-parent = <>;
> dfsrr;
> fram@50 {
> -   compatible = "ramtron,24c64";
> +   compatible = "atmel,24c64";

This should be '"ramtron,24c64", "atmel,24c64"'

> reg = <0x50>;
> };
> };
> diff --git a/arch/powerpc/boot/dts/mpc8349emitx.dts 
> b/arch/powerpc/boot/dts/mpc8349emitx.dts
> index 90aed3ac2f69..f49d1cffd927 100644
> --- a/arch/powerpc/boot/dts/mpc8349emitx.dts
> +++ b/arch/powerpc/boot/dts/mpc8349emitx.dts
> @@ -92,7 +92,7 @@
> dfsrr;
>
> eeprom: at24@50 {
> -   compatible = "st,24c256";
> +   compatible = "atmel,24c256";

Similar for this one.

> reg = <0x50>;
> };
>
> @@ -130,7 +130,7 @@
> };
>
> spd: at24@51 {
> -   compatible = "at24,spd";
> +   compatible = "atmel,spd";

This is fine because at24 is not a vendor.

Rob


[PATCH v5 20/20] powerpc/44x: Add generic compatible string for I2C EEPROM

2017-05-23 Thread Javier Martinez Canillas
The at24 driver allows to register I2C EEPROM chips using different vendor
and devices, but the I2C subsystem does not take the vendor into account
when matching using the I2C table since it only has device entries.

But when matching using an OF table, both the vendor and device has to be
taken into account so the driver defines only a set of compatible strings
using the "atmel" vendor as a generic fallback for compatible I2C devices.

So add this generic fallback to the device node compatible string to make
the device to match the driver using the OF device ID table.

Signed-off-by: Javier Martinez Canillas 

---

Changes in v5: None
Changes in v4:
- Only use the atmel manufacturer in the compatible string instead of
  keeping the deprecated ones (Rob Herring).

Changes in v3: None
Changes in v2: None

 arch/powerpc/boot/dts/warp.dts | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/boot/dts/warp.dts b/arch/powerpc/boot/dts/warp.dts
index e576ee85c42f..ea9053ef4819 100644
--- a/arch/powerpc/boot/dts/warp.dts
+++ b/arch/powerpc/boot/dts/warp.dts
@@ -238,7 +238,7 @@
 
/* This will create 52 and 53 */
at24@52 {
-   compatible = "at,24c04";
+   compatible = "atmel,24c04";
reg = <0x52>;
};
};
-- 
2.9.3



[PATCH v5 19/20] powerpc/83xx: Add generic compatible string for I2C EEPROM

2017-05-23 Thread Javier Martinez Canillas
The at24 driver allows to register I2C EEPROM chips using different vendor
and devices, but the I2C subsystem does not take the vendor into account
when matching using the I2C table since it only has device entries.

But when matching using an OF table, both the vendor and device has to be
taken into account so the driver defines only a set of compatible strings
using the "atmel" vendor as a generic fallback for compatible I2C devices.

So add this generic fallback to the device node compatible string to make
the device to match the driver using the OF device ID table.

Signed-off-by: Javier Martinez Canillas 

---

Changes in v5:
- Only replace atmel variant but keep other EEPROM vendors (Geert Uytterhoeven).

Changes in v4:
- Only use the atmel manufacturer in the compatible string instead of
  keeping the deprecated ones (Rob Herring).

Changes in v3: None
Changes in v2: None

 arch/powerpc/boot/dts/mpc8308_p1m.dts  | 2 +-
 arch/powerpc/boot/dts/mpc8349emitx.dts | 4 ++--
 arch/powerpc/boot/dts/mpc8377_rdb.dts  | 2 +-
 arch/powerpc/boot/dts/mpc8377_wlan.dts | 2 +-
 arch/powerpc/boot/dts/mpc8378_rdb.dts  | 2 +-
 arch/powerpc/boot/dts/mpc8379_rdb.dts  | 2 +-
 6 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/boot/dts/mpc8308_p1m.dts 
b/arch/powerpc/boot/dts/mpc8308_p1m.dts
index 57f86cdf9f36..cab933b3957a 100644
--- a/arch/powerpc/boot/dts/mpc8308_p1m.dts
+++ b/arch/powerpc/boot/dts/mpc8308_p1m.dts
@@ -123,7 +123,7 @@
interrupt-parent = <>;
dfsrr;
fram@50 {
-   compatible = "ramtron,24c64";
+   compatible = "ramtron,24c64", "atmel,24c64";
reg = <0x50>;
};
};
diff --git a/arch/powerpc/boot/dts/mpc8349emitx.dts 
b/arch/powerpc/boot/dts/mpc8349emitx.dts
index 90aed3ac2f69..648a85858eb5 100644
--- a/arch/powerpc/boot/dts/mpc8349emitx.dts
+++ b/arch/powerpc/boot/dts/mpc8349emitx.dts
@@ -92,7 +92,7 @@
dfsrr;
 
eeprom: at24@50 {
-   compatible = "st,24c256";
+   compatible = "st,24c256", "atmel,24c256";
reg = <0x50>;
};
 
@@ -130,7 +130,7 @@
};
 
spd: at24@51 {
-   compatible = "at24,spd";
+   compatible = "atmel,spd";
reg = <0x51>;
};
 
diff --git a/arch/powerpc/boot/dts/mpc8377_rdb.dts 
b/arch/powerpc/boot/dts/mpc8377_rdb.dts
index e32613963ab0..5e85d8c93bca 100644
--- a/arch/powerpc/boot/dts/mpc8377_rdb.dts
+++ b/arch/powerpc/boot/dts/mpc8377_rdb.dts
@@ -150,7 +150,7 @@
};
 
at24@50 {
-   compatible = "at24,24c256";
+   compatible = "atmel,24c256";
reg = <0x50>;
};
 
diff --git a/arch/powerpc/boot/dts/mpc8377_wlan.dts 
b/arch/powerpc/boot/dts/mpc8377_wlan.dts
index c0c790168b96..fee15fcbb46f 100644
--- a/arch/powerpc/boot/dts/mpc8377_wlan.dts
+++ b/arch/powerpc/boot/dts/mpc8377_wlan.dts
@@ -135,7 +135,7 @@
dfsrr;
 
at24@50 {
-   compatible = "at24,24c256";
+   compatible = "atmel,24c256";
reg = <0x50>;
};
 
diff --git a/arch/powerpc/boot/dts/mpc8378_rdb.dts 
b/arch/powerpc/boot/dts/mpc8378_rdb.dts
index 71842fcd621f..e973d61956b9 100644
--- a/arch/powerpc/boot/dts/mpc8378_rdb.dts
+++ b/arch/powerpc/boot/dts/mpc8378_rdb.dts
@@ -150,7 +150,7 @@
};
 
at24@50 {
-   compatible = "at24,24c256";
+   compatible = "atmel,24c256";
reg = <0x50>;
};
 
diff --git a/arch/powerpc/boot/dts/mpc8379_rdb.dts 
b/arch/powerpc/boot/dts/mpc8379_rdb.dts
index e442a29b2fe0..ed5d12ff2ee0 100644
--- a/arch/powerpc/boot/dts/mpc8379_rdb.dts
+++ b/arch/powerpc/boot/dts/mpc8379_rdb.dts
@@ -148,7 +148,7 @@
};
 
at24@50 {
-   compatible = "at24,24c256";
+   compatible = "atmel,24c256";
reg = <0x50>;
};
 
-- 
2.9.3



[PATCH v5 18/20] powerpc/512x: Add generic compatible string for I2C EEPROM

2017-05-23 Thread Javier Martinez Canillas
The at24 driver allows to register I2C EEPROM chips using different vendor
and devices, but the I2C subsystem does not take the vendor into account
when matching using the I2C table since it only has device entries.

But when matching using an OF table, both the vendor and device has to be
taken into account so the driver defines only a set of compatible strings
using the "atmel" vendor as a generic fallback for compatible I2C devices.

So add this generic fallback to the device node compatible string to make
the device to match the driver using the OF device ID table.

Signed-off-by: Javier Martinez Canillas 

---

Changes in v5: None
Changes in v4:
- Only use the atmel manufacturer in the compatible string instead of
  keeping the deprecated ones (Rob Herring).

Changes in v3: None
Changes in v2: None

 arch/powerpc/boot/dts/mpc5121ads.dts | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/boot/dts/mpc5121ads.dts 
b/arch/powerpc/boot/dts/mpc5121ads.dts
index 75888ce2c792..fcaa9bad4bda 100644
--- a/arch/powerpc/boot/dts/mpc5121ads.dts
+++ b/arch/powerpc/boot/dts/mpc5121ads.dts
@@ -94,7 +94,7 @@
};
 
eeprom@50 {
-   compatible = "at,24c32";
+   compatible = "atmel,24c32";
reg = <0x50>;
};
 
-- 
2.9.3



[PATCH v5 17/20] powerpc/fsl: Add generic compatible string for I2C EEPROM

2017-05-23 Thread Javier Martinez Canillas
The at24 driver allows to register I2C EEPROM chips using different vendor
and devices, but the I2C subsystem does not take the vendor into account
when matching using the I2C table since it only has device entries.

But when matching using an OF table, both the vendor and device has to be
taken into account so the driver defines only a set of compatible strings
using the "atmel" vendor as a generic fallback for compatible I2C devices.

So add this generic fallback to the device node compatible string to make
the device to match the driver using the OF device ID table.

Signed-off-by: Javier Martinez Canillas 

---

Changes in v5:
- Only replace atmel variant but keep other EEPROM vendors (Geert Uytterhoeven).

Changes in v4:
- Only use the atmel manufacturer in the compatible string instead of
  keeping the deprecated ones (Rob Herring).

Changes in v3: None
Changes in v2: None

 arch/powerpc/boot/dts/fsl/b4qds.dtsi|  8 
 arch/powerpc/boot/dts/fsl/c293pcie.dts  |  2 +-
 arch/powerpc/boot/dts/fsl/p1010rdb.dtsi |  2 +-
 arch/powerpc/boot/dts/fsl/p1023rdb.dts  |  2 +-
 arch/powerpc/boot/dts/fsl/p2041rdb.dts  |  4 ++--
 arch/powerpc/boot/dts/fsl/p3041ds.dts   |  4 ++--
 arch/powerpc/boot/dts/fsl/p4080ds.dts   |  4 ++--
 arch/powerpc/boot/dts/fsl/p5020ds.dts   |  4 ++--
 arch/powerpc/boot/dts/fsl/p5040ds.dts   |  4 ++--
 arch/powerpc/boot/dts/fsl/t208xqds.dtsi |  8 
 arch/powerpc/boot/dts/fsl/t4240qds.dts  | 12 ++--
 arch/powerpc/boot/dts/fsl/t4240rdb.dts  |  6 +++---
 12 files changed, 30 insertions(+), 30 deletions(-)

diff --git a/arch/powerpc/boot/dts/fsl/b4qds.dtsi 
b/arch/powerpc/boot/dts/fsl/b4qds.dtsi
index 3785ef826d07..999efd3bc167 100644
--- a/arch/powerpc/boot/dts/fsl/b4qds.dtsi
+++ b/arch/powerpc/boot/dts/fsl/b4qds.dtsi
@@ -166,19 +166,19 @@
reg = <0>;
 
eeprom@50 {
-   compatible = "at24,24c64";
+   compatible = "atmel,24c64";
reg = <0x50>;
};
eeprom@51 {
-   compatible = "at24,24c256";
+   compatible = "atmel,24c256";
reg = <0x51>;
};
eeprom@53 {
-   compatible = "at24,24c256";
+   compatible = "atmel,24c256";
reg = <0x53>;
};
eeprom@57 {
-   compatible = "at24,24c256";
+   compatible = "atmel,24c256";
reg = <0x57>;
};
rtc@68 {
diff --git a/arch/powerpc/boot/dts/fsl/c293pcie.dts 
b/arch/powerpc/boot/dts/fsl/c293pcie.dts
index 66709788429d..5e905e0857cf 100644
--- a/arch/powerpc/boot/dts/fsl/c293pcie.dts
+++ b/arch/powerpc/boot/dts/fsl/c293pcie.dts
@@ -153,7 +153,7 @@
  {
i2c@3000 {
eeprom@50 {
-   compatible = "st,24c1024";
+   compatible = "st,24c1024", "atmel,24c1024";
reg = <0x50>;
};
 
diff --git a/arch/powerpc/boot/dts/fsl/p1010rdb.dtsi 
b/arch/powerpc/boot/dts/fsl/p1010rdb.dtsi
index a8e4ba070104..2ca9cee2ddeb 100644
--- a/arch/powerpc/boot/dts/fsl/p1010rdb.dtsi
+++ b/arch/powerpc/boot/dts/fsl/p1010rdb.dtsi
@@ -89,7 +89,7 @@
 _soc {
i2c@3000 {
eeprom@50 {
-   compatible = "st,24c256";
+   compatible = "st,24c256", "atmel,24c256";
reg = <0x50>;
};
 
diff --git a/arch/powerpc/boot/dts/fsl/p1023rdb.dts 
b/arch/powerpc/boot/dts/fsl/p1023rdb.dts
index 9716ca64651c..ead928364beb 100644
--- a/arch/powerpc/boot/dts/fsl/p1023rdb.dts
+++ b/arch/powerpc/boot/dts/fsl/p1023rdb.dts
@@ -79,7 +79,7 @@
 
i2c@3000 {
eeprom@53 {
-   compatible = "at24,24c04";
+   compatible = "atmel,24c04";
reg = <0x53>;
};
 
diff --git a/arch/powerpc/boot/dts/fsl/p2041rdb.dts 
b/arch/powerpc/boot/dts/fsl/p2041rdb.dts
index e50fea95a853..950816b9d6e1 100644
--- a/arch/powerpc/boot/dts/fsl/p2041rdb.dts
+++ b/arch/powerpc/boot/dts/fsl/p2041rdb.dts
@@ -127,7 +127,7 @@
reg = <0x48>;
};
eeprom@50 {
- 

[PATCH v5 16/20] powerpc/5200: Add generic compatible string for I2C EEPROM

2017-05-23 Thread Javier Martinez Canillas
The at24 driver allows to register I2C EEPROM chips using different vendor
and devices, but the I2C subsystem does not take the vendor into account
when matching using the I2C table since it only has device entries.

But when matching using an OF table, both the vendor and device has to be
taken into account so the driver defines only a set of compatible strings
using the "atmel" vendor as a generic fallback for compatible I2C devices.

So add this generic fallback to the device node compatible string to make
the device to match the driver using the OF device ID table.

Signed-off-by: Javier Martinez Canillas 

---

Changes in v5:
- Only replace atmel variant but keep other EEPROM vendors (Geert Uytterhoeven).

Changes in v4:
- Only use the atmel manufacturer in the compatible string instead of
  keeping the deprecated ones (Rob Herring).

Changes in v3: None
Changes in v2: None

 arch/powerpc/boot/dts/digsy_mtc.dts | 2 +-
 arch/powerpc/boot/dts/pcm030.dts| 2 +-
 arch/powerpc/boot/dts/pcm032.dts| 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/boot/dts/digsy_mtc.dts 
b/arch/powerpc/boot/dts/digsy_mtc.dts
index 955bff629df3..c280e75c86bf 100644
--- a/arch/powerpc/boot/dts/digsy_mtc.dts
+++ b/arch/powerpc/boot/dts/digsy_mtc.dts
@@ -73,7 +73,7 @@
 
i2c@3d00 {
eeprom@50 {
-   compatible = "at,24c08";
+   compatible = "atmel,24c08";
reg = <0x50>;
};
 
diff --git a/arch/powerpc/boot/dts/pcm030.dts b/arch/powerpc/boot/dts/pcm030.dts
index 192e66af0001..836e47cc4bed 100644
--- a/arch/powerpc/boot/dts/pcm030.dts
+++ b/arch/powerpc/boot/dts/pcm030.dts
@@ -71,7 +71,7 @@
reg = <0x51>;
};
eeprom@52 {
-   compatible = "catalyst,24c32";
+   compatible = "catalyst,24c32", "atmel,24c32";
reg = <0x52>;
pagesize = <32>;
};
diff --git a/arch/powerpc/boot/dts/pcm032.dts b/arch/powerpc/boot/dts/pcm032.dts
index 96b139bf50e9..576249bf2fb9 100644
--- a/arch/powerpc/boot/dts/pcm032.dts
+++ b/arch/powerpc/boot/dts/pcm032.dts
@@ -75,7 +75,7 @@
reg = <0x51>;
};
eeprom@52 {
-   compatible = "catalyst,24c32";
+   compatible = "catalyst,24c32", "atmel,24c32";
reg = <0x52>;
pagesize = <32>;
};
-- 
2.9.3



[PATCH v5 00/20] eeprom: at24: Add OF device ID table

2017-05-23 Thread Javier Martinez Canillas
Hello Wolfram,

This series is a follow-up to patch [0] that added an OF device ID table
to the at24 EEPROM driver. As you suggested [1], this version instead of
adding entries for every used  tuple, only adds a single
entry for each chip type using the "atmel" vendor as a generic fallback.

The first patch documents in the DT binding what's the correct vendor to
use and what are the ones that are being deprecated. The second one adds
the OF device ID table for the at24 driver and the next patches use this
vendor in the compatible string to each DTS that defines a compatible I2C
EEPROM device node.

Patches can be applied independently since the DTS changes without driver
changes are no-op and the OF table won't be used without the DTS changes.

[0]: https://lkml.org/lkml/2017/3/14/589
[1]: https://lkml.org/lkml/2017/3/15/99

Best regards,
Javier

Changes in v5:
- Only deprecate the atmel variants at25 and at (Geert Uytterhoeven).
- Only replace atmel variant but keep other EEPROM vendors (Geert Uytterhoeven).
- Only replace atmel variant but keep other EEPROM vendors (Geert Uytterhoeven).
- Only replace atmel variant but keep other EEPROM vendors (Geert Uytterhoeven).
- Only replace atmel variant but keep other EEPROM vendors (Geert Uytterhoeven).
- Only replace atmel variant but keep other EEPROM vendors (Geert Uytterhoeven).
- Only replace atmel variant but keep other EEPROM vendors (Geert Uytterhoeven).
- Only replace atmel variant but keep other EEPROM vendors (Geert Uytterhoeven).
- Only replace atmel variant but keep other EEPROM vendors (Geert Uytterhoeven).
- Only replace atmel variant but keep other EEPROM vendors (Geert Uytterhoeven).

Changes in v4:
- Document the manufacturers that have been deprecated (Rob Herring).
- Only use the atmel manufacturer in the compatible string instead of
  keeping the deprecated ones (Rob Herring).
- Only use the atmel manufacturer in the compatible string instead of
  keeping the deprecated ones (Rob Herring).
- Only use the atmel manufacturer in the compatible string instead of
  keeping the deprecated ones (Rob Herring).
- Only use the atmel manufacturer in the compatible string instead of
  keeping the deprecated ones (Rob Herring).
- Only use the atmel manufacturer in the compatible string instead of
  keeping the deprecated ones (Rob Herring).
- Only use the atmel manufacturer in the compatible string instead of
  keeping the deprecated ones (Rob Herring).
- Only use the atmel manufacturer in the compatible string instead of
  keeping the deprecated ones (Rob Herring).
- Only use the atmel manufacturer in the compatible string instead of
  keeping the deprecated ones (Rob Herring).
- Only use the atmel manufacturer in the compatible string instead of
  keeping the deprecated ones (Rob Herring).
- Only use the atmel manufacturer in the compatible string instead of
  keeping the deprecated ones (Rob Herring).
- Only use the atmel manufacturer in the compatible string instead of
  keeping the deprecated ones (Rob Herring).
- Only use the atmel manufacturer in the compatible string instead of
  keeping the deprecated ones (Rob Herring).
- Only use the atmel manufacturer in the compatible string instead of
  keeping the deprecated ones (Rob Herring).
- Only use the atmel manufacturer in the compatible string instead of
  keeping the deprecated ones (Rob Herring).
- Only use the atmel manufacturer in the compatible string instead of
  keeping the deprecated ones (Rob Herring).
- Only use the atmel manufacturer in the compatible string instead of
  keeping the deprecated ones (Rob Herring).
- Only use the atmel manufacturer in the compatible string instead of
  keeping the deprecated ones (Rob Herring).
- Only use the atmel manufacturer in the compatible string instead of
  keeping the deprecated ones (Rob Herring).

Changes in v3:
- Fix wrong .data values for "atmel,24c02" and "atmel,24c64" entries.
- Add Geert Uytterhoeven reviewed-by tag.
- Add Geert Uytterhoeven reviewed-by tag.

Changes in v2:
- Only add a single OF device ID entry for each device type (Wolfram Sang).

Javier Martinez Canillas (20):
  dt-bindings: i2c: eeprom: Document vendor to be used and deprecated
ones
  eeprom: at24: Add OF device ID table
  ARM: dts: omap: Add generic compatible string for I2C EEPROM
  ARM: dts: turris-omnia: Add generic compatible string for I2C EEPROM
  ARM: dts: efm32: Add generic compatible string for I2C EEPROM
  ARM: dts: imx: Add generic compatible string for I2C EEPROM
  ARM: dts: keystone: Add generic compatible string for I2C EEPROM
  ARM: dts: lpc18xx: Add generic compatible string for I2C EEPROM
  ARM: dts: r7s72100: Add generic compatible string for I2C EEPROM
  ARM: dts: koelsch: Add generic compatible string for I2C EEPROM
  ARM: dts: socfpga: Add generic compatible string for I2C EEPROM
  ARM: dts: uniphier: Add generic compatible string for I2C EEPROM
  ARM: dts: zynq: Add generic compatible string for I2C EEPROM
  arm64: dts: 

RE: RFC: better timer interface

2017-05-23 Thread Thomas Gleixner
On Tue, 23 May 2017, David Laight wrote:
> From: Thomas Gleixner
> > Sent: 23 May 2017 12:59
> > On Tue, 23 May 2017, David Laight wrote:
> > 
> > > From: Thomas Gleixner
> > > > Sent: 21 May 2017 19:15
> > > ...
> > > > > timer_start(timer, ms, abs)
> > > >
> > > > I'm not even sure, whether we need absolute timer wheel timers at
> > > > all, because most use cases are relative to now.
> > >
> > > Posix requires absolute timers for some userspace calls
> > > (annoying because the code often wants relative).
> > 
> > Posix is completely irrelevant here. These timers are purely kernel
> > internal.
> 
> Somehow pthread_cond_timedwait() has to be implemented.
> Doing so without kernel timers that use absolute 'wall clock' time is tricky.

Oh well. The timer wheel timers are NOT used to implement any posix
interface. That's all handled by hrtimers and they are not debated here.

So still nothing to see here.

Thanks,

tglx


RE: RFC: better timer interface

2017-05-23 Thread David Laight
From: Thomas Gleixner
> Sent: 23 May 2017 12:59
> On Tue, 23 May 2017, David Laight wrote:
> 
> > From: Thomas Gleixner
> > > Sent: 21 May 2017 19:15
> > ...
> > > > timer_start(timer, ms, abs)
> > >
> > > I'm not even sure, whether we need absolute timer wheel timers at
> > > all, because most use cases are relative to now.
> >
> > Posix requires absolute timers for some userspace calls
> > (annoying because the code often wants relative).
> 
> Posix is completely irrelevant here. These timers are purely kernel
> internal.

Somehow pthread_cond_timedwait() has to be implemented.
Doing so without kernel timers that use absolute 'wall clock' time is tricky.

David



RE: RFC: better timer interface

2017-05-23 Thread Thomas Gleixner
On Tue, 23 May 2017, David Laight wrote:

> From: Thomas Gleixner
> > Sent: 21 May 2017 19:15
> ...
> > > timer_start(timer, ms, abs)
> > 
> > I'm not even sure, whether we need absolute timer wheel timers at
> > all, because most use cases are relative to now.
> 
> Posix requires absolute timers for some userspace calls
> (annoying because the code often wants relative).

Posix is completely irrelevant here. These timers are purely kernel
internal.

Thanks,

tglx


RE: RFC: better timer interface

2017-05-23 Thread David Laight
From: Thomas Gleixner
> Sent: 21 May 2017 19:15
...
> > timer_start(timer, ms, abs)
> 
> I'm not even sure, whether we need absolute timer wheel timers at
> all, because most use cases are relative to now.

Posix requires absolute timers for some userspace calls
(annoying because the code often wants relative).

OTOH how much conditional code is there for the 'abs' argument.
And is there any code that doesn't pass a constant?

Certainly worth a separate timer_start_abs(timer, wall_time)
function since you can't correctly map a wall_time timer
to a jiffies one.

David



Re: [PATCH 4/6] powerpc/mm: Add devmap support for ppc64

2017-05-23 Thread Balbir Singh
On Tue, May 23, 2017 at 2:05 PM, Oliver O'Halloran  wrote:
> Add support for the devmap bit on PTEs and PMDs for PPC64 Book3S.  This
> is used to differentiate device backed memory from transparent huge
> pages since they are handled in more or less the same manner by the core
> mm code.
>
> Cc: Aneesh Kumar K.V 
> Signed-off-by: Oliver O'Halloran 
> ---
> v1 -> v2: Properly differentiate THP and PMD Devmap entries. The
> mm core assumes that pmd_trans_huge() and pmd_devmap() are mutually
> exclusive and v1 had pmd_trans_huge() being true on a devmap pmd.
>
> Aneesh, this has been fleshed out substantially since v1. Can you
> re-review it? Also no explicit gup support is required in this patch
> since devmap support was added generic GUP as a part of making x86 use
> the generic version.
> ---
>  arch/powerpc/include/asm/book3s/64/hash-64k.h |  2 +-
>  arch/powerpc/include/asm/book3s/64/pgtable.h  | 37 
> ++-
>  arch/powerpc/include/asm/book3s/64/radix.h|  2 +-
>  arch/powerpc/mm/hugetlbpage.c |  2 +-
>  arch/powerpc/mm/pgtable-book3s64.c|  4 +--
>  arch/powerpc/mm/pgtable-hash64.c  |  4 ++-
>  arch/powerpc/mm/pgtable-radix.c   |  3 ++-
>  arch/powerpc/mm/pgtable_64.c  |  2 +-
>  8 files changed, 47 insertions(+), 9 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/book3s/64/hash-64k.h 
> b/arch/powerpc/include/asm/book3s/64/hash-64k.h
> index 9732837aaae8..eaaf613c5347 100644
> --- a/arch/powerpc/include/asm/book3s/64/hash-64k.h
> +++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h
> @@ -180,7 +180,7 @@ static inline void mark_hpte_slot_valid(unsigned char 
> *hpte_slot_array,
>   */
>  static inline int hash__pmd_trans_huge(pmd_t pmd)
>  {
> -   return !!((pmd_val(pmd) & (_PAGE_PTE | H_PAGE_THP_HUGE)) ==
> +   return !!((pmd_val(pmd) & (_PAGE_PTE | H_PAGE_THP_HUGE | 
> _PAGE_DEVMAP)) ==
>   (_PAGE_PTE | H_PAGE_THP_HUGE));
>  }

Like Aneesh suggested, I think we can probably skip this check here

>
> diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h 
> b/arch/powerpc/include/asm/book3s/64/pgtable.h
> index 85bc9875c3be..24634e92dd0b 100644
> --- a/arch/powerpc/include/asm/book3s/64/pgtable.h
> +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
> @@ -79,6 +79,9 @@
>
>  #define _PAGE_SOFT_DIRTY   _RPAGE_SW3 /* software: software dirty 
> tracking */
>  #define _PAGE_SPECIAL  _RPAGE_SW2 /* software: special page */
> +#define _PAGE_DEVMAP   _RPAGE_SW1
> +#define __HAVE_ARCH_PTE_DEVMAP
> +
>  /*
>   * Drivers request for cache inhibited pte mapping using _PAGE_NO_CACHE
>   * Instead of fixing all of them, add an alternate define which
> @@ -599,6 +602,16 @@ static inline pte_t pte_mkhuge(pte_t pte)
> return pte;
>  }
>
> +static inline pte_t pte_mkdevmap(pte_t pte)
> +{
> +   return __pte(pte_val(pte) | _PAGE_SPECIAL|_PAGE_DEVMAP);
> +}
> +
> +static inline int pte_devmap(pte_t pte)
> +{
> +   return !!(pte_raw(pte) & cpu_to_be64(_PAGE_DEVMAP));
> +}
> +
>  static inline pte_t pte_modify(pte_t pte, pgprot_t newprot)
>  {
> /* FIXME!! check whether this need to be a conditional */
> @@ -963,6 +976,9 @@ static inline pte_t *pmdp_ptep(pmd_t *pmd)
>  #define pmd_mk_savedwrite(pmd) pte_pmd(pte_mk_savedwrite(pmd_pte(pmd)))
>  #define pmd_clear_savedwrite(pmd)  
> pte_pmd(pte_clear_savedwrite(pmd_pte(pmd)))
>
> +#define pud_pfn(...) (0)
> +#define pgd_pfn(...) (0)
> +

Don't get these bits.. why are they zero?

>  #ifdef CONFIG_HAVE_ARCH_SOFT_DIRTY
>  #define pmd_soft_dirty(pmd)pte_soft_dirty(pmd_pte(pmd))
>  #define pmd_mksoft_dirty(pmd)  pte_pmd(pte_mksoft_dirty(pmd_pte(pmd)))
> @@ -1137,7 +1153,6 @@ static inline int pmd_move_must_withdraw(struct 
> spinlock *new_pmd_ptl,
> return true;
>  }
>
> -
>  #define arch_needs_pgtable_deposit arch_needs_pgtable_deposit
>  static inline bool arch_needs_pgtable_deposit(void)
>  {
> @@ -1146,6 +1161,26 @@ static inline bool arch_needs_pgtable_deposit(void)
> return true;
>  }
>
> +static inline pmd_t pmd_mkdevmap(pmd_t pmd)
> +{
> +   return pte_pmd(pte_mkdevmap(pmd_pte(pmd)));
> +}
> +
> +static inline int pmd_devmap(pmd_t pmd)
> +{
> +   return pte_devmap(pmd_pte(pmd));
> +}

This should be defined only if #ifdef __HAVE_ARCH_PTE_DEVMAP

The rest looks OK

Balbir Singh.


Re: [PATCH 1/6] powerpc/mm: Wire up hpte_removebolted for powernv

2017-05-23 Thread Balbir Singh
On Tue, May 23, 2017 at 2:05 PM, Oliver O'Halloran  wrote:
> From: Anton Blanchard 
>
> Adds support for removing bolted (i.e kernel linear mapping) mappings on
> powernv. This is needed to support memory hot unplug operations which
> are required for the teardown of DAX/PMEM devices.
>
> Reviewed-by: Rashmica Gupta 
> Signed-off-by: Anton Blanchard 
> Signed-off-by: Oliver O'Halloran 
> ---
> v1 -> v2: Fixed the commit author
>   Added VM_WARN_ON() if we attempt to remove an unbolted hpte
> ---
>  arch/powerpc/mm/hash_native_64.c | 33 +
>  1 file changed, 33 insertions(+)
>
> diff --git a/arch/powerpc/mm/hash_native_64.c 
> b/arch/powerpc/mm/hash_native_64.c
> index 65bb8f33b399..b534d041cfe8 100644
> --- a/arch/powerpc/mm/hash_native_64.c
> +++ b/arch/powerpc/mm/hash_native_64.c
> @@ -407,6 +407,38 @@ static void native_hpte_updateboltedpp(unsigned long 
> newpp, unsigned long ea,
> tlbie(vpn, psize, psize, ssize, 0);
>  }
>
> +/*
> + * Remove a bolted kernel entry. Memory hotplug uses this.
> + *
> + * No need to lock here because we should be the only user.
> + */
> +static int native_hpte_removebolted(unsigned long ea, int psize, int ssize)
> +{
> +   unsigned long vpn;
> +   unsigned long vsid;
> +   long slot;
> +   struct hash_pte *hptep;
> +
> +   vsid = get_kernel_vsid(ea, ssize);
> +   vpn = hpt_vpn(ea, vsid, ssize);
> +
> +   slot = native_hpte_find(vpn, psize, ssize);
> +   if (slot == -1)
> +   return -ENOENT;
> +
> +   hptep = htab_address + slot;
> +
> +   VM_WARN_ON(!(be64_to_cpu(hptep->v) & HPTE_V_BOLTED));
> +
> +   /* Invalidate the hpte */
> +   hptep->v = 0;
> +
> +   /* Invalidate the TLB */
> +   tlbie(vpn, psize, psize, ssize, 0);
> +   return 0;
> +}
> +

Reviewed-by: Balbir Singh 

Balbir Singh.


Re: [PATCH 3/6] powerpc/vmemmap: Add altmap support

2017-05-23 Thread Balbir Singh
On Tue, May 23, 2017 at 2:05 PM, Oliver O'Halloran  wrote:
> Adds support to powerpc for the altmap feature of ZONE_DEVICE memory. An
> altmap is a driver provided region that is used to provide the backing
> storage for the struct pages of ZONE_DEVICE memory. In situations where
> large amount of ZONE_DEVICE memory is being added to the system the
> altmap reduces pressure on main system memory by allowing the mm/
> metadata to be stored on the device itself rather in main memory.
>
> Signed-off-by: Oliver O'Halloran 
> ---
>  arch/powerpc/mm/init_64.c | 15 +--
>  arch/powerpc/mm/mem.c | 16 +---
>  2 files changed, 26 insertions(+), 5 deletions(-)
>
> diff --git a/arch/powerpc/mm/init_64.c b/arch/powerpc/mm/init_64.c
> index 8851e4f5dbab..225fbb8034e6 100644
> --- a/arch/powerpc/mm/init_64.c
> +++ b/arch/powerpc/mm/init_64.c
> @@ -44,6 +44,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>
>  #include 
>  #include 
> @@ -171,13 +172,17 @@ int __meminit vmemmap_populate(unsigned long start, 
> unsigned long end, int node)
> pr_debug("vmemmap_populate %lx..%lx, node %d\n", start, end, node);
>
> for (; start < end; start += page_size) {
> +   struct vmem_altmap *altmap;
> void *p;
> int rc;
>
> if (vmemmap_populated(start, page_size))
> continue;
>
> -   p = vmemmap_alloc_block(page_size, node);
> +   /* altmap lookups only work at section boundaries */
> +   altmap = to_vmem_altmap(SECTION_ALIGN_DOWN(start));
> +
> +   p =  __vmemmap_alloc_block_buf(page_size, node, altmap);
> if (!p)
> return -ENOMEM;
>
> @@ -242,6 +247,8 @@ void __ref vmemmap_free(unsigned long start, unsigned 
> long end)
>
> for (; start < end; start += page_size) {
> unsigned long nr_pages, addr;
> +   struct vmem_altmap *altmap;
> +   struct page *section_base;
> struct page *page;
>
> /*
> @@ -257,9 +264,13 @@ void __ref vmemmap_free(unsigned long start, unsigned 
> long end)
> continue;
>
> page = pfn_to_page(addr >> PAGE_SHIFT);
> +   section_base = pfn_to_page(vmemmap_section_start(start));
> nr_pages = 1 << page_order;
>
> -   if (PageReserved(page)) {
> +   altmap = to_vmem_altmap((unsigned long) section_base);
> +   if (altmap) {
> +   vmem_altmap_free(altmap, nr_pages);
> +   } else if (PageReserved(page)) {
> /* allocated from bootmem */
> if (page_size < PAGE_SIZE) {
> /*
> diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
> index 9ee536ec0739..2c0c16f11eee 100644
> --- a/arch/powerpc/mm/mem.c
> +++ b/arch/powerpc/mm/mem.c
> @@ -36,6 +36,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>
>  #include 
>  #include 
> @@ -159,11 +160,20 @@ int arch_remove_memory(u64 start, u64 size)
>  {
> unsigned long start_pfn = start >> PAGE_SHIFT;
> unsigned long nr_pages = size >> PAGE_SHIFT;
> -   struct zone *zone;
> +   struct vmem_altmap *altmap;
> +   struct page *page;
> int ret;
>
> -   zone = page_zone(pfn_to_page(start_pfn));
> -   ret = __remove_pages(zone, start_pfn, nr_pages);
> +   /*
> +* If we have an altmap then we need to skip over any reserved PFNs
> +* when querying the zone.
> +*/
> +   page = pfn_to_page(start_pfn);
> +   altmap = to_vmem_altmap((unsigned long) page);
> +   if (altmap)
> +   page += vmem_altmap_offset(altmap);
> +
> +   ret = __remove_pages(page_zone(page), start_pfn, nr_pages);
> if (ret)
> return ret;

Reviewed-by: Balbir Singh 

Balbir


Re: [PATCH 6/6] powerpc/mm: Enable ZONE_DEVICE on powerpc

2017-05-23 Thread Balbir Singh
On Tue, May 23, 2017 at 2:05 PM, Oliver O'Halloran  wrote:
> Flip the switch. Running around and screaming "IT'S ALIVE" is optional,
> but recommended.
>
> Signed-off-by: Oliver O'Halloran 
> ---
>  arch/powerpc/Kconfig | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> index f7c8f9972f61..bf3365c34244 100644
> --- a/arch/powerpc/Kconfig
> +++ b/arch/powerpc/Kconfig
> @@ -138,6 +138,7 @@ config PPC
> select ARCH_HAS_SG_CHAIN
> select ARCH_HAS_TICK_BROADCAST  if 
> GENERIC_CLOCKEVENTS_BROADCAST
> select ARCH_HAS_UBSAN_SANITIZE_ALL
> +   select ARCH_HAS_ZONE_DEVICE if PPC64

Does this work for Book E as well?

Balbir Singh.


Re: [PATCH 5/6] mm, x86: Add ARCH_HAS_ZONE_DEVICE

2017-05-23 Thread Balbir Singh
On Tue, May 23, 2017 at 2:05 PM, Oliver O'Halloran  wrote:
> Currently ZONE_DEVICE depends on X86_64. This is fine for now, but it
> will get unwieldly as new platforms get ZONE_DEVICE support. Moving it
> to an arch selected Kconfig option to save us some trouble in the
> future.
>
> Cc: x...@kernel.org
> Signed-off-by: Oliver O'Halloran 
> ---
>  arch/x86/Kconfig | 1 +
>  mm/Kconfig   | 5 -
>  2 files changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index cd18994a9555..acbb15234562 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -59,6 +59,7 @@ config X86
> select ARCH_HAS_STRICT_KERNEL_RWX
> select ARCH_HAS_STRICT_MODULE_RWX
> select ARCH_HAS_UBSAN_SANITIZE_ALL
> +   select ARCH_HAS_ZONE_DEVICE if X86_64
> select ARCH_HAVE_NMI_SAFE_CMPXCHG
> select ARCH_MIGHT_HAVE_ACPI_PDC if ACPI
> select ARCH_MIGHT_HAVE_PC_PARPORT
> diff --git a/mm/Kconfig b/mm/Kconfig
> index beb7a455915d..2d38a4abe957 100644
> --- a/mm/Kconfig
> +++ b/mm/Kconfig
> @@ -683,12 +683,15 @@ config IDLE_PAGE_TRACKING
>
>   See Documentation/vm/idle_page_tracking.txt for more details.
>
> +config ARCH_HAS_ZONE_DEVICE
> +   def_bool n
> +
>  config ZONE_DEVICE
> bool "Device memory (pmem, etc...) hotplug support"
> depends on MEMORY_HOTPLUG
> depends on MEMORY_HOTREMOVE
> depends on SPARSEMEM_VMEMMAP
> -   depends on X86_64 #arch_add_memory() comprehends device memory
> +   depends on ARCH_HAS_ZONE_DEVICE
>
> help
>   Device memory hotplug support allows for establishing pmem,

Acked-by: Balbir Singh 


Re: SPU not working for kernel 4.9, 4.10, 4.11a and 4.12

2017-05-23 Thread Jeremy Kerr
Hi all,


> Looks like this also happens with the simple spu_run test:
> 
>  
> https://github.com/jk-ozlabs/spufs-testsuite/blob/master/tests/03-spu_run/01-spu_run.c
> 
> ... might need some debugging here, I'll update if I find anything.

And it appears we're stuck in the POLL_WHILE_FALSE() loop in
wait_tag_complete() called from restore_lscsa().

Because spufs itself has been fairly static, I suspect some other change
has meant that MFC DMAs aren't working; so at this point, a bisect might
be the best way forward. Do you see the exact same behaviour on 4.9 (but
4.8 works?)

Cheers,


Jeremy


[PATCH] powerpc/fadump: add reschedule point while releasing memory

2017-05-23 Thread Hari Bathini
Around 95% of memory is reserved by fadump/capture kernel. All this
memory is freed, one page at a time, on writing '1' to the node
/sys/kernel/fadump_release_mem. On systems with large memory, this
can take a long time to complete, leading to soft lockup warning
messages. To avoid this, add reschedule points at regular intervals.

Suggested-by: Michael Ellerman 
Signed-off-by: Hari Bathini 
---
 arch/powerpc/kernel/fadump.c |   60 ++
 1 file changed, 49 insertions(+), 11 deletions(-)

diff --git a/arch/powerpc/kernel/fadump.c b/arch/powerpc/kernel/fadump.c
index 466569e..0babefc 100644
--- a/arch/powerpc/kernel/fadump.c
+++ b/arch/powerpc/kernel/fadump.c
@@ -1046,28 +1046,66 @@ void fadump_cleanup(void)
}
 }
 
+/* Time to wait before a reschedule point */
+#define RELEASE_TIME_LIMIT 500 /* in milliseconds */
+
+/* Release memory in batches of 'N' pages each */
+#define RELEASE_PAGES_BATCH(1 << 22)
+
+static void fadump_free_reserved_memory(unsigned long start, unsigned long end)
+{
+   unsigned long pfn, start_pfn, end_pfn;
+   unsigned int remaining = end > start ? (end - start) >> PAGE_SHIFT : 0;
+   unsigned long time_limit = jiffies +
+   msecs_to_jiffies(RELEASE_TIME_LIMIT);
+
+   while (remaining) {
+   /* A reschedule point for every 'X' milliseconds */
+   if (time_after_eq(jiffies, time_limit)) {
+   cond_resched();
+   time_limit = jiffies +
+   msecs_to_jiffies(RELEASE_TIME_LIMIT);
+   }
+
+   /* release memory in batches of 'N' pages */
+   start_pfn = start >> PAGE_SHIFT;
+   if (remaining > RELEASE_PAGES_BATCH) {
+   end_pfn = start_pfn + RELEASE_PAGES_BATCH;
+   remaining -= RELEASE_PAGES_BATCH;
+   } else {
+   end_pfn = end >> PAGE_SHIFT;
+   remaining = 0;
+   }
+
+   for (pfn = start_pfn; pfn < end_pfn; pfn++)
+   free_reserved_page(pfn_to_page(pfn));
+
+   start = end_pfn << PAGE_SHIFT;
+   }
+}
+
 /*
  * Release the memory that was reserved in early boot to preserve the memory
  * contents. The released memory will be available for general use.
  */
 static void fadump_release_memory(unsigned long begin, unsigned long end)
 {
-   unsigned long addr;
unsigned long ra_start, ra_end;
 
ra_start = fw_dump.reserve_dump_area_start;
ra_end = ra_start + fw_dump.reserve_dump_area_size;
 
-   for (addr = begin; addr < end; addr += PAGE_SIZE) {
-   /*
-* exclude the dump reserve area. Will reuse it for next
-* fadump registration.
-*/
-   if (addr <= ra_end && ((addr + PAGE_SIZE) > ra_start))
-   continue;
-
-   free_reserved_page(pfn_to_page(addr >> PAGE_SHIFT));
-   }
+   /*
+* exclude the dump reserve area. Will reuse it for next
+* fadump registration.
+*/
+   if (begin < ra_end && end > ra_start) {
+   if (begin < ra_start)
+   fadump_free_reserved_memory(begin, ra_start);
+   if (end > ra_end)
+   fadump_free_reserved_memory(ra_end, end);
+   } else
+   fadump_free_reserved_memory(begin, end);
 }
 
 static void fadump_invalidate_release_mem(void)



Re: [PATCH 4/6] powerpc/mm: Add devmap support for ppc64

2017-05-23 Thread Oliver O'Halloran
On Tue, May 23, 2017 at 2:23 PM, Aneesh Kumar K.V
 wrote:
> Oliver O'Halloran  writes:
>
>> Add support for the devmap bit on PTEs and PMDs for PPC64 Book3S.  This
>> is used to differentiate device backed memory from transparent huge
>> pages since they are handled in more or less the same manner by the core
>> mm code.
>>
>> Cc: Aneesh Kumar K.V 
>> Signed-off-by: Oliver O'Halloran 
>> ---
>> v1 -> v2: Properly differentiate THP and PMD Devmap entries. The
>> mm core assumes that pmd_trans_huge() and pmd_devmap() are mutually
>> exclusive and v1 had pmd_trans_huge() being true on a devmap pmd.
>>
>> Aneesh, this has been fleshed out substantially since v1. Can you
>> re-review it? Also no explicit gup support is required in this patch
>> since devmap support was added generic GUP as a part of making x86 use
>> the generic version.
>> ---
>>  arch/powerpc/include/asm/book3s/64/hash-64k.h |  2 +-
>>  arch/powerpc/include/asm/book3s/64/pgtable.h  | 37 
>> ++-
>>  arch/powerpc/include/asm/book3s/64/radix.h|  2 +-
>>  arch/powerpc/mm/hugetlbpage.c |  2 +-
>>  arch/powerpc/mm/pgtable-book3s64.c|  4 +--
>>  arch/powerpc/mm/pgtable-hash64.c  |  4 ++-
>>  arch/powerpc/mm/pgtable-radix.c   |  3 ++-
>>  arch/powerpc/mm/pgtable_64.c  |  2 +-
>>  8 files changed, 47 insertions(+), 9 deletions(-)
>>
>> diff --git a/arch/powerpc/include/asm/book3s/64/hash-64k.h 
>> b/arch/powerpc/include/asm/book3s/64/hash-64k.h
>> index 9732837aaae8..eaaf613c5347 100644
>> --- a/arch/powerpc/include/asm/book3s/64/hash-64k.h
>> +++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h
>> @@ -180,7 +180,7 @@ static inline void mark_hpte_slot_valid(unsigned char 
>> *hpte_slot_array,
>>   */
>>  static inline int hash__pmd_trans_huge(pmd_t pmd)
>>  {
>> - return !!((pmd_val(pmd) & (_PAGE_PTE | H_PAGE_THP_HUGE)) ==
>> + return !!((pmd_val(pmd) & (_PAGE_PTE | H_PAGE_THP_HUGE | 
>> _PAGE_DEVMAP)) ==
>> (_PAGE_PTE | H_PAGE_THP_HUGE));
>>  }
>
> _PAGE_DEVMAP is not really needed here. We will set H_PAGE_THP_HUGE only
> for thp hugepage w.r.t hash. But putting it here also makes it clear
> that devmap entries are not considered trans huge.

Good point. I'll remove it.

>>
>> diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h 
>> b/arch/powerpc/include/asm/book3s/64/pgtable.h
>> index 85bc9875c3be..24634e92dd0b 100644
>> --- a/arch/powerpc/include/asm/book3s/64/pgtable.h
>> +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
>> @@ -79,6 +79,9 @@
>>
>>  #define _PAGE_SOFT_DIRTY _RPAGE_SW3 /* software: software dirty 
>> tracking */
>>  #define _PAGE_SPECIAL_RPAGE_SW2 /* software: special page */
>> +#define _PAGE_DEVMAP _RPAGE_SW1
>> +#define __HAVE_ARCH_PTE_DEVMAP
>> +
>>  /*
>>   * Drivers request for cache inhibited pte mapping using _PAGE_NO_CACHE
>>   * Instead of fixing all of them, add an alternate define which
>> @@ -599,6 +602,16 @@ static inline pte_t pte_mkhuge(pte_t pte)
>>   return pte;
>>  }
>>
>> +static inline pte_t pte_mkdevmap(pte_t pte)
>> +{
>> + return __pte(pte_val(pte) | _PAGE_SPECIAL|_PAGE_DEVMAP);
>> +}
>> +
>> +static inline int pte_devmap(pte_t pte)
>> +{
>> + return !!(pte_raw(pte) & cpu_to_be64(_PAGE_DEVMAP));
>> +}
>> +
>>  static inline pte_t pte_modify(pte_t pte, pgprot_t newprot)
>>  {
>>   /* FIXME!! check whether this need to be a conditional */
>> @@ -963,6 +976,9 @@ static inline pte_t *pmdp_ptep(pmd_t *pmd)
>>  #define pmd_mk_savedwrite(pmd)   
>> pte_pmd(pte_mk_savedwrite(pmd_pte(pmd)))
>>  #define pmd_clear_savedwrite(pmd)
>> pte_pmd(pte_clear_savedwrite(pmd_pte(pmd)))
>>
>> +#define pud_pfn(...) (0)
>> +#define pgd_pfn(...) (0)
>> +
>>  #ifdef CONFIG_HAVE_ARCH_SOFT_DIRTY
>>  #define pmd_soft_dirty(pmd)pte_soft_dirty(pmd_pte(pmd))
>>  #define pmd_mksoft_dirty(pmd)  pte_pmd(pte_mksoft_dirty(pmd_pte(pmd)))
>> @@ -1137,7 +1153,6 @@ static inline int pmd_move_must_withdraw(struct 
>> spinlock *new_pmd_ptl,
>>   return true;
>>  }
>>
>> -
>>  #define arch_needs_pgtable_deposit arch_needs_pgtable_deposit
>>  static inline bool arch_needs_pgtable_deposit(void)
>>  {
>> @@ -1146,6 +1161,26 @@ static inline bool arch_needs_pgtable_deposit(void)
>>   return true;
>>  }
>>
>> +static inline pmd_t pmd_mkdevmap(pmd_t pmd)
>> +{
>> + return pte_pmd(pte_mkdevmap(pmd_pte(pmd)));
>> +}
>
>
> We avoided setting _PAGE_SPECIAL on pmd entries. This will set that, we
> may want to check if it is ok.  IIRC, we overloaded _PAGE_SPECIAL at some 
> point to indicate thp splitting. But good to double check.

I took a cursory look in arch/powerpc/ and mm/ for usages and didn't
see any usages of _PAGE_SPECIAL for pmds. There's no good reason to
set the flag though so I'll remove it.

>> +
>> +static inline int pmd_devmap(pmd_t pmd)
>> +{
>> + 

Re: [PATCH 5/6] mm, x86: Add ARCH_HAS_ZONE_DEVICE

2017-05-23 Thread Ingo Molnar

* Oliver O'Halloran  wrote:

> Currently ZONE_DEVICE depends on X86_64. This is fine for now, but it
> will get unwieldly as new platforms get ZONE_DEVICE support. Moving it
> to an arch selected Kconfig option to save us some trouble in the
> future.
> 
> Cc: x...@kernel.org
> Signed-off-by: Oliver O'Halloran 

Acked-by: Ingo Molnar 

Thanks,

Ingo


Re: [PowerPC][next-20170324][kselftest] kernel Oops when running tm/tm-signal-context-chk-vsx

2017-05-23 Thread Abdul Haleem
On Mon, 2017-04-03 at 14:33 +0530, Abdul Haleem wrote:
> On Mon, 2017-04-03 at 14:28 +0530, Abdul Haleem wrote:
> > On Tue, 2017-03-28 at 21:00 +1100, Michael Ellerman wrote:
> > > Abdul Haleem  writes:
> > > 
> > > > Hi,
> > > >
> > > > While running kernel self tests on ppc64, tm/tm-signal-context-chk-vsx
> > > > tests fails with Oops message. 
> > > >
> > > > I was able to reproduce only twice out of 20 runs on next-20170324 only.
> > > > so it is difficult to bisect the commit causing the issue.
> > > 
> > > Can you try mainline as of this commit:
> > > 
> > > 605df8d674ac ("selftests/powerpc: Replace stxvx and lxvx with 
> > > stxvd2x/lxvd2x")
> > > 
> > > https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git/commit/?id=605df8d674ac65e044a0bf4998b28c2f350b7f9e
> > > 
> > > cheers
> > > 
> > 
> > 
> > Trace is not reproducible on mainline with above commit.
> > 
> > Cyril was able to reproduce it and is working on it.
> > 
> > 
> > 
> > 
> cc Cyril Bur 
> 

Hi Cyril,

I see a similar trace, but with 'tm-signal-stack' test for mainline
kernel 4.12.0-rc1 on PowerVM LPAR.

tm-signal-msr-r[7669]: bad frame in rt_sigreturn: 7fffe8a6a6c0 nip
7fff8335f094 lr 7fff835104d8
tm-signal-stack[7675]: bad frame in setup_rt_frame:  nip
1d44 lr 1d28
Bad kernel stack pointer 7fffd8b33530 at c000b660
Oops: Bad kernel stack pointer, sig: 6 [#1]
SMP NR_CPUS=2048 
NUMA 
pSeries
Dumping ftrace buffer: 
   (ftrace buffer empty)
Modules linked in: vmx_crypto(E) pseries_rng(E) rtc_generic(E)
autofs4(E) [last unloaded: torture]
CPU: 8 PID: 8014 Comm: tm-signal-conte Tainted: GE
4.12.0-rc1-autotest #1
task: c007742ac000 task.stack: c00773b84000
NIP: c000b660 LR: 10001af0 CTR: 
REGS: cec23d40 TRAP: 0700   Tainted: GE
(4.12.0-rc1-autotest)
MSR: 800102a03031 
  CR: 42000822  XER:   
CFAR: c000b5b4 SOFTE: 0 
GPR00: 0025 7fffd8b33530 10028200
 
GPR04: 000a 10020010 
 
GPR08: 00f8  
 
GPR12:  7fff8b6ac440 
 
GPR16:   
 
GPR20:   
 
GPR24:   
7fff8b69f948 
GPR28: 10020010 10020304 1f4e
000333ce 
NIP [c000b660] fast_exception_return+0x90/0x98
LR [10001af0] 0x10001af0
Call Trace:
Instruction dump:
7c40e3a6 e9a100d8 7c7b03a6 e84101a0 7c4ff120 e8410170 7c5a03a6 e8010070 
e8410080 e8610088 e8810090 e8210078 <4c24> 4800 e8610178
88ed02bb 
---[ end trace a10b71ed348d921f ]---


-- 
Regard's

Abdul Haleem
IBM Linux Technology Centre