Re: [PATCH v3 5/5] mm: enable CONFIG_MOVABLE_NODE on powerpc

2016-09-26 Thread Aneesh Kumar K.V
Reza Arbab <ar...@linux.vnet.ibm.com> writes:

> To create a movable node, we need to hotplug all of its memory into
> ZONE_MOVABLE.
>
> Note that to do this, auto_online_blocks should be off. Since the memory
> will first be added to the default zone, we must explicitly use
> online_movable to online.
>
> Because such a node contains no normal memory, can_online_high_movable()
> will only allow us to do the onlining if CONFIG_MOVABLE_NODE is set.
> Enable the use of this config option on PPC64 platforms.
>

Reviewed-by: Aneesh Kumar K.V <aneesh.ku...@linux.vnet.ibm.com>

> Signed-off-by: Reza Arbab <ar...@linux.vnet.ibm.com>
> ---
>  Documentation/kernel-parameters.txt | 2 +-
>  mm/Kconfig  | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/Documentation/kernel-parameters.txt 
> b/Documentation/kernel-parameters.txt
> index a4f4d69..3d8460d 100644
> --- a/Documentation/kernel-parameters.txt
> +++ b/Documentation/kernel-parameters.txt
> @@ -2344,7 +2344,7 @@ bytes respectively. Such letter suffixes can also be 
> entirely omitted.
>   that the amount of memory usable for all allocations
>   is not too small.
>
> - movable_node[KNL,X86] Boot-time switch to enable the effects
> + movable_node[KNL,X86,PPC] Boot-time switch to enable the effects
>   of CONFIG_MOVABLE_NODE=y. See mm/Kconfig for details.
>
>   MTD_Partition=  [MTD]
> diff --git a/mm/Kconfig b/mm/Kconfig
> index be0ee11..4b19cd3 100644
> --- a/mm/Kconfig
> +++ b/mm/Kconfig
> @@ -153,7 +153,7 @@ config MOVABLE_NODE
>   bool "Enable to assign a node which has only movable memory"
>   depends on HAVE_MEMBLOCK
>   depends on NO_BOOTMEM
> - depends on X86_64
> + depends on X86_64 || PPC64
>   depends on NUMA
>   default n
>   help
> -- 
> 1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 3/3] mm: enable CONFIG_MOVABLE_NODE on powerpc

2016-09-21 Thread Aneesh Kumar K.V
Reza Arbab <ar...@linux.vnet.ibm.com> writes:

> On Wed, Sep 21, 2016 at 12:39:51PM +0530, Aneesh Kumar K.V wrote:
>>What I was checking was how will one mark a node movable in ppc64 ? I
>>don't see ppc64 code doing the equivalent of memblock_mark_hotplug().
>
> Post boot, the marking mechanism is not necessary. You can create a 
> movable node by putting all of the node's memory into ZONE_MOVABLE 
> during the hotplug.
>
>>So when you say "Onlining memory into ZONE_MOVABLE requires
>>CONFIG_MOVABLE_NODE" where is that restriction ?. IIUC,
>>should_add_memory_movable() will only return ZONE_MOVABLE only if it is
>>non empty and MOVABLE_NODE will create a ZONE_MOVABLE zone by default
>>only if it finds a memblock marked hotpluggable. So wondering if we
>>are not calling memblock_mark_hotplug() how is it working. Or am I
>>missing something ?
>
> You are looking at the addition step of hotplug. You're correct there, 
> the memory is added to the default zone, not ZONE_MOVABLE. The 
> transition to ZONE_MOVABLE takes place during the onlining step. In 
> online_pages():
>
>   zone = move_pfn_range(zone_shift, pfn, pfn + nr_pages);
>
> The reason we need CONFIG_MOVABLE_NODE is right before that:
>
>   if ((zone_idx(zone) > ZONE_NORMAL ||
>   online_type == MMOP_ONLINE_MOVABLE) &&
>   !can_online_high_movable(zone))
>   return -EINVAL;
>

So we are looking at two step online process here. The above explained
the details nicely. Can you capture these details in the commit message. ie,
to say that when using 'echo online-movable > state' we allow the move from
normal to movable only if movable node is set. Also you may want to
mention that we still don't support the auto-online to movable.


> where can_online_high_movable() is defined like this:
>
>   #ifdef CONFIG_MOVABLE_NODE
>   /*
>* When CONFIG_MOVABLE_NODE, we permit onlining of a node which doesn't 
> have
>* normal memory.
>*/
>   static bool can_online_high_movable(struct zone *zone)
>   {
>   return true;
>   }
>   #else /* CONFIG_MOVABLE_NODE */
>   /* ensure every online node has NORMAL memory */
>   static bool can_online_high_movable(struct zone *zone)
>   {
>   return node_state(zone_to_nid(zone), N_NORMAL_MEMORY);
>   }
>   #endif /* CONFIG_MOVABLE_NODE */
>
> To be more clear, I can change the commit log to say "Onlining all of a 
> node's memory into ZONE_MOVABLE requires CONFIG_MOVABLE_NODE".
>
> -- 
> Reza Arbab

-aneesh

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 3/3] mm: enable CONFIG_MOVABLE_NODE on powerpc

2016-09-19 Thread Aneesh Kumar K.V
Reza Arbab  writes:

> Onlining memory into ZONE_MOVABLE requires CONFIG_MOVABLE_NODE. Enable
> the use of this config option on PPC64 platforms.
>
> Signed-off-by: Reza Arbab 
> ---
>  Documentation/kernel-parameters.txt | 2 +-
>  mm/Kconfig  | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/Documentation/kernel-parameters.txt 
> b/Documentation/kernel-parameters.txt
> index a4f4d69..3d8460d 100644
> --- a/Documentation/kernel-parameters.txt
> +++ b/Documentation/kernel-parameters.txt
> @@ -2344,7 +2344,7 @@ bytes respectively. Such letter suffixes can also be 
> entirely omitted.
>   that the amount of memory usable for all allocations
>   is not too small.
>  
> - movable_node[KNL,X86] Boot-time switch to enable the effects
> + movable_node[KNL,X86,PPC] Boot-time switch to enable the effects
>   of CONFIG_MOVABLE_NODE=y. See mm/Kconfig for details.
>  

Movable node also does.
memblock_set_bottom_up(true);
What is the impact of that. Do we need changes equivalent to that ? Also
where are we marking the nodes which can be hotplugged, ie where do we
do memblock_mark_hotplug() ?
   
>   MTD_Partition=  [MTD]
> diff --git a/mm/Kconfig b/mm/Kconfig
> index be0ee11..4b19cd3 100644
> --- a/mm/Kconfig
> +++ b/mm/Kconfig
> @@ -153,7 +153,7 @@ config MOVABLE_NODE
>   bool "Enable to assign a node which has only movable memory"
>   depends on HAVE_MEMBLOCK
>   depends on NO_BOOTMEM
> - depends on X86_64
> + depends on X86_64 || PPC64
>   depends on NUMA
>   default n
>   help

-aneesh

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC v6 26/62] powerpc: Program HPTE key protection bits

2017-07-20 Thread Aneesh Kumar K.V
Ram Pai <linux...@us.ibm.com> writes:

> Map the PTE protection key bits to the HPTE key protection bits,
> while creating HPTE  entries.
>
Reviewed-by: Aneesh Kumar K.V <aneesh.ku...@linux.vnet.ibm.com>

> Signed-off-by: Ram Pai <linux...@us.ibm.com>
> ---
>  arch/powerpc/include/asm/book3s/64/mmu-hash.h |5 +
>  arch/powerpc/include/asm/pkeys.h  |   12 
>  arch/powerpc/mm/hash_utils_64.c   |4 
>  3 files changed, 21 insertions(+), 0 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/book3s/64/mmu-hash.h 
> b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
> index 6981a52..f7a6ed3 100644
> --- a/arch/powerpc/include/asm/book3s/64/mmu-hash.h
> +++ b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
> @@ -90,6 +90,8 @@
>  #define HPTE_R_PP0   ASM_CONST(0x8000)
>  #define HPTE_R_TSASM_CONST(0x4000)
>  #define HPTE_R_KEY_HIASM_CONST(0x3000)
> +#define HPTE_R_KEY_BIT0  ASM_CONST(0x2000)
> +#define HPTE_R_KEY_BIT1  ASM_CONST(0x1000)
>  #define HPTE_R_RPN_SHIFT 12
>  #define HPTE_R_RPN   ASM_CONST(0x0000)
>  #define HPTE_R_RPN_3_0   ASM_CONST(0x01fff000)
> @@ -104,6 +106,9 @@
>  #define HPTE_R_C ASM_CONST(0x0080)
>  #define HPTE_R_R ASM_CONST(0x0100)
>  #define HPTE_R_KEY_LOASM_CONST(0x0e00)
> +#define HPTE_R_KEY_BIT2  ASM_CONST(0x0800)
> +#define HPTE_R_KEY_BIT3  ASM_CONST(0x0400)
> +#define HPTE_R_KEY_BIT4  ASM_CONST(0x0200)
>
>  #define HPTE_V_1TB_SEG   ASM_CONST(0x4000)
>  #define HPTE_V_VRMA_MASK ASM_CONST(0x4001ff00)
> diff --git a/arch/powerpc/include/asm/pkeys.h 
> b/arch/powerpc/include/asm/pkeys.h
> index ad39db0..bbb5d85 100644
> --- a/arch/powerpc/include/asm/pkeys.h
> +++ b/arch/powerpc/include/asm/pkeys.h
> @@ -41,6 +41,18 @@ static inline u64 vmflag_to_page_pkey_bits(u64 vm_flags)
>   ((vm_flags & VM_PKEY_BIT4) ? H_PAGE_PKEY_BIT0 : 0x0UL));
>  }
>
> +static inline u64 pte_to_hpte_pkey_bits(u64 pteflags)
> +{
> + if (!pkey_inited)
> + return 0x0UL;
> +
> + return (((pteflags & H_PAGE_PKEY_BIT0) ? HPTE_R_KEY_BIT0 : 0x0UL) |
> + ((pteflags & H_PAGE_PKEY_BIT1) ? HPTE_R_KEY_BIT1 : 0x0UL) |
> + ((pteflags & H_PAGE_PKEY_BIT2) ? HPTE_R_KEY_BIT2 : 0x0UL) |
> + ((pteflags & H_PAGE_PKEY_BIT3) ? HPTE_R_KEY_BIT3 : 0x0UL) |
> + ((pteflags & H_PAGE_PKEY_BIT4) ? HPTE_R_KEY_BIT4 : 0x0UL));
> +}
> +
>  static inline int vma_pkey(struct vm_area_struct *vma)
>  {
>   if (!pkey_inited)
> diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
> index f88423b..1e74529 100644
> --- a/arch/powerpc/mm/hash_utils_64.c
> +++ b/arch/powerpc/mm/hash_utils_64.c
> @@ -231,6 +231,10 @@ unsigned long htab_convert_pte_flags(unsigned long 
> pteflags)
>*/
>   rflags |= HPTE_R_M;
>
> +#ifdef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
> + rflags |= pte_to_hpte_pkey_bits(pteflags);
> +#endif
> +
>   return rflags;
>  }
>
> -- 
> 1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC v6 01/62] powerpc: Free up four 64K PTE bits in 4K backed HPTE pages

2017-07-19 Thread Aneesh Kumar K.V

.

>   /*
> @@ -116,8 +104,8 @@ int __hash_page_4K(unsigned long ea, unsigned long 
> access, unsigned long vsid,
>* On hash insert failure we use old pte value and we don't
>* want slot information there if we have a insert failure.
>*/
> - old_pte &= ~(H_PAGE_HASHPTE | H_PAGE_F_GIX | H_PAGE_F_SECOND);
> - new_pte &= ~(H_PAGE_HASHPTE | H_PAGE_F_GIX | H_PAGE_F_SECOND);
> + old_pte &= ~(H_PAGE_HASHPTE);
> + new_pte &= ~(H_PAGE_HASHPTE);
>   goto htab_insert_hpte;
>   }

With the current path order and above hunk we will breaks the bisect I guess. 
With the above, when
we convert a 64k hpte to 4khpte, since this is the first patch, we
should clear that H_PAGE_F_GIX and H_PAGE_F_SECOND. We still use them
for 64k. I guess you should move this hunk to second patch.


-aneesh

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC v6 03/62] powerpc: introduce pte_set_hash_slot() helper

2017-07-19 Thread Aneesh Kumar K.V
Ram Pai <linux...@us.ibm.com> writes:

> Introduce pte_set_hash_slot().It  sets the (H_PAGE_F_SECOND|H_PAGE_F_GIX)
> bits at  the   appropriate   location   in   the   PTE  of  4K  PTE.  For
> 64K PTE, it  sets  the  bits  in  the  second  part  of  the  PTE. Though
> the implementation  for the former just needs the slot parameter, it does
> take some additional parameters to keep the prototype consistent.
>
> This function  will  be  handy  as  we   work   towards  re-arranging the
> bits in the later patches.
>

Reviewed-by: Aneesh Kumar K.V <aneesh.ku...@linux.vnet.ibm.com>
> Signed-off-by: Ram Pai <linux...@us.ibm.com>
> ---
>  arch/powerpc/include/asm/book3s/64/hash-4k.h  |   15 +++
>  arch/powerpc/include/asm/book3s/64/hash-64k.h |   25 
> +
>  2 files changed, 40 insertions(+), 0 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/book3s/64/hash-4k.h 
> b/arch/powerpc/include/asm/book3s/64/hash-4k.h
> index d2cf949..dc153c6 100644
> --- a/arch/powerpc/include/asm/book3s/64/hash-4k.h
> +++ b/arch/powerpc/include/asm/book3s/64/hash-4k.h
> @@ -53,6 +53,21 @@ static inline int hash__hugepd_ok(hugepd_t hpd)
>  }
>  #endif
>
> +/*
> + * 4k pte format is  different  from  64k  pte  format.  Saving  the
> + * hash_slot is just a matter of returning the pte bits that need to
> + * be modified. On 64k pte, things are a  little  more  involved and
> + * hence  needs   many   more  parameters  to  accomplish  the  same.
> + * However we  want  to abstract this out from the caller by keeping
> + * the prototype consistent across the two formats.
> + */
> +static inline unsigned long pte_set_hash_slot(pte_t *ptep, real_pte_t rpte,
> + unsigned int subpg_index, unsigned long slot)
> +{
> + return (slot << H_PAGE_F_GIX_SHIFT) &
> + (H_PAGE_F_SECOND | H_PAGE_F_GIX);
> +}
> +
>  #ifdef CONFIG_TRANSPARENT_HUGEPAGE
>
>  static inline char *get_hpte_slot_array(pmd_t *pmdp)
> diff --git a/arch/powerpc/include/asm/book3s/64/hash-64k.h 
> b/arch/powerpc/include/asm/book3s/64/hash-64k.h
> index c281f18..89ef5a9 100644
> --- a/arch/powerpc/include/asm/book3s/64/hash-64k.h
> +++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h
> @@ -67,6 +67,31 @@ static inline unsigned long __rpte_to_hidx(real_pte_t 
> rpte, unsigned long index)
>   return ((rpte.hidx >> (index<<2)) & 0xfUL);
>  }
>
> +/*
> + * Commit the hash slot and return pte bits that needs to be modified.
> + * The caller is expected to modify the pte bits accordingly and
> + * commit the pte to memory.
> + */
> +static inline unsigned long pte_set_hash_slot(pte_t *ptep, real_pte_t rpte,
> + unsigned int subpg_index, unsigned long slot)
> +{
> + unsigned long *hidxp = (unsigned long *)(ptep + PTRS_PER_PTE);
> +
> + rpte.hidx &= ~(0xfUL << (subpg_index << 2));
> + *hidxp = rpte.hidx  | (slot << (subpg_index << 2));
> + /*
> +  * Commit the hidx bits to memory before returning.
> +  * Anyone reading  pte  must  ensure hidx bits are
> +  * read  only  after  reading the pte by using the
> +  * read-side  barrier  smp_rmb(). __real_pte() can
> +  * help ensure that.
> +  */
> + smp_wmb();
> +
> + /* no pte bits to be modified, return 0x0UL */
> + return 0x0UL;
> +}
> +
>  #define __rpte_to_pte(r) ((r).pte)
>  extern bool __rpte_sub_valid(real_pte_t rpte, unsigned long index);
>  /*
> -- 
> 1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC v6 27/62] powerpc: helper to validate key-access permissions of a pte

2017-07-20 Thread Aneesh Kumar K.V
Ram Pai  writes:

> helper function that checks if the read/write/execute is allowed
> on the pte.
>
> Signed-off-by: Ram Pai 
> ---
>  arch/powerpc/include/asm/book3s/64/pgtable.h |4 +++
>  arch/powerpc/include/asm/pkeys.h |   12 +
>  arch/powerpc/mm/pkeys.c  |   33 
> ++
>  3 files changed, 49 insertions(+), 0 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h 
> b/arch/powerpc/include/asm/book3s/64/pgtable.h
> index 30d7f55..0056e58 100644
> --- a/arch/powerpc/include/asm/book3s/64/pgtable.h
> +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
> @@ -472,6 +472,10 @@ static inline void write_uamor(u64 value)
>   mtspr(SPRN_UAMOR, value);
>  }
>
> +#ifdef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
> +extern bool arch_pte_access_permitted(u64 pte, bool write, bool execute);
> +#endif /* CONFIG_PPC64_MEMORY_PROTECTION_KEYS */
> +
>  #define __HAVE_ARCH_PTEP_GET_AND_CLEAR
>  static inline pte_t ptep_get_and_clear(struct mm_struct *mm,
>  unsigned long addr, pte_t *ptep)
> diff --git a/arch/powerpc/include/asm/pkeys.h 
> b/arch/powerpc/include/asm/pkeys.h
> index bbb5d85..7a9aade 100644
> --- a/arch/powerpc/include/asm/pkeys.h
> +++ b/arch/powerpc/include/asm/pkeys.h
> @@ -53,6 +53,18 @@ static inline u64 pte_to_hpte_pkey_bits(u64 pteflags)
>   ((pteflags & H_PAGE_PKEY_BIT4) ? HPTE_R_KEY_BIT4 : 0x0UL));
>  }
>
> +static inline u16 pte_to_pkey_bits(u64 pteflags)
> +{
> + if (!pkey_inited)
> + return 0x0UL;

Do we really need that above check ? We should always find it
peky_inited to be set. 

> +
> + return (((pteflags & H_PAGE_PKEY_BIT0) ? 0x10 : 0x0UL) |
> + ((pteflags & H_PAGE_PKEY_BIT1) ? 0x8 : 0x0UL) |
> + ((pteflags & H_PAGE_PKEY_BIT2) ? 0x4 : 0x0UL) |
> + ((pteflags & H_PAGE_PKEY_BIT3) ? 0x2 : 0x0UL) |
> + ((pteflags & H_PAGE_PKEY_BIT4) ? 0x1 : 0x0UL));
> +}
> +


-aneesh

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC v6 05/62] powerpc: capture the PTE format changes in the dump pte report

2017-07-19 Thread Aneesh Kumar K.V
Ram Pai <linux...@us.ibm.com> writes:

> The H_PAGE_F_SECOND,H_PAGE_F_GIX are not in the 64K main-PTE.
> capture these changes in the dump pte report.
>

Reviewed-by: Aneesh Kumar K.V <aneesh.ku...@linux.vnet.ibm.com>

> Signed-off-by: Ram Pai <linux...@us.ibm.com>
> ---
>  arch/powerpc/mm/dump_linuxpagetables.c |3 ++-
>  1 files changed, 2 insertions(+), 1 deletions(-)
>
> diff --git a/arch/powerpc/mm/dump_linuxpagetables.c 
> b/arch/powerpc/mm/dump_linuxpagetables.c
> index 44fe483..5627edd 100644
> --- a/arch/powerpc/mm/dump_linuxpagetables.c
> +++ b/arch/powerpc/mm/dump_linuxpagetables.c
> @@ -213,7 +213,7 @@ struct flag_info {
>   .val= H_PAGE_4K_PFN,
>   .set= "4K_pfn",
>   }, {
> -#endif
> +#else /* CONFIG_PPC_64K_PAGES */
>   .mask   = H_PAGE_F_GIX,
>   .val= H_PAGE_F_GIX,
>   .set= "f_gix",
> @@ -224,6 +224,7 @@ struct flag_info {
>   .val= H_PAGE_F_SECOND,
>   .set= "f_second",
>   }, {
> +#endif /* CONFIG_PPC_64K_PAGES */
>  #endif
>   .mask   = _PAGE_SPECIAL,
>   .val= _PAGE_SPECIAL,
> -- 
> 1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC v6 06/62] powerpc: use helper functions in __hash_page_64K() for 64K PTE

2017-07-19 Thread Aneesh Kumar K.V
Ram Pai <linux...@us.ibm.com> writes:

> replace redundant code in __hash_page_64K() with helper
> functions pte_get_hash_gslot() and pte_set_hash_slot()
>

Reviewed-by: Aneesh Kumar K.V <aneesh.ku...@linux.vnet.ibm.com>


> Signed-off-by: Ram Pai <linux...@us.ibm.com>
> ---
>  arch/powerpc/mm/hash64_64k.c |   24 
>  1 files changed, 4 insertions(+), 20 deletions(-)
>
> diff --git a/arch/powerpc/mm/hash64_64k.c b/arch/powerpc/mm/hash64_64k.c
> index 0012618..645f621 100644
> --- a/arch/powerpc/mm/hash64_64k.c
> +++ b/arch/powerpc/mm/hash64_64k.c
> @@ -244,7 +244,6 @@ int __hash_page_64K(unsigned long ea, unsigned long 
> access,
>   unsigned long flags, int ssize)
>  {
>   real_pte_t rpte;
> - unsigned long *hidxp;
>   unsigned long hpte_group;
>   unsigned long rflags, pa;
>   unsigned long old_pte, new_pte;
> @@ -289,18 +288,12 @@ int __hash_page_64K(unsigned long ea, unsigned long 
> access,
>
>   vpn  = hpt_vpn(ea, vsid, ssize);
>   if (unlikely(old_pte & H_PAGE_HASHPTE)) {
> - unsigned long hash, slot, hidx;
> -
> - hash = hpt_hash(vpn, shift, ssize);
> - hidx = __rpte_to_hidx(rpte, 0);
> - if (hidx & _PTEIDX_SECONDARY)
> - hash = ~hash;
> - slot = (hash & htab_hash_mask) * HPTES_PER_GROUP;
> - slot += hidx & _PTEIDX_GROUP_IX;
> + unsigned long gslot;
>   /*
>* There MIGHT be an HPTE for this pte
>*/
> - if (mmu_hash_ops.hpte_updatepp(slot, rflags, vpn, MMU_PAGE_64K,
> + gslot = pte_get_hash_gslot(vpn, shift, ssize, rpte, 0);
> + if (mmu_hash_ops.hpte_updatepp(gslot, rflags, vpn, MMU_PAGE_64K,
>  MMU_PAGE_64K, ssize,
>  flags) == -1)
>   old_pte &= ~_PAGE_HPTEFLAGS;
> @@ -350,17 +343,8 @@ int __hash_page_64K(unsigned long ea, unsigned long 
> access,
>   return -1;
>   }
>
> - /*
> -  * Insert slot number & secondary bit in PTE second half.
> -  */
> - hidxp = (unsigned long *)(ptep + PTRS_PER_PTE);
> - rpte.hidx &= ~(0xfUL);
> - *hidxp = rpte.hidx  | (slot & 0xfUL);
> - /*
> -  * check __real_pte for details on matching smp_rmb()
> -  */
> - smp_wmb();
>   new_pte = (new_pte & ~_PAGE_HPTEFLAGS) | H_PAGE_HASHPTE;
> + new_pte |= pte_set_hash_slot(ptep, rpte, 0, slot);
>   }
>   *ptep = __pte(new_pte & ~H_PAGE_BUSY);
>   return 0;
> -- 
> 1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC v6 07/62] powerpc: use helper functions in __hash_page_huge() for 64K PTE

2017-07-19 Thread Aneesh Kumar K.V
Ram Pai  writes:

> replace redundant code in __hash_page_huge() with helper
> functions pte_get_hash_gslot() and pte_set_hash_slot()
>

Can you fold all the helper function usage into one patch ?


> Signed-off-by: Ram Pai 
> ---
>  arch/powerpc/mm/hugetlbpage-hash64.c |   24 
>  1 files changed, 4 insertions(+), 20 deletions(-)
>
> diff --git a/arch/powerpc/mm/hugetlbpage-hash64.c 
> b/arch/powerpc/mm/hugetlbpage-hash64.c
> index 6f7aee3..e6dcd50 100644
> --- a/arch/powerpc/mm/hugetlbpage-hash64.c
> +++ b/arch/powerpc/mm/hugetlbpage-hash64.c
> @@ -23,7 +23,6 @@ int __hash_page_huge(unsigned long ea, unsigned long 
> access, unsigned long vsid,
>int ssize, unsigned int shift, unsigned int mmu_psize)
>  {
>   real_pte_t rpte;
> - unsigned long *hidxp;
>   unsigned long vpn;
>   unsigned long old_pte, new_pte;
>   unsigned long rflags, pa, sz;
> @@ -74,16 +73,10 @@ int __hash_page_huge(unsigned long ea, unsigned long 
> access, unsigned long vsid,
>   /* Check if pte already has an hpte (case 2) */
>   if (unlikely(old_pte & H_PAGE_HASHPTE)) {
>   /* There MIGHT be an HPTE for this pte */
> - unsigned long hash, slot, hidx;
> + unsigned long gslot;
>
> - hash = hpt_hash(vpn, shift, ssize);
> - hidx = __rpte_to_hidx(rpte, 0);
> - if (hidx & _PTEIDX_SECONDARY)
> - hash = ~hash;
> - slot = (hash & htab_hash_mask) * HPTES_PER_GROUP;
> - slot += hidx & _PTEIDX_GROUP_IX;
> -
> - if (mmu_hash_ops.hpte_updatepp(slot, rflags, vpn, mmu_psize,
> + gslot = pte_get_hash_gslot(vpn, shift, ssize, rpte, 0);
> + if (mmu_hash_ops.hpte_updatepp(gslot, rflags, vpn, mmu_psize,
>  mmu_psize, ssize, flags) == -1)
>   old_pte &= ~_PAGE_HPTEFLAGS;
>   }
> @@ -110,16 +103,7 @@ int __hash_page_huge(unsigned long ea, unsigned long 
> access, unsigned long vsid,
>   return -1;
>   }
>
> - /*
> -  * Insert slot number & secondary bit in PTE second half.
> -  */
> - hidxp = (unsigned long *)(ptep + PTRS_PER_PTE);
> - rpte.hidx &= ~(0xfUL);
> - *hidxp = rpte.hidx  | (slot & 0xfUL);
> - /*
> -  * check __real_pte for details on matching smp_rmb()
> -  */
> - smp_wmb();
> + new_pte |= pte_set_hash_slot(ptep, rpte, 0, slot);
>   }
>
>   /*
> -- 
> 1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC v6 04/62] powerpc: introduce pte_get_hash_gslot() helper

2017-07-19 Thread Aneesh Kumar K.V
Ram Pai <linux...@us.ibm.com> writes:

> Introduce pte_get_hash_gslot()() which returns the slot number of the
> HPTE in the global hash table.
>
> This function will come in handy as we work towards re-arranging the
> PTE bits in the later patches.
>

Reviewed-by: Aneesh Kumar K.V <aneesh.ku...@linux.vnet.ibm.com>

> Signed-off-by: Ram Pai <linux...@us.ibm.com>
> ---
>  arch/powerpc/include/asm/book3s/64/hash.h |3 +++
>  arch/powerpc/mm/hash_utils_64.c   |   18 ++
>  2 files changed, 21 insertions(+), 0 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/book3s/64/hash.h 
> b/arch/powerpc/include/asm/book3s/64/hash.h
> index d27f885..277158c 100644
> --- a/arch/powerpc/include/asm/book3s/64/hash.h
> +++ b/arch/powerpc/include/asm/book3s/64/hash.h
> @@ -156,6 +156,9 @@ static inline int hash__pte_none(pte_t pte)
>   return (pte_val(pte) & ~H_PTE_NONE_MASK) == 0;
>  }
>
> +unsigned long pte_get_hash_gslot(unsigned long vpn, unsigned long shift,
> + int ssize, real_pte_t rpte, unsigned int subpg_index);
> +
>  /* This low level function performs the actual PTE insertion
>   * Setting the PTE depends on the MMU type and other factors. It's
>   * an horrible mess that I'm not going to try to clean up now but
> diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
> index 1b494d0..d3604da 100644
> --- a/arch/powerpc/mm/hash_utils_64.c
> +++ b/arch/powerpc/mm/hash_utils_64.c
> @@ -1591,6 +1591,24 @@ static inline void tm_flush_hash_page(int local)
>  }
>  #endif
>
> +/*
> + * return the global hash slot, corresponding to the given
> + * pte, which contains the hpte.
> + */
> +unsigned long pte_get_hash_gslot(unsigned long vpn, unsigned long shift,
> + int ssize, real_pte_t rpte, unsigned int subpg_index)
> +{
> + unsigned long hash, slot, hidx;
> +
> + hash = hpt_hash(vpn, shift, ssize);
> + hidx = __rpte_to_hidx(rpte, subpg_index);
> + if (hidx & _PTEIDX_SECONDARY)
> + hash = ~hash;
> + slot = (hash & htab_hash_mask) * HPTES_PER_GROUP;
> + slot += hidx & _PTEIDX_GROUP_IX;
> + return slot;
> +}
> +
>  /* WARNING: This is called from hash_low_64.S, if you change this prototype,
>   *  do not forget to update the assembly call site !
>   */
> -- 
> 1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC v6 02/62] powerpc: Free up four 64K PTE bits in 64K backed HPTE pages

2017-07-19 Thread Aneesh Kumar K.V
Ram Pai <linux...@us.ibm.com> writes:

> Rearrange 64K PTE bits to  free  up  bits 3, 4, 5  and  6
> in the 64K backed HPTE pages. This along with the earlier
> patch will  entirely free  up the four bits from 64K PTE.
> The bit numbers are  big-endian as defined in the  ISA3.0
>
> This patch  does  the  following change to 64K PTE backed
> by 64K HPTE.
>
> H_PAGE_F_SECOND (S) which  occupied  bit  4  moves to the
>   second part of the pte to bit 60.
> H_PAGE_F_GIX (G,I,X) which  occupied  bit 5, 6 and 7 also
>   moves  to  the   second part of the pte to bit 61,
>   62, 63, 64 respectively
>
> since bit 7 is now freed up, we move H_PAGE_BUSY (B) from
> bit  9  to  bit  7.
>
> The second part of the PTE will hold
> (H_PAGE_F_SECOND|H_PAGE_F_GIX) at bit 60,61,62,63.
> NOTE: None of the bits in the secondary PTE were not used
> by 64k-HPTE backed PTE.
>
> Before the patch, the 64K HPTE backed 64k PTE format was
> as follows
>
>  0 1 2 3 4  5  6  7  8 9 10...63
>  : : : : :  :  :  :  : : ::
>  v v v v v  v  v  v  v v vv
>
> ,-,-,-,-,--,--,--,--,-,-,-,-,-,--,-,-,-,
> |x|x|x| |S |G |I |X |x|B| |x|x||x|x|x|x| <- primary pte
> '_'_'_'_'__'__'__'__'_'_'_'_'_''_'_'_'_'
> | | | | |  |  |  |  | | | | |..| | | | | <- secondary pte
> '_'_'_'_'__'__'__'__'_'_'_'_'__'_'_'_'_'
>
> After the patch, the 64k HPTE backed 64k PTE format is
> as follows
>
>  0 1 2 3 4  5  6  7  8 9 10...63
>  : : : : :  :  :  :  : : ::
>  v v v v v  v  v  v  v v vv
>
> ,-,-,-,-,--,--,--,--,-,-,-,-,-,--,-,-,-,
> |x|x|x| |  |  |  |B |x| | |x|x||.|.|.|.| <- primary pte
> '_'_'_'_'__'__'__'__'_'_'_'_'_''_'_'_'_'
> | | | | |  |  |  |  | | | | |..|S|G|I|X| <- secondary pte
> '_'_'_'_'__'__'__'__'_'_'_'_'__'_'_'_'_'
>
> The above PTE changes is applicable to hugetlbpages aswell.
>
> The patch does the following code changes:
>
> a) moves  the  H_PAGE_F_SECOND and  H_PAGE_F_GIX to 4k PTE
>   header   since it is no more needed b the 64k PTEs.
> b) abstracts  out __real_pte() and __rpte_to_hidx() so the
>   caller  need not know the bit location of the slot.
> c) moves the slot bits the secondary pte.
>

Reviewed-by: Aneesh Kumar K.V <aneesh.ku...@linux.vnet.ibm.com>
With changes suggested for the first patch.

> Signed-off-by: Ram Pai <linux...@us.ibm.com>
> ---


-aneesh

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC v6 11/62] powerpc: initial pkey plumbing

2017-07-20 Thread Aneesh Kumar K.V
Ram Pai  writes:

> basic setup to initialize the pkey system. Only 64K kernel in HPT
> mode, enables the pkey system.
>
> Signed-off-by: Ram Pai 
> ---
>  arch/powerpc/Kconfig   |   16 ++
>  arch/powerpc/include/asm/mmu_context.h |5 +++
>  arch/powerpc/include/asm/pkeys.h   |   51 
> 
>  arch/powerpc/kernel/setup_64.c |4 ++
>  arch/powerpc/mm/Makefile   |1 +
>  arch/powerpc/mm/hash_utils_64.c|1 +
>  arch/powerpc/mm/pkeys.c|   18 +++
>  7 files changed, 96 insertions(+), 0 deletions(-)
>  create mode 100644 arch/powerpc/include/asm/pkeys.h
>  create mode 100644 arch/powerpc/mm/pkeys.c
>
> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> index bf4391d..5c60fd6 100644
> --- a/arch/powerpc/Kconfig
> +++ b/arch/powerpc/Kconfig
> @@ -855,6 +855,22 @@ config SECCOMP
>
> If unsure, say Y. Only embedded should say N here.
>
> +config PPC64_MEMORY_PROTECTION_KEYS
> + prompt "PowerPC Memory Protection Keys"
> + def_bool y
> + # Note: only available in 64-bit mode
> + depends on PPC64 && PPC_64K_PAGES
> + select ARCH_USES_HIGH_VMA_FLAGS
> + select ARCH_HAS_PKEYS
> + ---help---
> +   Memory Protection Keys provides a mechanism for enforcing
> +   page-based protections, but without requiring modification of the
> +   page tables when an application changes protection domains.
> +
> +   For details, see Documentation/vm/protection-keys.txt
> +
> +   If unsure, say y.
> +
>  endmenu


Shouldn't this come as the last patch ? Or can we enable this config by
this patch ? If so what does it do ? Did you test boot each of this
patch to make sure we don't break git bisect ?

>
>  config ISA_DMA_API
> diff --git a/arch/powerpc/include/asm/mmu_context.h 
> b/arch/powerpc/include/asm/mmu_context.h
> index da7e943..4b93547 100644
> --- a/arch/powerpc/include/asm/mmu_context.h
> +++ b/arch/powerpc/include/asm/mmu_context.h
> @@ -181,5 +181,10 @@ static inline bool arch_vma_access_permitted(struct 
> vm_area_struct *vma,
>   /* by default, allow everything */


-aneesh

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC v6 27/62] powerpc: helper to validate key-access permissions of a pte

2017-07-21 Thread Aneesh Kumar K.V
Ram Pai <linux...@us.ibm.com> writes:

> On Thu, Jul 20, 2017 at 12:12:47PM +0530, Aneesh Kumar K.V wrote:
>> Ram Pai <linux...@us.ibm.com> writes:
>> 
>> > helper function that checks if the read/write/execute is allowed
>> > on the pte.
>> >
>> > Signed-off-by: Ram Pai <linux...@us.ibm.com>
>> > ---
>> >  arch/powerpc/include/asm/book3s/64/pgtable.h |4 +++
>> >  arch/powerpc/include/asm/pkeys.h |   12 +
>> >  arch/powerpc/mm/pkeys.c  |   33 
>> > ++
>> >  3 files changed, 49 insertions(+), 0 deletions(-)
>> >
>> > diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h 
>> > b/arch/powerpc/include/asm/book3s/64/pgtable.h
>> > index 30d7f55..0056e58 100644
>> > --- a/arch/powerpc/include/asm/book3s/64/pgtable.h
>> > +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
>> > @@ -472,6 +472,10 @@ static inline void write_uamor(u64 value)
>> >mtspr(SPRN_UAMOR, value);
>> >  }
>> >
>> > +#ifdef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
>> > +extern bool arch_pte_access_permitted(u64 pte, bool write, bool execute);
>> > +#endif /* CONFIG_PPC64_MEMORY_PROTECTION_KEYS */
>> > +
>> >  #define __HAVE_ARCH_PTEP_GET_AND_CLEAR
>> >  static inline pte_t ptep_get_and_clear(struct mm_struct *mm,
>> >   unsigned long addr, pte_t *ptep)
>> > diff --git a/arch/powerpc/include/asm/pkeys.h 
>> > b/arch/powerpc/include/asm/pkeys.h
>> > index bbb5d85..7a9aade 100644
>> > --- a/arch/powerpc/include/asm/pkeys.h
>> > +++ b/arch/powerpc/include/asm/pkeys.h
>> > @@ -53,6 +53,18 @@ static inline u64 pte_to_hpte_pkey_bits(u64 pteflags)
>> >((pteflags & H_PAGE_PKEY_BIT4) ? HPTE_R_KEY_BIT4 : 0x0UL));
>> >  }
>> >
>> > +static inline u16 pte_to_pkey_bits(u64 pteflags)
>> > +{
>> > +  if (!pkey_inited)
>> > +  return 0x0UL;
>> 
>> Do we really need that above check ? We should always find it
>> peky_inited to be set. 
>
> Yes. there are cases where pkey_inited is not enabled. 
> a) if the MMU is radix.
That should be be a feature check

> b) if the PAGE size is 4k.

That is a kernel config change

> c) if the device tree says the feature is not available
> d) if the CPU is of a older generation. P6 and older.

Both feature check.

how about doing something like

static inline u16 pte_to_pkey_bits(u64 pteflags)
{
if (!(pteflags & H_PAGE_KEY_MASK))
return 0x0UL;

return (((pteflags & H_PAGE_PKEY_BIT0) ? 0x10 : 0x0UL) |
((pteflags & H_PAGE_PKEY_BIT1) ? 0x8 : 0x0UL) |
((pteflags & H_PAGE_PKEY_BIT2) ? 0x4 : 0x0UL) |
((pteflags & H_PAGE_PKEY_BIT3) ? 0x2 : 0x0UL) |
((pteflags & H_PAGE_PKEY_BIT4) ? 0x1 : 0x0UL));
}

We still have to look at the code generated to see it is really saving
something.

-aneesh

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC v4 09/17] powerpc: call the hash functions with the correct pkey value

2017-06-27 Thread Aneesh Kumar K.V



On Tuesday 27 June 2017 03:41 PM, Ram Pai wrote:

Pass the correct protection key value to the hash functions on
page fault.

Signed-off-by: Ram Pai 
---
  arch/powerpc/include/asm/pkeys.h | 11 +++
  arch/powerpc/mm/hash_utils_64.c  |  4 
  arch/powerpc/mm/mem.c|  6 ++
  3 files changed, 21 insertions(+)

diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
index ef1c601..1370b3f 100644
--- a/arch/powerpc/include/asm/pkeys.h
+++ b/arch/powerpc/include/asm/pkeys.h
@@ -74,6 +74,17 @@ static inline bool mm_pkey_is_allocated(struct mm_struct 
*mm, int pkey)
  }

  /*
+ * return the protection key of the vma corresponding to the
+ * given effective address @ea.
+ */
+static inline int mm_pkey(struct mm_struct *mm, unsigned long ea)
+{
+   struct vm_area_struct *vma = find_vma(mm, ea);
+   int pkey = vma ? vma_pkey(vma) : 0;
+   return pkey;
+}
+
+/*



That is not going to work in hash fault path right ? We can't do a 
find_vma there without holding the mmap_sem


-aneesh

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v10 01/27] mm, powerpc, x86: define VM_PKEY_BITx bits if CONFIG_ARCH_HAS_PKEYS is enabled

2018-01-21 Thread Aneesh Kumar K.V
Ram Pai <linux...@us.ibm.com> writes:

> VM_PKEY_BITx are defined only if CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS
> is enabled. Powerpc also needs these bits. Hence lets define the
> VM_PKEY_BITx bits for any architecture that enables
> CONFIG_ARCH_HAS_PKEYS.
>

Reviewed-by: Aneesh Kumar K.V <aneesh.ku...@linux.vnet.ibm.com>

> Signed-off-by: Ram Pai <linux...@us.ibm.com>
> ---
>  fs/proc/task_mmu.c |4 ++--
>  include/linux/mm.h |9 +
>  2 files changed, 7 insertions(+), 6 deletions(-)
>
> diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
> index 339e4c1..b139617 100644
> --- a/fs/proc/task_mmu.c
> +++ b/fs/proc/task_mmu.c
> @@ -674,13 +674,13 @@ static void show_smap_vma_flags(struct seq_file *m, 
> struct vm_area_struct *vma)
>   [ilog2(VM_MERGEABLE)]   = "mg",
>   [ilog2(VM_UFFD_MISSING)]= "um",
>   [ilog2(VM_UFFD_WP)] = "uw",
> -#ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS
> +#ifdef CONFIG_ARCH_HAS_PKEYS
>   /* These come out via ProtectionKey: */
>   [ilog2(VM_PKEY_BIT0)]   = "",
>   [ilog2(VM_PKEY_BIT1)]   = "",
>   [ilog2(VM_PKEY_BIT2)]   = "",
>   [ilog2(VM_PKEY_BIT3)]   = "",
> -#endif
> +#endif /* CONFIG_ARCH_HAS_PKEYS */
>   };
>   size_t i;
>  
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index ea818ff..01381d3 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -228,15 +228,16 @@ extern int overcommit_kbytes_handler(struct ctl_table 
> *, int, void __user *,
>  #define VM_HIGH_ARCH_4   BIT(VM_HIGH_ARCH_BIT_4)
>  #endif /* CONFIG_ARCH_USES_HIGH_VMA_FLAGS */
>  
> -#if defined(CONFIG_X86)
> -# define VM_PAT  VM_ARCH_1   /* PAT reserves whole VMA at 
> once (x86) */
> -#if defined (CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS)
> +#ifdef CONFIG_ARCH_HAS_PKEYS
>  # define VM_PKEY_SHIFT   VM_HIGH_ARCH_BIT_0
>  # define VM_PKEY_BIT0VM_HIGH_ARCH_0  /* A protection key is a 4-bit 
> value */
>  # define VM_PKEY_BIT1VM_HIGH_ARCH_1
>  # define VM_PKEY_BIT2VM_HIGH_ARCH_2
>  # define VM_PKEY_BIT3VM_HIGH_ARCH_3
> -#endif
> +#endif /* CONFIG_ARCH_HAS_PKEYS */
> +
> +#if defined(CONFIG_X86)
> +# define VM_PAT  VM_ARCH_1   /* PAT reserves whole VMA at 
> once (x86) */
>  #elif defined(CONFIG_PPC)
>  # define VM_SAO  VM_ARCH_1   /* Strong Access Ordering 
> (powerpc) */
>  #elif defined(CONFIG_PARISC)
> -- 
> 1.7.1
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majord...@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: mailto:"d...@kvack.org;> em...@kvack.org 

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html