Re: [PATCH v8 4/6] powerpc/tlb: Add local flush for page given mm_struct and psize

2022-10-23 Thread Benjamin Gray
On Mon, 2022-10-24 at 14:30 +1100, Russell Currey wrote:
> On Fri, 2022-10-21 at 16:22 +1100, Benjamin Gray wrote:
> > Adds a local TLB flush operation that works given an mm_struct, VA
> > to
> > flush, and page size representation.
> > 
> > This removes the need to create a vm_area_struct, which the
> > temporary
> > patching mm work does not need.
> > 
> > Signed-off-by: Benjamin Gray 
> > ---
> >  arch/powerpc/include/asm/book3s/32/tlbflush.h  | 9 +
> >  arch/powerpc/include/asm/book3s/64/tlbflush-hash.h | 5 +
> >  arch/powerpc/include/asm/book3s/64/tlbflush.h  | 8 
> >  arch/powerpc/include/asm/nohash/tlbflush.h | 1 +
> >  4 files changed, 23 insertions(+)
> > 
> > diff --git a/arch/powerpc/include/asm/book3s/32/tlbflush.h
> > b/arch/powerpc/include/asm/book3s/32/tlbflush.h
> > index ba1743c52b56..e5a688cebf69 100644
> > --- a/arch/powerpc/include/asm/book3s/32/tlbflush.h
> > +++ b/arch/powerpc/include/asm/book3s/32/tlbflush.h
> > @@ -2,6 +2,8 @@
> >  #ifndef _ASM_POWERPC_BOOK3S_32_TLBFLUSH_H
> >  #define _ASM_POWERPC_BOOK3S_32_TLBFLUSH_H
> >  
> > +#include 
> > +
> >  #define MMU_NO_CONTEXT  (0)
> >  /*
> >   * TLB flushing for "classic" hash-MMU 32-bit CPUs, 6xx, 7xx, 7xxx
> > @@ -74,6 +76,13 @@ static inline void local_flush_tlb_page(struct
> > vm_area_struct *vma,
> >  {
> > flush_tlb_page(vma, vmaddr);
> >  }
> > +
> > +static inline void local_flush_tlb_page_psize(struct mm_struct
> > *mm,
> > unsigned long vmaddr, int psize)
> > +{
> > +   BUILD_BUG_ON(psize != MMU_PAGE_4K);
> 
> Is there any utility in adding this for 32bit if the following
> patches
> are only for Radix?

It needs some kind of definition to avoid #ifdef's. I figured I may as
well provide a correct implementation, given the functions around it
are implemented. The BUILD_BUG_ON specifically is just defensive in
case my assumptions are wrong. I don't know anything about these
machines, just what the kernel defines. I can remove the check, or
replace the whole implementation with a BUILD_BUG?


Re: [PATCH v8 5/6] powerpc/code-patching: Use temporary mm for Radix MMU

2022-10-23 Thread Benjamin Gray
On Mon, 2022-10-24 at 14:45 +1100, Russell Currey wrote:
> On Fri, 2022-10-21 at 16:22 +1100, Benjamin Gray wrote:
> > From: "Christopher M. Riedl" 
> > 
> > x86 supports the notion of a temporary mm which restricts access to
> > temporary PTEs to a single CPU. A temporary mm is useful for
> > situations
> > where a CPU needs to perform sensitive operations (such as patching
> > a
> > STRICT_KERNEL_RWX kernel) requiring temporary mappings without
> > exposing
> > said mappings to other CPUs. Another benefit is that other CPU TLBs
> > do
> > not need to be flushed when the temporary mm is torn down.
> > 
> > Mappings in the temporary mm can be set in the userspace portion of
> > the
> > address-space.
> > 
> > Interrupts must be disabled while the temporary mm is in use. HW
> > breakpoints, which may have been set by userspace as watchpoints on
> > addresses now within the temporary mm, are saved and disabled when
> > loading the temporary mm. The HW breakpoints are restored when
> > unloading
> > the temporary mm. All HW breakpoints are indiscriminately disabled
> > while
> > the temporary mm is in use - this may include breakpoints set by
> > perf.
> > 
> > Use the `poking_init` init hook to prepare a temporary mm and
> > patching
> > address. Initialize the temporary mm by copying the init mm. Choose
> > a
> > randomized patching address inside the temporary mm userspace
> > address
> > space. The patching address is randomized between PAGE_SIZE and
> > DEFAULT_MAP_WINDOW-PAGE_SIZE.
> > 
> > Bits of entropy with 64K page size on BOOK3S_64:
> > 
> > bits of entropy = log2(DEFAULT_MAP_WINDOW_USER64 /
> > PAGE_SIZE)
> > 
> > PAGE_SIZE=64K, DEFAULT_MAP_WINDOW_USER64=128TB
> > bits of entropy = log2(128TB / 64K)
> > bits of entropy = 31
> > 
> > The upper limit is DEFAULT_MAP_WINDOW due to how the Book3s64 Hash
> > MMU
> > operates - by default the space above DEFAULT_MAP_WINDOW is not
> > available. Currently the Hash MMU does not use a temporary mm so
> > technically this upper limit isn't necessary; however, a larger
> > randomization range does not further "harden" this overall approach
> > and
> > future work may introduce patching with a temporary mm on Hash as
> > well.
> > 
> > Randomization occurs only once during initialization for each CPU
> > as
> > it
> > comes online.
> > 
> > The patching page is mapped with PAGE_KERNEL to set EAA[0] for the
> > PTE
> > which ignores the AMR (so no need to unlock/lock KUAP) according to
> > PowerISA v3.0b Figure 35 on Radix.
> > 
> > Based on x86 implementation:
> > 
> > commit 4fc19708b165
> > ("x86/alternatives: Initialize temporary mm for patching")
> > 
> > and:
> > 
> > commit b3fd8e83ada0
> > ("x86/alternatives: Use temporary mm for text poking")
> > 
> > ---
> 
> Is the section following the --- your addendum to Chris' patch?  That
> cuts it off from git, including your signoff.  It'd be better to have
> it together as one commit message and note the bits you contributed
> below the --- after your signoff.
> 
> Commits where you're modifying someone else's previous work should
> include their signoff above yours, as well.

Addendum to his wording, to break it off from the "From..." section
(which is me splicing together his comments from previous patches with
some minor changes to account for the patch changes). I found out
earlier today that Git will treat it as a comment :(

I'll add the signed off by back, I wasn't sure whether to leave it
there after making changes (same in patch 2).
 
> > +static int __do_patch_instruction_mm(u32 *addr, ppc_inst_t instr)
> > +{
> > +   int err;
> > +   u32 *patch_addr;
> > +   unsigned long text_poke_addr;
> > +   pte_t *pte;
> > +   unsigned long pfn = get_patch_pfn(addr);
> > +   struct mm_struct *patching_mm;
> > +   struct temp_mm_state prev;
> 
> Reverse christmas tree?  If we care

Currently it's mirroring the __do_patch_instruction declarations, with
extra ones at the bottom.


Re: [PATCH v1 0/2] cleanup stackprotector canary generation

2022-10-23 Thread Greg Kroah-Hartman
On Sun, Oct 23, 2022 at 10:32:06PM +0200, Jason A. Donenfeld wrote:
> Stack canary generation currently lives partially in random.h, where it
> doesn't belong, and is in general a bit overcomplicated. This small
> patchset fixes up both issues. I'll take these in my tree, unless
> somebody else prefers to do so.
> 
> Cc: Albert Ou 
> Cc: Boris Ostrovsky 
> Cc: Borislav Petkov 
> Cc: Catalin Marinas 
> Cc: Chris Zankel 
> Cc: Christophe Leroy 
> Cc: Dave Hansen 
> Cc: Greg Kroah-Hartman 
> Cc: Guo Ren 
> Cc: H. Peter Anvin 
> Cc: Ingo Molnar 
> Cc: Juergen Gross 
> Cc: Max Filippov 
> Cc: Michael Ellerman 
> Cc: Nicholas Piggin 
> Cc: Palmer Dabbelt 
> Cc: Paul Walmsley 
> Cc: Rich Felker 
> Cc: Russell King 
> Cc: Thomas Bogendoerfer 
> Cc: Thomas Gleixner 
> Cc: Will Deacon 
> Cc: Yoshinori Sato 
> Cc: linux-arm-ker...@lists.infradead.org
> Cc: linux-c...@vger.kernel.org
> Cc: linux-m...@vger.kernel.org
> Cc: linux-ri...@lists.infradead.org
> Cc: linux...@vger.kernel.org
> Cc: linux-xte...@linux-xtensa.org
> Cc: linuxppc-dev@lists.ozlabs.org
> Cc: x...@kernel.org

Acked-by: Greg Kroah-Hartman 


Re: [PATCH] powerpc: replace ternary operator with min()

2022-10-23 Thread Russell Currey
On Sun, 2022-10-23 at 20:44 +0800, KaiLong Wang wrote:
> Fix the following coccicheck warning:
> 
> arch/powerpc/xmon/xmon.c:2987: WARNING opportunity for min()
> arch/powerpc/xmon/xmon.c:2583: WARNING opportunity for min()
> 
> Signed-off-by: KaiLong Wang 

Hello,

This fails to compile on some platforms/compilers since n is a long and
16 is an int, expanding to:

r = __builtin_choose_expr(
((!!(sizeof((typeof(n) *)1 == (typeof(16) *)1))) &&
 ((sizeof(int) ==
   sizeof(*(8 ? ((void *)((long)(n)*0l)) : (int *)8))) &&
  (sizeof(int) ==
   sizeof(*(8 ? ((void *)((long)(16) * 0l)) :
(int *)8),
((n) < (16) ? (n) : (16)), ({
typeof(n) __UNIQUE_ID___x0 = (n);
typeof(16) __UNIQUE_ID___y1 = (16);
((__UNIQUE_ID___x0) < (__UNIQUE_ID___y1) ?
 (__UNIQUE_ID___x0) :
 (__UNIQUE_ID___y1));
}));

Here's the full build failure as found by snowpatch:
https://github.com/ruscur/linux-ci/actions/runs/3308880562/jobs/5461579048#step:4:89

You should use min_t(long, n, 16) instead.

- Russell

> ---
>  arch/powerpc/xmon/xmon.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/powerpc/xmon/xmon.c b/arch/powerpc/xmon/xmon.c
> index f51c882bf902..a7751cd2cc9d 100644
> --- a/arch/powerpc/xmon/xmon.c
> +++ b/arch/powerpc/xmon/xmon.c
> @@ -2580,7 +2580,7 @@ static void xmon_rawdump (unsigned long adrs,
> long ndump)
> unsigned char temp[16];
>  
> for (n = ndump; n > 0;) {
> -   r = n < 16? n: 16;
> +   r = min(n, 16);
> nr = mread(adrs, temp, r);
> adrs += nr;
> for (m = 0; m < r; ++m) {
> @@ -2984,7 +2984,7 @@ prdump(unsigned long adrs, long ndump)
> for (n = ndump; n > 0;) {
> printf(REG, adrs);
> putchar(' ');
> -   r = n < 16? n: 16;
> +   r = min(n, 16);
> nr = mread(adrs, temp, r);
> adrs += nr;
> for (m = 0; m < r; ++m) {



Re: [PATCH v8 4/6] powerpc/tlb: Add local flush for page given mm_struct and psize

2022-10-23 Thread Russell Currey
On Fri, 2022-10-21 at 16:22 +1100, Benjamin Gray wrote:
> Adds a local TLB flush operation that works given an mm_struct, VA to
> flush, and page size representation.
> 
> This removes the need to create a vm_area_struct, which the temporary
> patching mm work does not need.
> 
> Signed-off-by: Benjamin Gray 
> ---
>  arch/powerpc/include/asm/book3s/32/tlbflush.h  | 9 +
>  arch/powerpc/include/asm/book3s/64/tlbflush-hash.h | 5 +
>  arch/powerpc/include/asm/book3s/64/tlbflush.h  | 8 
>  arch/powerpc/include/asm/nohash/tlbflush.h | 1 +
>  4 files changed, 23 insertions(+)
> 
> diff --git a/arch/powerpc/include/asm/book3s/32/tlbflush.h
> b/arch/powerpc/include/asm/book3s/32/tlbflush.h
> index ba1743c52b56..e5a688cebf69 100644
> --- a/arch/powerpc/include/asm/book3s/32/tlbflush.h
> +++ b/arch/powerpc/include/asm/book3s/32/tlbflush.h
> @@ -2,6 +2,8 @@
>  #ifndef _ASM_POWERPC_BOOK3S_32_TLBFLUSH_H
>  #define _ASM_POWERPC_BOOK3S_32_TLBFLUSH_H
>  
> +#include 
> +
>  #define MMU_NO_CONTEXT  (0)
>  /*
>   * TLB flushing for "classic" hash-MMU 32-bit CPUs, 6xx, 7xx, 7xxx
> @@ -74,6 +76,13 @@ static inline void local_flush_tlb_page(struct
> vm_area_struct *vma,
>  {
> flush_tlb_page(vma, vmaddr);
>  }
> +
> +static inline void local_flush_tlb_page_psize(struct mm_struct *mm,
> unsigned long vmaddr, int psize)
> +{
> +   BUILD_BUG_ON(psize != MMU_PAGE_4K);
> +   flush_range(mm, vmaddr, vmaddr + PAGE_SIZE);
> +}
> +
>  static inline void local_flush_tlb_mm(struct mm_struct *mm)
>  {
> flush_tlb_mm(mm);
> diff --git a/arch/powerpc/include/asm/book3s/64/tlbflush-hash.h
> b/arch/powerpc/include/asm/book3s/64/tlbflush-hash.h
> index fab8332fe1ad..8fd9dc49b2a1 100644
> --- a/arch/powerpc/include/asm/book3s/64/tlbflush-hash.h
> +++ b/arch/powerpc/include/asm/book3s/64/tlbflush-hash.h
> @@ -94,6 +94,11 @@ static inline void
> hash__local_flush_tlb_page(struct vm_area_struct *vma,
>  {
>  }
>  
> +static inline void hash__local_flush_tlb_page_psize(struct mm_struct
> *mm,
> +   unsigned long
> vmaddr, int psize)
> +{
> +}
> +
>  static inline void hash__flush_tlb_page(struct vm_area_struct *vma,
>     unsigned long vmaddr)
>  {
> diff --git a/arch/powerpc/include/asm/book3s/64/tlbflush.h
> b/arch/powerpc/include/asm/book3s/64/tlbflush.h
> index 67655cd60545..2d839dd5c08c 100644
> --- a/arch/powerpc/include/asm/book3s/64/tlbflush.h
> +++ b/arch/powerpc/include/asm/book3s/64/tlbflush.h
> @@ -92,6 +92,14 @@ static inline void local_flush_tlb_page(struct
> vm_area_struct *vma,
> return hash__local_flush_tlb_page(vma, vmaddr);
>  }
>  
> +static inline void local_flush_tlb_page_psize(struct mm_struct *mm,
> + unsigned long vmaddr,
> int psize)
> +{
> +   if (radix_enabled())
> +   return radix__local_flush_tlb_page_psize(mm, vmaddr,
> psize);
> +   return hash__local_flush_tlb_page_psize(mm, vmaddr, psize);
> +}
> +
>  static inline void local_flush_all_mm(struct mm_struct *mm)
>  {
> if (radix_enabled())
> diff --git a/arch/powerpc/include/asm/nohash/tlbflush.h
> b/arch/powerpc/include/asm/nohash/tlbflush.h
> index bdaf34ad41ea..59bce0ebdcf4 100644
> --- a/arch/powerpc/include/asm/nohash/tlbflush.h
> +++ b/arch/powerpc/include/asm/nohash/tlbflush.h
> @@ -58,6 +58,7 @@ static inline void flush_tlb_kernel_range(unsigned
> long start, unsigned long end
>  extern void flush_tlb_kernel_range(unsigned long start, unsigned
> long end);
>  extern void local_flush_tlb_mm(struct mm_struct *mm);
>  extern void local_flush_tlb_page(struct vm_area_struct *vma,
> unsigned long vmaddr);
> +extern void local_flush_tlb_page_psize(struct mm_struct *mm,
> unsigned long vmaddr, int psize);

This misses a definition for PPC_8xx which leads to a build failure as
found by snowpatch here:
https://github.com/ruscur/linux-ci/actions/runs/3295033018/jobs/5433162658#step:4:116

>  
>  extern void __local_flush_tlb_page(struct mm_struct *mm, unsigned
> long vmaddr,
>    int tsize, int ind);



[PATCH] powerpc/8xx: Fix warning in hw_breakpoint_handler()

2022-10-23 Thread Russell Currey
In hw_breakpoint_handler(), ea is set by wp_get_instr_detail() except
for 8xx, leading the variable to be passed uninitialised to
wp_check_constraints().  This is safe as wp_check_constraints() returns
early without using ea, so just set it to make the compiler happy.

Signed-off-by: Russell Currey 
---
 arch/powerpc/kernel/hw_breakpoint.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/hw_breakpoint.c 
b/arch/powerpc/kernel/hw_breakpoint.c
index 8db1a15d7acb..e1b4e70c8fd0 100644
--- a/arch/powerpc/kernel/hw_breakpoint.c
+++ b/arch/powerpc/kernel/hw_breakpoint.c
@@ -646,7 +646,7 @@ int hw_breakpoint_handler(struct die_args *args)
ppc_inst_t instr = ppc_inst(0);
int type = 0;
int size = 0;
-   unsigned long ea;
+   unsigned long ea = 0;
 
/* Disable breakpoints during exception handling */
hw_breakpoint_disable();
-- 
2.37.3



Re: [PATCH v8 5/6] powerpc/code-patching: Use temporary mm for Radix MMU

2022-10-23 Thread Russell Currey
On Fri, 2022-10-21 at 16:22 +1100, Benjamin Gray wrote:
> From: "Christopher M. Riedl" 
> 
> x86 supports the notion of a temporary mm which restricts access to
> temporary PTEs to a single CPU. A temporary mm is useful for
> situations
> where a CPU needs to perform sensitive operations (such as patching a
> STRICT_KERNEL_RWX kernel) requiring temporary mappings without
> exposing
> said mappings to other CPUs. Another benefit is that other CPU TLBs
> do
> not need to be flushed when the temporary mm is torn down.
> 
> Mappings in the temporary mm can be set in the userspace portion of
> the
> address-space.
> 
> Interrupts must be disabled while the temporary mm is in use. HW
> breakpoints, which may have been set by userspace as watchpoints on
> addresses now within the temporary mm, are saved and disabled when
> loading the temporary mm. The HW breakpoints are restored when
> unloading
> the temporary mm. All HW breakpoints are indiscriminately disabled
> while
> the temporary mm is in use - this may include breakpoints set by
> perf.
> 
> Use the `poking_init` init hook to prepare a temporary mm and
> patching
> address. Initialize the temporary mm by copying the init mm. Choose a
> randomized patching address inside the temporary mm userspace address
> space. The patching address is randomized between PAGE_SIZE and
> DEFAULT_MAP_WINDOW-PAGE_SIZE.
> 
> Bits of entropy with 64K page size on BOOK3S_64:
> 
> bits of entropy = log2(DEFAULT_MAP_WINDOW_USER64 / PAGE_SIZE)
> 
> PAGE_SIZE=64K, DEFAULT_MAP_WINDOW_USER64=128TB
> bits of entropy = log2(128TB / 64K)
> bits of entropy = 31
> 
> The upper limit is DEFAULT_MAP_WINDOW due to how the Book3s64 Hash
> MMU
> operates - by default the space above DEFAULT_MAP_WINDOW is not
> available. Currently the Hash MMU does not use a temporary mm so
> technically this upper limit isn't necessary; however, a larger
> randomization range does not further "harden" this overall approach
> and
> future work may introduce patching with a temporary mm on Hash as
> well.
> 
> Randomization occurs only once during initialization for each CPU as
> it
> comes online.
> 
> The patching page is mapped with PAGE_KERNEL to set EAA[0] for the
> PTE
> which ignores the AMR (so no need to unlock/lock KUAP) according to
> PowerISA v3.0b Figure 35 on Radix.
> 
> Based on x86 implementation:
> 
> commit 4fc19708b165
> ("x86/alternatives: Initialize temporary mm for patching")
> 
> and:
> 
> commit b3fd8e83ada0
> ("x86/alternatives: Use temporary mm for text poking")
> 
> ---

Is the section following the --- your addendum to Chris' patch?  That
cuts it off from git, including your signoff.  It'd be better to have
it together as one commit message and note the bits you contributed
below the --- after your signoff.

Commits where you're modifying someone else's previous work should
include their signoff above yours, as well.

> Synchronisation is done according to Book 3 Chapter 13

might want to mention the ISA version alongside this, since chapter
numbering can change

> "Synchronization
> Requirements for Context Alterations". Switching the mm is a change
> to
> the PID, which requires a context synchronising instruction before
> and
> after the change, and a hwsync between the last instruction that
> performs address translation for an associated storage access.
> 
> Instruction fetch is an associated storage access, but the
> instruction
> address mappings are not being changed, so it should not matter which
> context they use. We must still perform a hwsync to guard arbitrary
> prior code that may have access a userspace address.
> 
> TLB invalidation is local and VA specific. Local because only this
> core
> used the patching mm, and VA specific because we only care that the
> writable mapping is purged. Leaving the other mappings intact is more
> efficient, especially when performing many code patches in a row
> (e.g.,
> as ftrace would).
> 
> Signed-off-by: Benjamin Gray 
> ---
>  arch/powerpc/lib/code-patching.c | 226
> ++-
>  1 file changed, 221 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/powerpc/lib/code-patching.c
> b/arch/powerpc/lib/code-patching.c
> index 9b9eba574d7e..eabdd74a26c0 100644
> --- a/arch/powerpc/lib/code-patching.c
> +++ b/arch/powerpc/lib/code-patching.c
> @@ -4,12 +4,17 @@
>   */
>  
>  #include 
> +#include 
> +#include 
>  #include 
>  #include 
>  #include 
>  #include 
>  #include 
>  
> +#include 
> +#include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -42,11 +47,59 @@ int raw_patch_instruction(u32 *addr, ppc_inst_t
> instr)
>  }
>  
>  #ifdef CONFIG_STRICT_KERNEL_RWX
> +
>  static DEFINE_PER_CPU(struct vm_struct *, text_poke_area);
> +static DEFINE_PER_CPU(struct mm_struct *, cpu_patching_mm);
> +static DEFINE_PER_CPU(unsigned long, cpu_patching_addr);
> +static DEFINE_PER_CPU(pte_t *, cpu_patching_pte);
>  
>  static int map_patch_area(void *addr, unsigned long 

Re: [PATCH v8 4/6] powerpc/tlb: Add local flush for page given mm_struct and psize

2022-10-23 Thread Russell Currey
On Fri, 2022-10-21 at 16:22 +1100, Benjamin Gray wrote:
> Adds a local TLB flush operation that works given an mm_struct, VA to
> flush, and page size representation.
> 
> This removes the need to create a vm_area_struct, which the temporary
> patching mm work does not need.
> 
> Signed-off-by: Benjamin Gray 
> ---
>  arch/powerpc/include/asm/book3s/32/tlbflush.h  | 9 +
>  arch/powerpc/include/asm/book3s/64/tlbflush-hash.h | 5 +
>  arch/powerpc/include/asm/book3s/64/tlbflush.h  | 8 
>  arch/powerpc/include/asm/nohash/tlbflush.h | 1 +
>  4 files changed, 23 insertions(+)
> 
> diff --git a/arch/powerpc/include/asm/book3s/32/tlbflush.h
> b/arch/powerpc/include/asm/book3s/32/tlbflush.h
> index ba1743c52b56..e5a688cebf69 100644
> --- a/arch/powerpc/include/asm/book3s/32/tlbflush.h
> +++ b/arch/powerpc/include/asm/book3s/32/tlbflush.h
> @@ -2,6 +2,8 @@
>  #ifndef _ASM_POWERPC_BOOK3S_32_TLBFLUSH_H
>  #define _ASM_POWERPC_BOOK3S_32_TLBFLUSH_H
>  
> +#include 
> +
>  #define MMU_NO_CONTEXT  (0)
>  /*
>   * TLB flushing for "classic" hash-MMU 32-bit CPUs, 6xx, 7xx, 7xxx
> @@ -74,6 +76,13 @@ static inline void local_flush_tlb_page(struct
> vm_area_struct *vma,
>  {
> flush_tlb_page(vma, vmaddr);
>  }
> +
> +static inline void local_flush_tlb_page_psize(struct mm_struct *mm,
> unsigned long vmaddr, int psize)
> +{
> +   BUILD_BUG_ON(psize != MMU_PAGE_4K);

Is there any utility in adding this for 32bit if the following patches
are only for Radix?

> +   flush_range(mm, vmaddr, vmaddr + PAGE_SIZE);
> +}
> +
>  static inline void local_flush_tlb_mm(struct mm_struct *mm)
>  {
> flush_tlb_mm(mm);
> diff --git a/arch/powerpc/include/asm/book3s/64/tlbflush-hash.h
> b/arch/powerpc/include/asm/book3s/64/tlbflush-hash.h
> index fab8332fe1ad..8fd9dc49b2a1 100644
> --- a/arch/powerpc/include/asm/book3s/64/tlbflush-hash.h
> +++ b/arch/powerpc/include/asm/book3s/64/tlbflush-hash.h
> @@ -94,6 +94,11 @@ static inline void
> hash__local_flush_tlb_page(struct vm_area_struct *vma,
>  {
>  }
>  
> +static inline void hash__local_flush_tlb_page_psize(struct mm_struct
> *mm,
> +   unsigned long
> vmaddr, int psize)
> +{
> +}
> +
>  static inline void hash__flush_tlb_page(struct vm_area_struct *vma,
>     unsigned long vmaddr)
>  {
> diff --git a/arch/powerpc/include/asm/book3s/64/tlbflush.h
> b/arch/powerpc/include/asm/book3s/64/tlbflush.h
> index 67655cd60545..2d839dd5c08c 100644
> --- a/arch/powerpc/include/asm/book3s/64/tlbflush.h
> +++ b/arch/powerpc/include/asm/book3s/64/tlbflush.h
> @@ -92,6 +92,14 @@ static inline void local_flush_tlb_page(struct
> vm_area_struct *vma,
> return hash__local_flush_tlb_page(vma, vmaddr);
>  }
>  
> +static inline void local_flush_tlb_page_psize(struct mm_struct *mm,
> + unsigned long vmaddr,
> int psize)
> +{
> +   if (radix_enabled())
> +   return radix__local_flush_tlb_page_psize(mm, vmaddr,
> psize);
> +   return hash__local_flush_tlb_page_psize(mm, vmaddr, psize);
> +}
> +
>  static inline void local_flush_all_mm(struct mm_struct *mm)
>  {
> if (radix_enabled())
> diff --git a/arch/powerpc/include/asm/nohash/tlbflush.h
> b/arch/powerpc/include/asm/nohash/tlbflush.h
> index bdaf34ad41ea..59bce0ebdcf4 100644
> --- a/arch/powerpc/include/asm/nohash/tlbflush.h
> +++ b/arch/powerpc/include/asm/nohash/tlbflush.h
> @@ -58,6 +58,7 @@ static inline void flush_tlb_kernel_range(unsigned
> long start, unsigned long end
>  extern void flush_tlb_kernel_range(unsigned long start, unsigned
> long end);
>  extern void local_flush_tlb_mm(struct mm_struct *mm);
>  extern void local_flush_tlb_page(struct vm_area_struct *vma,
> unsigned long vmaddr);
> +extern void local_flush_tlb_page_psize(struct mm_struct *mm,
> unsigned long vmaddr, int psize);
>  
>  extern void __local_flush_tlb_page(struct mm_struct *mm, unsigned
> long vmaddr,
>    int tsize, int ind);



Re: [PATCH v8 3/6] powerpc/code-patching: Verify instruction patch succeeded

2022-10-23 Thread Russell Currey
On Fri, 2022-10-21 at 16:22 +1100, Benjamin Gray wrote:
> Verifies that if the instruction patching did not return an error
> then
> the value stored at the given address to patch is now equal to the
> instruction we patched it to.
> 
> Signed-off-by: Benjamin Gray 
> ---
>  arch/powerpc/lib/code-patching.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/arch/powerpc/lib/code-patching.c
> b/arch/powerpc/lib/code-patching.c
> index 34fc7ac34d91..9b9eba574d7e 100644
> --- a/arch/powerpc/lib/code-patching.c
> +++ b/arch/powerpc/lib/code-patching.c
> @@ -186,6 +186,8 @@ static int do_patch_instruction(u32 *addr,
> ppc_inst_t instr)
> err = __do_patch_instruction(addr, instr);
> local_irq_restore(flags);
>  
> +   WARN_ON(!err && !ppc_inst_equal(instr, ppc_inst_read(addr)));
> +

As a side note, I had a look at test-code-patching.c and it doesn't
look like we don't have a test for ppc_inst_equal() with prefixed
instructions.  We should fix that.

> return err;
>  }
>  #else /* !CONFIG_STRICT_KERNEL_RWX */



Re: [PATCH v8 2/6] powerpc/code-patching: Use WARN_ON and fix check in poking_init

2022-10-23 Thread Russell Currey
On Fri, 2022-10-21 at 16:22 +1100, Benjamin Gray wrote:
> From: "Christopher M. Riedl" 
> 
> The latest kernel docs list BUG_ON() as 'deprecated' and that they
> should be replaced with WARN_ON() (or pr_warn()) when possible. The
> BUG_ON() in poking_init() warrants a WARN_ON() rather than a
> pr_warn()
> since the error condition is deemed "unreachable".
> 
> Also take this opportunity to fix the failure check in the WARN_ON():
> cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, ...) returns a positive
> integer
> on success and a negative integer on failure.
> 
> Signed-off-by: Benjamin Gray 

Reviewed-by: Russell Currey 


Re: [PATCH v8 1/6] powerpc: Allow clearing and restoring registers independent of saved breakpoint state

2022-10-23 Thread Russell Currey
On Fri, 2022-10-21 at 16:22 +1100, Benjamin Gray wrote:
> From: Jordan Niethe 

Hi Ben,

> For the coming temporary mm used for instruction patching, the
> breakpoint registers need to be cleared to prevent them from
> accidentally being triggered. As soon as the patching is done, the
> breakpoints will be restored. The breakpoint state is stored in the
> per
> cpu variable current_brk[]. Add a pause_breakpoints() function which
> will
> clear the breakpoint registers without touching the state in
> current_bkr[]. Add a pair function unpause_breakpoints() which will
 
typo here ^

> move
> the state in current_brk[] back to the registers.
> 
> Signed-off-by: Jordan Niethe 
> Signed-off-by: Benjamin Gray 
> ---
>  arch/powerpc/include/asm/debug.h |  2 ++
>  arch/powerpc/kernel/process.c    | 36 +-
> --
>  2 files changed, 35 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/debug.h
> b/arch/powerpc/include/asm/debug.h
> index 86a14736c76c..83f2dc3785e8 100644
> --- a/arch/powerpc/include/asm/debug.h
> +++ b/arch/powerpc/include/asm/debug.h
> @@ -46,6 +46,8 @@ static inline int debugger_fault_handler(struct
> pt_regs *regs) { return 0; }
>  #endif
>  
>  void __set_breakpoint(int nr, struct arch_hw_breakpoint *brk);
> +void pause_breakpoints(void);
> +void unpause_breakpoints(void);

Nitpick, would (clear/suspend)/restore be clearer than pause/unpause?

>  bool ppc_breakpoint_available(void);
>  #ifdef CONFIG_PPC_ADV_DEBUG_REGS
>  extern void do_send_trap(struct pt_regs *regs, unsigned long
> address,
> diff --git a/arch/powerpc/kernel/process.c
> b/arch/powerpc/kernel/process.c
> index 67da147fe34d..7aee1b30e73c 100644
> --- a/arch/powerpc/kernel/process.c
> +++ b/arch/powerpc/kernel/process.c
> @@ -685,6 +685,7 @@ DEFINE_INTERRUPT_HANDLER(do_break)
>  
>  static DEFINE_PER_CPU(struct arch_hw_breakpoint,
> current_brk[HBP_NUM_MAX]);
>  
> +

some bonus whitespace here

>  #ifdef CONFIG_PPC_ADV_DEBUG_REGS
>  /*
>   * Set the debug registers back to their default "safe" values.
> @@ -862,10 +863,8 @@ static inline int set_breakpoint_8xx(struct
> arch_hw_breakpoint *brk)
> return 0;
>  }
>  
> -void __set_breakpoint(int nr, struct arch_hw_breakpoint *brk)
> +static void set_breakpoint(int nr, struct arch_hw_breakpoint
> *brk)

Is there a way to refactor this?  The quad underscore is pretty cursed.

>  {
> -   memcpy(this_cpu_ptr(_brk[nr]), brk, sizeof(*brk));
> -
> if (dawr_enabled())
> // Power8 or later
> set_dawr(nr, brk);
> @@ -879,6 +878,12 @@ void __set_breakpoint(int nr, struct
> arch_hw_breakpoint *brk)
> WARN_ON_ONCE(1);
>  }
>  
> +void __set_breakpoint(int nr, struct arch_hw_breakpoint *brk)
> +{
> +   memcpy(this_cpu_ptr(_brk[nr]), brk, sizeof(*brk));
> +   set_breakpoint(nr, brk);
> +}
> +
>  /* Check if we have DAWR or DABR hardware */
>  bool ppc_breakpoint_available(void)
>  {
> @@ -891,6 +896,31 @@ bool ppc_breakpoint_available(void)
>  }
>  EXPORT_SYMBOL_GPL(ppc_breakpoint_available);
>  
> +/* Disable the breakpoint in hardware without touching current_brk[]
> */
> +void pause_breakpoints(void)
> +{
> +   struct arch_hw_breakpoint brk = {0};
> +   int i;
> +
> +   if (!ppc_breakpoint_available())
> +   return;
> +
> +   for (i = 0; i < nr_wp_slots(); i++)
> +   set_breakpoint(i, );
> +}
> +
> +/* Renable the breakpoint in hardware from current_brk[] */
> +void unpause_breakpoints(void)
> +{
> +   int i;
> +
> +   if (!ppc_breakpoint_available())
> +   return;
> +
> +   for (i = 0; i < nr_wp_slots(); i++)
> +   set_breakpoint(i, this_cpu_ptr(_brk[i]));
> +}
> +
>  #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
>  
>  static inline bool tm_enabled(struct task_struct *tsk)



[PATCH] powerpc/64s/hash: add stress_hpt kernel boot option to increase hash faults

2022-10-23 Thread Nicholas Piggin
This option increases the number of hash misses by limiting the number
of kernel HPT entries, by keeping a per-CPU record of the last kernel
HPTEs installed, and removing that from the hash table on the next hash
insertion. A timer round-robins CPUs removing remaining kernel HPTEs and
clearing the TLB (in the case of bare metal) to increase and slightly
randomise kernel fault activity.

Signed-off-by: Nicholas Piggin 
---

I tried this a while ago and it had some lockup issues, I think I fixed
them. Deadlock if the same hpte group is used as the one to be cleared,
solved by deferring the removal in that case. Livelock if CPUs are
concurrently faulting in HPTEs and removing them, solved by limiting
HPTE remove iterations. And thrashing / practical livelock due to
insufficient TLB entries on pseries, solved by keeping the last 16
translations cached. Also added a periodic tlb flush timer to add a
bit more randomness to faults.

This catches the irq return bug that the stress_code_patching.sh
test was triggering.

Thanks,
Nick

 .../admin-guide/kernel-parameters.txt |   5 +
 arch/powerpc/mm/book3s64/hash_4k.c|   5 +
 arch/powerpc/mm/book3s64/hash_64k.c   |  10 ++
 arch/powerpc/mm/book3s64/hash_utils.c | 129 +-
 arch/powerpc/mm/book3s64/internal.h   |  11 ++
 5 files changed, 159 insertions(+), 1 deletion(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt 
b/Documentation/admin-guide/kernel-parameters.txt
index a465d5242774..9f3d256529d0 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -1042,6 +1042,11 @@
them frequently to increase the rate of SLB faults
on kernel addresses.
 
+   stress_hpt  [PPC]
+   Limits the number of kernel HPT entries in the hash
+   page table to increase the rate of hash page table
+   faults on kernel addresses.
+
disable=[IPV6]
See Documentation/networking/ipv6.rst.
 
diff --git a/arch/powerpc/mm/book3s64/hash_4k.c 
b/arch/powerpc/mm/book3s64/hash_4k.c
index 7de1a8a0c62a..02acbfd05b46 100644
--- a/arch/powerpc/mm/book3s64/hash_4k.c
+++ b/arch/powerpc/mm/book3s64/hash_4k.c
@@ -16,6 +16,8 @@
 #include 
 #include 
 
+#include "internal.h"
+
 int __hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid,
   pte_t *ptep, unsigned long trap, unsigned long flags,
   int ssize, int subpg_prot)
@@ -118,6 +120,9 @@ int __hash_page_4K(unsigned long ea, unsigned long access, 
unsigned long vsid,
}
new_pte = (new_pte & ~_PAGE_HPTEFLAGS) | H_PAGE_HASHPTE;
new_pte |= pte_set_hidx(ptep, rpte, 0, slot, PTRS_PER_PTE);
+
+   if (stress_hpt())
+   hpt_do_stress(ea, hpte_group);
}
*ptep = __pte(new_pte & ~H_PAGE_BUSY);
return 0;
diff --git a/arch/powerpc/mm/book3s64/hash_64k.c 
b/arch/powerpc/mm/book3s64/hash_64k.c
index 998c6817ed47..954af420f358 100644
--- a/arch/powerpc/mm/book3s64/hash_64k.c
+++ b/arch/powerpc/mm/book3s64/hash_64k.c
@@ -16,6 +16,8 @@
 #include 
 #include 
 
+#include "internal.h"
+
 /*
  * Return true, if the entry has a slot value which
  * the software considers as invalid.
@@ -216,6 +218,9 @@ int __hash_page_4K(unsigned long ea, unsigned long access, 
unsigned long vsid,
new_pte |= pte_set_hidx(ptep, rpte, subpg_index, slot, PTRS_PER_PTE);
new_pte |= H_PAGE_HASHPTE;
 
+   if (stress_hpt())
+   hpt_do_stress(ea, hpte_group);
+
*ptep = __pte(new_pte & ~H_PAGE_BUSY);
return 0;
 }
@@ -327,7 +332,12 @@ int __hash_page_64K(unsigned long ea, unsigned long access,
 
new_pte = (new_pte & ~_PAGE_HPTEFLAGS) | H_PAGE_HASHPTE;
new_pte |= pte_set_hidx(ptep, rpte, 0, slot, PTRS_PER_PTE);
+
+   if (stress_hpt())
+   hpt_do_stress(ea, hpte_group);
}
+
*ptep = __pte(new_pte & ~H_PAGE_BUSY);
+
return 0;
 }
diff --git a/arch/powerpc/mm/book3s64/hash_utils.c 
b/arch/powerpc/mm/book3s64/hash_utils.c
index df008edf7be0..8440e524e57b 100644
--- a/arch/powerpc/mm/book3s64/hash_utils.c
+++ b/arch/powerpc/mm/book3s64/hash_utils.c
@@ -471,7 +471,7 @@ int htab_remove_mapping(unsigned long vstart, unsigned long 
vend,
return ret;
 }
 
-static bool disable_1tb_segments = false;
+static bool disable_1tb_segments __ro_after_init;
 
 static int __init parse_disable_1tb_segments(char *p)
 {
@@ -480,6 +480,40 @@ static int __init parse_disable_1tb_segments(char *p)
 }
 early_param("disable_1tb_segments", parse_disable_1tb_segments);
 
+bool stress_hpt_enabled __initdata;
+
+static int __init parse_stress_hpt(char *p)
+{
+   stress_hpt_enabled = true;
+   return 0;
+}
+early_param("stress_hpt", parse_stress_hpt);
+

Re: [PATCH v1 2/2] stackprotector: actually use get_random_canary()

2022-10-23 Thread Guo Ren
On Mon, Oct 24, 2022 at 4:32 AM Jason A. Donenfeld  wrote:
>
> The RNG always mixes in the Linux version extremely early in boot. It
> also always includes a cycle counter, not only during early boot, but
> each and every time it is invoked prior to being fully initialized.
> Together, this means that the use of additional xors inside of the
> various stackprotector.h files is superfluous and over-complicated.
> Instead, we can get exactly the same thing, but better, by just calling
> `get_random_canary()`.
>
> Signed-off-by: Jason A. Donenfeld 
> ---
>  arch/arm/include/asm/stackprotector.h |  9 +
>  arch/arm64/include/asm/stackprotector.h   |  9 +
>  arch/csky/include/asm/stackprotector.h| 10 +-
>  arch/mips/include/asm/stackprotector.h|  9 +
>  arch/powerpc/include/asm/stackprotector.h | 10 +-
>  arch/riscv/include/asm/stackprotector.h   | 10 +-
>  arch/sh/include/asm/stackprotector.h  | 10 +-
>  arch/x86/include/asm/stackprotector.h | 14 +-
>  arch/xtensa/include/asm/stackprotector.h  |  7 +--
>  9 files changed, 9 insertions(+), 79 deletions(-)
>
> diff --git a/arch/arm/include/asm/stackprotector.h 
> b/arch/arm/include/asm/stackprotector.h
> index 088d03161be5..0bd4979759f1 100644
> --- a/arch/arm/include/asm/stackprotector.h
> +++ b/arch/arm/include/asm/stackprotector.h
> @@ -15,9 +15,6 @@
>  #ifndef _ASM_STACKPROTECTOR_H
>  #define _ASM_STACKPROTECTOR_H 1
>
> -#include 
> -#include 
> -
>  #include 
>
>  extern unsigned long __stack_chk_guard;
> @@ -30,11 +27,7 @@ extern unsigned long __stack_chk_guard;
>   */
>  static __always_inline void boot_init_stack_canary(void)
>  {
> -   unsigned long canary;
> -
> -   /* Try to get a semi random initial value. */
> -   get_random_bytes(, sizeof(canary));
> -   canary ^= LINUX_VERSION_CODE;
> +   unsigned long canary = get_random_canary();
>
> current->stack_canary = canary;
>  #ifndef CONFIG_STACKPROTECTOR_PER_TASK
> diff --git a/arch/arm64/include/asm/stackprotector.h 
> b/arch/arm64/include/asm/stackprotector.h
> index 33f1bb453150..ae3ad80f51fe 100644
> --- a/arch/arm64/include/asm/stackprotector.h
> +++ b/arch/arm64/include/asm/stackprotector.h
> @@ -13,8 +13,6 @@
>  #ifndef __ASM_STACKPROTECTOR_H
>  #define __ASM_STACKPROTECTOR_H
>
> -#include 
> -#include 
>  #include 
>
>  extern unsigned long __stack_chk_guard;
> @@ -28,12 +26,7 @@ extern unsigned long __stack_chk_guard;
>  static __always_inline void boot_init_stack_canary(void)
>  {
>  #if defined(CONFIG_STACKPROTECTOR)
> -   unsigned long canary;
> -
> -   /* Try to get a semi random initial value. */
> -   get_random_bytes(, sizeof(canary));
> -   canary ^= LINUX_VERSION_CODE;
> -   canary &= CANARY_MASK;
> +   unsigned long canary = get_random_canary();
>
> current->stack_canary = canary;
> if (!IS_ENABLED(CONFIG_STACKPROTECTOR_PER_TASK))
> diff --git a/arch/csky/include/asm/stackprotector.h 
> b/arch/csky/include/asm/stackprotector.h
> index d7cd4e51edd9..d23747447166 100644
> --- a/arch/csky/include/asm/stackprotector.h
> +++ b/arch/csky/include/asm/stackprotector.h
> @@ -2,9 +2,6 @@
>  #ifndef _ASM_STACKPROTECTOR_H
>  #define _ASM_STACKPROTECTOR_H 1
>
> -#include 
> -#include 
> -
>  extern unsigned long __stack_chk_guard;
>
>  /*
> @@ -15,12 +12,7 @@ extern unsigned long __stack_chk_guard;
>   */
>  static __always_inline void boot_init_stack_canary(void)
>  {
> -   unsigned long canary;
> -
> -   /* Try to get a semi random initial value. */
> -   get_random_bytes(, sizeof(canary));
> -   canary ^= LINUX_VERSION_CODE;
> -   canary &= CANARY_MASK;
> +   unsigned long canary = get_random_canary();
Acked-by: Guo Ren  #csky part

>
> current->stack_canary = canary;
> __stack_chk_guard = current->stack_canary;
> diff --git a/arch/mips/include/asm/stackprotector.h 
> b/arch/mips/include/asm/stackprotector.h
> index 68d4be9e1254..518c192ad982 100644
> --- a/arch/mips/include/asm/stackprotector.h
> +++ b/arch/mips/include/asm/stackprotector.h
> @@ -15,9 +15,6 @@
>  #ifndef _ASM_STACKPROTECTOR_H
>  #define _ASM_STACKPROTECTOR_H 1
>
> -#include 
> -#include 
> -
>  extern unsigned long __stack_chk_guard;
>
>  /*
> @@ -28,11 +25,7 @@ extern unsigned long __stack_chk_guard;
>   */
>  static __always_inline void boot_init_stack_canary(void)
>  {
> -   unsigned long canary;
> -
> -   /* Try to get a semi random initial value. */
> -   get_random_bytes(, sizeof(canary));
> -   canary ^= LINUX_VERSION_CODE;
> +   unsigned long canary = get_random_canary();
>
> current->stack_canary = canary;
> __stack_chk_guard = current->stack_canary;
> diff --git a/arch/powerpc/include/asm/stackprotector.h 
> b/arch/powerpc/include/asm/stackprotector.h
> index 1c8460e23583..283c34647856 100644
> --- a/arch/powerpc/include/asm/stackprotector.h
> +++ 

[PATCH v3 1/3] powerpc: Add common pud_pfn stub for all platforms

2022-10-23 Thread Rohan McLure
Prior to this commit, pud_pfn was implemented with BUILD_BUG as the inline
function for 64-bit Book3S systems but is never included, as its
invocations in generic code are guarded by calls to pud_devmap which return
zero on such systems. A future patch will provide support for page table
checks, the generic code for which depends on a pud_pfn stub being
implemented, even while the patch will not interact with puds directly.

Remove the 64-bit Book3S stub and define pud_pfn to warn on all
platforms. pud_pfn may be defined properly on a per-platform basis
should it grow real usages in future.

Signed-off-by: Rohan McLure 
---
V2: Remove conditional BUILD_BUG and BUG. Instead warn on usage.
---
 arch/powerpc/include/asm/book3s/64/pgtable.h | 10 --
 arch/powerpc/include/asm/pgtable.h   | 14 ++
 2 files changed, 14 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h 
b/arch/powerpc/include/asm/book3s/64/pgtable.h
index 486902aff040..f9aefa492df0 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -1413,16 +1413,6 @@ static inline int pgd_devmap(pgd_t pgd)
 }
 #endif /* CONFIG_TRANSPARENT_HUGEPAGE */
 
-static inline int pud_pfn(pud_t pud)
-{
-   /*
-* Currently all calls to pud_pfn() are gated around a pud_devmap()
-* check so this should never be used. If it grows another user we
-* want to know about it.
-*/
-   BUILD_BUG();
-   return 0;
-}
 #define __HAVE_ARCH_PTEP_MODIFY_PROT_TRANSACTION
 pte_t ptep_modify_prot_start(struct vm_area_struct *, unsigned long, pte_t *);
 void ptep_modify_prot_commit(struct vm_area_struct *, unsigned long,
diff --git a/arch/powerpc/include/asm/pgtable.h 
b/arch/powerpc/include/asm/pgtable.h
index 33f4bf8d22b0..36956fb440e1 100644
--- a/arch/powerpc/include/asm/pgtable.h
+++ b/arch/powerpc/include/asm/pgtable.h
@@ -158,6 +158,20 @@ struct seq_file;
 void arch_report_meminfo(struct seq_file *m);
 #endif /* CONFIG_PPC64 */
 
+/*
+ * Currently only consumed by page_table_check_pud_{set,clear}. Since clears
+ * and sets to page table entries at any level are done through
+ * page_table_check_pte_{set,clear}, provide stub implementation.
+ */
+#ifndef pud_pfn
+#define pud_pfn pud_pfn
+static inline int pud_pfn(pud_t pud)
+{
+   WARN(1, "pud: platform does not use pud entries directly");
+   return 0;
+}
+#endif
+
 #endif /* __ASSEMBLY__ */
 
 #endif /* _ASM_POWERPC_PGTABLE_H */
-- 
2.34.1



[PATCH v3 2/3] powerpc: mm: add p{te,md,ud}_user_accessible_page helpers

2022-10-23 Thread Rohan McLure
Add the following helpers for detecting whether a page table entry
is a leaf and is accessible to user space.

 * pte_user_accessible_page
 * pmd_user_accessible_page
 * pud_user_accessible_page

Also implement missing pud_user definitions for both Book3S/nohash 64-bit
systems, and pmd_user for Book3S/nohash 32-bit systems.

Signed-off-by: Rohan McLure 
---
V2: Provide missing pud_user implementations, use p{u,m}d_is_leaf.
V3: Provide missing pmd_user implementations as stubs in 32-bit.
---
 arch/powerpc/include/asm/book3s/32/pgtable.h |  4 
 arch/powerpc/include/asm/book3s/64/pgtable.h | 10 ++
 arch/powerpc/include/asm/nohash/32/pgtable.h |  1 +
 arch/powerpc/include/asm/nohash/64/pgtable.h | 10 ++
 arch/powerpc/include/asm/pgtable.h   | 15 +++
 5 files changed, 40 insertions(+)

diff --git a/arch/powerpc/include/asm/book3s/32/pgtable.h 
b/arch/powerpc/include/asm/book3s/32/pgtable.h
index 40041ac713d9..8bf1c538839a 100644
--- a/arch/powerpc/include/asm/book3s/32/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/32/pgtable.h
@@ -531,6 +531,10 @@ static inline pte_t pte_modify(pte_t pte, pgprot_t newprot)
return __pte((pte_val(pte) & _PAGE_CHG_MASK) | pgprot_val(newprot));
 }
 
+static inline bool pmd_user(pmd_t pmd)
+{
+   return 0;
+}
 
 
 /* This low level function performs the actual PTE insertion
diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h 
b/arch/powerpc/include/asm/book3s/64/pgtable.h
index f9aefa492df0..3083111f9d0a 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -621,6 +621,16 @@ static inline bool pte_user(pte_t pte)
return !(pte_raw(pte) & cpu_to_be64(_PAGE_PRIVILEGED));
 }
 
+static inline bool pmd_user(pmd_t pmd)
+{
+   return !(pmd_raw(pmd) & cpu_to_be64(_PAGE_PRIVILEGED));
+}
+
+static inline bool pud_user(pud_t pud)
+{
+   return !(pud_raw(pud) & cpu_to_be64(_PAGE_PRIVILEGED));
+}
+
 #define pte_access_permitted pte_access_permitted
 static inline bool pte_access_permitted(pte_t pte, bool write)
 {
diff --git a/arch/powerpc/include/asm/nohash/32/pgtable.h 
b/arch/powerpc/include/asm/nohash/32/pgtable.h
index 9091e4904a6b..b92044d9d778 100644
--- a/arch/powerpc/include/asm/nohash/32/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/32/pgtable.h
@@ -354,6 +354,7 @@ static inline int pte_young(pte_t pte)
 #endif
 
 #define pmd_page(pmd)  pfn_to_page(pmd_pfn(pmd))
+#define pmd_user(pmd)  0
 /*
  * Encode and decode a swap entry.
  * Note that the bits we use in a PTE for representing a swap entry
diff --git a/arch/powerpc/include/asm/nohash/64/pgtable.h 
b/arch/powerpc/include/asm/nohash/64/pgtable.h
index 599921cc257e..23c5135178d1 100644
--- a/arch/powerpc/include/asm/nohash/64/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/64/pgtable.h
@@ -123,6 +123,11 @@ static inline pte_t pmd_pte(pmd_t pmd)
return __pte(pmd_val(pmd));
 }
 
+static inline bool pmd_user(pmd_t pmd)
+{
+   return (pmd_val(pmd) & _PAGE_USER) == _PAGE_USER;
+}
+
 #define pmd_none(pmd)  (!pmd_val(pmd))
 #definepmd_bad(pmd)(!is_kernel_addr(pmd_val(pmd)) \
 || (pmd_val(pmd) & PMD_BAD_BITS))
@@ -158,6 +163,11 @@ static inline pte_t pud_pte(pud_t pud)
return __pte(pud_val(pud));
 }
 
+static inline bool pud_user(pud_t pud)
+{
+   return (pud_val(pud) & _PAGE_USER) == _PAGE_USER;
+}
+
 static inline pud_t pte_pud(pte_t pte)
 {
return __pud(pte_val(pte));
diff --git a/arch/powerpc/include/asm/pgtable.h 
b/arch/powerpc/include/asm/pgtable.h
index 36956fb440e1..3cb5de9f1aa4 100644
--- a/arch/powerpc/include/asm/pgtable.h
+++ b/arch/powerpc/include/asm/pgtable.h
@@ -172,6 +172,21 @@ static inline int pud_pfn(pud_t pud)
 }
 #endif
 
+static inline bool pte_user_accessible_page(pte_t pte)
+{
+   return pte_present(pte) && pte_user(pte);
+}
+
+static inline bool pmd_user_accessible_page(pmd_t pmd)
+{
+   return pmd_is_leaf(pmd) && pmd_present(pmd) && pmd_user(pmd);
+}
+
+static inline bool pud_user_accessible_page(pud_t pud)
+{
+   return pud_is_leaf(pud) && pud_present(pud) && pud_user(pud);
+}
+
 #endif /* __ASSEMBLY__ */
 
 #endif /* _ASM_POWERPC_PGTABLE_H */
-- 
2.34.1



[PATCH v3 3/3] powerpc: mm: support page table check

2022-10-23 Thread Rohan McLure
On creation and clearing of a page table mapping, instrument such calls
by invoking page_table_check_pte_set and page_table_check_pte_clear
respectively. These calls serve as a sanity check against illegal
mappings.

Enable ARCH_SUPPORTS_PAGE_TABLE_CHECK for all ppc64, and 32-bit
platforms implementing Book3S.

Change pud_pfn to be a runtime bug rather than a build bug as it is
consumed by page_table_check_pud_{clear,set} which are not called.

See also:

riscv support in commit 3fee229a8eb9 ("riscv/mm: enable
ARCH_SUPPORTS_PAGE_TABLE_CHECK")
arm64 in commit 42b2547137f5 ("arm64/mm: enable
ARCH_SUPPORTS_PAGE_TABLE_CHECK")
x86_64 in commit d283d422c6c4 ("x86: mm: add x86_64 support for page table
check")

Signed-off-by: Rohan McLure 
---
V2: Update spacing and types assigned to pte_update calls.
V3: Update one last pte_update call to remove __pte invocation.
---
 arch/powerpc/Kconfig |  1 +
 arch/powerpc/include/asm/book3s/32/pgtable.h |  9 -
 arch/powerpc/include/asm/book3s/64/pgtable.h | 18 +++---
 arch/powerpc/include/asm/nohash/32/pgtable.h |  7 ++-
 arch/powerpc/include/asm/nohash/64/pgtable.h |  8 ++--
 arch/powerpc/include/asm/nohash/pgtable.h|  1 +
 6 files changed, 37 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 4c466acdc70d..6c213ac46a92 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -149,6 +149,7 @@ config PPC
select ARCH_STACKWALK
select ARCH_SUPPORTS_ATOMIC_RMW
select ARCH_SUPPORTS_DEBUG_PAGEALLOCif PPC_BOOK3S || PPC_8xx || 40x
+   select ARCH_SUPPORTS_PAGE_TABLE_CHECK
select ARCH_USE_BUILTIN_BSWAP
select ARCH_USE_CMPXCHG_LOCKREF if PPC64
select ARCH_USE_MEMTEST
diff --git a/arch/powerpc/include/asm/book3s/32/pgtable.h 
b/arch/powerpc/include/asm/book3s/32/pgtable.h
index 8bf1c538839a..6a592426b935 100644
--- a/arch/powerpc/include/asm/book3s/32/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/32/pgtable.h
@@ -53,6 +53,8 @@
 
 #ifndef __ASSEMBLY__
 
+#include 
+
 static inline bool pte_user(pte_t pte)
 {
return pte_val(pte) & _PAGE_USER;
@@ -353,7 +355,11 @@ static inline int __ptep_test_and_clear_young(struct 
mm_struct *mm,
 static inline pte_t ptep_get_and_clear(struct mm_struct *mm, unsigned long 
addr,
   pte_t *ptep)
 {
-   return __pte(pte_update(mm, addr, ptep, ~_PAGE_HASHPTE, 0, 0));
+   pte_t old_pte = __pte(pte_update(mm, addr, ptep, ~_PAGE_HASHPTE, 0, 0));
+
+   page_table_check_pte_clear(mm, addr, old_pte);
+
+   return old_pte;
 }
 
 #define __HAVE_ARCH_PTEP_SET_WRPROTECT
@@ -545,6 +551,7 @@ static inline bool pmd_user(pmd_t pmd)
 static inline void __set_pte_at(struct mm_struct *mm, unsigned long addr,
pte_t *ptep, pte_t pte, int percpu)
 {
+   page_table_check_pte_set(mm, addr, ptep, pte);
 #if defined(CONFIG_SMP) && !defined(CONFIG_PTE_64BIT)
/* First case is 32-bit Hash MMU in SMP mode with 32-bit PTEs. We use 
the
 * helper pte_update() which does an atomic update. We need to do that
diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h 
b/arch/powerpc/include/asm/book3s/64/pgtable.h
index 3083111f9d0a..b5c5718d9b90 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -181,6 +181,8 @@
 #define PAGE_AGP   (PAGE_KERNEL_NC)
 
 #ifndef __ASSEMBLY__
+#include 
+
 /*
  * page table defines
  */
@@ -484,8 +486,11 @@ static inline void huge_ptep_set_wrprotect(struct 
mm_struct *mm,
 static inline pte_t ptep_get_and_clear(struct mm_struct *mm,
   unsigned long addr, pte_t *ptep)
 {
-   unsigned long old = pte_update(mm, addr, ptep, ~0UL, 0, 0);
-   return __pte(old);
+   pte_t old_pte = __pte(pte_update(mm, addr, ptep, ~0UL, 0, 0));
+
+   page_table_check_pte_clear(mm, addr, old_pte);
+
+   return old_pte;
 }
 
 #define __HAVE_ARCH_PTEP_GET_AND_CLEAR_FULL
@@ -494,11 +499,16 @@ static inline pte_t ptep_get_and_clear_full(struct 
mm_struct *mm,
pte_t *ptep, int full)
 {
if (full && radix_enabled()) {
+   pte_t old_pte;
+
/*
 * We know that this is a full mm pte clear and
 * hence can be sure there is no parallel set_pte.
 */
-   return radix__ptep_get_and_clear_full(mm, addr, ptep, full);
+   old_pte = radix__ptep_get_and_clear_full(mm, addr, ptep, full);
+   page_table_check_pte_clear(mm, addr, old_pte);
+
+   return old_pte;
}
return ptep_get_and_clear(mm, addr, ptep);
 }
@@ -884,6 +894,8 @@ static inline void __set_pte_at(struct mm_struct *mm, 
unsigned long addr,
 */
pte = __pte_raw(pte_raw(pte) | cpu_to_be64(_PAGE_PTE));
 
+   

[PATCH] powerpc: replace ternary operator with min()

2022-10-23 Thread KaiLong Wang
Fix the following coccicheck warning:

arch/powerpc/xmon/xmon.c:2987: WARNING opportunity for min()
arch/powerpc/xmon/xmon.c:2583: WARNING opportunity for min()

Signed-off-by: KaiLong Wang 
---
 arch/powerpc/xmon/xmon.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/xmon/xmon.c b/arch/powerpc/xmon/xmon.c
index f51c882bf902..a7751cd2cc9d 100644
--- a/arch/powerpc/xmon/xmon.c
+++ b/arch/powerpc/xmon/xmon.c
@@ -2580,7 +2580,7 @@ static void xmon_rawdump (unsigned long adrs, long ndump)
unsigned char temp[16];
 
for (n = ndump; n > 0;) {
-   r = n < 16? n: 16;
+   r = min(n, 16);
nr = mread(adrs, temp, r);
adrs += nr;
for (m = 0; m < r; ++m) {
@@ -2984,7 +2984,7 @@ prdump(unsigned long adrs, long ndump)
for (n = ndump; n > 0;) {
printf(REG, adrs);
putchar(' ');
-   r = n < 16? n: 16;
+   r = min(n, 16);
nr = mread(adrs, temp, r);
adrs += nr;
for (m = 0; m < r; ++m) {
-- 
2.25.1


Re: [PATCH v1 0/5] convert tree to get_random_u32_{below,above,between}()

2022-10-23 Thread Theodore Ts'o
On Fri, Oct 21, 2022 at 11:03:22PM -0700, Jakub Kicinski wrote:
> On Sat, 22 Oct 2022 07:47:06 +0200 Jason A. Donenfeld wrote:
> > On Fri, Oct 21, 2022 at 10:32:42PM -0700, Jakub Kicinski wrote:
> > > But whatever. I mean - hopefully there aren't any conflicts in the ~50
> > > networking files you touch. I just wish that people didn't pipe up with
> > > the tree wide changes right after the merge window. Feels like the
> > > worst possible timing.  
> > 
> > Oh, if the timing is what makes this especially worrisome, I have
> > no qualms about rebasing much later, and reposting this series then.
> > I'll do that.
> 
> Cool, thanks! I promise to not be grumpy if you repost around rc6 :)

One way of making things less painful for the stable branch and for
the upstream branch is to *add* new helpers instead of playing
replacement games like s/prandom_u32_max/get_random_u32_below/.  This
is what causes the patch conflict problems.

One advantage of at least adding the new functions to the stable
branches, even if we don't do the wholesale replacement, is that it
makes it much less likely that a backported patch, which uses the new
API, won't fail to compile --- and of course, the major headache case
is one where it's not noticed at first because it didn't get picked up
in people's test compiles until after the Linux x.y.z release has been
pushed out.

Whether it's worth doing the wholesale replacement is a different
question, but separating "add the new functions with one or two use
cases so the functions are actulaly _used_" from the "convert the
world to use the new functions" from the "remove the old functions",
might life easier.

- Ted


[PATCH v1 2/2] stackprotector: actually use get_random_canary()

2022-10-23 Thread Jason A. Donenfeld
The RNG always mixes in the Linux version extremely early in boot. It
also always includes a cycle counter, not only during early boot, but
each and every time it is invoked prior to being fully initialized.
Together, this means that the use of additional xors inside of the
various stackprotector.h files is superfluous and over-complicated.
Instead, we can get exactly the same thing, but better, by just calling
`get_random_canary()`.

Signed-off-by: Jason A. Donenfeld 
---
 arch/arm/include/asm/stackprotector.h |  9 +
 arch/arm64/include/asm/stackprotector.h   |  9 +
 arch/csky/include/asm/stackprotector.h| 10 +-
 arch/mips/include/asm/stackprotector.h|  9 +
 arch/powerpc/include/asm/stackprotector.h | 10 +-
 arch/riscv/include/asm/stackprotector.h   | 10 +-
 arch/sh/include/asm/stackprotector.h  | 10 +-
 arch/x86/include/asm/stackprotector.h | 14 +-
 arch/xtensa/include/asm/stackprotector.h  |  7 +--
 9 files changed, 9 insertions(+), 79 deletions(-)

diff --git a/arch/arm/include/asm/stackprotector.h 
b/arch/arm/include/asm/stackprotector.h
index 088d03161be5..0bd4979759f1 100644
--- a/arch/arm/include/asm/stackprotector.h
+++ b/arch/arm/include/asm/stackprotector.h
@@ -15,9 +15,6 @@
 #ifndef _ASM_STACKPROTECTOR_H
 #define _ASM_STACKPROTECTOR_H 1
 
-#include 
-#include 
-
 #include 
 
 extern unsigned long __stack_chk_guard;
@@ -30,11 +27,7 @@ extern unsigned long __stack_chk_guard;
  */
 static __always_inline void boot_init_stack_canary(void)
 {
-   unsigned long canary;
-
-   /* Try to get a semi random initial value. */
-   get_random_bytes(, sizeof(canary));
-   canary ^= LINUX_VERSION_CODE;
+   unsigned long canary = get_random_canary();
 
current->stack_canary = canary;
 #ifndef CONFIG_STACKPROTECTOR_PER_TASK
diff --git a/arch/arm64/include/asm/stackprotector.h 
b/arch/arm64/include/asm/stackprotector.h
index 33f1bb453150..ae3ad80f51fe 100644
--- a/arch/arm64/include/asm/stackprotector.h
+++ b/arch/arm64/include/asm/stackprotector.h
@@ -13,8 +13,6 @@
 #ifndef __ASM_STACKPROTECTOR_H
 #define __ASM_STACKPROTECTOR_H
 
-#include 
-#include 
 #include 
 
 extern unsigned long __stack_chk_guard;
@@ -28,12 +26,7 @@ extern unsigned long __stack_chk_guard;
 static __always_inline void boot_init_stack_canary(void)
 {
 #if defined(CONFIG_STACKPROTECTOR)
-   unsigned long canary;
-
-   /* Try to get a semi random initial value. */
-   get_random_bytes(, sizeof(canary));
-   canary ^= LINUX_VERSION_CODE;
-   canary &= CANARY_MASK;
+   unsigned long canary = get_random_canary();
 
current->stack_canary = canary;
if (!IS_ENABLED(CONFIG_STACKPROTECTOR_PER_TASK))
diff --git a/arch/csky/include/asm/stackprotector.h 
b/arch/csky/include/asm/stackprotector.h
index d7cd4e51edd9..d23747447166 100644
--- a/arch/csky/include/asm/stackprotector.h
+++ b/arch/csky/include/asm/stackprotector.h
@@ -2,9 +2,6 @@
 #ifndef _ASM_STACKPROTECTOR_H
 #define _ASM_STACKPROTECTOR_H 1
 
-#include 
-#include 
-
 extern unsigned long __stack_chk_guard;
 
 /*
@@ -15,12 +12,7 @@ extern unsigned long __stack_chk_guard;
  */
 static __always_inline void boot_init_stack_canary(void)
 {
-   unsigned long canary;
-
-   /* Try to get a semi random initial value. */
-   get_random_bytes(, sizeof(canary));
-   canary ^= LINUX_VERSION_CODE;
-   canary &= CANARY_MASK;
+   unsigned long canary = get_random_canary();
 
current->stack_canary = canary;
__stack_chk_guard = current->stack_canary;
diff --git a/arch/mips/include/asm/stackprotector.h 
b/arch/mips/include/asm/stackprotector.h
index 68d4be9e1254..518c192ad982 100644
--- a/arch/mips/include/asm/stackprotector.h
+++ b/arch/mips/include/asm/stackprotector.h
@@ -15,9 +15,6 @@
 #ifndef _ASM_STACKPROTECTOR_H
 #define _ASM_STACKPROTECTOR_H 1
 
-#include 
-#include 
-
 extern unsigned long __stack_chk_guard;
 
 /*
@@ -28,11 +25,7 @@ extern unsigned long __stack_chk_guard;
  */
 static __always_inline void boot_init_stack_canary(void)
 {
-   unsigned long canary;
-
-   /* Try to get a semi random initial value. */
-   get_random_bytes(, sizeof(canary));
-   canary ^= LINUX_VERSION_CODE;
+   unsigned long canary = get_random_canary();
 
current->stack_canary = canary;
__stack_chk_guard = current->stack_canary;
diff --git a/arch/powerpc/include/asm/stackprotector.h 
b/arch/powerpc/include/asm/stackprotector.h
index 1c8460e23583..283c34647856 100644
--- a/arch/powerpc/include/asm/stackprotector.h
+++ b/arch/powerpc/include/asm/stackprotector.h
@@ -7,8 +7,6 @@
 #ifndef _ASM_STACKPROTECTOR_H
 #define _ASM_STACKPROTECTOR_H
 
-#include 
-#include 
 #include 
 #include 
 #include 
@@ -21,13 +19,7 @@
  */
 static __always_inline void boot_init_stack_canary(void)
 {
-   unsigned long canary;
-
-   /* Try to get a semi random initial value. */
-   canary = 

[PATCH v1 1/2] stackprotector: move CANARY_MASK and get_random_canary() into stackprotector.h

2022-10-23 Thread Jason A. Donenfeld
This has nothing to do with random.c and everything to do with stack
protectors. Yes, it uses randomness. But many things use randomness.
random.h and random.c are concerned with the generation of randomness,
not with each and every use. So move this function into the more
specific stackprotector.h file where it belongs.

Signed-off-by: Jason A. Donenfeld 
---
 arch/x86/kernel/cpu/common.c   |  2 +-
 arch/x86/kernel/setup_percpu.c |  2 +-
 arch/x86/kernel/smpboot.c  |  2 +-
 arch/x86/xen/enlighten_pv.c|  2 +-
 include/linux/random.h | 19 ---
 include/linux/stackprotector.h | 19 +++
 kernel/fork.c  |  2 +-
 7 files changed, 24 insertions(+), 24 deletions(-)

diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 3e508f239098..3f66dd03c091 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -22,9 +22,9 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
-#include 
 #include 
 #include 
 #include 
diff --git a/arch/x86/kernel/setup_percpu.c b/arch/x86/kernel/setup_percpu.c
index 49325caa7307..b26123c90b4f 100644
--- a/arch/x86/kernel/setup_percpu.c
+++ b/arch/x86/kernel/setup_percpu.c
@@ -11,6 +11,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -21,7 +22,6 @@
 #include 
 #include 
 #include 
-#include 
 
 DEFINE_PER_CPU_READ_MOSTLY(int, cpu_number);
 EXPORT_PER_CPU_SYMBOL(cpu_number);
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 3f3ea0287f69..dbe09fcc6604 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -56,6 +56,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -80,7 +81,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 
 /* representing HT siblings of each logical CPU */
diff --git a/arch/x86/xen/enlighten_pv.c b/arch/x86/xen/enlighten_pv.c
index f82857e48815..745420853a7c 100644
--- a/arch/x86/xen/enlighten_pv.c
+++ b/arch/x86/xen/enlighten_pv.c
@@ -32,6 +32,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -64,7 +65,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
diff --git a/include/linux/random.h b/include/linux/random.h
index bf8ed3df3af0..182780cafd45 100644
--- a/include/linux/random.h
+++ b/include/linux/random.h
@@ -115,25 +115,6 @@ static inline u32 get_random_u32_between(u32 floor, u32 
ceil)
return floor + get_random_u32_below(ceil - floor);
 }
 
-/*
- * On 64-bit architectures, protect against non-terminated C string overflows
- * by zeroing out the first byte of the canary; this leaves 56 bits of entropy.
- */
-#ifdef CONFIG_64BIT
-# ifdef __LITTLE_ENDIAN
-#  define CANARY_MASK 0xff00UL
-# else /* big endian, 64 bits: */
-#  define CANARY_MASK 0x00ffUL
-# endif
-#else /* 32 bits: */
-# define CANARY_MASK 0xUL
-#endif
-
-static inline unsigned long get_random_canary(void)
-{
-   return get_random_long() & CANARY_MASK;
-}
-
 void __init random_init_early(const char *command_line);
 void __init random_init(void);
 bool rng_is_initialized(void);
diff --git a/include/linux/stackprotector.h b/include/linux/stackprotector.h
index 4c678c4fec58..9c88707d9a0f 100644
--- a/include/linux/stackprotector.h
+++ b/include/linux/stackprotector.h
@@ -6,6 +6,25 @@
 #include 
 #include 
 
+/*
+ * On 64-bit architectures, protect against non-terminated C string overflows
+ * by zeroing out the first byte of the canary; this leaves 56 bits of entropy.
+ */
+#ifdef CONFIG_64BIT
+# ifdef __LITTLE_ENDIAN
+#  define CANARY_MASK 0xff00UL
+# else /* big endian, 64 bits: */
+#  define CANARY_MASK 0x00ffUL
+# endif
+#else /* 32 bits: */
+# define CANARY_MASK 0xUL
+#endif
+
+static inline unsigned long get_random_canary(void)
+{
+   return get_random_long() & CANARY_MASK;
+}
+
 #if defined(CONFIG_STACKPROTECTOR) || defined(CONFIG_ARM64_PTR_AUTH)
 # include 
 #else
diff --git a/kernel/fork.c b/kernel/fork.c
index 08969f5aa38d..ec57cae58ff1 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -75,7 +75,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
@@ -97,6 +96,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
-- 
2.38.1



[PATCH v1 0/2] cleanup stackprotector canary generation

2022-10-23 Thread Jason A. Donenfeld
Stack canary generation currently lives partially in random.h, where it
doesn't belong, and is in general a bit overcomplicated. This small
patchset fixes up both issues. I'll take these in my tree, unless
somebody else prefers to do so.

Cc: Albert Ou 
Cc: Boris Ostrovsky 
Cc: Borislav Petkov 
Cc: Catalin Marinas 
Cc: Chris Zankel 
Cc: Christophe Leroy 
Cc: Dave Hansen 
Cc: Greg Kroah-Hartman 
Cc: Guo Ren 
Cc: H. Peter Anvin 
Cc: Ingo Molnar 
Cc: Juergen Gross 
Cc: Max Filippov 
Cc: Michael Ellerman 
Cc: Nicholas Piggin 
Cc: Palmer Dabbelt 
Cc: Paul Walmsley 
Cc: Rich Felker 
Cc: Russell King 
Cc: Thomas Bogendoerfer 
Cc: Thomas Gleixner 
Cc: Will Deacon 
Cc: Yoshinori Sato 
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-c...@vger.kernel.org
Cc: linux-m...@vger.kernel.org
Cc: linux-ri...@lists.infradead.org
Cc: linux...@vger.kernel.org
Cc: linux-xte...@linux-xtensa.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: x...@kernel.org

Jason A. Donenfeld (2):
  stackprotector: move CANARY_MASK and get_random_canary() into
stackprotector.h
  stackprotector: actually use get_random_canary()

 arch/arm/include/asm/stackprotector.h |  9 +
 arch/arm64/include/asm/stackprotector.h   |  9 +
 arch/csky/include/asm/stackprotector.h| 10 +-
 arch/mips/include/asm/stackprotector.h|  9 +
 arch/powerpc/include/asm/stackprotector.h | 10 +-
 arch/riscv/include/asm/stackprotector.h   | 10 +-
 arch/sh/include/asm/stackprotector.h  | 10 +-
 arch/x86/include/asm/stackprotector.h | 14 +-
 arch/x86/kernel/cpu/common.c  |  2 +-
 arch/x86/kernel/setup_percpu.c|  2 +-
 arch/x86/kernel/smpboot.c |  2 +-
 arch/x86/xen/enlighten_pv.c   |  2 +-
 arch/xtensa/include/asm/stackprotector.h  |  7 +--
 include/linux/random.h| 19 ---
 include/linux/stackprotector.h| 19 +++
 kernel/fork.c |  2 +-
 16 files changed, 33 insertions(+), 103 deletions(-)

-- 
2.38.1



Re: [PATCH v3 0/2] Copy-on-write poison recovery

2022-10-23 Thread Shuai Xue



在 2022/10/22 AM4:01, Tony Luck 写道:
> Part 1 deals with the process that triggered the copy on write
> fault with a store to a shared read-only page. That process is
> send a SIGBUS with the usual machine check decoration to specify
> the virtual address of the lost page, together with the scope.
> 
> Part 2 sets up to asynchronously take the page with the uncorrected
> error offline to prevent additional machine check faults. H/t to
> Miaohe Lin  and Shuai Xue 
> for pointing me to the existing function to queue a call to
> memory_failure().
> 
> On x86 there is some duplicate reporting (because the error is
> also signalled by the memory controller as well as by the core
> that triggered the machine check). Console logs look like this:
> 
> [ 1647.723403] mce: [Hardware Error]: Machine check events logged
>   Machine check from kernel copy routine
> 
> [ 1647.723414] MCE: Killing einj_mem_uc:3600 due to hardware memory 
> corruption fault at 7f3309503400
>   x86 fault handler sends SIGBUS to child process
> 
> [ 1647.735183] Memory failure: 0x905b92d: recovery action for dirty LRU page: 
> Recovered
>   Async call to memory_failure() from copy on write path

The recovery action might also be handled asynchronously in CMCI 
uc_decode_notifier
handler signaled by memory controller, right?

I have a one more memory failure log than yours.

[ 3187.485742] MCE: Killing einj_mem_uc:31746 due to hardware memory corruption 
fault at 7fc4bf7cf400
[ 3187.740620] Memory failure: 0x1a3b80: recovery action for dirty LRU page: 
Recovered
uc_decode_notifier() processes memory controller report

[ 3187.748272] Memory failure: 0x1a3b80: already hardware poisoned
Workqueue: events memory_failure_work_func // queued by 
ghes_do_memory_failure

[ 3187.754194] Memory failure: 0x1a3b80: already hardware poisoned
Workqueue: events memory_failure_work_func // queued by 
__wp_page_copy_user

[ 3188.615920] MCE: Killing einj_mem_uc:31745 due to hardware memory corruption 
fault at 7fc4bf7cf400

Best Regards,
Shuai

> 
> [ 1647.748397] Memory failure: 0x905b92d: already hardware poisoned
>   uc_decode_notifier() processes memory controller report
> 
> [ 1647.761313] MCE: Killing einj_mem_uc:3599 due to hardware memory 
> corruption fault at 7f3309503400
>   Parent process tries to read poisoned page. Page has been unmapped, so
>   #PF handler sends SIGBUS
> 
> 
> Tony Luck (2):
>   mm, hwpoison: Try to recover from copy-on write faults
>   mm, hwpoison: When copy-on-write hits poison, take page offline
> 
>  include/linux/highmem.h | 24 
>  include/linux/mm.h  |  5 -
>  mm/memory.c | 32 ++--
>  3 files changed, 50 insertions(+), 11 deletions(-)
> 


Re: [PATCH v2] mm, hwpoison: Try to recover from copy-on write faults

2022-10-23 Thread Shuai Xue



在 2022/10/22 AM12:30, Luck, Tony 写道:
>>> But maybe it is some RMW instruction ... then, if all the above options 
>>> didn't happen ... we
>>> could get another machine check from the same address. But then we just 
>>> follow the usual
>>> recovery path.
> 
> 
>> Let assume the instruction that cause the COW is in the 63/64 case, aka,
>> it is writing a different cache line from the poisoned one. But the new_page
>> allocated in COW is dropped right? So might page fault again?
> 
> It can, but this should be no surprise to a user that has a signal handler for
> a h/w event (SIGBUS, SIGSEGV, SIGILL) that does nothing to address the
> problem, but simply returns to re-execute the same instruction that caused
> the original trap.
> 
> There may be badly written signal handlers that do this. But they just cause
> pain for themselves. Linux can keep taking the traps and fixing things up and
> sending a new signal over and over.
> 
> In this case that loop may involve taking the machine check again, so some
> extra pain for the kernel, but recoverable machine checks on Intel/x86 
> switched
> from broadcast to delivery to just the logical CPU that tried to consume the 
> poison
> a few generations back. So only a bit more painful than a repeated page fault.
> 
> -Tony
> 
> 

I see, thanks for your patient explanation :)

Best Regards,
Shuai