Re: Issues with the first PowerPC updates for the kernel 6.1

2022-10-30 Thread Christian Zigotzky

On 29 October 2022 at 5:33 pm, Segher Boessenkool wrote:

On Mon, Oct 17, 2022 at 09:53:04AM +0200, Christian Zigotzky wrote:

On 17. Oct 2022, at 02:43, Michael Ellerman  wrote:
Previously BIG_ENDIAN && GENERIC_CPU would use -mcpu=power5, now it uses
-mcpu=power4.

Maybe this is the issue. We will wait and not release the RC1 for testing 
because it is a risk for our testers to test these new kernels because of this 
issue.

It is really important do not to rewrite code, that is well worked before.
Bugfixing and adding some new features is ok but rewriting of good code is 
expensive and doesn’t make any sense.

It was just a bugfix, and a (partial) revert.

471d7ff8b51b says it removed ISA 2.00 support (original power4, "GP").
Support for ISA 2.01 was retained it says.  That is power4+, "GQ", but
also 970 (Apple G5).  That patch actually switched to ISA 2.02 though,
unintendedly, and code generated for ISA 2.02 will not run on systems
like 970, in principle.  It is just one uncommon instruction that is
problematical, namely popcntb, because the kernel does not use floating
point at all, so that is why we got away with it for so long (most code
that does use fp will fall flat on its face in no time).  It still is a
bug fix though!

PA6T is ISA 2.04, it's not clear how this (bugfix, and revert!) change
made code not run on PA6T anymore.  Smells a lot like something indirect
(or triply indirect), a separate bug, something that was introduced in
the last two years maybe, but I'll even bet it is something *exposed* in
that time, a bug that has been here for longer!


Segher

Unfortunately my FSL P5040 system is also affected.

-- Christian


Re: Issues with the first PowerPC updates for the kernel 6.1

2022-10-30 Thread Christian Zigotzky

On 29 October 2022 at 01:44 pm, Christian Zigotzky wrote:

On 17 October 2022 at 09:53 am, Christian Zigotzky wrote:

On 17. Oct 2022, at 02:43, Michael Ellerman  wrote:
Previously BIG_ENDIAN && GENERIC_CPU would use -mcpu=power5, now it 
uses

-mcpu=power4.
Maybe this is the issue. We will wait and not release the RC1 for 
testing because it is a risk for our testers to test these new 
kernels because of this issue.




cheers



I compiled the RC2 of kernel 6.1 today.

After the first boot of the RC2, the file system was immediately to 
100% used.  This is the same issue we have seen with the git kernel 3 
weeks ago.


The Cyrus+ and Nemo boards are affected.

I wrote 3 weeks ago:

Hi All,

I successfully compiled the latest git kernel with the first PowerPC 
updates yesterday.


Unfortunately this kernel is really dangerous. Many things for example 
Network Manager and LightDM don't work anymore and produced several 
gigabyte of config files till the partition has been filled.


I deleted some files like the resolv.conf that had a size over 200 GB!

Unfortunately, MintPPC was still damaged. For example LightDM doesn't 
work anymore and the MATE desktop doesn't display any icons anymore 
because Caja wasn't able to reserve memory anymore.


In this case, bisecting isn't an option and I have to wait some weeks. 
It is really difficult to find the issue if the userland will damaged 
again and again.


Cheers,
Christian

---

Maybe there is an issue in my kernel configs. Could you please check 
the configs? Please find attached the configs. Could you please test 
the RC2 on your FSL and pasemi machines?


Thanks,
Christian


Hi All,

I bisected today because Void PPC is recovering after a reboot. Memory 
space is released again. [1]


Result: c2e7a19827eec443a7cbe85e8d959052412d6dc3 (powerpc: Use generic 
fallocate compatibility syscall) is the first bad commit. [2]


I was able to create a patch for reverting this bad commit. [3]

I compiled the kernel with this patch. After that the kernel works 
without any problems.


Please check the first bad commit. [2]

Thanks,
Christian


[1] https://forum.hyperion-entertainment.com/viewtopic.php?p=56099#p56099
[2] 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c2e7a19827eec443a7cbe85e8d959052412d6dc3

[3] syscall.patch:

diff -rupN a/arch/powerpc/include/asm/syscalls.h 
b/arch/powerpc/include/asm/syscalls.h
--- a/arch/powerpc/include/asm/syscalls.h   2022-10-30 
13:53:28.956001116 +0100
+++ b/arch/powerpc/include/asm/syscalls.h   2022-10-30 
13:55:39.166300756 +0100

@@ -15,6 +15,7 @@
 #include 
 #include 

+long compat_sys_fallocate(int fd, int mode, u32 offset1, u32 offset2, 
u32 len1, u32 len2);

 #ifndef CONFIG_ARCH_HAS_SYSCALL_WRAPPER
 long sys_ni_syscall(void);
 #else
diff -rupN a/arch/powerpc/include/asm/unistd.h 
b/arch/powerpc/include/asm/unistd.h

--- a/arch/powerpc/include/asm/unistd.h 2022-10-30 13:53:28.957001103 +0100
+++ b/arch/powerpc/include/asm/unistd.h 2022-10-30 13:56:44.851441888 +0100
@@ -45,7 +45,6 @@
 #define __ARCH_WANT_SYS_UTIME
 #define __ARCH_WANT_SYS_NEWFSTATAT
 #define __ARCH_WANT_COMPAT_STAT
-#define __ARCH_WANT_COMPAT_FALLOCATE
 #define __ARCH_WANT_COMPAT_SYS_SENDFILE
 #endif
 #define __ARCH_WANT_SYS_FORK
diff -rupN a/arch/powerpc/kernel/sys_ppc32.c 
b/arch/powerpc/kernel/sys_ppc32.c

--- a/arch/powerpc/kernel/sys_ppc32.c   2022-10-30 13:53:28.967000972 +0100
+++ b/arch/powerpc/kernel/sys_ppc32.c   2022-10-30 13:58:28.993078689 +0100
@@ -97,6 +97,13 @@ PPC32_SYSCALL_DEFINE4(ppc_truncate64,
    return ksys_truncate(path, merge_64(len1, len2));
 }

+long compat_sys_fallocate(int fd, int mode, u32 offset1, u32 offset2,
+    u32 len1, u32 len2)
+{
+   return ksys_fallocate(fd, mode, merge_64(offset1, offset2),
+    merge_64(len1, len2));
+}
+
 PPC32_SYSCALL_DEFINE4(ppc_ftruncate64,
   unsigned int, fd, u32, reg4,
   unsigned long, len1, unsigned long, len2)


Re: Issues with the first PowerPC updates for the kernel 6.1 #forregzbot

2022-10-30 Thread Thorsten Leemhuis
[Note: this mail is primarily send for documentation purposes and/or for
regzbot, my Linux kernel regression tracking bot. That's why I removed
most or all folks from the list of recipients, but left any that looked
like a mailing lists. These mails usually contain '#forregzbot' in the
subject, to make them easy to spot and filter out.]

[TLDR: I'm adding this regression report to the list of tracked
regressions; all text from me you find below is based on a few templates
paragraphs you might have encountered already already in similar form.]

Hi, this is your Linux kernel regression tracker.

On 12.10.22 08:51, Christian Zigotzky wrote:
> Hi All,
> 
> I use the Nemo board with a PASemi PA6T CPU and have some issues since the 
> first PowerPC updates for the kernel 6.1.
> 
> I successfully compiled the git kernel with the first PowerPC updates two 
> days ago.
> 
> Unfortunately this kernel is really dangerous. Many things for example 
> Network Manager and LightDM don't work anymore and produced several gigabyte 
> of config files till the partition has been filled.
> 
> I deleted some files like the resolv.conf that had a size over 200 GB!
> 
> Unfortunately, MintPPC was still damaged. For example LightDM doesn't work 
> anymore and the MATE desktop doesn't display any icons anymore because Caja 
> wasn't able to reserve memory anymore.
> 
> In this case, bisecting isn't an option and I have to wait some weeks. It is 
> really difficult to find the issue if the userland will damaged again and 
> again.

Thanks for the report. To be sure below issue doesn't fall through the
cracks unnoticed, I'm adding it to regzbot, my Linux kernel regression
tracking bot:

#regzbot ^introduced c2e7a19827eec443a7cb
#regzbot title ppc: PASemi PA6T CPU: Network Manager and LightDM and
fill volume with data
#regzbot ignore-activity

This isn't a regression? This issue or a fix for it are already
discussed somewhere else? It was fixed already? You want to clarify when
the regression started to happen? Or point out I got the title or
something else totally wrong? Then just reply -- ideally with also
telling regzbot about it, as explained here:
https://linux-regtracking.leemhuis.info/tracked-regression/

Reminder for developers: When fixing the issue, add 'Link:' tags
pointing to the report (the mail this one replies to), as explained for
in the Linux kernel's documentation; above webpage explains why this is
important for tracked regressions.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)

P.S.: As the Linux kernel's regression tracker I deal with a lot of
reports and sometimes miss something important when writing mails like
this. If that's the case here, don't hesitate to tell me in a public
reply, it's in everyone's interest to set the public record straight.


Re: [PATCH v6 21/25] powerpc: Provide syscall wrapper

2022-10-30 Thread Andreas Schwab
This breaks powerpc32.  The fallocate syscall misinterprets its
arguments.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."


Re: [PATCH v6 21/25] powerpc: Provide syscall wrapper

2022-10-30 Thread Andreas Schwab
It probably breaks every syscall with a 64-bit argument.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."


Re: [PATCH 1/2] powerpc/32: fix syscall wrappers with 64-bit arguments of unaligned register-pairs

2022-10-30 Thread Andreas Schwab
On Okt 12 2022, Nicholas Piggin wrote:

> diff --git a/arch/powerpc/kernel/syscalls/syscall.tbl 
> b/arch/powerpc/kernel/syscalls/syscall.tbl
> index 2bca64f96164..e9e0df4f9a61 100644
> --- a/arch/powerpc/kernel/syscalls/syscall.tbl
> +++ b/arch/powerpc/kernel/syscalls/syscall.tbl
> @@ -228,8 +228,10 @@
>  176  64  rt_sigtimedwait sys_rt_sigtimedwait
>  177  nospu   rt_sigqueueinfo sys_rt_sigqueueinfo 
> compat_sys_rt_sigqueueinfo
>  178  nospu   rt_sigsuspend   sys_rt_sigsuspend   
> compat_sys_rt_sigsuspend
> -179  common  pread64 sys_pread64 
> compat_sys_ppc_pread64
> -180  common  pwrite64sys_pwrite64
> compat_sys_ppc_pwrite64
> +179  32  pread64 sys_ppc_pread64 
> compat_sys_ppc_pread64
> +179  64  pread64 sys_pread64
> +180  32  pwrite64sys_ppc_pwrite64
> compat_sys_ppc_pwrite64
> +180  64  pwrite64sys_pwrite64

Doesn't that lack entries for SPU?  Likewise for all other former common
syscalls in this patch.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."


Re: [PATCH v4] hugetlb: simplify hugetlb handling in follow_page_mask

2022-10-30 Thread Peter Xu
On Fri, Oct 28, 2022 at 11:11:08AM -0700, Mike Kravetz wrote:
> +struct page *hugetlb_follow_page_mask(struct vm_area_struct *vma,
> + unsigned long address, unsigned int flags)
> +{
> + struct hstate *h = hstate_vma(vma);
> + struct mm_struct *mm = vma->vm_mm;
> + unsigned long haddr = address & huge_page_mask(h);
> + struct page *page = NULL;
> + spinlock_t *ptl;
> + pte_t *pte, entry;
> +
> + /*
> +  * FOLL_PIN is not supported for follow_page(). Ordinary GUP goes via
> +  * follow_hugetlb_page().
> +  */
> + if (WARN_ON_ONCE(flags & FOLL_PIN))
> + return NULL;
> +
> +retry:
> + pte = huge_pte_offset(mm, haddr, huge_page_size(h));
> + if (!pte)
> + return NULL;
> +
> + ptl = huge_pte_lock(h, mm, pte);
> + entry = huge_ptep_get(pte);
> + if (pte_present(entry)) {
> + page = pte_page(entry) +
> + ((address & ~huge_page_mask(h)) >> PAGE_SHIFT);
> + /*
> +  * Note that page may be a sub-page, and with vmemmap
> +  * optimizations the page struct may be read only.
> +  * try_grab_page() will increase the ref count on the
> +  * head page, so this will be OK.
> +  *
> +  * try_grab_page() should always succeed here, because we hold
> +  * the ptl lock and have verified pte_present().
> +  */
> + if (WARN_ON_ONCE(!try_grab_page(page, flags))) {
> + page = NULL;
> + goto out;
> + }
> + } else {
> + if (is_hugetlb_entry_migration(entry)) {
> + spin_unlock(ptl);
> + hugetlb_vma_unlock_read(vma);

Just noticed it when pulled the last mm-unstable: this line seems to be a
left-over of v3, while not needed now?

> + __migration_entry_wait_huge(pte, ptl);
> + goto retry;
> + }
> + /*
> +  * hwpoisoned entry is treated as no_page_table in
> +  * follow_page_mask().
> +  */
> + }
> +out:
> + spin_unlock(ptl);
> + return page;
> +}

-- 
Peter Xu



[PATCH v5] hugetlb: simplify hugetlb handling in follow_page_mask

2022-10-30 Thread Mike Kravetz
During discussions of this series [1], it was suggested that hugetlb
handling code in follow_page_mask could be simplified.  At the beginning
of follow_page_mask, there currently is a call to follow_huge_addr which
'may' handle hugetlb pages.  ia64 is the only architecture which provides
a follow_huge_addr routine that does not return error.  Instead, at each
level of the page table a check is made for a hugetlb entry.  If a hugetlb
entry is found, a call to a routine associated with that entry is made.

Currently, there are two checks for hugetlb entries at each page table
level.  The first check is of the form:
if (p?d_huge())
page = follow_huge_p?d();
the second check is of the form:
if (is_hugepd())
page = follow_huge_pd().

We can replace these checks, as well as the special handling routines
such as follow_huge_p?d() and follow_huge_pd() with a single routine to
handle hugetlb vmas.

A new routine hugetlb_follow_page_mask is called for hugetlb vmas at the
beginning of follow_page_mask.  hugetlb_follow_page_mask will use the
existing routine huge_pte_offset to walk page tables looking for hugetlb
entries.  huge_pte_offset can be overwritten by architectures, and already
handles special cases such as hugepd entries.

[1] 
https://lore.kernel.org/linux-mm/cover.1661240170.git.baolin.w...@linux.alibaba.com/

Suggested-by: David Hildenbrand 
Signed-off-by: Mike Kravetz 
---
v5 -Remove left over hugetlb_vma_unlock_read
v4 -Remove vma (pmd sharing) locking as this can be called with
FOLL_NOWAIT. Peter
v3 -Change WARN_ON_ONCE() to BUILD_BUG() as reminded by Christophe Leroy
v2 -Added WARN_ON_ONCE() and updated comment as suggested by David
Fixed build issue found by kernel test robot
Added vma (pmd sharing) locking to hugetlb_follow_page_mask
ReBased on Baolin's patch to fix issues with CONT_* entries

 arch/ia64/mm/hugetlbpage.c|  15 ---
 arch/powerpc/mm/hugetlbpage.c |  37 
 include/linux/hugetlb.h   |  50 ++
 mm/gup.c  |  80 +++-
 mm/hugetlb.c  | 172 +++---
 5 files changed, 76 insertions(+), 278 deletions(-)

diff --git a/arch/ia64/mm/hugetlbpage.c b/arch/ia64/mm/hugetlbpage.c
index f993cb36c062..380d2f3966c9 100644
--- a/arch/ia64/mm/hugetlbpage.c
+++ b/arch/ia64/mm/hugetlbpage.c
@@ -91,21 +91,6 @@ int prepare_hugepage_range(struct file *file,
return 0;
 }
 
-struct page *follow_huge_addr(struct mm_struct *mm, unsigned long addr, int 
write)
-{
-   struct page *page;
-   pte_t *ptep;
-
-   if (REGION_NUMBER(addr) != RGN_HPAGE)
-   return ERR_PTR(-EINVAL);
-
-   ptep = huge_pte_offset(mm, addr, HPAGE_SIZE);
-   if (!ptep || pte_none(*ptep))
-   return NULL;
-   page = pte_page(*ptep);
-   page += ((addr & ~HPAGE_MASK) >> PAGE_SHIFT);
-   return page;
-}
 int pmd_huge(pmd_t pmd)
 {
return 0;
diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c
index 5852a86d990d..f1ba8d1e8c1a 100644
--- a/arch/powerpc/mm/hugetlbpage.c
+++ b/arch/powerpc/mm/hugetlbpage.c
@@ -506,43 +506,6 @@ void hugetlb_free_pgd_range(struct mmu_gather *tlb,
} while (addr = next, addr != end);
 }
 
-struct page *follow_huge_pd(struct vm_area_struct *vma,
-   unsigned long address, hugepd_t hpd,
-   int flags, int pdshift)
-{
-   pte_t *ptep;
-   spinlock_t *ptl;
-   struct page *page = NULL;
-   unsigned long mask;
-   int shift = hugepd_shift(hpd);
-   struct mm_struct *mm = vma->vm_mm;
-
-retry:
-   /*
-* hugepage directory entries are protected by mm->page_table_lock
-* Use this instead of huge_pte_lockptr
-*/
-   ptl = >page_table_lock;
-   spin_lock(ptl);
-
-   ptep = hugepte_offset(hpd, address, pdshift);
-   if (pte_present(*ptep)) {
-   mask = (1UL << shift) - 1;
-   page = pte_page(*ptep);
-   page += ((address & mask) >> PAGE_SHIFT);
-   if (flags & FOLL_GET)
-   get_page(page);
-   } else {
-   if (is_hugetlb_entry_migration(*ptep)) {
-   spin_unlock(ptl);
-   __migration_entry_wait(mm, ptep, ptl);
-   goto retry;
-   }
-   }
-   spin_unlock(ptl);
-   return page;
-}
-
 bool __init arch_hugetlb_valid_size(unsigned long size)
 {
int shift = __ffs(size);
diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index 8b4f93e84868..4a76c0fc6bbf 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -149,6 +149,8 @@ int move_hugetlb_page_tables(struct vm_area_struct *vma,
 unsigned long len);
 int copy_hugetlb_page_range(struct mm_struct *, struct mm_struct *,
struct 

Re: [PATCH v6 21/25] powerpc: Provide syscall wrapper

2022-10-30 Thread Andreas Schwab
On Okt 30 2022, Arnd Bergmann wrote:

> On Sun, Oct 30, 2022, at 16:34, Andreas Schwab wrote:
>> This breaks powerpc32.  The fallocate syscall misinterprets its
>> arguments.
>
> It was fixed in

Nope.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."


Re: [PATCH v6 21/25] powerpc: Provide syscall wrapper

2022-10-30 Thread Arnd Bergmann
On Sun, Oct 30, 2022, at 16:34, Andreas Schwab wrote:
> This breaks powerpc32.  The fallocate syscall misinterprets its
> arguments.

It was fixed in

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=e237506238352f3bfa9cf3983cdab873e35651eb

 Arnd


Re: [PATCH v4] hugetlb: simplify hugetlb handling in follow_page_mask

2022-10-30 Thread Mike Kravetz
On 10/30/22 15:45, Peter Xu wrote:
> On Fri, Oct 28, 2022 at 11:11:08AM -0700, Mike Kravetz wrote:
> > +   } else {
> > +   if (is_hugetlb_entry_migration(entry)) {
> > +   spin_unlock(ptl);
> > +   hugetlb_vma_unlock_read(vma);
> 
> Just noticed it when pulled the last mm-unstable: this line seems to be a
> left-over of v3, while not needed now?
> 
> > +   __migration_entry_wait_huge(pte, ptl);
> > +   goto retry;
> > +   }

Thanks Peter!

Sent v5 with the that line removed.

-- 
Mike Kravetz


Re: [PATCH v6 21/25] powerpc: Provide syscall wrapper

2022-10-30 Thread Michael Ellerman
Andreas Schwab  writes:
> On Okt 30 2022, Arnd Bergmann wrote:
>
>> On Sun, Oct 30, 2022, at 16:34, Andreas Schwab wrote:
>>> This breaks powerpc32.  The fallocate syscall misinterprets its
>>> arguments.
>>
>> It was fixed in
>
> Nope.

Ack.


Re: [PATCH v2 -next] powerpc/powermac: Fix symbol not declared warnings

2022-10-30 Thread chenlifu

在 2022/8/19 21:06, Chen Lifu 写道:

1. ppc_override_l2cr and ppc_override_l2cr_value are only used in
l2cr_init() function, remove them and used *l2cr directly.
2. has_l2cache is not used outside of the file, so mark it static and
do not initialise statics to 0.

Fixes the following warning:

arch/powerpc/platforms/powermac/setup.c:74:5: warning: symbol 
'ppc_override_l2cr' was not declared. Should it be static?
arch/powerpc/platforms/powermac/setup.c:75:5: warning: symbol 
'ppc_override_l2cr_value' was not declared. Should it be static?
arch/powerpc/platforms/powermac/setup.c:76:5: warning: symbol 'has_l2cache' was 
not declared. Should it be static?

Signed-off-by: Chen Lifu 
---
  arch/powerpc/platforms/powermac/setup.c | 19 ++-
  1 file changed, 6 insertions(+), 13 deletions(-)

diff --git a/arch/powerpc/platforms/powermac/setup.c 
b/arch/powerpc/platforms/powermac/setup.c
index 04daa7f0a03c..49faa066b372 100644
--- a/arch/powerpc/platforms/powermac/setup.c
+++ b/arch/powerpc/platforms/powermac/setup.c
@@ -68,13 +68,11 @@
  
  #include "pmac.h"
  
  #undef SHOW_GATWICK_IRQS
  
-int ppc_override_l2cr = 0;

-int ppc_override_l2cr_value;
-int has_l2cache = 0;
+static int has_l2cache;
  
  int pmac_newworld;
  
  static int current_root_goodness = -1;
  
@@ -234,26 +232,21 @@ static void __init l2cr_init(void)
  
  		for_each_of_cpu_node(np) {

const unsigned int *l2cr =
of_get_property(np, "l2cr-value", NULL);
if (l2cr) {
-   ppc_override_l2cr = 1;
-   ppc_override_l2cr_value = *l2cr;
_set_L2CR(0);
-   _set_L2CR(ppc_override_l2cr_value);
+   _set_L2CR(*l2cr);
+   printk(KERN_INFO "L2CR overridden (0x%x), "
+  "backside cache is %s\n",
+  *l2cr, ((*l2cr) & 0x8000) ?
+  "enabled" : "disabled");
}
of_node_put(np);
break;
}
}
-
-   if (ppc_override_l2cr)
-   printk(KERN_INFO "L2CR overridden (0x%x), "
-  "backside cache is %s\n",
-  ppc_override_l2cr_value,
-  (ppc_override_l2cr_value & 0x8000)
-   ? "enabled" : "disabled");
  }
  #endif
  
  static void __init pmac_setup_arch(void)

  {


Friendly ping ...



[RFC PATCH 00/19] Remove STACK_FRAME_OVERHEAD

2022-10-30 Thread Nicholas Piggin
This is some quick hacking, hardly tested but might have potential.

I think we're not validating the perf kernel stack walker bounds
quite correctly, and not setting up decent stack frames for the child
in copy_thread. So at least those two things we could do. Maybe
patch 1 should go upstream as a fix.

Thanks,
Nick

Nicholas Piggin (19):
  powerpc/perf: callchain validate kernel stack pointer bounds
  powerpc: Rearrange copy_thread child stack creation
  powerpc/64: Remove asm interrupt tracing call helpers
  powerpc/pseries: hvcall stack frame overhead
  powerpc/32: Use load and store multiple in GPR save/restore macros
  powerpc: simplify ppc_save_regs
  powerpc: add definition for pt_regs offset within an interrupt frame
  powerpc: add a definition for the marker offset within the interrupt
frame
  powerpc: Rename STACK_FRAME_MARKER and derive it from frame offset
  powerpc: add a define for the user interrupt frame size
  powerpc: add a define for the switch frame size and regs offset
  powerpc: copy_thread fill in interrupt frame marker and back chain
  powerpc: copy_thread add a back chain to the switch stack frame
  powerpc: split validate_sp into two functions
  powerpc: allow minimum sized kernel stack frames
  powerpc/64: ELFv2 use minimal stack frames in int and switch frame
sizes
  powerpc: remove STACK_FRAME_OVERHEAD
  powerpc: change stack marker memory operations to 32-bit
  powerpc/64: ELFv2 use reserved word in the stack frame for the regs
marker

 arch/powerpc/include/asm/irqflags.h   | 29 ---
 arch/powerpc/include/asm/ppc_asm.h| 18 +++-
 arch/powerpc/include/asm/processor.h  | 15 +++-
 arch/powerpc/include/asm/ptrace.h | 41 +++---
 arch/powerpc/kernel/asm-offsets.c |  9 +-
 arch/powerpc/kernel/entry_32.S| 14 ++--
 arch/powerpc/kernel/exceptions-64e.S  | 44 +-
 arch/powerpc/kernel/exceptions-64s.S  | 82 +--
 arch/powerpc/kernel/head_32.h |  4 +-
 arch/powerpc/kernel/head_40x.S|  2 +-
 arch/powerpc/kernel/head_44x.S|  6 +-
 arch/powerpc/kernel/head_64.S |  6 +-
 arch/powerpc/kernel/head_85xx.S   |  8 +-
 arch/powerpc/kernel/head_8xx.S|  2 +-
 arch/powerpc/kernel/head_book3s_32.S  |  4 +-
 arch/powerpc/kernel/head_booke.h  |  4 +-
 arch/powerpc/kernel/interrupt_64.S| 32 
 arch/powerpc/kernel/irq.c |  4 +-
 arch/powerpc/kernel/kgdb.c|  2 +-
 arch/powerpc/kernel/misc_32.S |  2 +-
 arch/powerpc/kernel/misc_64.S |  4 +-
 arch/powerpc/kernel/optprobes_head.S  |  4 +-
 arch/powerpc/kernel/ppc_save_regs.S   | 58 -
 arch/powerpc/kernel/process.c | 54 +++-
 arch/powerpc/kernel/smp.c |  2 +-
 arch/powerpc/kernel/stacktrace.c  | 10 +--
 arch/powerpc/kernel/tm.S  |  8 +-
 arch/powerpc/kernel/trace/ftrace_mprofile.S   |  2 +-
 arch/powerpc/kvm/book3s_hv_rmhandlers.S   |  2 +-
 .../lib/test_emulate_step_exec_instr.S|  2 +-
 arch/powerpc/perf/callchain.c |  9 +-
 arch/powerpc/platforms/pseries/hvCall.S   | 38 +
 arch/powerpc/xmon/xmon.c  | 10 +--
 33 files changed, 263 insertions(+), 268 deletions(-)

-- 
2.37.2



[RFC PATCH 01/19] powerpc/perf: callchain validate kernel stack pointer bounds

2022-10-30 Thread Nicholas Piggin
The interrupt frame detection and loads from the hypothetical pt_regs
are not bounds-checked. The next-frame validation only bounds-checks
STACK_FRAME_OVERHEAD, which does not include the pt_regs. Add another
test for this.

Signed-off-by: Nicholas Piggin 
---

Could the user set r1 to be equal to the address matching the first
interrupt frame - STACK_INT_FRAME_SIZE, which is in the previous page
due to the kernel redzone, and induce the kernel to load the marker from
there? Possibly it could cause a crash at least.

It also seems a bit rude to put a fancy next-frame-validation out in
perf/ rather than with the rest of the frame validation code.

Thanks,
Nick

 arch/powerpc/perf/callchain.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/powerpc/perf/callchain.c b/arch/powerpc/perf/callchain.c
index 082f6d0308a4..8718289c051d 100644
--- a/arch/powerpc/perf/callchain.c
+++ b/arch/powerpc/perf/callchain.c
@@ -61,6 +61,7 @@ perf_callchain_kernel(struct perf_callchain_entry_ctx *entry, 
struct pt_regs *re
next_sp = fp[0];
 
if (next_sp == sp + STACK_INT_FRAME_SIZE &&
+   validate_sp(sp, current, STACK_INT_FRAME_SIZE) &&
fp[STACK_FRAME_MARKER] == STACK_FRAME_REGS_MARKER) {
/*
 * This looks like an interrupt frame for an
-- 
2.37.2



[RFC PATCH 02/19] powerpc: Rearrange copy_thread child stack creation

2022-10-30 Thread Nicholas Piggin
This makes it a bit clearer where the stack frame is created, and will
allow easier use of some of the stack offset constants in a later
change.

Signed-off-by: Nicholas Piggin 
---
 arch/powerpc/kernel/process.c | 11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index 67da147fe34d..acfa197fb2df 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -1726,13 +1726,16 @@ int copy_thread(struct task_struct *p, const struct 
kernel_clone_args *args)
 
klp_init_thread_info(p);
 
+   /* Create initial stack frame. */
+   sp -= (sizeof(struct pt_regs) + STACK_FRAME_OVERHEAD);
+   ((unsigned long *)sp)[0] = 0;
+
/* Copy registers */
-   sp -= sizeof(struct pt_regs);
-   childregs = (struct pt_regs *) sp;
+   childregs = (struct pt_regs *)(sp + STACK_FRAME_OVERHEAD);
if (unlikely(args->fn)) {
/* kernel thread */
memset(childregs, 0, sizeof(struct pt_regs));
-   childregs->gpr[1] = sp + sizeof(struct pt_regs);
+   childregs->gpr[1] = sp + (sizeof(struct pt_regs) + 
STACK_FRAME_OVERHEAD);
/* function */
if (args->fn)
childregs->gpr[14] = ppc_function_entry((void 
*)args->fn);
@@ -1767,7 +1770,6 @@ int copy_thread(struct task_struct *p, const struct 
kernel_clone_args *args)
f = ret_from_fork;
}
childregs->msr &= ~(MSR_FP|MSR_VEC|MSR_VSX);
-   sp -= STACK_FRAME_OVERHEAD;
 
/*
 * The way this works is that at some point in the future
@@ -1777,7 +1779,6 @@ int copy_thread(struct task_struct *p, const struct 
kernel_clone_args *args)
 * do some house keeping and then return from the fork or clone
 * system call, using the stack frame created above.
 */
-   ((unsigned long *)sp)[0] = 0;
sp -= sizeof(struct pt_regs);
kregs = (struct pt_regs *) sp;
sp -= STACK_FRAME_OVERHEAD;
-- 
2.37.2



[RFC PATCH 04/19] powerpc/pseries: hvcall stack frame overhead

2022-10-30 Thread Nicholas Piggin
This call may use the min size stack frame. The scratch space used is
in the caller's parameter area frame, not this function's frame.

Signed-off-by: Nicholas Piggin 
---
 arch/powerpc/platforms/pseries/hvCall.S | 38 +
 1 file changed, 20 insertions(+), 18 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/hvCall.S 
b/arch/powerpc/platforms/pseries/hvCall.S
index 762eb15d3bd4..783c16ad648b 100644
--- a/arch/powerpc/platforms/pseries/hvCall.S
+++ b/arch/powerpc/platforms/pseries/hvCall.S
@@ -27,7 +27,9 @@ hcall_tracepoint_refcount:
 
 /*
  * precall must preserve all registers.  use unused STK_PARAM()
- * areas to save snapshots and opcode.
+ * areas to save snapshots and opcode. STK_PARAM() in the caller's
+ * frame will be available even on ELFv2 because these are all
+ * variadic functions.
  */
 #define HCALL_INST_PRECALL(FIRST_REG)  \
mflrr0; \
@@ -41,29 +43,29 @@ hcall_tracepoint_refcount:
std r10,STK_PARAM(R10)(r1); \
std r0,16(r1);  \
addir4,r1,STK_PARAM(FIRST_REG); \
-   stdur1,-STACK_FRAME_OVERHEAD(r1);   \
+   stdur1,-STACK_FRAME_MIN_SIZE(r1);   \
bl  __trace_hcall_entry;\
-   ld  r3,STACK_FRAME_OVERHEAD+STK_PARAM(R3)(r1);  \
-   ld  r4,STACK_FRAME_OVERHEAD+STK_PARAM(R4)(r1);  \
-   ld  r5,STACK_FRAME_OVERHEAD+STK_PARAM(R5)(r1);  \
-   ld  r6,STACK_FRAME_OVERHEAD+STK_PARAM(R6)(r1);  \
-   ld  r7,STACK_FRAME_OVERHEAD+STK_PARAM(R7)(r1);  \
-   ld  r8,STACK_FRAME_OVERHEAD+STK_PARAM(R8)(r1);  \
-   ld  r9,STACK_FRAME_OVERHEAD+STK_PARAM(R9)(r1);  \
-   ld  r10,STACK_FRAME_OVERHEAD+STK_PARAM(R10)(r1)
+   ld  r3,STACK_FRAME_MIN_SIZE+STK_PARAM(R3)(r1);  \
+   ld  r4,STACK_FRAME_MIN_SIZE+STK_PARAM(R4)(r1);  \
+   ld  r5,STACK_FRAME_MIN_SIZE+STK_PARAM(R5)(r1);  \
+   ld  r6,STACK_FRAME_MIN_SIZE+STK_PARAM(R6)(r1);  \
+   ld  r7,STACK_FRAME_MIN_SIZE+STK_PARAM(R7)(r1);  \
+   ld  r8,STACK_FRAME_MIN_SIZE+STK_PARAM(R8)(r1);  \
+   ld  r9,STACK_FRAME_MIN_SIZE+STK_PARAM(R9)(r1);  \
+   ld  r10,STACK_FRAME_MIN_SIZE+STK_PARAM(R10)(r1)
 
 /*
  * postcall is performed immediately before function return which
  * allows liberal use of volatile registers.
  */
 #define __HCALL_INST_POSTCALL  \
-   ld  r0,STACK_FRAME_OVERHEAD+STK_PARAM(R3)(r1);  \
-   std r3,STACK_FRAME_OVERHEAD+STK_PARAM(R3)(r1);  \
+   ld  r0,STACK_FRAME_MIN_SIZE+STK_PARAM(R3)(r1);  \
+   std r3,STACK_FRAME_MIN_SIZE+STK_PARAM(R3)(r1);  \
mr  r4,r3;  \
mr  r3,r0;  \
bl  __trace_hcall_exit; \
-   ld  r0,STACK_FRAME_OVERHEAD+16(r1); \
-   addir1,r1,STACK_FRAME_OVERHEAD; \
+   ld  r0,STACK_FRAME_MIN_SIZE+16(r1); \
+   addir1,r1,STACK_FRAME_MIN_SIZE; \
ld  r3,STK_PARAM(R3)(r1);   \
mtlrr0
 
@@ -303,14 +305,14 @@ plpar_hcall9_trace:
mr  r7,r8
mr  r8,r9
mr  r9,r10
-   ld  r10,STACK_FRAME_OVERHEAD+STK_PARAM(R11)(r1)
-   ld  r11,STACK_FRAME_OVERHEAD+STK_PARAM(R12)(r1)
-   ld  r12,STACK_FRAME_OVERHEAD+STK_PARAM(R13)(r1)
+   ld  r10,STACK_FRAME_MIN_SIZE+STK_PARAM(R11)(r1)
+   ld  r11,STACK_FRAME_MIN_SIZE+STK_PARAM(R12)(r1)
+   ld  r12,STACK_FRAME_MIN_SIZE+STK_PARAM(R13)(r1)
 
HVSC
 
mr  r0,r12
-   ld  r12,STACK_FRAME_OVERHEAD+STK_PARAM(R4)(r1)
+   ld  r12,STACK_FRAME_MIN_SIZE+STK_PARAM(R4)(r1)
std r4,0(r12)
std r5,8(r12)
std r6,16(r12)
-- 
2.37.2



[RFC PATCH 03/19] powerpc/64: Remove asm interrupt tracing call helpers

2022-10-30 Thread Nicholas Piggin
These are unused. Remove.

Signed-off-by: Nicholas Piggin 
---
 arch/powerpc/include/asm/irqflags.h | 29 -
 1 file changed, 29 deletions(-)

diff --git a/arch/powerpc/include/asm/irqflags.h 
b/arch/powerpc/include/asm/irqflags.h
index 1a6c1ce17735..81e0a5025be8 100644
--- a/arch/powerpc/include/asm/irqflags.h
+++ b/arch/powerpc/include/asm/irqflags.h
@@ -13,32 +13,6 @@
 
 #else
 #ifdef CONFIG_TRACE_IRQFLAGS
-#ifdef CONFIG_IRQSOFF_TRACER
-/*
- * Since the ftrace irqsoff latency trace checks CALLER_ADDR1,
- * which is the stack frame here, we need to force a stack frame
- * in case we came from user space.
- */
-#define TRACE_WITH_FRAME_BUFFER(func)  \
-   mflrr0; \
-   stdur1, -STACK_FRAME_OVERHEAD(r1);  \
-   std r0, 16(r1); \
-   stdur1, -STACK_FRAME_OVERHEAD(r1);  \
-   bl func;\
-   ld  r1, 0(r1);  \
-   ld  r1, 0(r1);
-#else
-#define TRACE_WITH_FRAME_BUFFER(func)  \
-   bl func;
-#endif
-
-/*
- * These are calls to C code, so the caller must be prepared for volatiles to
- * be clobbered.
- */
-#define TRACE_ENABLE_INTS  TRACE_WITH_FRAME_BUFFER(trace_hardirqs_on)
-#define TRACE_DISABLE_INTS TRACE_WITH_FRAME_BUFFER(trace_hardirqs_off)
-
 /*
  * This is used by assembly code to soft-disable interrupts first and
  * reconcile irq state.
@@ -59,9 +33,6 @@
 44:
 
 #else
-#define TRACE_ENABLE_INTS
-#define TRACE_DISABLE_INTS
-
 #define RECONCILE_IRQ_STATE(__rA, __rB)\
lbz __rA,PACAIRQHAPPENED(r13);  \
li  __rB,IRQS_DISABLED; \
-- 
2.37.2



[RFC PATCH 05/19] powerpc/32: Use load and store multiple in GPR save/restore macros

2022-10-30 Thread Nicholas Piggin
---
 arch/powerpc/include/asm/ppc_asm.h | 18 --
 1 file changed, 16 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/ppc_asm.h 
b/arch/powerpc/include/asm/ppc_asm.h
index 753a2757bcd4..ac44383d350a 100644
--- a/arch/powerpc/include/asm/ppc_asm.h
+++ b/arch/powerpc/include/asm/ppc_asm.h
@@ -57,8 +57,22 @@
 #define SAVE_NVGPRS(base)  SAVE_GPRS(14, 31, base)
 #define REST_NVGPRS(base)  REST_GPRS(14, 31, base)
 #else
-#define SAVE_GPRS(start, end, base)OP_REGS stw, 4, start, end, base, GPR0
-#define REST_GPRS(start, end, base)OP_REGS lwz, 4, start, end, base, GPR0
+.macro __SAVE_GPRS start, end, base, offset
+   .if \end == 31
+   stmw\start,\offset(\base)
+   .else
+   OP_REGS stw, 4, \start, \end, \base, \offset
+   .endif
+.endm
+.macro __REST_GPRS start, end, base, offset
+   .if \end == 31
+   lmw \start,\offset(\base)
+   .else
+   OP_REGS lwz, 4, \start, \end, \base, \offset
+   .endif
+.endm
+#define SAVE_GPRS(start, end, base)__SAVE_GPRS start, end, base, GPR0
+#define REST_GPRS(start, end, base)__REST_GPRS start, end, base, GPR0
 #define SAVE_NVGPRS(base)  SAVE_GPRS(13, 31, base)
 #define REST_NVGPRS(base)  REST_GPRS(13, 31, base)
 #endif
-- 
2.37.2