[Bug 217427] Linux 6.3.1 + AMD RX 570 on ppc64le 4K: Kernel attempted to read user page (1128) - exploit attempt? (uid: 0)

2023-05-10 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=217427

--- Comment #2 from Artem S. Tashkinov (a...@gmx.com) ---
Please also report to https://gitlab.freedesktop.org/drm/amd/-/issues

Performing git bisect would be nice:
https://docs.kernel.org/admin-guide/bug-bisect.html

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

Re: linux-next: build warnings in powercp build

2023-05-10 Thread Michael Ellerman
Stephen Rothwell  writes:
> Hi all,
>
> Today's (and a few before) linux-next build (powerpc pseries_le_defconfig)
> produced these warnings:

Those aren't really warnings, it's just merge_config.sh telling you what
it's doing. The whole point of the generated configs is to override
those symbols.

Looks like there is a way to silence them, by using 
merge_into_defconfig_override.

cheers

> Building: powerpc pseries_le_defconfig
> Using /home/sfr/next/next/arch/powerpc/configs/ppc64_defconfig as base
> Merging /home/sfr/next/next/arch/powerpc/configs/le.config
> Merging /home/sfr/next/next/arch/powerpc/configs/guest.config
> Value of CONFIG_VIRTIO_BLK is redefined by fragment 
> /home/sfr/next/next/arch/powerpc/configs/guest.config:
> Previous value: CONFIG_VIRTIO_BLK=m
> New value: CONFIG_VIRTIO_BLK=y
>
> Value of CONFIG_SCSI_VIRTIO is redefined by fragment 
> /home/sfr/next/next/arch/powerpc/configs/guest.config:
> Previous value: CONFIG_SCSI_VIRTIO=m
> New value: CONFIG_SCSI_VIRTIO=y
>
> Value of CONFIG_VIRTIO_NET is redefined by fragment 
> /home/sfr/next/next/arch/powerpc/configs/guest.config:
> Previous value: CONFIG_VIRTIO_NET=m
> New value: CONFIG_VIRTIO_NET=y
>
> Value of CONFIG_VIRTIO_CONSOLE is redefined by fragment 
> /home/sfr/next/next/arch/powerpc/configs/guest.config:
> Previous value: CONFIG_VIRTIO_CONSOLE=m
> New value: CONFIG_VIRTIO_CONSOLE=y
>
> Value of CONFIG_VIRTIO_PCI is redefined by fragment 
> /home/sfr/next/next/arch/powerpc/configs/guest.config:
> Previous value: CONFIG_VIRTIO_PCI=m
> New value: CONFIG_VIRTIO_PCI=y
>
> Value of CONFIG_VIRTIO_BALLOON is redefined by fragment 
> /home/sfr/next/next/arch/powerpc/configs/guest.config:
> Previous value: CONFIG_VIRTIO_BALLOON=m
> New value: CONFIG_VIRTIO_BALLOON=y
>
> Value of CONFIG_VHOST_NET is redefined by fragment 
> /home/sfr/next/next/arch/powerpc/configs/guest.config:
> Previous value: CONFIG_VHOST_NET=m
> New value: CONFIG_VHOST_NET=y
>
> Value of CONFIG_IBMVETH is redefined by fragment 
> /home/sfr/next/next/arch/powerpc/configs/guest.config:
> Previous value: CONFIG_IBMVETH=m
> New value: CONFIG_IBMVETH=y
>
> Value of CONFIG_IBMVNIC is redefined by fragment 
> /home/sfr/next/next/arch/powerpc/configs/guest.config:
> Previous value: CONFIG_IBMVNIC=m
> New value: CONFIG_IBMVNIC=y
>
> I am not sure exactly which change(s) introduced these warnings.
>
> -- 
> Cheers,
> Stephen Rothwell


Re: [PATCH 00/23] arch: allow pte_offset_map[_lock]() to fail

2023-05-10 Thread Hugh Dickins
On Wed, 10 May 2023, Matthew Wilcox wrote:
> On Tue, May 09, 2023 at 09:39:13PM -0700, Hugh Dickins wrote:
> > Two: pte_offset_map() will need to do an rcu_read_lock(), with the
> > corresponding rcu_read_unlock() in pte_unmap().  But most architectures
> > never supported CONFIG_HIGHPTE, so some don't always call pte_unmap()
> > after pte_offset_map(), or have used userspace pte_offset_map() where
> > pte_offset_kernel() is more correct.  No problem in the current tree,
> > but a problem once an rcu_read_unlock() will be needed to keep balance.
> 
> Hi Hugh,
> 
> I shall have to spend some time looking at these patches, but at LSFMM
> just a few hours ago, I proposed and nobody objected to removing
> CONFIG_HIGHPTE.  I don't intend to take action on that consensus
> immediately, so I can certainly wait until your patches are applied, but
> if this information simplifies what you're doing, feel free to act on it.

Thanks a lot, Matthew: very considerate, as usual.

Yes, I did see your "Whither Highmem?" (wither highmem!) proposal on the
list, and it did make me think, better get these patches and preview out
soon, before you get to vanish pte_unmap() altogether.  HIGHMEM or not,
HIGHPTE or not, I think pte_offset_map() and pte_unmap() still have an
important role to play.

I don't really understand why you're going down a remove-CONFIG_HIGHPTE
route: I thought you were motivated by the awkardness of kmap on large
folios; but I don't see how removing HIGHPTE helps with that at all
(unless you have a "large page tables" effort in mind, but I doubt it).

But I've no investment in CONFIG_HIGHPTE if people think now is the
time to remove it: I disagree, but wouldn't miss it myself - so long
as you leave pte_offset_map() and pte_unmap() (under whatever names).

I don't think removing CONFIG_HIGHPTE will simplify what I'm doing.
For a moment it looked like it would: the PAE case is nasty (and our
data centres have not been on PAE for a long time, so it wasn't a
problem I had to face before); and knowing pmd_high must be 0 for a
page table looked like it would help, but now I'm not so sure of that
(hmm, I'm changing my mind again as I write).

Peter's pmdp_get_lockless() does rely for complete correctness on
interrupts being disabled, and I suspect that I may be forced in the
PAE case to do so briefly; but detest that notion.  For now I'm just
deferring it, hoping for a better idea before third series finalized.

I mention this (and Cc Peter) in passing: don't want this arch thread
to go down into that rabbit hole: we can start a fresh thread on it if
you wish, but right now my priority is commit messages for the second
series, rather than solving (or even detailing) the PAE problem.

Hugh


Re: [PATCH 01/23] arm: allow pte_offset_map[_lock]() to fail

2023-05-10 Thread Hugh Dickins
On Wed, 10 May 2023, Matthew Wilcox wrote:
> On Tue, May 09, 2023 at 09:42:44PM -0700, Hugh Dickins wrote:
> > diff --git a/arch/arm/lib/uaccess_with_memcpy.c 
> > b/arch/arm/lib/uaccess_with_memcpy.c
> > index e4c2677cc1e9..2f6163f05e93 100644
> > --- a/arch/arm/lib/uaccess_with_memcpy.c
> > +++ b/arch/arm/lib/uaccess_with_memcpy.c
> > @@ -74,6 +74,9 @@ pin_page_for_write(const void __user *_addr, pte_t 
> > **ptep, spinlock_t **ptlp)
> > return 0;
> >  
> > pte = pte_offset_map_lock(current->mm, pmd, addr, );
> > +   if (unlikely(!pte))
> > +   return 0;
> 
> Failing seems like the wrong thig to do if we transitioned from a PTE
> to PMD here?  Looks to me like we should goto a new label right after
> the 'pmd = pmd_offset(pud, addr);', no?

I'm pretty sure it's right as is; but probably more by luck than care -
I do not think I studied this code as closely as you have now made me do;
and it's clear that this is a piece of code where rare transient issues
could come up, and must be handled correctly.  Thank you for making me
look again.

The key is in the callers of pin_page_for_write(): __copy_to_user_memcpy()
and __clear_user_memset().  They're doing "while (!pin_page_for_write())"
loops - they hope for the fast path of getting pte_lock or pmd_lock on
the page, and doing a __memcpy() or __memset() to the user address; but
if anything goes "wrong", a __put_user() to fault in the page (or fail)
then pin_page_for_write() again.

"if (unlikely(!pte)) return 0" says that the expected fast path did not
succeed, so please __put_user() and have another go.

It is somewhere I could have done a "goto again", but that would be
superfluous when it's already designed that way at the outer level.

Hugh


Re: [PATCH] cachestat: wire up cachestat for other architectures

2023-05-10 Thread Michael Ellerman
Nhat Pham  writes:
> cachestat is previously only wired in for x86 (and architectures using
> the generic unistd.h table):
>
> https://lore.kernel.org/lkml/20230503013608.2431726-1-npha...@gmail.com/
>
> This patch wires cachestat in for all the other architectures.
>
> Signed-off-by: Nhat Pham 
> ---
>  arch/alpha/kernel/syscalls/syscall.tbl  | 1 +
>  arch/arm/tools/syscall.tbl  | 1 +
>  arch/ia64/kernel/syscalls/syscall.tbl   | 1 +
>  arch/m68k/kernel/syscalls/syscall.tbl   | 1 +
>  arch/microblaze/kernel/syscalls/syscall.tbl | 1 +
>  arch/mips/kernel/syscalls/syscall_n32.tbl   | 1 +
>  arch/mips/kernel/syscalls/syscall_n64.tbl   | 1 +
>  arch/mips/kernel/syscalls/syscall_o32.tbl   | 1 +
>  arch/parisc/kernel/syscalls/syscall.tbl | 1 +
>  arch/powerpc/kernel/syscalls/syscall.tbl| 1 +

With the change to the selftest (see my other mail), I tested this on
powerpc and all tests pass.

Tested-by: Michael Ellerman  (powerpc)


cheers


Re: [PATCH v13 3/3] selftests: Add selftests for cachestat

2023-05-10 Thread Michael Ellerman
Nhat Pham  writes:
> Test cachestat on a newly created file, /dev/ files, and /proc/ files.
> Also test on a shmem file (which can also be tested with huge pages
> since tmpfs supports huge pages).
>
> Signed-off-by: Nhat Pham 
...
> diff --git a/tools/testing/selftests/cachestat/test_cachestat.c 
> b/tools/testing/selftests/cachestat/test_cachestat.c
> new file mode 100644
> index ..c3823b809c25
> --- /dev/null
> +++ b/tools/testing/selftests/cachestat/test_cachestat.c
> @@ -0,0 +1,258 @@
> +// SPDX-License-Identifier: GPL-2.0
> +#define _GNU_SOURCE
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#include "../kselftest.h"
> +
> +static const char * const dev_files[] = {
> + "/dev/zero", "/dev/null", "/dev/urandom",
> + "/proc/version", "/proc"
> +};
> +static const int cachestat_nr = 451;
> +
> +void print_cachestat(struct cachestat *cs)
> +{
> + ksft_print_msg(
> + "Using cachestat: Cached: %lu, Dirty: %lu, Writeback: %lu, Evicted: 
> %lu, Recently Evicted: %lu\n",
> + cs->nr_cache, cs->nr_dirty, cs->nr_writeback,
> + cs->nr_evicted, cs->nr_recently_evicted);
> +}
> +
> +bool write_exactly(int fd, size_t filesize)
> +{
> + char data[filesize];

On kernels with 64K pages (powerpc at least), this tries to allocate
64MB on the stack which segfaults.

Allocating data with malloc avoids the problem and allows the test to
pass.

Looks like this commit is still in mm-unstable, so maybe Andrew can
squash the incremental diff below in, if it looks OK to you. The diff is
a bit big because I unindented the body of the function.

cheers


diff --git a/tools/testing/selftests/cachestat/test_cachestat.c 
b/tools/testing/selftests/cachestat/test_cachestat.c
index 9be2262e5c17..54d09b820ed4 100644
--- a/tools/testing/selftests/cachestat/test_cachestat.c
+++ b/tools/testing/selftests/cachestat/test_cachestat.c
@@ -31,48 +31,59 @@ void print_cachestat(struct cachestat *cs)
 
 bool write_exactly(int fd, size_t filesize)
 {
-   char data[filesize];
-   bool ret = true;
int random_fd = open("/dev/urandom", O_RDONLY);
+   char *cursor, *data;
+   int remained;
+   bool ret;
 
if (random_fd < 0) {
ksft_print_msg("Unable to access urandom.\n");
ret = false;
goto out;
-   } else {
-   int remained = filesize;
-   char *cursor = data;
+   }
 
-   while (remained) {
-   ssize_t read_len = read(random_fd, cursor, remained);
+   data = malloc(filesize);
+   if (!data) {
+   ksft_print_msg("Unable to allocate data.\n");
+   ret = false;
+   goto close_random_fd;
+   }
 
-   if (read_len <= 0) {
-   ksft_print_msg("Unable to read from 
urandom.\n");
-   ret = false;
-   goto close_random_fd;
-   }
+   remained = filesize;
+   cursor = data;
 
-   remained -= read_len;
-   cursor += read_len;
+   while (remained) {
+   ssize_t read_len = read(random_fd, cursor, remained);
+
+   if (read_len <= 0) {
+   ksft_print_msg("Unable to read from urandom.\n");
+   ret = false;
+   goto out_free_data;
}
 
-   /* write random data to fd */
-   remained = filesize;
-   cursor = data;
-   while (remained) {
-   ssize_t write_len = write(fd, cursor, remained);
+   remained -= read_len;
+   cursor += read_len;
+   }
 
-   if (write_len <= 0) {
-   ksft_print_msg("Unable write random data to 
file.\n");
-   ret = false;
-   goto close_random_fd;
-   }
+   /* write random data to fd */
+   remained = filesize;
+   cursor = data;
+   while (remained) {
+   ssize_t write_len = write(fd, cursor, remained);
 
-   remained -= write_len;
-   cursor += write_len;
+   if (write_len <= 0) {
+   ksft_print_msg("Unable write random data to file.\n");
+   ret = false;
+   goto out_free_data;
}
+
+   remained -= write_len;
+   cursor += write_len;
}
 
+   ret = true;
+out_free_data:
+   free(data);
 close_random_fd:
close(random_fd);
 out:



Re: [PATCH 21/23] x86: Allow get_locked_pte() to fail

2023-05-10 Thread Hugh Dickins
On Wed, 10 May 2023, Peter Zijlstra wrote:

> On Tue, May 09, 2023 at 10:08:37PM -0700, Hugh Dickins wrote:
> > In rare transient cases, not yet made possible, pte_offset_map() and
> > pte_offset_map_lock() may not find a page table: handle appropriately.
> > 
> > Signed-off-by: Hugh Dickins 
> > ---
> >  arch/x86/kernel/ldt.c | 6 --
> >  1 file changed, 4 insertions(+), 2 deletions(-)
> > 
> > diff --git a/arch/x86/kernel/ldt.c b/arch/x86/kernel/ldt.c
> > index 525876e7b9f4..eb844549cd83 100644
> > --- a/arch/x86/kernel/ldt.c
> > +++ b/arch/x86/kernel/ldt.c
> > @@ -367,8 +367,10 @@ static void unmap_ldt_struct(struct mm_struct *mm, 
> > struct ldt_struct *ldt)
> >  
> > va = (unsigned long)ldt_slot_va(ldt->slot) + offset;
> > ptep = get_locked_pte(mm, va, );
> > -   pte_clear(mm, va, ptep);
> > -   pte_unmap_unlock(ptep, ptl);
> > +   if (ptep) {
> > +   pte_clear(mm, va, ptep);
> > +   pte_unmap_unlock(ptep, ptl);
> > +   }
> > }
> 
> Ow geez, now I have to go remember how the whole PTI/LDT crud worked :/

I apologize for sending you back there!

> 
> At first glance this seems wrong; we can't just not unmap the LDT if we
> can't find it in a hurry. Also, IIRC this isn't in fact a regular user
> mapping, so it should not be subject to THP induced seizures.
> 
> ... memory bubbles back ... for PTI kernels we need to map this in the
> user and kernel page-tables because obviously userspace needs to be able
> to have access to the LDT. But it is not directly acessible by
> userspace. It lives in the cpu_entry_area as a virtual map of the real
> kernel allocation, and this virtual address is used for LLDT.
> Modification is done through sys_modify_ldt().

And there must be a user-style page table backing that cpu_entry_area,
because the use of get_locked_pte() and pte_unmap_unlock() implies
that there's a user page table (struct page containing spinlock if
config says so) rather than just a kernel page table mapping it.

> 
> I think I would feel much better if this were something like:
> 
>   if (!WARN_ON_ONCE(!ptep))
> 
> This really shouldn't fail and if it does, simply skipping it isn't the
> right thing either.

Sure, I'll gladly make that change when I respin - not immediately, let's
get more feedback on this arch series first, but maybe in a week's time.

Thanks for looking so quickly, Peter: I didn't Cc you on this particular
series, but shall certainly be doing so on the ones that follow, because
a few of those patches go into interesting pmdp_get_lockless() territory.

Hugh


Re: [PATCH 05/23] m68k: allow pte_offset_map[_lock]() to fail

2023-05-10 Thread Hugh Dickins
On Wed, 10 May 2023, Geert Uytterhoeven wrote:

> Hi Hugh,
> 
> Thanks for your patch!

And thank you for looking so quickly, Geert.

> 
> On Wed, May 10, 2023 at 6:48 AM Hugh Dickins  wrote:
> > In rare transient cases, not yet made possible, pte_offset_map() and
> > pte_offset_map_lock() may not find a page table: handle appropriately.
> >
> > Restructure cf_tlb_miss() with a pte_unmap() (previously omitted)
> > at label out, followed by one local_irq_restore() for all.
> 
> That's a bug fix, which should be a separate patch?

No, that's not a bug fix for the current tree, since m68k does not
offer CONFIG_HIGHPTE, so pte_unmap() is never anything but a no-op
for m68k (see include/linux/pgtable.h).

But I want to change pte_unmap() to do something even without
CONFIG_HIGHPTE, so have to fix up any such previously harmless
omissions in this series first.

> 
> >
> > Signed-off-by: Hugh Dickins 
> 
> 
> > --- a/arch/m68k/include/asm/mmu_context.h
> > +++ b/arch/m68k/include/asm/mmu_context.h
> > @@ -99,7 +99,7 @@ static inline void load_ksp_mmu(struct task_struct *task)
> > p4d_t *p4d;
> > pud_t *pud;
> > pmd_t *pmd;
> > -   pte_t *pte;
> > +   pte_t *pte = NULL;
> > unsigned long mmuar;
> >
> > local_irq_save(flags);
> > @@ -139,7 +139,7 @@ static inline void load_ksp_mmu(struct task_struct 
> > *task)
> >
> > pte = (mmuar >= PAGE_OFFSET) ? pte_offset_kernel(pmd, mmuar)
> >  : pte_offset_map(pmd, mmuar);
> > -   if (pte_none(*pte) || !pte_present(*pte))
> > +   if (!pte || pte_none(*pte) || !pte_present(*pte))
> > goto bug;
> 
> If the absence of a pte is to become a non-abnormal case, it should
> probably jump to "end" instead, to avoid spamming the kernel log.

I don't think so (but of course it's hard for you to tell, without
seeing all completed series of series).  If pmd_none(*pmd) can safely
goto bug just above, and pte_none(*pte) goto bug here, well, the !pte
case is going to be stranger than either of those.

My understanding of this function, load_ksp_mmu(), is that it's dealing
at context switch with a part of userspace which very much needs to be
present: whatever keeps that from being swapped out or migrated at
present, will be sure to keep the !pte case away - we cannot steal its
page table just at random (and a THP on m68k would be surprising too).

Though there is one case I can think of which will cause !pte here,
and so goto bug: if the pmd entry has got corrupted, and counts as
pmd_bad(), which will be tested (and cleared) in pte_offset_map().
But it is okay to report a bug in that case.

I can certainly change this to goto end instead if you still prefer,
no problem; but I'd rather keep it as is, if only for me to be proved
wrong by you actually seeing spam there.

> 
> >
> > set_pte(pte, pte_mkyoung(*pte));
> > @@ -161,6 +161,8 @@ static inline void load_ksp_mmu(struct task_struct 
> > *task)
> >  bug:
> > pr_info("ksp load failed: mm=0x%p ksp=0x08%lx\n", mm, mmuar);
> >  end:
> > +   if (pte && mmuar < PAGE_OFFSET)
> > +   pte_unmap(pte);
> 
> Is this also a bugfix, not mentioned in the patch description?

I'm not sure whether you're referring to the pte_unmap() which we
already discussed above, or you're seeing something else in addition;
but I don't think there's a bugfix here, just a rearrangement because
we now want lots of cases to do the pte_unmap() and local_irq_restore().

Hugh

> 
> > local_irq_restore(flags);
> >  }
> >
> 
> Gr{oetje,eeting}s,
> 
> Geert
> 
> -- 
> Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- 
> ge...@linux-m68k.org
> 
> In personal conversations with technical people, I call myself a hacker. But
> when I'm talking to journalists I just say "programmer" or something like 
> that.
> -- Linus Torvalds

linux-next: build warnings in powercp build

2023-05-10 Thread Stephen Rothwell
Hi all,

Today's (and a few before) linux-next build (powerpc pseries_le_defconfig)
produced these warnings:

Building: powerpc pseries_le_defconfig
Using /home/sfr/next/next/arch/powerpc/configs/ppc64_defconfig as base
Merging /home/sfr/next/next/arch/powerpc/configs/le.config
Merging /home/sfr/next/next/arch/powerpc/configs/guest.config
Value of CONFIG_VIRTIO_BLK is redefined by fragment 
/home/sfr/next/next/arch/powerpc/configs/guest.config:
Previous value: CONFIG_VIRTIO_BLK=m
New value: CONFIG_VIRTIO_BLK=y

Value of CONFIG_SCSI_VIRTIO is redefined by fragment 
/home/sfr/next/next/arch/powerpc/configs/guest.config:
Previous value: CONFIG_SCSI_VIRTIO=m
New value: CONFIG_SCSI_VIRTIO=y

Value of CONFIG_VIRTIO_NET is redefined by fragment 
/home/sfr/next/next/arch/powerpc/configs/guest.config:
Previous value: CONFIG_VIRTIO_NET=m
New value: CONFIG_VIRTIO_NET=y

Value of CONFIG_VIRTIO_CONSOLE is redefined by fragment 
/home/sfr/next/next/arch/powerpc/configs/guest.config:
Previous value: CONFIG_VIRTIO_CONSOLE=m
New value: CONFIG_VIRTIO_CONSOLE=y

Value of CONFIG_VIRTIO_PCI is redefined by fragment 
/home/sfr/next/next/arch/powerpc/configs/guest.config:
Previous value: CONFIG_VIRTIO_PCI=m
New value: CONFIG_VIRTIO_PCI=y

Value of CONFIG_VIRTIO_BALLOON is redefined by fragment 
/home/sfr/next/next/arch/powerpc/configs/guest.config:
Previous value: CONFIG_VIRTIO_BALLOON=m
New value: CONFIG_VIRTIO_BALLOON=y

Value of CONFIG_VHOST_NET is redefined by fragment 
/home/sfr/next/next/arch/powerpc/configs/guest.config:
Previous value: CONFIG_VHOST_NET=m
New value: CONFIG_VHOST_NET=y

Value of CONFIG_IBMVETH is redefined by fragment 
/home/sfr/next/next/arch/powerpc/configs/guest.config:
Previous value: CONFIG_IBMVETH=m
New value: CONFIG_IBMVETH=y

Value of CONFIG_IBMVNIC is redefined by fragment 
/home/sfr/next/next/arch/powerpc/configs/guest.config:
Previous value: CONFIG_IBMVNIC=m
New value: CONFIG_IBMVNIC=y

I am not sure exactly which change(s) introduced these warnings.

-- 
Cheers,
Stephen Rothwell


pgpjxoHBTV8UL.pgp
Description: OpenPGP digital signature


[PATCH] cachestat: wire up cachestat for other architectures

2023-05-10 Thread Nhat Pham
cachestat is previously only wired in for x86 (and architectures using
the generic unistd.h table):

https://lore.kernel.org/lkml/20230503013608.2431726-1-npha...@gmail.com/

This patch wires cachestat in for all the other architectures.

Signed-off-by: Nhat Pham 
---
 arch/alpha/kernel/syscalls/syscall.tbl  | 1 +
 arch/arm/tools/syscall.tbl  | 1 +
 arch/ia64/kernel/syscalls/syscall.tbl   | 1 +
 arch/m68k/kernel/syscalls/syscall.tbl   | 1 +
 arch/microblaze/kernel/syscalls/syscall.tbl | 1 +
 arch/mips/kernel/syscalls/syscall_n32.tbl   | 1 +
 arch/mips/kernel/syscalls/syscall_n64.tbl   | 1 +
 arch/mips/kernel/syscalls/syscall_o32.tbl   | 1 +
 arch/parisc/kernel/syscalls/syscall.tbl | 1 +
 arch/powerpc/kernel/syscalls/syscall.tbl| 1 +
 arch/s390/kernel/syscalls/syscall.tbl   | 1 +
 arch/sh/kernel/syscalls/syscall.tbl | 1 +
 arch/sparc/kernel/syscalls/syscall.tbl  | 1 +
 arch/xtensa/kernel/syscalls/syscall.tbl | 1 +
 14 files changed, 14 insertions(+)

diff --git a/arch/alpha/kernel/syscalls/syscall.tbl 
b/arch/alpha/kernel/syscalls/syscall.tbl
index 8ebacf37a8cf..1f13995d00d7 100644
--- a/arch/alpha/kernel/syscalls/syscall.tbl
+++ b/arch/alpha/kernel/syscalls/syscall.tbl
@@ -490,3 +490,4 @@
 558common  process_mreleasesys_process_mrelease
 559common  futex_waitv sys_futex_waitv
 560common  set_mempolicy_home_node sys_ni_syscall
+561common  cachestat   sys_cachestat
diff --git a/arch/arm/tools/syscall.tbl b/arch/arm/tools/syscall.tbl
index ac964612d8b0..8ebed8a13874 100644
--- a/arch/arm/tools/syscall.tbl
+++ b/arch/arm/tools/syscall.tbl
@@ -464,3 +464,4 @@
 448common  process_mreleasesys_process_mrelease
 449common  futex_waitv sys_futex_waitv
 450common  set_mempolicy_home_node sys_set_mempolicy_home_node
+451common  cachestat   sys_cachestat
diff --git a/arch/ia64/kernel/syscalls/syscall.tbl 
b/arch/ia64/kernel/syscalls/syscall.tbl
index 72c929d9902b..f8c74ffeeefb 100644
--- a/arch/ia64/kernel/syscalls/syscall.tbl
+++ b/arch/ia64/kernel/syscalls/syscall.tbl
@@ -371,3 +371,4 @@
 448common  process_mreleasesys_process_mrelease
 449common  futex_waitv sys_futex_waitv
 450common  set_mempolicy_home_node sys_set_mempolicy_home_node
+451common  cachestat   sys_cachestat
diff --git a/arch/m68k/kernel/syscalls/syscall.tbl 
b/arch/m68k/kernel/syscalls/syscall.tbl
index b1f3940bc298..4f504783371f 100644
--- a/arch/m68k/kernel/syscalls/syscall.tbl
+++ b/arch/m68k/kernel/syscalls/syscall.tbl
@@ -450,3 +450,4 @@
 448common  process_mreleasesys_process_mrelease
 449common  futex_waitv sys_futex_waitv
 450common  set_mempolicy_home_node sys_set_mempolicy_home_node
+451common  cachestat   sys_cachestat
diff --git a/arch/microblaze/kernel/syscalls/syscall.tbl 
b/arch/microblaze/kernel/syscalls/syscall.tbl
index 820145e47350..858d22bf275c 100644
--- a/arch/microblaze/kernel/syscalls/syscall.tbl
+++ b/arch/microblaze/kernel/syscalls/syscall.tbl
@@ -456,3 +456,4 @@
 448common  process_mreleasesys_process_mrelease
 449common  futex_waitv sys_futex_waitv
 450common  set_mempolicy_home_node sys_set_mempolicy_home_node
+451common  cachestat   sys_cachestat
diff --git a/arch/mips/kernel/syscalls/syscall_n32.tbl 
b/arch/mips/kernel/syscalls/syscall_n32.tbl
index 253ff994ed2e..1976317d4e8b 100644
--- a/arch/mips/kernel/syscalls/syscall_n32.tbl
+++ b/arch/mips/kernel/syscalls/syscall_n32.tbl
@@ -389,3 +389,4 @@
 448n32 process_mreleasesys_process_mrelease
 449n32 futex_waitv sys_futex_waitv
 450n32 set_mempolicy_home_node sys_set_mempolicy_home_node
+451n32 cachestat   sys_cachestat
diff --git a/arch/mips/kernel/syscalls/syscall_n64.tbl 
b/arch/mips/kernel/syscalls/syscall_n64.tbl
index 3f1886ad9d80..cfda2511badf 100644
--- a/arch/mips/kernel/syscalls/syscall_n64.tbl
+++ b/arch/mips/kernel/syscalls/syscall_n64.tbl
@@ -365,3 +365,4 @@
 448n64 process_mreleasesys_process_mrelease
 449n64 futex_waitv sys_futex_waitv
 450common  set_mempolicy_home_node sys_set_mempolicy_home_node
+451n64 cachestat   sys_cachestat
diff --git a/arch/mips/kernel/syscalls/syscall_o32.tbl 
b/arch/mips/kernel/syscalls/syscall_o32.tbl
index 8f243e35a7b2..7692234c3768 100644
--- a/arch/mips/kernel/syscalls/syscall_o32.tbl
+++ b/arch/mips/kernel/syscalls/syscall_o32.tbl
@@ -438,3 +438,4 @@
 448o32 process_mreleasesys_process_mrelease
 449o32 futex_waitv sys_futex_waitv
 450

Re: [PATCH 1/4] block:sed-opal: SED Opal keystore

2023-05-10 Thread Jarkko Sakkinen
On Fri May 5, 2023 at 10:43 PM EEST,  wrote:
> From: Greg Joyce 
>
> Add read and write functions that allow SED Opal keys to stored
> in a permanent keystore.

Please be more verbose starting from "Self-Encrypting Drive (SED)",
instead of just "SED", and take time to explain what these keys are.

>
> Signed-off-by: Greg Joyce 
> Reviewed-by: Jonathan Derrick 
> ---
>  block/Makefile   |  2 +-
>  block/sed-opal-key.c | 24 
>  include/linux/sed-opal-key.h | 15 +++
>  3 files changed, 40 insertions(+), 1 deletion(-)
>  create mode 100644 block/sed-opal-key.c
>  create mode 100644 include/linux/sed-opal-key.h
>
> diff --git a/block/Makefile b/block/Makefile
> index 4e01bb71ad6e..464a9f209552 100644
> --- a/block/Makefile
> +++ b/block/Makefile
> @@ -35,7 +35,7 @@ obj-$(CONFIG_BLK_DEV_ZONED) += blk-zoned.o
>  obj-$(CONFIG_BLK_WBT)+= blk-wbt.o
>  obj-$(CONFIG_BLK_DEBUG_FS)   += blk-mq-debugfs.o
>  obj-$(CONFIG_BLK_DEBUG_FS_ZONED)+= blk-mq-debugfs-zoned.o
> -obj-$(CONFIG_BLK_SED_OPAL)   += sed-opal.o
> +obj-$(CONFIG_BLK_SED_OPAL)   += sed-opal.o sed-opal-key.o
>  obj-$(CONFIG_BLK_PM) += blk-pm.o
>  obj-$(CONFIG_BLK_INLINE_ENCRYPTION)  += blk-crypto.o blk-crypto-profile.o \
>  blk-crypto-sysfs.o
> diff --git a/block/sed-opal-key.c b/block/sed-opal-key.c
> new file mode 100644
> index ..16f380164c44
> --- /dev/null
> +++ b/block/sed-opal-key.c
> @@ -0,0 +1,24 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * SED key operations.
> + *
> + * Copyright (C) 2022 IBM Corporation
> + *
> + * These are the accessor functions (read/write) for SED Opal
> + * keys. Specific keystores can provide overrides.
> + *
> + */
> +
> +#include 
> +#include 
> +#include 
> +
> +int __weak sed_read_key(char *keyname, char *key, u_int *keylen)
> +{
> + return -EOPNOTSUPP;
> +}
> +
> +int __weak sed_write_key(char *keyname, char *key, u_int keylen)
> +{
> + return -EOPNOTSUPP;
> +}
> diff --git a/include/linux/sed-opal-key.h b/include/linux/sed-opal-key.h
> new file mode 100644
> index ..c9b1447986d8
> --- /dev/null
> +++ b/include/linux/sed-opal-key.h
> @@ -0,0 +1,15 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * SED key operations.
> + *
> + * Copyright (C) 2022 IBM Corporation
> + *
> + * These are the accessor functions (read/write) for SED Opal
> + * keys. Specific keystores can provide overrides.
> + *
> + */
> +
> +#include 
> +
> +int sed_read_key(char *keyname, char *key, u_int *keylen);
> +int sed_write_key(char *keyname, char *key, u_int keylen);
> -- 
> gjo...@linux.vnet.ibm.com


BR, Jarkko


Re: [PATCH v8 00/10] arm64: Add framework to turn an IPI as NMI

2023-05-10 Thread Doug Anderson
Hi,

On Wed, May 10, 2023 at 9:30 AM Mark Rutland  wrote:
>
> On Wed, May 10, 2023 at 08:28:17AM -0700, Doug Anderson wrote:
> > Hi,
>
> Hi Doug,
>
> > On Wed, Apr 19, 2023 at 3:57 PM Douglas Anderson  
> > wrote:
> > > This is an attempt to resurrect Sumit's old patch series [1] that
> > > allowed us to use the arm64 pseudo-NMI to get backtraces of CPUs and
> > > also to round up CPUs in kdb/kgdb. The last post from Sumit that I
> > > could find was v7, so I called this series v8. I haven't copied all of
> > > his old changelongs here, but you can find them from the link.
> > >
> > > Since v7, I have:
> > > * Addressed the small amount of feedback that was there for v7.
> > > * Rebased.
> > > * Added a new patch that prevents us from spamming the logs with idle
> > >   tasks.
> > > * Added an extra patch to gracefully fall back to regular IPIs if
> > >   pseudo-NMIs aren't there.
> > >
> > > Since there appear to be a few different patches series related to
> > > being able to use NMIs to get stack traces of crashed systems, let me
> > > try to organize them to the best of my understanding:
> > >
> > > a) This series. On its own, a) will (among other things) enable stack
> > >traces of all running processes with the soft lockup detector if
> > >you've enabled the sysctl "kernel.softlockup_all_cpu_backtrace". On
> > >its own, a) doesn't give a hard lockup detector.
> > >
> > > b) A different recently-posted series [2] that adds a hard lockup
> > >detector based on perf. On its own, b) gives a stack crawl of the
> > >locked up CPU but no stack crawls of other CPUs (even if they're
> > >locked too). Together with a) + b) we get everything (full lockup
> > >detect, full ability to get stack crawls).
> > >
> > > c) The old Android "buddy" hard lockup detector [3] that I'm
> > >considering trying to upstream. If b) lands then I believe c) would
> > >be redundant (at least for arm64). c) on its own is really only
> > >useful on arm64 for platforms that can print CPU_DBGPCSR somehow
> > >(see [4]). a) + c) is roughly as good as a) + b).
>
> > It's been 3 weeks and I haven't heard a peep on this series. That
> > means nobody has any objections and it's all good to land, right?
> > Right? :-P
>
> FWIW, there are still longstanding soundness issues in the arm64 pseudo-NMI
> support (and fixing that requires an overhaul of our DAIF / IRQ flag
> management, which I've been chipping away at for a number of releases), so I
> hadn't looked at this in detail yet because the foundations are still somewhat
> dodgy.
>
> I appreciate that this has been around for a while, and it's on my queue to
> look at.

Ah, thanks for the heads up! We've been thinking about turning this on
in production in ChromeOS because it will help us track down a whole
class of field-generated crash reports that are otherwise opaque to
us. It sounds as if maybe that's not a good idea quite yet? Do you
have any idea of how much farther along this needs to go? ...of
course, we've also run into issues with Mediatek devices because they
don't save/restore GICR registers properly [1]. In theory, we might be
able to work around that in the kernel.

In any case, even if there are bugs that would prevent turning this on
for production, it still seems like we could still land this series.
It simply wouldn't do anything until someone turned on pseudo NMIs,
which wouldn't happen till the kinks are worked out.

...actually, I guess I should say that if all the patches of the
current series do land then it actually _would_ still do something,
even without pseudo-NMI. Assuming the last patch looks OK, it would at
least start falling back to using regular IPIs to do backtraces. That
wouldn't get backtraces on hard locked up CPUs but it would be better
than what we have today where we don't get any backtraces. This would
get arm64 on par with arm32...

[1] https://issuetracker.google.com/281831288


Re: [PATCH v8 00/10] arm64: Add framework to turn an IPI as NMI

2023-05-10 Thread Mark Rutland
On Wed, May 10, 2023 at 08:28:17AM -0700, Doug Anderson wrote:
> Hi,

Hi Doug,

> On Wed, Apr 19, 2023 at 3:57 PM Douglas Anderson  
> wrote:
> > This is an attempt to resurrect Sumit's old patch series [1] that
> > allowed us to use the arm64 pseudo-NMI to get backtraces of CPUs and
> > also to round up CPUs in kdb/kgdb. The last post from Sumit that I
> > could find was v7, so I called this series v8. I haven't copied all of
> > his old changelongs here, but you can find them from the link.
> >
> > Since v7, I have:
> > * Addressed the small amount of feedback that was there for v7.
> > * Rebased.
> > * Added a new patch that prevents us from spamming the logs with idle
> >   tasks.
> > * Added an extra patch to gracefully fall back to regular IPIs if
> >   pseudo-NMIs aren't there.
> >
> > Since there appear to be a few different patches series related to
> > being able to use NMIs to get stack traces of crashed systems, let me
> > try to organize them to the best of my understanding:
> >
> > a) This series. On its own, a) will (among other things) enable stack
> >traces of all running processes with the soft lockup detector if
> >you've enabled the sysctl "kernel.softlockup_all_cpu_backtrace". On
> >its own, a) doesn't give a hard lockup detector.
> >
> > b) A different recently-posted series [2] that adds a hard lockup
> >detector based on perf. On its own, b) gives a stack crawl of the
> >locked up CPU but no stack crawls of other CPUs (even if they're
> >locked too). Together with a) + b) we get everything (full lockup
> >detect, full ability to get stack crawls).
> >
> > c) The old Android "buddy" hard lockup detector [3] that I'm
> >considering trying to upstream. If b) lands then I believe c) would
> >be redundant (at least for arm64). c) on its own is really only
> >useful on arm64 for platforms that can print CPU_DBGPCSR somehow
> >(see [4]). a) + c) is roughly as good as a) + b).

> It's been 3 weeks and I haven't heard a peep on this series. That
> means nobody has any objections and it's all good to land, right?
> Right? :-P

FWIW, there are still longstanding soundness issues in the arm64 pseudo-NMI
support (and fixing that requires an overhaul of our DAIF / IRQ flag
management, which I've been chipping away at for a number of releases), so I
hadn't looked at this in detail yet because the foundations are still somewhat
dodgy.

I appreciate that this has been around for a while, and it's on my queue to
look at.

Thanks,
Mark.


Re: [PATCH v8 00/10] arm64: Add framework to turn an IPI as NMI

2023-05-10 Thread Doug Anderson
Hi,

On Wed, Apr 19, 2023 at 3:57 PM Douglas Anderson  wrote:
>
> This is an attempt to resurrect Sumit's old patch series [1] that
> allowed us to use the arm64 pseudo-NMI to get backtraces of CPUs and
> also to round up CPUs in kdb/kgdb. The last post from Sumit that I
> could find was v7, so I called this series v8. I haven't copied all of
> his old changelongs here, but you can find them from the link.
>
> Since v7, I have:
> * Addressed the small amount of feedback that was there for v7.
> * Rebased.
> * Added a new patch that prevents us from spamming the logs with idle
>   tasks.
> * Added an extra patch to gracefully fall back to regular IPIs if
>   pseudo-NMIs aren't there.
>
> Since there appear to be a few different patches series related to
> being able to use NMIs to get stack traces of crashed systems, let me
> try to organize them to the best of my understanding:
>
> a) This series. On its own, a) will (among other things) enable stack
>traces of all running processes with the soft lockup detector if
>you've enabled the sysctl "kernel.softlockup_all_cpu_backtrace". On
>its own, a) doesn't give a hard lockup detector.
>
> b) A different recently-posted series [2] that adds a hard lockup
>detector based on perf. On its own, b) gives a stack crawl of the
>locked up CPU but no stack crawls of other CPUs (even if they're
>locked too). Together with a) + b) we get everything (full lockup
>detect, full ability to get stack crawls).
>
> c) The old Android "buddy" hard lockup detector [3] that I'm
>considering trying to upstream. If b) lands then I believe c) would
>be redundant (at least for arm64). c) on its own is really only
>useful on arm64 for platforms that can print CPU_DBGPCSR somehow
>(see [4]). a) + c) is roughly as good as a) + b).
>
> [1] 
> https://lore.kernel.org/linux-arm-kernel/1604317487-14543-1-git-send-email-sumit.g...@linaro.org/
> [2] 
> https://lore.kernel.org/linux-arm-kernel/20220903093415.15850-1-lecopzer.c...@mediatek.com/
> [3] https://issuetracker.google.com/172213097
> [4] https://issuetracker.google.com/172213129
>
> Changes in v8:
> - dynamic_ipi_setup() and dynamic_ipi_teardown() no longer take cpu param
> - dynamic_ipi_setup() and dynamic_ipi_teardown() no longer take cpu param
> - Add loongarch support, too
> - Removed "#ifdef CONFIG_SMP" since arm64 is always SMP
> - "Tag the arm64 idle functions as __cpuidle" new for v8
> - "Provide a stub kgdb_nmicallback() if !CONFIG_KGDB" new for v8
> - "Fallback to a regular IPI if NMI isn't enabled" new for v8
>
> Douglas Anderson (3):
>   arm64: idle: Tag the arm64 idle functions as __cpuidle
>   kgdb: Provide a stub kgdb_nmicallback() if !CONFIG_KGDB
>   arm64: ipi_nmi: Fallback to a regular IPI if NMI isn't enabled
>
> Sumit Garg (7):
>   arm64: Add framework to turn IPI as NMI
>   irqchip/gic-v3: Enable support for SGIs to act as NMIs
>   arm64: smp: Assign and setup an IPI as NMI
>   nmi: backtrace: Allow runtime arch specific override
>   arm64: ipi_nmi: Add support for NMI backtrace
>   kgdb: Expose default CPUs roundup fallback mechanism
>   arm64: kgdb: Roundup cpus using IPI as NMI
>
>  arch/arm/include/asm/irq.h   |   2 +-
>  arch/arm/kernel/smp.c|   3 +-
>  arch/arm64/include/asm/irq.h |   4 ++
>  arch/arm64/include/asm/nmi.h |  17 +
>  arch/arm64/kernel/Makefile   |   2 +-
>  arch/arm64/kernel/idle.c |   4 +-
>  arch/arm64/kernel/ipi_nmi.c  | 103 +++
>  arch/arm64/kernel/kgdb.c |  18 ++
>  arch/arm64/kernel/smp.c  |   8 +++
>  arch/loongarch/include/asm/irq.h |   2 +-
>  arch/loongarch/kernel/process.c  |   3 +-
>  arch/mips/include/asm/irq.h  |   2 +-
>  arch/mips/kernel/process.c   |   3 +-
>  arch/powerpc/include/asm/nmi.h   |   2 +-
>  arch/powerpc/kernel/stacktrace.c |   3 +-
>  arch/sparc/include/asm/irq_64.h  |   2 +-
>  arch/sparc/kernel/process_64.c   |   4 +-
>  arch/x86/include/asm/irq.h   |   2 +-
>  arch/x86/kernel/apic/hw_nmi.c|   3 +-
>  drivers/irqchip/irq-gic-v3.c |  29 ++---
>  include/linux/kgdb.h |  13 
>  include/linux/nmi.h  |  12 ++--
>  kernel/debug/debug_core.c|   8 ++-
>  23 files changed, 217 insertions(+), 32 deletions(-)

It's been 3 weeks and I haven't heard a peep on this series. That
means nobody has any objections and it's all good to land, right?
Right? :-P

Seriously, though, I really think we should figure out how to get this
landed. There's obviously interest from several different parties and
I'm chomping at the bit waiting to spin this series according to your
wishes. I also don't think there's anything super scary/ugly here. IMO
the ideal situation is that folks are OK with the idea presented in
the last patch in the series ("arm64: ipi_nmi: Fallback to a regular
IPI if NMI isn't enabled") and then I can re-spin this series to be
_much_ simpler. That being said, I'm also OK with 

[Bug 217350] kdump kernel hangs in powerkvm guest

2023-05-10 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=217350

--- Comment #2 from Cédric Le Goater (c...@kaod.org) ---
Hello  Pingfan,

This was a large series. Did you identify the exact patch ?  It should be 
related to the powerpc/pseries/pci or powerpc/xive subsystem.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

Re: [PATCH 01/23] arm: allow pte_offset_map[_lock]() to fail

2023-05-10 Thread Matthew Wilcox
On Tue, May 09, 2023 at 09:42:44PM -0700, Hugh Dickins wrote:
> diff --git a/arch/arm/lib/uaccess_with_memcpy.c 
> b/arch/arm/lib/uaccess_with_memcpy.c
> index e4c2677cc1e9..2f6163f05e93 100644
> --- a/arch/arm/lib/uaccess_with_memcpy.c
> +++ b/arch/arm/lib/uaccess_with_memcpy.c
> @@ -74,6 +74,9 @@ pin_page_for_write(const void __user *_addr, pte_t **ptep, 
> spinlock_t **ptlp)
>   return 0;
>  
>   pte = pte_offset_map_lock(current->mm, pmd, addr, );
> + if (unlikely(!pte))
> + return 0;

Failing seems like the wrong thig to do if we transitioned from a PTE
to PMD here?  Looks to me like we should goto a new label right after
the 'pmd = pmd_offset(pud, addr);', no?



Re: [PATCH 14/23] riscv/hugetlb: pte_alloc_huge() pte_offset_huge()

2023-05-10 Thread Palmer Dabbelt

On Tue, 09 May 2023 21:59:57 PDT (-0700), hu...@google.com wrote:

pte_alloc_map() expects to be followed by pte_unmap(), but hugetlb omits
that: to keep balance in future, use the recently added pte_alloc_huge()
instead; with pte_offset_huge() a better name for pte_offset_kernel().

Signed-off-by: Hugh Dickins 
---
 arch/riscv/mm/hugetlbpage.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/riscv/mm/hugetlbpage.c b/arch/riscv/mm/hugetlbpage.c
index a163a3e0f0d4..80926946759f 100644
--- a/arch/riscv/mm/hugetlbpage.c
+++ b/arch/riscv/mm/hugetlbpage.c
@@ -43,7 +43,7 @@ pte_t *huge_pte_alloc(struct mm_struct *mm,

for_each_napot_order(order) {
if (napot_cont_size(order) == sz) {
-   pte = pte_alloc_map(mm, pmd, addr & 
napot_cont_mask(order));
+   pte = pte_alloc_huge(mm, pmd, addr & 
napot_cont_mask(order));
break;
}
}
@@ -90,7 +90,7 @@ pte_t *huge_pte_offset(struct mm_struct *mm,

for_each_napot_order(order) {
if (napot_cont_size(order) == sz) {
-   pte = pte_offset_kernel(pmd, addr & 
napot_cont_mask(order));
+   pte = pte_offset_huge(pmd, addr & 
napot_cont_mask(order));
break;
}
}


Acked-by: Palmer Dabbelt 


[Bug 217427] Linux 6.3.1 + AMD RX 570 on ppc64le 4K: Kernel attempted to read user page (1128) - exploit attempt? (uid: 0)

2023-05-10 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=217427

--- Comment #1 from darkbasic (darkba...@linuxsystems.it) ---
Created attachment 304241
  --> https://bugzilla.kernel.org/attachment.cgi?id=304241=edit
journalctl_6.3.1.log

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 217427] New: Linux 6.3.1 + AMD RX 570 on ppc64le 4K: Kernel attempted to read user page (1128) - exploit attempt? (uid: 0)

2023-05-10 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=217427

Bug ID: 217427
   Summary: Linux 6.3.1 + AMD RX 570 on ppc64le 4K: Kernel
attempted to read user page (1128) - exploit attempt?
(uid: 0)
   Product: Platform Specific/Hardware
   Version: 2.5
  Hardware: All
OS: Linux
Status: NEW
  Severity: normal
  Priority: P3
 Component: PPC-64
  Assignee: platform_ppc...@kernel-bugs.osdl.org
  Reporter: darkba...@linuxsystems.it
Regression: No

I'm using Gentoo Linux on a Raptor CS Talos 2 ppc64le, GPU is an AMD RX 570. So
far the past dozen of kernels (up to 6.2.14) worked flawlessly, but with 6.3.1
I don't get any video output and I get the following in journalctl:

May 10 15:09:01 talos2 kernel: Kernel attempted to read user page (1128) -
exploit attempt? (uid: 0)
May 10 15:09:01 talos2 kernel: BUG: Unable to handle kernel data access on read
at 0x1128
May 10 15:09:01 talos2 kernel: Faulting instruction address: 0xc0080d1a805c
May 10 15:09:01 talos2 kernel: Oops: Kernel access of bad area, sig: 11 [#1]
May 10 15:09:01 talos2 kernel: LE PAGE_SIZE=4K MMU=Radix SMP NR_CPUS=512 NUMA
PowerNV
May 10 15:09:01 talos2 kernel: Modules linked in: rfkill(+) 8021q garp mrp stp
llc binfmt_misc amdgpu uvcvideo uvc videobuf2_vmalloc videobuf2_memops
gpu_sched snd_hda_codec_hdmi i2c_algo_bit at24(+) videobuf2_v4l2 drm_ttm_helper
regmap_i2c videobuf2_common ttm snd_usb_audio drm_di>
May 10 15:09:01 talos2 kernel: CPU: 0 PID: 188 Comm: kworker/0:3 Not tainted
6.3.1-gentoo-dist #1
May 10 15:09:01 talos2 kernel: Hardware name: T2P9S01 REV 1.01 POWER9 0x4e1202
opal:skiboot-9858186 PowerNV
May 10 15:09:01 talos2 kernel: Workqueue: events_long
drm_dp_check_and_send_link_address [drm_display_helper]
May 10 15:09:01 talos2 kernel: NIP:  c0080d1a805c LR: c0080d1a8018 CTR:
c0c87900
May 10 15:09:01 talos2 kernel: REGS: cbeb3370 TRAP: 0300   Not tainted 
(6.3.1-gentoo-dist)
May 10 15:09:01 talos2 kernel: MSR:  9280b033
  CR: 88048223  XER: 005a
May 10 15:09:01 talos2 kernel: CFAR: c0c87980 DAR: 1128
DSISR: 4000 IRQMASK: 0 
   GPR00: c0080d1a8018 cbeb3610
c0080d690f00  
   GPR04: 0002 c0080d6297c0
 c0002a00b740 
   GPR08:  1124
 c0080d431560 
   GPR12: c0c87900 c2a6b000
c0170ad8 c0001a460310 
   GPR16: 0045 c00022858388
c00026000340 0001 
   GPR20:  0001
c000260001a0 4000 
   GPR24: 4000 c0002610
c000228580b8 fffd 
   GPR28:  c000228580a0
c00022856000 c00022858000 
May 10 15:09:01 talos2 kernel: NIP [c0080d1a805c]
is_synaptics_cascaded_panamera+0x244/0x600 [amdgpu]
May 10 15:09:01 talos2 kernel: LR [c0080d1a8018]
is_synaptics_cascaded_panamera+0x200/0x600 [amdgpu]
May 10 15:09:01 talos2 kernel: Call Trace:
May 10 15:09:01 talos2 kernel: [cbeb3610] [c0080d1a8018]
is_synaptics_cascaded_panamera+0x200/0x600 [amdgpu] (unreliable)
May 10 15:09:01 talos2 kernel: [cbeb36d0] [c0080b7c2b18]
drm_helper_probe_single_connector_modes+0x230/0x698 [drm_kms_helper]
May 10 15:09:01 talos2 kernel: [cbeb3810] [c0c57174]
drm_client_modeset_probe+0x2b4/0x16c0
May 10 15:09:01 talos2 kernel: [cbeb3a10] [c0080b7c7a30]
__drm_fb_helper_initial_config_and_unlock+0x68/0x640 [drm_kms_helper]
May 10 15:09:01 talos2 kernel: [cbeb3af0] [c0080b7c5b08]
drm_fbdev_client_hotplug+0x40/0x1d0 [drm_kms_helper]
May 10 15:09:01 talos2 kernel: [cbeb3b70] [c0c55480]
drm_client_dev_hotplug+0x120/0x1b0
May 10 15:09:01 talos2 kernel: [cbeb3c00] [c0080b7c1130]
drm_kms_helper_hotplug_event+0x58/0x80 [drm_kms_helper]
May 10 15:09:01 talos2 kernel: [cbeb3c30] [c0080b80b298]
drm_dp_check_and_send_link_address+0x330/0x3a0 [drm_display_helper]
May 10 15:09:01 talos2 kernel: [cbeb3cd0] [c0162d84]
process_one_work+0x2f4/0x580
May 10 15:09:01 talos2 kernel: [cbeb3d70] [c01630b8]
worker_thread+0xa8/0x600
May 10 15:09:01 talos2 kernel: [cbeb3e00] [c0170bf4]
kthread+0x124/0x130
May 10 15:09:01 talos2 kernel: [cbeb3e50] [c000dd14]
ret_from_kernel_thread+0x5c/0x64
May 10 15:09:01 talos2 kernel: --- interrupt: 0 at 0x0
May 10 15:09:01 talos2 kernel: NIP:   LR:  CTR:

May 10 15:09:01 talos2 kernel: REGS: cbeb3e80 TRAP:    Not tainted 

Re: [PATCH 21/23] x86: Allow get_locked_pte() to fail

2023-05-10 Thread Peter Zijlstra
On Tue, May 09, 2023 at 10:08:37PM -0700, Hugh Dickins wrote:
> In rare transient cases, not yet made possible, pte_offset_map() and
> pte_offset_map_lock() may not find a page table: handle appropriately.
> 
> Signed-off-by: Hugh Dickins 
> ---
>  arch/x86/kernel/ldt.c | 6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/kernel/ldt.c b/arch/x86/kernel/ldt.c
> index 525876e7b9f4..eb844549cd83 100644
> --- a/arch/x86/kernel/ldt.c
> +++ b/arch/x86/kernel/ldt.c
> @@ -367,8 +367,10 @@ static void unmap_ldt_struct(struct mm_struct *mm, 
> struct ldt_struct *ldt)
>  
>   va = (unsigned long)ldt_slot_va(ldt->slot) + offset;
>   ptep = get_locked_pte(mm, va, );
> - pte_clear(mm, va, ptep);
> - pte_unmap_unlock(ptep, ptl);
> + if (ptep) {
> + pte_clear(mm, va, ptep);
> + pte_unmap_unlock(ptep, ptl);
> + }
>   }

Ow geez, now I have to go remember how the whole PTI/LDT crud worked :/

At first glance this seems wrong; we can't just not unmap the LDT if we
can't find it in a hurry. Also, IIRC this isn't in fact a regular user
mapping, so it should not be subject to THP induced seizures.

... memory bubbles back ... for PTI kernels we need to map this in the
user and kernel page-tables because obviously userspace needs to be able
to have access to the LDT. But it is not directly acessible by
userspace. It lives in the cpu_entry_area as a virtual map of the real
kernel allocation, and this virtual address is used for LLDT.
Modification is done through sys_modify_ldt().

I think I would feel much better if this were something like:

if (!WARN_ON_ONCE(!ptep))

This really shouldn't fail and if it does, simply skipping it isn't the
right thing either.


Re: [PATCH 14/23] riscv/hugetlb: pte_alloc_huge() pte_offset_huge()

2023-05-10 Thread Alexandre Ghiti

Hi Hugh,

On 5/10/23 06:59, Hugh Dickins wrote:

pte_alloc_map() expects to be followed by pte_unmap(), but hugetlb omits
that: to keep balance in future, use the recently added pte_alloc_huge()
instead; with pte_offset_huge() a better name for pte_offset_kernel().

Signed-off-by: Hugh Dickins 
---
  arch/riscv/mm/hugetlbpage.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/riscv/mm/hugetlbpage.c b/arch/riscv/mm/hugetlbpage.c
index a163a3e0f0d4..80926946759f 100644
--- a/arch/riscv/mm/hugetlbpage.c
+++ b/arch/riscv/mm/hugetlbpage.c
@@ -43,7 +43,7 @@ pte_t *huge_pte_alloc(struct mm_struct *mm,
  
  	for_each_napot_order(order) {

if (napot_cont_size(order) == sz) {
-   pte = pte_alloc_map(mm, pmd, addr & 
napot_cont_mask(order));
+   pte = pte_alloc_huge(mm, pmd, addr & 
napot_cont_mask(order));
break;
}
}
@@ -90,7 +90,7 @@ pte_t *huge_pte_offset(struct mm_struct *mm,
  
  	for_each_napot_order(order) {

if (napot_cont_size(order) == sz) {
-   pte = pte_offset_kernel(pmd, addr & 
napot_cont_mask(order));
+   pte = pte_offset_huge(pmd, addr & 
napot_cont_mask(order));
break;
}
}



Reviewed-by: Alexandre Ghiti 

Thanks,

Alex



Re: [PATCH 05/23] m68k: allow pte_offset_map[_lock]() to fail

2023-05-10 Thread Geert Uytterhoeven
Hi Hugh,

Thanks for your patch!

On Wed, May 10, 2023 at 6:48 AM Hugh Dickins  wrote:
> In rare transient cases, not yet made possible, pte_offset_map() and
> pte_offset_map_lock() may not find a page table: handle appropriately.
>
> Restructure cf_tlb_miss() with a pte_unmap() (previously omitted)
> at label out, followed by one local_irq_restore() for all.

That's a bug fix, which should be a separate patch?

>
> Signed-off-by: Hugh Dickins 


> --- a/arch/m68k/include/asm/mmu_context.h
> +++ b/arch/m68k/include/asm/mmu_context.h
> @@ -99,7 +99,7 @@ static inline void load_ksp_mmu(struct task_struct *task)
> p4d_t *p4d;
> pud_t *pud;
> pmd_t *pmd;
> -   pte_t *pte;
> +   pte_t *pte = NULL;
> unsigned long mmuar;
>
> local_irq_save(flags);
> @@ -139,7 +139,7 @@ static inline void load_ksp_mmu(struct task_struct *task)
>
> pte = (mmuar >= PAGE_OFFSET) ? pte_offset_kernel(pmd, mmuar)
>  : pte_offset_map(pmd, mmuar);
> -   if (pte_none(*pte) || !pte_present(*pte))
> +   if (!pte || pte_none(*pte) || !pte_present(*pte))
> goto bug;

If the absence of a pte is to become a non-abnormal case, it should
probably jump to "end" instead, to avoid spamming the kernel log.

>
> set_pte(pte, pte_mkyoung(*pte));
> @@ -161,6 +161,8 @@ static inline void load_ksp_mmu(struct task_struct *task)
>  bug:
> pr_info("ksp load failed: mm=0x%p ksp=0x08%lx\n", mm, mmuar);
>  end:
> +   if (pte && mmuar < PAGE_OFFSET)
> +   pte_unmap(pte);

Is this also a bugfix, not mentioned in the patch description?

> local_irq_restore(flags);
>  }
>

Gr{oetje,eeting}s,

Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds


Re: [PATCH 00/23] arch: allow pte_offset_map[_lock]() to fail

2023-05-10 Thread Matthew Wilcox
On Tue, May 09, 2023 at 09:39:13PM -0700, Hugh Dickins wrote:
> Two: pte_offset_map() will need to do an rcu_read_lock(), with the
> corresponding rcu_read_unlock() in pte_unmap().  But most architectures
> never supported CONFIG_HIGHPTE, so some don't always call pte_unmap()
> after pte_offset_map(), or have used userspace pte_offset_map() where
> pte_offset_kernel() is more correct.  No problem in the current tree,
> but a problem once an rcu_read_unlock() will be needed to keep balance.

Hi Hugh,

I shall have to spend some time looking at these patches, but at LSFMM
just a few hours ago, I proposed and nobody objected to removing
CONFIG_HIGHPTE.  I don't intend to take action on that consensus
immediately, so I can certainly wait until your patches are applied, but
if this information simplifies what you're doing, feel free to act on it.