[Bug 217427] Linux 6.3.1 + AMD RX 570 on ppc64le 4K: Kernel attempted to read user page (1128) - exploit attempt? (uid: 0)
https://bugzilla.kernel.org/show_bug.cgi?id=217427 --- Comment #2 from Artem S. Tashkinov (a...@gmx.com) --- Please also report to https://gitlab.freedesktop.org/drm/amd/-/issues Performing git bisect would be nice: https://docs.kernel.org/admin-guide/bug-bisect.html -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
Re: linux-next: build warnings in powercp build
Stephen Rothwell writes: > Hi all, > > Today's (and a few before) linux-next build (powerpc pseries_le_defconfig) > produced these warnings: Those aren't really warnings, it's just merge_config.sh telling you what it's doing. The whole point of the generated configs is to override those symbols. Looks like there is a way to silence them, by using merge_into_defconfig_override. cheers > Building: powerpc pseries_le_defconfig > Using /home/sfr/next/next/arch/powerpc/configs/ppc64_defconfig as base > Merging /home/sfr/next/next/arch/powerpc/configs/le.config > Merging /home/sfr/next/next/arch/powerpc/configs/guest.config > Value of CONFIG_VIRTIO_BLK is redefined by fragment > /home/sfr/next/next/arch/powerpc/configs/guest.config: > Previous value: CONFIG_VIRTIO_BLK=m > New value: CONFIG_VIRTIO_BLK=y > > Value of CONFIG_SCSI_VIRTIO is redefined by fragment > /home/sfr/next/next/arch/powerpc/configs/guest.config: > Previous value: CONFIG_SCSI_VIRTIO=m > New value: CONFIG_SCSI_VIRTIO=y > > Value of CONFIG_VIRTIO_NET is redefined by fragment > /home/sfr/next/next/arch/powerpc/configs/guest.config: > Previous value: CONFIG_VIRTIO_NET=m > New value: CONFIG_VIRTIO_NET=y > > Value of CONFIG_VIRTIO_CONSOLE is redefined by fragment > /home/sfr/next/next/arch/powerpc/configs/guest.config: > Previous value: CONFIG_VIRTIO_CONSOLE=m > New value: CONFIG_VIRTIO_CONSOLE=y > > Value of CONFIG_VIRTIO_PCI is redefined by fragment > /home/sfr/next/next/arch/powerpc/configs/guest.config: > Previous value: CONFIG_VIRTIO_PCI=m > New value: CONFIG_VIRTIO_PCI=y > > Value of CONFIG_VIRTIO_BALLOON is redefined by fragment > /home/sfr/next/next/arch/powerpc/configs/guest.config: > Previous value: CONFIG_VIRTIO_BALLOON=m > New value: CONFIG_VIRTIO_BALLOON=y > > Value of CONFIG_VHOST_NET is redefined by fragment > /home/sfr/next/next/arch/powerpc/configs/guest.config: > Previous value: CONFIG_VHOST_NET=m > New value: CONFIG_VHOST_NET=y > > Value of CONFIG_IBMVETH is redefined by fragment > /home/sfr/next/next/arch/powerpc/configs/guest.config: > Previous value: CONFIG_IBMVETH=m > New value: CONFIG_IBMVETH=y > > Value of CONFIG_IBMVNIC is redefined by fragment > /home/sfr/next/next/arch/powerpc/configs/guest.config: > Previous value: CONFIG_IBMVNIC=m > New value: CONFIG_IBMVNIC=y > > I am not sure exactly which change(s) introduced these warnings. > > -- > Cheers, > Stephen Rothwell
Re: [PATCH 00/23] arch: allow pte_offset_map[_lock]() to fail
On Wed, 10 May 2023, Matthew Wilcox wrote: > On Tue, May 09, 2023 at 09:39:13PM -0700, Hugh Dickins wrote: > > Two: pte_offset_map() will need to do an rcu_read_lock(), with the > > corresponding rcu_read_unlock() in pte_unmap(). But most architectures > > never supported CONFIG_HIGHPTE, so some don't always call pte_unmap() > > after pte_offset_map(), or have used userspace pte_offset_map() where > > pte_offset_kernel() is more correct. No problem in the current tree, > > but a problem once an rcu_read_unlock() will be needed to keep balance. > > Hi Hugh, > > I shall have to spend some time looking at these patches, but at LSFMM > just a few hours ago, I proposed and nobody objected to removing > CONFIG_HIGHPTE. I don't intend to take action on that consensus > immediately, so I can certainly wait until your patches are applied, but > if this information simplifies what you're doing, feel free to act on it. Thanks a lot, Matthew: very considerate, as usual. Yes, I did see your "Whither Highmem?" (wither highmem!) proposal on the list, and it did make me think, better get these patches and preview out soon, before you get to vanish pte_unmap() altogether. HIGHMEM or not, HIGHPTE or not, I think pte_offset_map() and pte_unmap() still have an important role to play. I don't really understand why you're going down a remove-CONFIG_HIGHPTE route: I thought you were motivated by the awkardness of kmap on large folios; but I don't see how removing HIGHPTE helps with that at all (unless you have a "large page tables" effort in mind, but I doubt it). But I've no investment in CONFIG_HIGHPTE if people think now is the time to remove it: I disagree, but wouldn't miss it myself - so long as you leave pte_offset_map() and pte_unmap() (under whatever names). I don't think removing CONFIG_HIGHPTE will simplify what I'm doing. For a moment it looked like it would: the PAE case is nasty (and our data centres have not been on PAE for a long time, so it wasn't a problem I had to face before); and knowing pmd_high must be 0 for a page table looked like it would help, but now I'm not so sure of that (hmm, I'm changing my mind again as I write). Peter's pmdp_get_lockless() does rely for complete correctness on interrupts being disabled, and I suspect that I may be forced in the PAE case to do so briefly; but detest that notion. For now I'm just deferring it, hoping for a better idea before third series finalized. I mention this (and Cc Peter) in passing: don't want this arch thread to go down into that rabbit hole: we can start a fresh thread on it if you wish, but right now my priority is commit messages for the second series, rather than solving (or even detailing) the PAE problem. Hugh
Re: [PATCH 01/23] arm: allow pte_offset_map[_lock]() to fail
On Wed, 10 May 2023, Matthew Wilcox wrote: > On Tue, May 09, 2023 at 09:42:44PM -0700, Hugh Dickins wrote: > > diff --git a/arch/arm/lib/uaccess_with_memcpy.c > > b/arch/arm/lib/uaccess_with_memcpy.c > > index e4c2677cc1e9..2f6163f05e93 100644 > > --- a/arch/arm/lib/uaccess_with_memcpy.c > > +++ b/arch/arm/lib/uaccess_with_memcpy.c > > @@ -74,6 +74,9 @@ pin_page_for_write(const void __user *_addr, pte_t > > **ptep, spinlock_t **ptlp) > > return 0; > > > > pte = pte_offset_map_lock(current->mm, pmd, addr, ); > > + if (unlikely(!pte)) > > + return 0; > > Failing seems like the wrong thig to do if we transitioned from a PTE > to PMD here? Looks to me like we should goto a new label right after > the 'pmd = pmd_offset(pud, addr);', no? I'm pretty sure it's right as is; but probably more by luck than care - I do not think I studied this code as closely as you have now made me do; and it's clear that this is a piece of code where rare transient issues could come up, and must be handled correctly. Thank you for making me look again. The key is in the callers of pin_page_for_write(): __copy_to_user_memcpy() and __clear_user_memset(). They're doing "while (!pin_page_for_write())" loops - they hope for the fast path of getting pte_lock or pmd_lock on the page, and doing a __memcpy() or __memset() to the user address; but if anything goes "wrong", a __put_user() to fault in the page (or fail) then pin_page_for_write() again. "if (unlikely(!pte)) return 0" says that the expected fast path did not succeed, so please __put_user() and have another go. It is somewhere I could have done a "goto again", but that would be superfluous when it's already designed that way at the outer level. Hugh
Re: [PATCH] cachestat: wire up cachestat for other architectures
Nhat Pham writes: > cachestat is previously only wired in for x86 (and architectures using > the generic unistd.h table): > > https://lore.kernel.org/lkml/20230503013608.2431726-1-npha...@gmail.com/ > > This patch wires cachestat in for all the other architectures. > > Signed-off-by: Nhat Pham > --- > arch/alpha/kernel/syscalls/syscall.tbl | 1 + > arch/arm/tools/syscall.tbl | 1 + > arch/ia64/kernel/syscalls/syscall.tbl | 1 + > arch/m68k/kernel/syscalls/syscall.tbl | 1 + > arch/microblaze/kernel/syscalls/syscall.tbl | 1 + > arch/mips/kernel/syscalls/syscall_n32.tbl | 1 + > arch/mips/kernel/syscalls/syscall_n64.tbl | 1 + > arch/mips/kernel/syscalls/syscall_o32.tbl | 1 + > arch/parisc/kernel/syscalls/syscall.tbl | 1 + > arch/powerpc/kernel/syscalls/syscall.tbl| 1 + With the change to the selftest (see my other mail), I tested this on powerpc and all tests pass. Tested-by: Michael Ellerman (powerpc) cheers
Re: [PATCH v13 3/3] selftests: Add selftests for cachestat
Nhat Pham writes: > Test cachestat on a newly created file, /dev/ files, and /proc/ files. > Also test on a shmem file (which can also be tested with huge pages > since tmpfs supports huge pages). > > Signed-off-by: Nhat Pham ... > diff --git a/tools/testing/selftests/cachestat/test_cachestat.c > b/tools/testing/selftests/cachestat/test_cachestat.c > new file mode 100644 > index ..c3823b809c25 > --- /dev/null > +++ b/tools/testing/selftests/cachestat/test_cachestat.c > @@ -0,0 +1,258 @@ > +// SPDX-License-Identifier: GPL-2.0 > +#define _GNU_SOURCE > + > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > + > +#include "../kselftest.h" > + > +static const char * const dev_files[] = { > + "/dev/zero", "/dev/null", "/dev/urandom", > + "/proc/version", "/proc" > +}; > +static const int cachestat_nr = 451; > + > +void print_cachestat(struct cachestat *cs) > +{ > + ksft_print_msg( > + "Using cachestat: Cached: %lu, Dirty: %lu, Writeback: %lu, Evicted: > %lu, Recently Evicted: %lu\n", > + cs->nr_cache, cs->nr_dirty, cs->nr_writeback, > + cs->nr_evicted, cs->nr_recently_evicted); > +} > + > +bool write_exactly(int fd, size_t filesize) > +{ > + char data[filesize]; On kernels with 64K pages (powerpc at least), this tries to allocate 64MB on the stack which segfaults. Allocating data with malloc avoids the problem and allows the test to pass. Looks like this commit is still in mm-unstable, so maybe Andrew can squash the incremental diff below in, if it looks OK to you. The diff is a bit big because I unindented the body of the function. cheers diff --git a/tools/testing/selftests/cachestat/test_cachestat.c b/tools/testing/selftests/cachestat/test_cachestat.c index 9be2262e5c17..54d09b820ed4 100644 --- a/tools/testing/selftests/cachestat/test_cachestat.c +++ b/tools/testing/selftests/cachestat/test_cachestat.c @@ -31,48 +31,59 @@ void print_cachestat(struct cachestat *cs) bool write_exactly(int fd, size_t filesize) { - char data[filesize]; - bool ret = true; int random_fd = open("/dev/urandom", O_RDONLY); + char *cursor, *data; + int remained; + bool ret; if (random_fd < 0) { ksft_print_msg("Unable to access urandom.\n"); ret = false; goto out; - } else { - int remained = filesize; - char *cursor = data; + } - while (remained) { - ssize_t read_len = read(random_fd, cursor, remained); + data = malloc(filesize); + if (!data) { + ksft_print_msg("Unable to allocate data.\n"); + ret = false; + goto close_random_fd; + } - if (read_len <= 0) { - ksft_print_msg("Unable to read from urandom.\n"); - ret = false; - goto close_random_fd; - } + remained = filesize; + cursor = data; - remained -= read_len; - cursor += read_len; + while (remained) { + ssize_t read_len = read(random_fd, cursor, remained); + + if (read_len <= 0) { + ksft_print_msg("Unable to read from urandom.\n"); + ret = false; + goto out_free_data; } - /* write random data to fd */ - remained = filesize; - cursor = data; - while (remained) { - ssize_t write_len = write(fd, cursor, remained); + remained -= read_len; + cursor += read_len; + } - if (write_len <= 0) { - ksft_print_msg("Unable write random data to file.\n"); - ret = false; - goto close_random_fd; - } + /* write random data to fd */ + remained = filesize; + cursor = data; + while (remained) { + ssize_t write_len = write(fd, cursor, remained); - remained -= write_len; - cursor += write_len; + if (write_len <= 0) { + ksft_print_msg("Unable write random data to file.\n"); + ret = false; + goto out_free_data; } + + remained -= write_len; + cursor += write_len; } + ret = true; +out_free_data: + free(data); close_random_fd: close(random_fd); out:
Re: [PATCH 21/23] x86: Allow get_locked_pte() to fail
On Wed, 10 May 2023, Peter Zijlstra wrote: > On Tue, May 09, 2023 at 10:08:37PM -0700, Hugh Dickins wrote: > > In rare transient cases, not yet made possible, pte_offset_map() and > > pte_offset_map_lock() may not find a page table: handle appropriately. > > > > Signed-off-by: Hugh Dickins > > --- > > arch/x86/kernel/ldt.c | 6 -- > > 1 file changed, 4 insertions(+), 2 deletions(-) > > > > diff --git a/arch/x86/kernel/ldt.c b/arch/x86/kernel/ldt.c > > index 525876e7b9f4..eb844549cd83 100644 > > --- a/arch/x86/kernel/ldt.c > > +++ b/arch/x86/kernel/ldt.c > > @@ -367,8 +367,10 @@ static void unmap_ldt_struct(struct mm_struct *mm, > > struct ldt_struct *ldt) > > > > va = (unsigned long)ldt_slot_va(ldt->slot) + offset; > > ptep = get_locked_pte(mm, va, ); > > - pte_clear(mm, va, ptep); > > - pte_unmap_unlock(ptep, ptl); > > + if (ptep) { > > + pte_clear(mm, va, ptep); > > + pte_unmap_unlock(ptep, ptl); > > + } > > } > > Ow geez, now I have to go remember how the whole PTI/LDT crud worked :/ I apologize for sending you back there! > > At first glance this seems wrong; we can't just not unmap the LDT if we > can't find it in a hurry. Also, IIRC this isn't in fact a regular user > mapping, so it should not be subject to THP induced seizures. > > ... memory bubbles back ... for PTI kernels we need to map this in the > user and kernel page-tables because obviously userspace needs to be able > to have access to the LDT. But it is not directly acessible by > userspace. It lives in the cpu_entry_area as a virtual map of the real > kernel allocation, and this virtual address is used for LLDT. > Modification is done through sys_modify_ldt(). And there must be a user-style page table backing that cpu_entry_area, because the use of get_locked_pte() and pte_unmap_unlock() implies that there's a user page table (struct page containing spinlock if config says so) rather than just a kernel page table mapping it. > > I think I would feel much better if this were something like: > > if (!WARN_ON_ONCE(!ptep)) > > This really shouldn't fail and if it does, simply skipping it isn't the > right thing either. Sure, I'll gladly make that change when I respin - not immediately, let's get more feedback on this arch series first, but maybe in a week's time. Thanks for looking so quickly, Peter: I didn't Cc you on this particular series, but shall certainly be doing so on the ones that follow, because a few of those patches go into interesting pmdp_get_lockless() territory. Hugh
Re: [PATCH 05/23] m68k: allow pte_offset_map[_lock]() to fail
On Wed, 10 May 2023, Geert Uytterhoeven wrote: > Hi Hugh, > > Thanks for your patch! And thank you for looking so quickly, Geert. > > On Wed, May 10, 2023 at 6:48 AM Hugh Dickins wrote: > > In rare transient cases, not yet made possible, pte_offset_map() and > > pte_offset_map_lock() may not find a page table: handle appropriately. > > > > Restructure cf_tlb_miss() with a pte_unmap() (previously omitted) > > at label out, followed by one local_irq_restore() for all. > > That's a bug fix, which should be a separate patch? No, that's not a bug fix for the current tree, since m68k does not offer CONFIG_HIGHPTE, so pte_unmap() is never anything but a no-op for m68k (see include/linux/pgtable.h). But I want to change pte_unmap() to do something even without CONFIG_HIGHPTE, so have to fix up any such previously harmless omissions in this series first. > > > > > Signed-off-by: Hugh Dickins > > > > --- a/arch/m68k/include/asm/mmu_context.h > > +++ b/arch/m68k/include/asm/mmu_context.h > > @@ -99,7 +99,7 @@ static inline void load_ksp_mmu(struct task_struct *task) > > p4d_t *p4d; > > pud_t *pud; > > pmd_t *pmd; > > - pte_t *pte; > > + pte_t *pte = NULL; > > unsigned long mmuar; > > > > local_irq_save(flags); > > @@ -139,7 +139,7 @@ static inline void load_ksp_mmu(struct task_struct > > *task) > > > > pte = (mmuar >= PAGE_OFFSET) ? pte_offset_kernel(pmd, mmuar) > > : pte_offset_map(pmd, mmuar); > > - if (pte_none(*pte) || !pte_present(*pte)) > > + if (!pte || pte_none(*pte) || !pte_present(*pte)) > > goto bug; > > If the absence of a pte is to become a non-abnormal case, it should > probably jump to "end" instead, to avoid spamming the kernel log. I don't think so (but of course it's hard for you to tell, without seeing all completed series of series). If pmd_none(*pmd) can safely goto bug just above, and pte_none(*pte) goto bug here, well, the !pte case is going to be stranger than either of those. My understanding of this function, load_ksp_mmu(), is that it's dealing at context switch with a part of userspace which very much needs to be present: whatever keeps that from being swapped out or migrated at present, will be sure to keep the !pte case away - we cannot steal its page table just at random (and a THP on m68k would be surprising too). Though there is one case I can think of which will cause !pte here, and so goto bug: if the pmd entry has got corrupted, and counts as pmd_bad(), which will be tested (and cleared) in pte_offset_map(). But it is okay to report a bug in that case. I can certainly change this to goto end instead if you still prefer, no problem; but I'd rather keep it as is, if only for me to be proved wrong by you actually seeing spam there. > > > > > set_pte(pte, pte_mkyoung(*pte)); > > @@ -161,6 +161,8 @@ static inline void load_ksp_mmu(struct task_struct > > *task) > > bug: > > pr_info("ksp load failed: mm=0x%p ksp=0x08%lx\n", mm, mmuar); > > end: > > + if (pte && mmuar < PAGE_OFFSET) > > + pte_unmap(pte); > > Is this also a bugfix, not mentioned in the patch description? I'm not sure whether you're referring to the pte_unmap() which we already discussed above, or you're seeing something else in addition; but I don't think there's a bugfix here, just a rearrangement because we now want lots of cases to do the pte_unmap() and local_irq_restore(). Hugh > > > local_irq_restore(flags); > > } > > > > Gr{oetje,eeting}s, > > Geert > > -- > Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- > ge...@linux-m68k.org > > In personal conversations with technical people, I call myself a hacker. But > when I'm talking to journalists I just say "programmer" or something like > that. > -- Linus Torvalds
linux-next: build warnings in powercp build
Hi all, Today's (and a few before) linux-next build (powerpc pseries_le_defconfig) produced these warnings: Building: powerpc pseries_le_defconfig Using /home/sfr/next/next/arch/powerpc/configs/ppc64_defconfig as base Merging /home/sfr/next/next/arch/powerpc/configs/le.config Merging /home/sfr/next/next/arch/powerpc/configs/guest.config Value of CONFIG_VIRTIO_BLK is redefined by fragment /home/sfr/next/next/arch/powerpc/configs/guest.config: Previous value: CONFIG_VIRTIO_BLK=m New value: CONFIG_VIRTIO_BLK=y Value of CONFIG_SCSI_VIRTIO is redefined by fragment /home/sfr/next/next/arch/powerpc/configs/guest.config: Previous value: CONFIG_SCSI_VIRTIO=m New value: CONFIG_SCSI_VIRTIO=y Value of CONFIG_VIRTIO_NET is redefined by fragment /home/sfr/next/next/arch/powerpc/configs/guest.config: Previous value: CONFIG_VIRTIO_NET=m New value: CONFIG_VIRTIO_NET=y Value of CONFIG_VIRTIO_CONSOLE is redefined by fragment /home/sfr/next/next/arch/powerpc/configs/guest.config: Previous value: CONFIG_VIRTIO_CONSOLE=m New value: CONFIG_VIRTIO_CONSOLE=y Value of CONFIG_VIRTIO_PCI is redefined by fragment /home/sfr/next/next/arch/powerpc/configs/guest.config: Previous value: CONFIG_VIRTIO_PCI=m New value: CONFIG_VIRTIO_PCI=y Value of CONFIG_VIRTIO_BALLOON is redefined by fragment /home/sfr/next/next/arch/powerpc/configs/guest.config: Previous value: CONFIG_VIRTIO_BALLOON=m New value: CONFIG_VIRTIO_BALLOON=y Value of CONFIG_VHOST_NET is redefined by fragment /home/sfr/next/next/arch/powerpc/configs/guest.config: Previous value: CONFIG_VHOST_NET=m New value: CONFIG_VHOST_NET=y Value of CONFIG_IBMVETH is redefined by fragment /home/sfr/next/next/arch/powerpc/configs/guest.config: Previous value: CONFIG_IBMVETH=m New value: CONFIG_IBMVETH=y Value of CONFIG_IBMVNIC is redefined by fragment /home/sfr/next/next/arch/powerpc/configs/guest.config: Previous value: CONFIG_IBMVNIC=m New value: CONFIG_IBMVNIC=y I am not sure exactly which change(s) introduced these warnings. -- Cheers, Stephen Rothwell pgpjxoHBTV8UL.pgp Description: OpenPGP digital signature
[PATCH] cachestat: wire up cachestat for other architectures
cachestat is previously only wired in for x86 (and architectures using the generic unistd.h table): https://lore.kernel.org/lkml/20230503013608.2431726-1-npha...@gmail.com/ This patch wires cachestat in for all the other architectures. Signed-off-by: Nhat Pham --- arch/alpha/kernel/syscalls/syscall.tbl | 1 + arch/arm/tools/syscall.tbl | 1 + arch/ia64/kernel/syscalls/syscall.tbl | 1 + arch/m68k/kernel/syscalls/syscall.tbl | 1 + arch/microblaze/kernel/syscalls/syscall.tbl | 1 + arch/mips/kernel/syscalls/syscall_n32.tbl | 1 + arch/mips/kernel/syscalls/syscall_n64.tbl | 1 + arch/mips/kernel/syscalls/syscall_o32.tbl | 1 + arch/parisc/kernel/syscalls/syscall.tbl | 1 + arch/powerpc/kernel/syscalls/syscall.tbl| 1 + arch/s390/kernel/syscalls/syscall.tbl | 1 + arch/sh/kernel/syscalls/syscall.tbl | 1 + arch/sparc/kernel/syscalls/syscall.tbl | 1 + arch/xtensa/kernel/syscalls/syscall.tbl | 1 + 14 files changed, 14 insertions(+) diff --git a/arch/alpha/kernel/syscalls/syscall.tbl b/arch/alpha/kernel/syscalls/syscall.tbl index 8ebacf37a8cf..1f13995d00d7 100644 --- a/arch/alpha/kernel/syscalls/syscall.tbl +++ b/arch/alpha/kernel/syscalls/syscall.tbl @@ -490,3 +490,4 @@ 558common process_mreleasesys_process_mrelease 559common futex_waitv sys_futex_waitv 560common set_mempolicy_home_node sys_ni_syscall +561common cachestat sys_cachestat diff --git a/arch/arm/tools/syscall.tbl b/arch/arm/tools/syscall.tbl index ac964612d8b0..8ebed8a13874 100644 --- a/arch/arm/tools/syscall.tbl +++ b/arch/arm/tools/syscall.tbl @@ -464,3 +464,4 @@ 448common process_mreleasesys_process_mrelease 449common futex_waitv sys_futex_waitv 450common set_mempolicy_home_node sys_set_mempolicy_home_node +451common cachestat sys_cachestat diff --git a/arch/ia64/kernel/syscalls/syscall.tbl b/arch/ia64/kernel/syscalls/syscall.tbl index 72c929d9902b..f8c74ffeeefb 100644 --- a/arch/ia64/kernel/syscalls/syscall.tbl +++ b/arch/ia64/kernel/syscalls/syscall.tbl @@ -371,3 +371,4 @@ 448common process_mreleasesys_process_mrelease 449common futex_waitv sys_futex_waitv 450common set_mempolicy_home_node sys_set_mempolicy_home_node +451common cachestat sys_cachestat diff --git a/arch/m68k/kernel/syscalls/syscall.tbl b/arch/m68k/kernel/syscalls/syscall.tbl index b1f3940bc298..4f504783371f 100644 --- a/arch/m68k/kernel/syscalls/syscall.tbl +++ b/arch/m68k/kernel/syscalls/syscall.tbl @@ -450,3 +450,4 @@ 448common process_mreleasesys_process_mrelease 449common futex_waitv sys_futex_waitv 450common set_mempolicy_home_node sys_set_mempolicy_home_node +451common cachestat sys_cachestat diff --git a/arch/microblaze/kernel/syscalls/syscall.tbl b/arch/microblaze/kernel/syscalls/syscall.tbl index 820145e47350..858d22bf275c 100644 --- a/arch/microblaze/kernel/syscalls/syscall.tbl +++ b/arch/microblaze/kernel/syscalls/syscall.tbl @@ -456,3 +456,4 @@ 448common process_mreleasesys_process_mrelease 449common futex_waitv sys_futex_waitv 450common set_mempolicy_home_node sys_set_mempolicy_home_node +451common cachestat sys_cachestat diff --git a/arch/mips/kernel/syscalls/syscall_n32.tbl b/arch/mips/kernel/syscalls/syscall_n32.tbl index 253ff994ed2e..1976317d4e8b 100644 --- a/arch/mips/kernel/syscalls/syscall_n32.tbl +++ b/arch/mips/kernel/syscalls/syscall_n32.tbl @@ -389,3 +389,4 @@ 448n32 process_mreleasesys_process_mrelease 449n32 futex_waitv sys_futex_waitv 450n32 set_mempolicy_home_node sys_set_mempolicy_home_node +451n32 cachestat sys_cachestat diff --git a/arch/mips/kernel/syscalls/syscall_n64.tbl b/arch/mips/kernel/syscalls/syscall_n64.tbl index 3f1886ad9d80..cfda2511badf 100644 --- a/arch/mips/kernel/syscalls/syscall_n64.tbl +++ b/arch/mips/kernel/syscalls/syscall_n64.tbl @@ -365,3 +365,4 @@ 448n64 process_mreleasesys_process_mrelease 449n64 futex_waitv sys_futex_waitv 450common set_mempolicy_home_node sys_set_mempolicy_home_node +451n64 cachestat sys_cachestat diff --git a/arch/mips/kernel/syscalls/syscall_o32.tbl b/arch/mips/kernel/syscalls/syscall_o32.tbl index 8f243e35a7b2..7692234c3768 100644 --- a/arch/mips/kernel/syscalls/syscall_o32.tbl +++ b/arch/mips/kernel/syscalls/syscall_o32.tbl @@ -438,3 +438,4 @@ 448o32 process_mreleasesys_process_mrelease 449o32 futex_waitv sys_futex_waitv 450
Re: [PATCH 1/4] block:sed-opal: SED Opal keystore
On Fri May 5, 2023 at 10:43 PM EEST, wrote: > From: Greg Joyce > > Add read and write functions that allow SED Opal keys to stored > in a permanent keystore. Please be more verbose starting from "Self-Encrypting Drive (SED)", instead of just "SED", and take time to explain what these keys are. > > Signed-off-by: Greg Joyce > Reviewed-by: Jonathan Derrick > --- > block/Makefile | 2 +- > block/sed-opal-key.c | 24 > include/linux/sed-opal-key.h | 15 +++ > 3 files changed, 40 insertions(+), 1 deletion(-) > create mode 100644 block/sed-opal-key.c > create mode 100644 include/linux/sed-opal-key.h > > diff --git a/block/Makefile b/block/Makefile > index 4e01bb71ad6e..464a9f209552 100644 > --- a/block/Makefile > +++ b/block/Makefile > @@ -35,7 +35,7 @@ obj-$(CONFIG_BLK_DEV_ZONED) += blk-zoned.o > obj-$(CONFIG_BLK_WBT)+= blk-wbt.o > obj-$(CONFIG_BLK_DEBUG_FS) += blk-mq-debugfs.o > obj-$(CONFIG_BLK_DEBUG_FS_ZONED)+= blk-mq-debugfs-zoned.o > -obj-$(CONFIG_BLK_SED_OPAL) += sed-opal.o > +obj-$(CONFIG_BLK_SED_OPAL) += sed-opal.o sed-opal-key.o > obj-$(CONFIG_BLK_PM) += blk-pm.o > obj-$(CONFIG_BLK_INLINE_ENCRYPTION) += blk-crypto.o blk-crypto-profile.o \ > blk-crypto-sysfs.o > diff --git a/block/sed-opal-key.c b/block/sed-opal-key.c > new file mode 100644 > index ..16f380164c44 > --- /dev/null > +++ b/block/sed-opal-key.c > @@ -0,0 +1,24 @@ > +// SPDX-License-Identifier: GPL-2.0-only > +/* > + * SED key operations. > + * > + * Copyright (C) 2022 IBM Corporation > + * > + * These are the accessor functions (read/write) for SED Opal > + * keys. Specific keystores can provide overrides. > + * > + */ > + > +#include > +#include > +#include > + > +int __weak sed_read_key(char *keyname, char *key, u_int *keylen) > +{ > + return -EOPNOTSUPP; > +} > + > +int __weak sed_write_key(char *keyname, char *key, u_int keylen) > +{ > + return -EOPNOTSUPP; > +} > diff --git a/include/linux/sed-opal-key.h b/include/linux/sed-opal-key.h > new file mode 100644 > index ..c9b1447986d8 > --- /dev/null > +++ b/include/linux/sed-opal-key.h > @@ -0,0 +1,15 @@ > +/* SPDX-License-Identifier: GPL-2.0 */ > +/* > + * SED key operations. > + * > + * Copyright (C) 2022 IBM Corporation > + * > + * These are the accessor functions (read/write) for SED Opal > + * keys. Specific keystores can provide overrides. > + * > + */ > + > +#include > + > +int sed_read_key(char *keyname, char *key, u_int *keylen); > +int sed_write_key(char *keyname, char *key, u_int keylen); > -- > gjo...@linux.vnet.ibm.com BR, Jarkko
Re: [PATCH v8 00/10] arm64: Add framework to turn an IPI as NMI
Hi, On Wed, May 10, 2023 at 9:30 AM Mark Rutland wrote: > > On Wed, May 10, 2023 at 08:28:17AM -0700, Doug Anderson wrote: > > Hi, > > Hi Doug, > > > On Wed, Apr 19, 2023 at 3:57 PM Douglas Anderson > > wrote: > > > This is an attempt to resurrect Sumit's old patch series [1] that > > > allowed us to use the arm64 pseudo-NMI to get backtraces of CPUs and > > > also to round up CPUs in kdb/kgdb. The last post from Sumit that I > > > could find was v7, so I called this series v8. I haven't copied all of > > > his old changelongs here, but you can find them from the link. > > > > > > Since v7, I have: > > > * Addressed the small amount of feedback that was there for v7. > > > * Rebased. > > > * Added a new patch that prevents us from spamming the logs with idle > > > tasks. > > > * Added an extra patch to gracefully fall back to regular IPIs if > > > pseudo-NMIs aren't there. > > > > > > Since there appear to be a few different patches series related to > > > being able to use NMIs to get stack traces of crashed systems, let me > > > try to organize them to the best of my understanding: > > > > > > a) This series. On its own, a) will (among other things) enable stack > > >traces of all running processes with the soft lockup detector if > > >you've enabled the sysctl "kernel.softlockup_all_cpu_backtrace". On > > >its own, a) doesn't give a hard lockup detector. > > > > > > b) A different recently-posted series [2] that adds a hard lockup > > >detector based on perf. On its own, b) gives a stack crawl of the > > >locked up CPU but no stack crawls of other CPUs (even if they're > > >locked too). Together with a) + b) we get everything (full lockup > > >detect, full ability to get stack crawls). > > > > > > c) The old Android "buddy" hard lockup detector [3] that I'm > > >considering trying to upstream. If b) lands then I believe c) would > > >be redundant (at least for arm64). c) on its own is really only > > >useful on arm64 for platforms that can print CPU_DBGPCSR somehow > > >(see [4]). a) + c) is roughly as good as a) + b). > > > It's been 3 weeks and I haven't heard a peep on this series. That > > means nobody has any objections and it's all good to land, right? > > Right? :-P > > FWIW, there are still longstanding soundness issues in the arm64 pseudo-NMI > support (and fixing that requires an overhaul of our DAIF / IRQ flag > management, which I've been chipping away at for a number of releases), so I > hadn't looked at this in detail yet because the foundations are still somewhat > dodgy. > > I appreciate that this has been around for a while, and it's on my queue to > look at. Ah, thanks for the heads up! We've been thinking about turning this on in production in ChromeOS because it will help us track down a whole class of field-generated crash reports that are otherwise opaque to us. It sounds as if maybe that's not a good idea quite yet? Do you have any idea of how much farther along this needs to go? ...of course, we've also run into issues with Mediatek devices because they don't save/restore GICR registers properly [1]. In theory, we might be able to work around that in the kernel. In any case, even if there are bugs that would prevent turning this on for production, it still seems like we could still land this series. It simply wouldn't do anything until someone turned on pseudo NMIs, which wouldn't happen till the kinks are worked out. ...actually, I guess I should say that if all the patches of the current series do land then it actually _would_ still do something, even without pseudo-NMI. Assuming the last patch looks OK, it would at least start falling back to using regular IPIs to do backtraces. That wouldn't get backtraces on hard locked up CPUs but it would be better than what we have today where we don't get any backtraces. This would get arm64 on par with arm32... [1] https://issuetracker.google.com/281831288
Re: [PATCH v8 00/10] arm64: Add framework to turn an IPI as NMI
On Wed, May 10, 2023 at 08:28:17AM -0700, Doug Anderson wrote: > Hi, Hi Doug, > On Wed, Apr 19, 2023 at 3:57 PM Douglas Anderson > wrote: > > This is an attempt to resurrect Sumit's old patch series [1] that > > allowed us to use the arm64 pseudo-NMI to get backtraces of CPUs and > > also to round up CPUs in kdb/kgdb. The last post from Sumit that I > > could find was v7, so I called this series v8. I haven't copied all of > > his old changelongs here, but you can find them from the link. > > > > Since v7, I have: > > * Addressed the small amount of feedback that was there for v7. > > * Rebased. > > * Added a new patch that prevents us from spamming the logs with idle > > tasks. > > * Added an extra patch to gracefully fall back to regular IPIs if > > pseudo-NMIs aren't there. > > > > Since there appear to be a few different patches series related to > > being able to use NMIs to get stack traces of crashed systems, let me > > try to organize them to the best of my understanding: > > > > a) This series. On its own, a) will (among other things) enable stack > >traces of all running processes with the soft lockup detector if > >you've enabled the sysctl "kernel.softlockup_all_cpu_backtrace". On > >its own, a) doesn't give a hard lockup detector. > > > > b) A different recently-posted series [2] that adds a hard lockup > >detector based on perf. On its own, b) gives a stack crawl of the > >locked up CPU but no stack crawls of other CPUs (even if they're > >locked too). Together with a) + b) we get everything (full lockup > >detect, full ability to get stack crawls). > > > > c) The old Android "buddy" hard lockup detector [3] that I'm > >considering trying to upstream. If b) lands then I believe c) would > >be redundant (at least for arm64). c) on its own is really only > >useful on arm64 for platforms that can print CPU_DBGPCSR somehow > >(see [4]). a) + c) is roughly as good as a) + b). > It's been 3 weeks and I haven't heard a peep on this series. That > means nobody has any objections and it's all good to land, right? > Right? :-P FWIW, there are still longstanding soundness issues in the arm64 pseudo-NMI support (and fixing that requires an overhaul of our DAIF / IRQ flag management, which I've been chipping away at for a number of releases), so I hadn't looked at this in detail yet because the foundations are still somewhat dodgy. I appreciate that this has been around for a while, and it's on my queue to look at. Thanks, Mark.
Re: [PATCH v8 00/10] arm64: Add framework to turn an IPI as NMI
Hi, On Wed, Apr 19, 2023 at 3:57 PM Douglas Anderson wrote: > > This is an attempt to resurrect Sumit's old patch series [1] that > allowed us to use the arm64 pseudo-NMI to get backtraces of CPUs and > also to round up CPUs in kdb/kgdb. The last post from Sumit that I > could find was v7, so I called this series v8. I haven't copied all of > his old changelongs here, but you can find them from the link. > > Since v7, I have: > * Addressed the small amount of feedback that was there for v7. > * Rebased. > * Added a new patch that prevents us from spamming the logs with idle > tasks. > * Added an extra patch to gracefully fall back to regular IPIs if > pseudo-NMIs aren't there. > > Since there appear to be a few different patches series related to > being able to use NMIs to get stack traces of crashed systems, let me > try to organize them to the best of my understanding: > > a) This series. On its own, a) will (among other things) enable stack >traces of all running processes with the soft lockup detector if >you've enabled the sysctl "kernel.softlockup_all_cpu_backtrace". On >its own, a) doesn't give a hard lockup detector. > > b) A different recently-posted series [2] that adds a hard lockup >detector based on perf. On its own, b) gives a stack crawl of the >locked up CPU but no stack crawls of other CPUs (even if they're >locked too). Together with a) + b) we get everything (full lockup >detect, full ability to get stack crawls). > > c) The old Android "buddy" hard lockup detector [3] that I'm >considering trying to upstream. If b) lands then I believe c) would >be redundant (at least for arm64). c) on its own is really only >useful on arm64 for platforms that can print CPU_DBGPCSR somehow >(see [4]). a) + c) is roughly as good as a) + b). > > [1] > https://lore.kernel.org/linux-arm-kernel/1604317487-14543-1-git-send-email-sumit.g...@linaro.org/ > [2] > https://lore.kernel.org/linux-arm-kernel/20220903093415.15850-1-lecopzer.c...@mediatek.com/ > [3] https://issuetracker.google.com/172213097 > [4] https://issuetracker.google.com/172213129 > > Changes in v8: > - dynamic_ipi_setup() and dynamic_ipi_teardown() no longer take cpu param > - dynamic_ipi_setup() and dynamic_ipi_teardown() no longer take cpu param > - Add loongarch support, too > - Removed "#ifdef CONFIG_SMP" since arm64 is always SMP > - "Tag the arm64 idle functions as __cpuidle" new for v8 > - "Provide a stub kgdb_nmicallback() if !CONFIG_KGDB" new for v8 > - "Fallback to a regular IPI if NMI isn't enabled" new for v8 > > Douglas Anderson (3): > arm64: idle: Tag the arm64 idle functions as __cpuidle > kgdb: Provide a stub kgdb_nmicallback() if !CONFIG_KGDB > arm64: ipi_nmi: Fallback to a regular IPI if NMI isn't enabled > > Sumit Garg (7): > arm64: Add framework to turn IPI as NMI > irqchip/gic-v3: Enable support for SGIs to act as NMIs > arm64: smp: Assign and setup an IPI as NMI > nmi: backtrace: Allow runtime arch specific override > arm64: ipi_nmi: Add support for NMI backtrace > kgdb: Expose default CPUs roundup fallback mechanism > arm64: kgdb: Roundup cpus using IPI as NMI > > arch/arm/include/asm/irq.h | 2 +- > arch/arm/kernel/smp.c| 3 +- > arch/arm64/include/asm/irq.h | 4 ++ > arch/arm64/include/asm/nmi.h | 17 + > arch/arm64/kernel/Makefile | 2 +- > arch/arm64/kernel/idle.c | 4 +- > arch/arm64/kernel/ipi_nmi.c | 103 +++ > arch/arm64/kernel/kgdb.c | 18 ++ > arch/arm64/kernel/smp.c | 8 +++ > arch/loongarch/include/asm/irq.h | 2 +- > arch/loongarch/kernel/process.c | 3 +- > arch/mips/include/asm/irq.h | 2 +- > arch/mips/kernel/process.c | 3 +- > arch/powerpc/include/asm/nmi.h | 2 +- > arch/powerpc/kernel/stacktrace.c | 3 +- > arch/sparc/include/asm/irq_64.h | 2 +- > arch/sparc/kernel/process_64.c | 4 +- > arch/x86/include/asm/irq.h | 2 +- > arch/x86/kernel/apic/hw_nmi.c| 3 +- > drivers/irqchip/irq-gic-v3.c | 29 ++--- > include/linux/kgdb.h | 13 > include/linux/nmi.h | 12 ++-- > kernel/debug/debug_core.c| 8 ++- > 23 files changed, 217 insertions(+), 32 deletions(-) It's been 3 weeks and I haven't heard a peep on this series. That means nobody has any objections and it's all good to land, right? Right? :-P Seriously, though, I really think we should figure out how to get this landed. There's obviously interest from several different parties and I'm chomping at the bit waiting to spin this series according to your wishes. I also don't think there's anything super scary/ugly here. IMO the ideal situation is that folks are OK with the idea presented in the last patch in the series ("arm64: ipi_nmi: Fallback to a regular IPI if NMI isn't enabled") and then I can re-spin this series to be _much_ simpler. That being said, I'm also OK with
[Bug 217350] kdump kernel hangs in powerkvm guest
https://bugzilla.kernel.org/show_bug.cgi?id=217350 --- Comment #2 from Cédric Le Goater (c...@kaod.org) --- Hello Pingfan, This was a large series. Did you identify the exact patch ? It should be related to the powerpc/pseries/pci or powerpc/xive subsystem. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
Re: [PATCH 01/23] arm: allow pte_offset_map[_lock]() to fail
On Tue, May 09, 2023 at 09:42:44PM -0700, Hugh Dickins wrote: > diff --git a/arch/arm/lib/uaccess_with_memcpy.c > b/arch/arm/lib/uaccess_with_memcpy.c > index e4c2677cc1e9..2f6163f05e93 100644 > --- a/arch/arm/lib/uaccess_with_memcpy.c > +++ b/arch/arm/lib/uaccess_with_memcpy.c > @@ -74,6 +74,9 @@ pin_page_for_write(const void __user *_addr, pte_t **ptep, > spinlock_t **ptlp) > return 0; > > pte = pte_offset_map_lock(current->mm, pmd, addr, ); > + if (unlikely(!pte)) > + return 0; Failing seems like the wrong thig to do if we transitioned from a PTE to PMD here? Looks to me like we should goto a new label right after the 'pmd = pmd_offset(pud, addr);', no?
Re: [PATCH 14/23] riscv/hugetlb: pte_alloc_huge() pte_offset_huge()
On Tue, 09 May 2023 21:59:57 PDT (-0700), hu...@google.com wrote: pte_alloc_map() expects to be followed by pte_unmap(), but hugetlb omits that: to keep balance in future, use the recently added pte_alloc_huge() instead; with pte_offset_huge() a better name for pte_offset_kernel(). Signed-off-by: Hugh Dickins --- arch/riscv/mm/hugetlbpage.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/riscv/mm/hugetlbpage.c b/arch/riscv/mm/hugetlbpage.c index a163a3e0f0d4..80926946759f 100644 --- a/arch/riscv/mm/hugetlbpage.c +++ b/arch/riscv/mm/hugetlbpage.c @@ -43,7 +43,7 @@ pte_t *huge_pte_alloc(struct mm_struct *mm, for_each_napot_order(order) { if (napot_cont_size(order) == sz) { - pte = pte_alloc_map(mm, pmd, addr & napot_cont_mask(order)); + pte = pte_alloc_huge(mm, pmd, addr & napot_cont_mask(order)); break; } } @@ -90,7 +90,7 @@ pte_t *huge_pte_offset(struct mm_struct *mm, for_each_napot_order(order) { if (napot_cont_size(order) == sz) { - pte = pte_offset_kernel(pmd, addr & napot_cont_mask(order)); + pte = pte_offset_huge(pmd, addr & napot_cont_mask(order)); break; } } Acked-by: Palmer Dabbelt
[Bug 217427] Linux 6.3.1 + AMD RX 570 on ppc64le 4K: Kernel attempted to read user page (1128) - exploit attempt? (uid: 0)
https://bugzilla.kernel.org/show_bug.cgi?id=217427 --- Comment #1 from darkbasic (darkba...@linuxsystems.it) --- Created attachment 304241 --> https://bugzilla.kernel.org/attachment.cgi?id=304241=edit journalctl_6.3.1.log -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[Bug 217427] New: Linux 6.3.1 + AMD RX 570 on ppc64le 4K: Kernel attempted to read user page (1128) - exploit attempt? (uid: 0)
https://bugzilla.kernel.org/show_bug.cgi?id=217427 Bug ID: 217427 Summary: Linux 6.3.1 + AMD RX 570 on ppc64le 4K: Kernel attempted to read user page (1128) - exploit attempt? (uid: 0) Product: Platform Specific/Hardware Version: 2.5 Hardware: All OS: Linux Status: NEW Severity: normal Priority: P3 Component: PPC-64 Assignee: platform_ppc...@kernel-bugs.osdl.org Reporter: darkba...@linuxsystems.it Regression: No I'm using Gentoo Linux on a Raptor CS Talos 2 ppc64le, GPU is an AMD RX 570. So far the past dozen of kernels (up to 6.2.14) worked flawlessly, but with 6.3.1 I don't get any video output and I get the following in journalctl: May 10 15:09:01 talos2 kernel: Kernel attempted to read user page (1128) - exploit attempt? (uid: 0) May 10 15:09:01 talos2 kernel: BUG: Unable to handle kernel data access on read at 0x1128 May 10 15:09:01 talos2 kernel: Faulting instruction address: 0xc0080d1a805c May 10 15:09:01 talos2 kernel: Oops: Kernel access of bad area, sig: 11 [#1] May 10 15:09:01 talos2 kernel: LE PAGE_SIZE=4K MMU=Radix SMP NR_CPUS=512 NUMA PowerNV May 10 15:09:01 talos2 kernel: Modules linked in: rfkill(+) 8021q garp mrp stp llc binfmt_misc amdgpu uvcvideo uvc videobuf2_vmalloc videobuf2_memops gpu_sched snd_hda_codec_hdmi i2c_algo_bit at24(+) videobuf2_v4l2 drm_ttm_helper regmap_i2c videobuf2_common ttm snd_usb_audio drm_di> May 10 15:09:01 talos2 kernel: CPU: 0 PID: 188 Comm: kworker/0:3 Not tainted 6.3.1-gentoo-dist #1 May 10 15:09:01 talos2 kernel: Hardware name: T2P9S01 REV 1.01 POWER9 0x4e1202 opal:skiboot-9858186 PowerNV May 10 15:09:01 talos2 kernel: Workqueue: events_long drm_dp_check_and_send_link_address [drm_display_helper] May 10 15:09:01 talos2 kernel: NIP: c0080d1a805c LR: c0080d1a8018 CTR: c0c87900 May 10 15:09:01 talos2 kernel: REGS: cbeb3370 TRAP: 0300 Not tainted (6.3.1-gentoo-dist) May 10 15:09:01 talos2 kernel: MSR: 9280b033 CR: 88048223 XER: 005a May 10 15:09:01 talos2 kernel: CFAR: c0c87980 DAR: 1128 DSISR: 4000 IRQMASK: 0 GPR00: c0080d1a8018 cbeb3610 c0080d690f00 GPR04: 0002 c0080d6297c0 c0002a00b740 GPR08: 1124 c0080d431560 GPR12: c0c87900 c2a6b000 c0170ad8 c0001a460310 GPR16: 0045 c00022858388 c00026000340 0001 GPR20: 0001 c000260001a0 4000 GPR24: 4000 c0002610 c000228580b8 fffd GPR28: c000228580a0 c00022856000 c00022858000 May 10 15:09:01 talos2 kernel: NIP [c0080d1a805c] is_synaptics_cascaded_panamera+0x244/0x600 [amdgpu] May 10 15:09:01 talos2 kernel: LR [c0080d1a8018] is_synaptics_cascaded_panamera+0x200/0x600 [amdgpu] May 10 15:09:01 talos2 kernel: Call Trace: May 10 15:09:01 talos2 kernel: [cbeb3610] [c0080d1a8018] is_synaptics_cascaded_panamera+0x200/0x600 [amdgpu] (unreliable) May 10 15:09:01 talos2 kernel: [cbeb36d0] [c0080b7c2b18] drm_helper_probe_single_connector_modes+0x230/0x698 [drm_kms_helper] May 10 15:09:01 talos2 kernel: [cbeb3810] [c0c57174] drm_client_modeset_probe+0x2b4/0x16c0 May 10 15:09:01 talos2 kernel: [cbeb3a10] [c0080b7c7a30] __drm_fb_helper_initial_config_and_unlock+0x68/0x640 [drm_kms_helper] May 10 15:09:01 talos2 kernel: [cbeb3af0] [c0080b7c5b08] drm_fbdev_client_hotplug+0x40/0x1d0 [drm_kms_helper] May 10 15:09:01 talos2 kernel: [cbeb3b70] [c0c55480] drm_client_dev_hotplug+0x120/0x1b0 May 10 15:09:01 talos2 kernel: [cbeb3c00] [c0080b7c1130] drm_kms_helper_hotplug_event+0x58/0x80 [drm_kms_helper] May 10 15:09:01 talos2 kernel: [cbeb3c30] [c0080b80b298] drm_dp_check_and_send_link_address+0x330/0x3a0 [drm_display_helper] May 10 15:09:01 talos2 kernel: [cbeb3cd0] [c0162d84] process_one_work+0x2f4/0x580 May 10 15:09:01 talos2 kernel: [cbeb3d70] [c01630b8] worker_thread+0xa8/0x600 May 10 15:09:01 talos2 kernel: [cbeb3e00] [c0170bf4] kthread+0x124/0x130 May 10 15:09:01 talos2 kernel: [cbeb3e50] [c000dd14] ret_from_kernel_thread+0x5c/0x64 May 10 15:09:01 talos2 kernel: --- interrupt: 0 at 0x0 May 10 15:09:01 talos2 kernel: NIP: LR: CTR: May 10 15:09:01 talos2 kernel: REGS: cbeb3e80 TRAP: Not tainted
Re: [PATCH 21/23] x86: Allow get_locked_pte() to fail
On Tue, May 09, 2023 at 10:08:37PM -0700, Hugh Dickins wrote: > In rare transient cases, not yet made possible, pte_offset_map() and > pte_offset_map_lock() may not find a page table: handle appropriately. > > Signed-off-by: Hugh Dickins > --- > arch/x86/kernel/ldt.c | 6 -- > 1 file changed, 4 insertions(+), 2 deletions(-) > > diff --git a/arch/x86/kernel/ldt.c b/arch/x86/kernel/ldt.c > index 525876e7b9f4..eb844549cd83 100644 > --- a/arch/x86/kernel/ldt.c > +++ b/arch/x86/kernel/ldt.c > @@ -367,8 +367,10 @@ static void unmap_ldt_struct(struct mm_struct *mm, > struct ldt_struct *ldt) > > va = (unsigned long)ldt_slot_va(ldt->slot) + offset; > ptep = get_locked_pte(mm, va, ); > - pte_clear(mm, va, ptep); > - pte_unmap_unlock(ptep, ptl); > + if (ptep) { > + pte_clear(mm, va, ptep); > + pte_unmap_unlock(ptep, ptl); > + } > } Ow geez, now I have to go remember how the whole PTI/LDT crud worked :/ At first glance this seems wrong; we can't just not unmap the LDT if we can't find it in a hurry. Also, IIRC this isn't in fact a regular user mapping, so it should not be subject to THP induced seizures. ... memory bubbles back ... for PTI kernels we need to map this in the user and kernel page-tables because obviously userspace needs to be able to have access to the LDT. But it is not directly acessible by userspace. It lives in the cpu_entry_area as a virtual map of the real kernel allocation, and this virtual address is used for LLDT. Modification is done through sys_modify_ldt(). I think I would feel much better if this were something like: if (!WARN_ON_ONCE(!ptep)) This really shouldn't fail and if it does, simply skipping it isn't the right thing either.
Re: [PATCH 14/23] riscv/hugetlb: pte_alloc_huge() pte_offset_huge()
Hi Hugh, On 5/10/23 06:59, Hugh Dickins wrote: pte_alloc_map() expects to be followed by pte_unmap(), but hugetlb omits that: to keep balance in future, use the recently added pte_alloc_huge() instead; with pte_offset_huge() a better name for pte_offset_kernel(). Signed-off-by: Hugh Dickins --- arch/riscv/mm/hugetlbpage.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/riscv/mm/hugetlbpage.c b/arch/riscv/mm/hugetlbpage.c index a163a3e0f0d4..80926946759f 100644 --- a/arch/riscv/mm/hugetlbpage.c +++ b/arch/riscv/mm/hugetlbpage.c @@ -43,7 +43,7 @@ pte_t *huge_pte_alloc(struct mm_struct *mm, for_each_napot_order(order) { if (napot_cont_size(order) == sz) { - pte = pte_alloc_map(mm, pmd, addr & napot_cont_mask(order)); + pte = pte_alloc_huge(mm, pmd, addr & napot_cont_mask(order)); break; } } @@ -90,7 +90,7 @@ pte_t *huge_pte_offset(struct mm_struct *mm, for_each_napot_order(order) { if (napot_cont_size(order) == sz) { - pte = pte_offset_kernel(pmd, addr & napot_cont_mask(order)); + pte = pte_offset_huge(pmd, addr & napot_cont_mask(order)); break; } } Reviewed-by: Alexandre Ghiti Thanks, Alex
Re: [PATCH 05/23] m68k: allow pte_offset_map[_lock]() to fail
Hi Hugh, Thanks for your patch! On Wed, May 10, 2023 at 6:48 AM Hugh Dickins wrote: > In rare transient cases, not yet made possible, pte_offset_map() and > pte_offset_map_lock() may not find a page table: handle appropriately. > > Restructure cf_tlb_miss() with a pte_unmap() (previously omitted) > at label out, followed by one local_irq_restore() for all. That's a bug fix, which should be a separate patch? > > Signed-off-by: Hugh Dickins > --- a/arch/m68k/include/asm/mmu_context.h > +++ b/arch/m68k/include/asm/mmu_context.h > @@ -99,7 +99,7 @@ static inline void load_ksp_mmu(struct task_struct *task) > p4d_t *p4d; > pud_t *pud; > pmd_t *pmd; > - pte_t *pte; > + pte_t *pte = NULL; > unsigned long mmuar; > > local_irq_save(flags); > @@ -139,7 +139,7 @@ static inline void load_ksp_mmu(struct task_struct *task) > > pte = (mmuar >= PAGE_OFFSET) ? pte_offset_kernel(pmd, mmuar) > : pte_offset_map(pmd, mmuar); > - if (pte_none(*pte) || !pte_present(*pte)) > + if (!pte || pte_none(*pte) || !pte_present(*pte)) > goto bug; If the absence of a pte is to become a non-abnormal case, it should probably jump to "end" instead, to avoid spamming the kernel log. > > set_pte(pte, pte_mkyoung(*pte)); > @@ -161,6 +161,8 @@ static inline void load_ksp_mmu(struct task_struct *task) > bug: > pr_info("ksp load failed: mm=0x%p ksp=0x08%lx\n", mm, mmuar); > end: > + if (pte && mmuar < PAGE_OFFSET) > + pte_unmap(pte); Is this also a bugfix, not mentioned in the patch description? > local_irq_restore(flags); > } > Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds
Re: [PATCH 00/23] arch: allow pte_offset_map[_lock]() to fail
On Tue, May 09, 2023 at 09:39:13PM -0700, Hugh Dickins wrote: > Two: pte_offset_map() will need to do an rcu_read_lock(), with the > corresponding rcu_read_unlock() in pte_unmap(). But most architectures > never supported CONFIG_HIGHPTE, so some don't always call pte_unmap() > after pte_offset_map(), or have used userspace pte_offset_map() where > pte_offset_kernel() is more correct. No problem in the current tree, > but a problem once an rcu_read_unlock() will be needed to keep balance. Hi Hugh, I shall have to spend some time looking at these patches, but at LSFMM just a few hours ago, I proposed and nobody objected to removing CONFIG_HIGHPTE. I don't intend to take action on that consensus immediately, so I can certainly wait until your patches are applied, but if this information simplifies what you're doing, feel free to act on it.