Re: [PATCH] arch: mm: rename FORCE_MAX_ZONEORDER to ARCH_FORCE_MAX_ORDER
Zi Yan writes: > From: Zi Yan > > This Kconfig option is used by individual arch to set its desired > MAX_ORDER. Rename it to reflect its actual use. > > Acked-by: Mike Rapoport > Signed-off-by: Zi Yan ... > arch/powerpc/Kconfig | 2 +- > arch/powerpc/configs/85xx/ge_imp3a_defconfig | 2 +- > arch/powerpc/configs/fsl-emb-nonhw.config| 2 +- Acked-by: Michael Ellerman (powerpc) cheers
Re: [PATCH] MAINTAINERS: Remove myself as EEH maintainer
On 2022-08-11 17:19:47 Thu, Oliver O'Halloran wrote: > On Thu, Aug 11, 2022 at 4:22 PM Michael Ellerman wrote: > > > > Russell Currey writes: > > > I haven't touched EEH in a long time I don't have much knowledge of the > > > subsystem at this point either, so it's misleading to have me as a > > > maintainer. > > > > Thank you for your service. > > > > > I remain grateful to Oliver for picking up my slack over the years. > > > > Ack. > > > > But I wonder if he is still happy being listed as the only maintainer. > > Given the status is "Supported" that means "Someone is actually paid to > > look after this" - and I suspect Oracle are probably not paying him to > > do that? > > I'm still happy to field questions and/or give reviews occasionally if > needed, but yeah I don't have the time, hardware, or inclination to do > any actual maintenance. IIRC Mahesh was supposed to take over > supporting EEH after I left IBM. If he's still around he should > probably be listed as a maintainer. Yes, I am still around. I am currently looking into the EEH and will be glad to take over the maintenanership of EEH for powerpc. Please feel free to add me as maintainer for EEH. Thanks, -- Mahesh J Salgaonkar
Re: [PATCH] ASoC: fsl_sai: Remove unnecessary FIFO reset in ISR
On Tue, Aug 16, 2022 at 10:41 PM Shengjiu Wang wrote: > > The FIFO reset drops the words in the FIFO, which may cause > channel swap when SAI module is running, especially when the > DMA speed is low. So it is not good to do FIFO reset in ISR, > then remove the operation. I don't recall the details of adding this many years ago, but leaving underrun/overrun errors unhandled does not sound right to me either. Would it result in a channel swap also? Perhaps there needs to be a reset routine that stops and restarts the DMA as well?
Re: [PATCH v4 2/2] selftests/powerpc: Add a test for execute-only memory
On Wed, 2022-08-17 at 15:06 +1000, Russell Currey wrote: > From: Nicholas Miehlbradt > > This selftest is designed to cover execute-only protections > on the Radix MMU but will also work with Hash. > > The tests are based on those found in pkey_exec_test with modifications > to use the generic mprotect() instead of the pkey variants. Would it make sense to rename pkey_exec_test to exec_test and have this test be apart of that? > > Signed-off-by: Nicholas Miehlbradt > Signed-off-by: Russell Currey > --- > v4: new > > tools/testing/selftests/powerpc/mm/Makefile | 3 +- > .../testing/selftests/powerpc/mm/exec_prot.c | 231 ++ > 2 files changed, 233 insertions(+), 1 deletion(-) > create mode 100644 tools/testing/selftests/powerpc/mm/exec_prot.c > > diff --git a/tools/testing/selftests/powerpc/mm/Makefile > b/tools/testing/selftests/powerpc/mm/Makefile > index 27dc09d0bfee..19dd0b2ea397 100644 > --- a/tools/testing/selftests/powerpc/mm/Makefile > +++ b/tools/testing/selftests/powerpc/mm/Makefile > @@ -3,7 +3,7 @@ noarg: > $(MAKE) -C ../ > > TEST_GEN_PROGS := hugetlb_vs_thp_test subpage_prot prot_sao segv_errors > wild_bctr \ > - large_vm_fork_separation bad_accesses pkey_exec_prot \ > + large_vm_fork_separation bad_accesses exec_prot > pkey_exec_prot \ > pkey_siginfo stack_expansion_signal stack_expansion_ldst \ > large_vm_gpr_corruption > TEST_PROGS := stress_code_patching.sh > @@ -22,6 +22,7 @@ $(OUTPUT)/wild_bctr: CFLAGS += -m64 > $(OUTPUT)/large_vm_fork_separation: CFLAGS += -m64 > $(OUTPUT)/large_vm_gpr_corruption: CFLAGS += -m64 > $(OUTPUT)/bad_accesses: CFLAGS += -m64 > +$(OUTPUT)/exec_prot: CFLAGS += -m64 > $(OUTPUT)/pkey_exec_prot: CFLAGS += -m64 > $(OUTPUT)/pkey_siginfo: CFLAGS += -m64 > > diff --git a/tools/testing/selftests/powerpc/mm/exec_prot.c > b/tools/testing/selftests/powerpc/mm/exec_prot.c > new file mode 100644 > index ..db75b2225de1 > --- /dev/null > +++ b/tools/testing/selftests/powerpc/mm/exec_prot.c > @@ -0,0 +1,231 @@ > +// SPDX-License-Identifier: GPL-2.0 > + > +/* > + * Copyright 2022, Nicholas Miehlbradt, IBM Corporation > + * based on pkey_exec_prot.c > + * > + * Test if applying execute protection on pages works as expected. > + */ > + > +#define _GNU_SOURCE > +#include > +#include > +#include > +#include > + > +#include > +#include > + > +#include "pkeys.h" > + > + > +#define PPC_INST_NOP 0x6000 > +#define PPC_INST_TRAP0x7fe8 > +#define PPC_INST_BLR 0x4e800020 > + > +static volatile sig_atomic_t fault_code; > +static volatile sig_atomic_t remaining_faults; > +static volatile unsigned int *fault_addr; > +static unsigned long pgsize, numinsns; > +static unsigned int *insns; > +static bool pkeys_supported; > + > +static bool is_fault_expected(int fault_code) > +{ > + if (fault_code == SEGV_ACCERR) > + return true; > + > + /* Assume any pkey error is fine since pkey_exec_prot test covers them > */ > + if (fault_code == SEGV_PKUERR && pkeys_supported) > + return true; > + > + return false; > +} > + > +static void trap_handler(int signum, siginfo_t *sinfo, void *ctx) > +{ > + /* Check if this fault originated from the expected address */ > + if (sinfo->si_addr != (void *)fault_addr) > + sigsafe_err("got a fault for an unexpected address\n"); > + > + _exit(1); > +} > + > +static void segv_handler(int signum, siginfo_t *sinfo, void *ctx) > +{ > + fault_code = sinfo->si_code; > + > + /* Check if this fault originated from the expected address */ > + if (sinfo->si_addr != (void *)fault_addr) { > + sigsafe_err("got a fault for an unexpected address\n"); > + _exit(1); > + } > + > + /* Check if too many faults have occurred for a single test case */ > + if (!remaining_faults) { > + sigsafe_err("got too many faults for the same address\n"); > + _exit(1); > + } > + > + > + /* Restore permissions in order to continue */ > + if (is_fault_expected(fault_code)) { > + if (mprotect(insns, pgsize, PROT_READ | PROT_WRITE | > PROT_EXEC)) { > + sigsafe_err("failed to set access permissions\n"); > + _exit(1); > + } > + } else { > + sigsafe_err("got a fault with an unexpected code\n"); > + _exit(1); > + } > + > + remaining_faults--; > +} > + > +static int check_exec_fault(int rights) > +{ > + /* > + * Jump to the executable region. > + * > + * The first iteration also checks if the overwrite of the > + * first instruction word from a trap to a no-op succeeded. > + */ > + fault_code = -1; > + remaining_faults = 0; > + if (!(rights & PROT_EXEC)) > + remaining_faults = 1; > + > + FAIL_IF(mprotect(insns, pgsize, rights) != 0); > + asm volatile("mtctr %0; b
Re: [PATCH v2 1/2] mm/migrate_device.c: Copy pte dirty bit to page
Peter Xu writes: > On Wed, Aug 17, 2022 at 11:49:03AM +1000, Alistair Popple wrote: >> >> Peter Xu writes: >> >> > On Tue, Aug 16, 2022 at 04:10:29PM +0800, huang ying wrote: >> >> > @@ -193,11 +194,10 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp, >> >> > bool anon_exclusive; >> >> > pte_t swp_pte; >> >> > >> >> > + flush_cache_page(vma, addr, pte_pfn(*ptep)); >> >> > + pte = ptep_clear_flush(vma, addr, ptep); >> >> >> >> Although I think it's possible to batch the TLB flushing just before >> >> unlocking PTL. The current code looks correct. >> > >> > If we're with unconditionally ptep_clear_flush(), does it mean we should >> > probably drop the "unmapped" and the last flush_tlb_range() already since >> > they'll be redundant? >> >> This patch does that, unless I missed something? > > Yes it does. Somehow I didn't read into the real v2 patch, sorry! > >> >> > If that'll need to be dropped, it looks indeed better to still keep the >> > batch to me but just move it earlier (before unlock iiuc then it'll be >> > safe), then we can keep using ptep_get_and_clear() afaiu but keep "pte" >> > updated. >> >> I think we would also need to check should_defer_flush(). Looking at >> try_to_unmap_one() there is this comment: >> >> if (should_defer_flush(mm, flags) && !anon_exclusive) { >> /* >> * We clear the PTE but do not flush so >> potentially >> * a remote CPU could still be writing to the >> folio. >> * If the entry was previously clean then the >> * architecture must guarantee that a >> clear->dirty >> * transition on a cached TLB entry is written >> through >> * and traps if the PTE is unmapped. >> */ >> >> And as I understand it we'd need the same guarantee here. Given >> try_to_migrate_one() doesn't do batched TLB flushes either I'd rather >> keep the code as consistent as possible between >> migrate_vma_collect_pmd() and try_to_migrate_one(). I could look at >> introducing TLB flushing for both in some future patch series. > > should_defer_flush() is TTU-specific code? I'm not sure, but I think we need the same guarantee here as mentioned in the comment otherwise we wouldn't see a subsequent CPU write that could dirty the PTE after we have cleared it but before the TLB flush. My assumption was should_defer_flush() would ensure we have that guarantee from the architecture, but maybe there are alternate/better ways of enforcing that? > IIUC the caller sets TTU_BATCH_FLUSH showing that tlb can be omitted since > the caller will be responsible for doing it. In migrate_vma_collect_pmd() > iiuc we don't need that hint because it'll be flushed within the same > function but just only after the loop of modifying the ptes. Also it'll be > with the pgtable lock held. Right, but the pgtable lock doesn't protect against HW PTE changes such as setting the dirty bit so we need to ensure the HW does the right thing here and I don't know if all HW does. > Indeed try_to_migrate_one() doesn't do batching either, but IMHO it's just > harder to do due to using the vma walker (e.g., the lock is released in > not_found() implicitly so iiuc it's hard to do tlb flush batching safely in > the loop of page_vma_mapped_walk). Also that's less a concern since the > loop will only operate upon >1 ptes only if it's a thp page mapped in ptes. > OTOH migrate_vma_collect_pmd() operates on all ptes on a pmd always. Yes, I had forgotten we loop over multiple ptes under the same PTL so didn't think removing the batching/range flush would cause all that much of a problem. > No strong opinion anyway, it's just a bit of a pity because fundamentally > this patch is removing the batching tlb flush. I also don't know whether > there'll be observe-able perf degrade for migrate_vma_collect_pmd(), > especially on large machines. I agree it's a pity. OTOH the original code isn't correct, and it's not entirely clear to me that just moving it under the PTL is entirely correct either. So unless someone is confident and can convince me that just moving it under the PTL is fine I'd rather stick with this fix which we all agree is at least correct. My primary concern with batching is ensuring a CPU write after clearing a clean PTE but before flushing the TLB does the "right thing" (ie. faults if the PTE is not present). > Thanks,
Re: [6.0-rc1] Kernel crash while running MCE tests
Sachin Sant writes: > Following crash is seen while running powerpc/mce subtest on > a Power10 LPAR. > > 1..1 > # selftests: powerpc/mce: inject-ra-err > [ 155.240591] BUG: Unable to handle kernel data access on read at > 0xc00e00022d55b503 > [ 155.240618] Faulting instruction address: 0xc06f1f0c > [ 155.240627] Oops: Kernel access of bad area, sig: 11 [#1] > [ 155.240633] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries > [ 155.240642] Modules linked in: dm_mod mptcp_diag xsk_diag tcp_diag > udp_diag raw_diag inet_diag unix_diag af_packet_diag netlink_diag > nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 > nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack > nf_defrag_ipv6 nf_defrag_ipv4 bonding rfkill tls ip_set nf_tables nfnetlink > sunrpc binfmt_misc pseries_rng drm drm_panel_orientation_quirks xfs libcrc32c > sd_mod t10_pi sr_mod crc64_rocksoft_generic cdrom crc64_rocksoft crc64 sg > ibmvscsi ibmveth scsi_transport_srp xts vmx_crypto fuse > [ 155.240750] CPU: 4 PID: 3645 Comm: inject-ra-err Not tainted 6.0.0-rc1 #2 > [ 155.240761] NIP: c06f1f0c LR: c00630d0 CTR: > > [ 155.240768] REGS: c000ff887890 TRAP: 0300 Not tainted (6.0.0-rc1) > [ 155.240776] MSR: 80001003 CR: 48002828 XER: > ^ MMU is off, aka. real mode. > [ 155.240792] CFAR: c00630cc DAR: c00e00022d55b503 DSISR: 4000 > IRQMASK: 3 > [ 155.240792] GPR00: c00630d0 c000ff887b30 c44afe00 > c0116aada818 > [ 155.240792] GPR04: 4d43 0008 c00630d0 > 004d4249 > [ 155.240792] GPR08: 0001 18022d55b503 a80e > 0348 > [ 155.240792] GPR12: c000b700 > > [ 155.240792] GPR16: > > [ 155.240792] GPR20: > 1b30 > [ 155.240792] GPR24: 7fff8dad 7fff8dacf6d8 7fffd1551e98 > 1001fce8 > [ 155.240792] GPR28: c0116aada888 c0116aada800 4d43 > c0116aada818 > [ 155.240885] NIP [c06f1f0c] __asan_load2+0x5c/0xe0 > [ 155.240898] LR [c00630d0] pseries_errorlog_id+0x20/0x40 > [ 155.240910] Call Trace: > [ 155.240914] [c000ff887b50] [c00630d0] > pseries_errorlog_id+0x20/0x40 > [ 155.240925] [c000ff887b80] [c15595c8] > get_pseries_errorlog+0xa8/0x110 get_pseries_errorlog() is marked noinstr. And pseries_errorlog_id() is: static inline uint16_t pseries_errorlog_id(struct pseries_errorlog *sect) { return be16_to_cpu(sect->id); } So I guess the compiler has decided not to inline it (why?!), and it is not marked noinstr, so it gets KASAN instrumentation which crashes in real mode. We'll have to make sure everything get_pseries_errorlog() is either forced inline, or marked noinstr. cheers
Re: [PATCH v4 2/2] selftests/powerpc: Add a test for execute-only memory
Le 17/08/2022 à 07:06, Russell Currey a écrit : > From: Nicholas Miehlbradt > > This selftest is designed to cover execute-only protections > on the Radix MMU but will also work with Hash. > > The tests are based on those found in pkey_exec_test with modifications > to use the generic mprotect() instead of the pkey variants. > > Signed-off-by: Nicholas Miehlbradt > Signed-off-by: Russell Currey > --- > v4: new > > tools/testing/selftests/powerpc/mm/Makefile | 3 +- > .../testing/selftests/powerpc/mm/exec_prot.c | 231 ++ There is a lot of code in common with pkey_exec_prot.c Isn't there a way to refactor ? > 2 files changed, 233 insertions(+), 1 deletion(-) > create mode 100644 tools/testing/selftests/powerpc/mm/exec_prot.c > > diff --git a/tools/testing/selftests/powerpc/mm/Makefile > b/tools/testing/selftests/powerpc/mm/Makefile > index 27dc09d0bfee..19dd0b2ea397 100644 > --- a/tools/testing/selftests/powerpc/mm/Makefile > +++ b/tools/testing/selftests/powerpc/mm/Makefile > @@ -3,7 +3,7 @@ noarg: > $(MAKE) -C ../ > > TEST_GEN_PROGS := hugetlb_vs_thp_test subpage_prot prot_sao segv_errors > wild_bctr \ > - large_vm_fork_separation bad_accesses pkey_exec_prot \ > + large_vm_fork_separation bad_accesses exec_prot > pkey_exec_prot \ > pkey_siginfo stack_expansion_signal stack_expansion_ldst \ > large_vm_gpr_corruption > TEST_PROGS := stress_code_patching.sh > @@ -22,6 +22,7 @@ $(OUTPUT)/wild_bctr: CFLAGS += -m64 > $(OUTPUT)/large_vm_fork_separation: CFLAGS += -m64 > $(OUTPUT)/large_vm_gpr_corruption: CFLAGS += -m64 > $(OUTPUT)/bad_accesses: CFLAGS += -m64 > +$(OUTPUT)/exec_prot: CFLAGS += -m64 > $(OUTPUT)/pkey_exec_prot: CFLAGS += -m64 > $(OUTPUT)/pkey_siginfo: CFLAGS += -m64 > > diff --git a/tools/testing/selftests/powerpc/mm/exec_prot.c > b/tools/testing/selftests/powerpc/mm/exec_prot.c > new file mode 100644 > index ..db75b2225de1 > --- /dev/null > +++ b/tools/testing/selftests/powerpc/mm/exec_prot.c > @@ -0,0 +1,231 @@ > +// SPDX-License-Identifier: GPL-2.0 > + > +/* > + * Copyright 2022, Nicholas Miehlbradt, IBM Corporation > + * based on pkey_exec_prot.c > + * > + * Test if applying execute protection on pages works as expected. > + */ > + > +#define _GNU_SOURCE > +#include > +#include > +#include > +#include > + > +#include > +#include > + > +#include "pkeys.h" > + > + > +#define PPC_INST_NOP 0x6000 > +#define PPC_INST_TRAP0x7fe8 > +#define PPC_INST_BLR 0x4e800020 > + > +static volatile sig_atomic_t fault_code; > +static volatile sig_atomic_t remaining_faults; > +static volatile unsigned int *fault_addr; > +static unsigned long pgsize, numinsns; > +static unsigned int *insns; > +static bool pkeys_supported; > + > +static bool is_fault_expected(int fault_code) > +{ > + if (fault_code == SEGV_ACCERR) > + return true; > + > + /* Assume any pkey error is fine since pkey_exec_prot test covers them > */ > + if (fault_code == SEGV_PKUERR && pkeys_supported) > + return true; > + > + return false; > +} > + > +static void trap_handler(int signum, siginfo_t *sinfo, void *ctx) > +{ > + /* Check if this fault originated from the expected address */ > + if (sinfo->si_addr != (void *)fault_addr) > + sigsafe_err("got a fault for an unexpected address\n"); > + > + _exit(1); > +} > + > +static void segv_handler(int signum, siginfo_t *sinfo, void *ctx) > +{ > + fault_code = sinfo->si_code; > + > + /* Check if this fault originated from the expected address */ > + if (sinfo->si_addr != (void *)fault_addr) { > + sigsafe_err("got a fault for an unexpected address\n"); > + _exit(1); > + } > + > + /* Check if too many faults have occurred for a single test case */ > + if (!remaining_faults) { > + sigsafe_err("got too many faults for the same address\n"); > + _exit(1); > + } > + > + > + /* Restore permissions in order to continue */ > + if (is_fault_expected(fault_code)) { > + if (mprotect(insns, pgsize, PROT_READ | PROT_WRITE | > PROT_EXEC)) { > + sigsafe_err("failed to set access permissions\n"); > + _exit(1); > + } > + } else { > + sigsafe_err("got a fault with an unexpected code\n"); > + _exit(1); > + } > + > + remaining_faults--; > +} > + > +static int check_exec_fault(int rights) > +{ > + /* > + * Jump to the executable region. > + * > + * The first iteration also checks if the overwrite of the > + * first instruction word from a trap to a no-op succeeded. > + */ > + fault_code = -1; > + remaining_faults = 0; > + if (!(rights & PROT_EXEC)) > + remaining_faults = 1; > + > + FAIL_IF(mprotect(insns, pgsize, rights) != 0); > + asm volatile("mtctr %0; bctr
[PATCH v4 2/2] selftests/powerpc: Add a test for execute-only memory
From: Nicholas Miehlbradt This selftest is designed to cover execute-only protections on the Radix MMU but will also work with Hash. The tests are based on those found in pkey_exec_test with modifications to use the generic mprotect() instead of the pkey variants. Signed-off-by: Nicholas Miehlbradt Signed-off-by: Russell Currey --- v4: new tools/testing/selftests/powerpc/mm/Makefile | 3 +- .../testing/selftests/powerpc/mm/exec_prot.c | 231 ++ 2 files changed, 233 insertions(+), 1 deletion(-) create mode 100644 tools/testing/selftests/powerpc/mm/exec_prot.c diff --git a/tools/testing/selftests/powerpc/mm/Makefile b/tools/testing/selftests/powerpc/mm/Makefile index 27dc09d0bfee..19dd0b2ea397 100644 --- a/tools/testing/selftests/powerpc/mm/Makefile +++ b/tools/testing/selftests/powerpc/mm/Makefile @@ -3,7 +3,7 @@ noarg: $(MAKE) -C ../ TEST_GEN_PROGS := hugetlb_vs_thp_test subpage_prot prot_sao segv_errors wild_bctr \ - large_vm_fork_separation bad_accesses pkey_exec_prot \ + large_vm_fork_separation bad_accesses exec_prot pkey_exec_prot \ pkey_siginfo stack_expansion_signal stack_expansion_ldst \ large_vm_gpr_corruption TEST_PROGS := stress_code_patching.sh @@ -22,6 +22,7 @@ $(OUTPUT)/wild_bctr: CFLAGS += -m64 $(OUTPUT)/large_vm_fork_separation: CFLAGS += -m64 $(OUTPUT)/large_vm_gpr_corruption: CFLAGS += -m64 $(OUTPUT)/bad_accesses: CFLAGS += -m64 +$(OUTPUT)/exec_prot: CFLAGS += -m64 $(OUTPUT)/pkey_exec_prot: CFLAGS += -m64 $(OUTPUT)/pkey_siginfo: CFLAGS += -m64 diff --git a/tools/testing/selftests/powerpc/mm/exec_prot.c b/tools/testing/selftests/powerpc/mm/exec_prot.c new file mode 100644 index ..db75b2225de1 --- /dev/null +++ b/tools/testing/selftests/powerpc/mm/exec_prot.c @@ -0,0 +1,231 @@ +// SPDX-License-Identifier: GPL-2.0 + +/* + * Copyright 2022, Nicholas Miehlbradt, IBM Corporation + * based on pkey_exec_prot.c + * + * Test if applying execute protection on pages works as expected. + */ + +#define _GNU_SOURCE +#include +#include +#include +#include + +#include +#include + +#include "pkeys.h" + + +#define PPC_INST_NOP 0x6000 +#define PPC_INST_TRAP 0x7fe8 +#define PPC_INST_BLR 0x4e800020 + +static volatile sig_atomic_t fault_code; +static volatile sig_atomic_t remaining_faults; +static volatile unsigned int *fault_addr; +static unsigned long pgsize, numinsns; +static unsigned int *insns; +static bool pkeys_supported; + +static bool is_fault_expected(int fault_code) +{ + if (fault_code == SEGV_ACCERR) + return true; + + /* Assume any pkey error is fine since pkey_exec_prot test covers them */ + if (fault_code == SEGV_PKUERR && pkeys_supported) + return true; + + return false; +} + +static void trap_handler(int signum, siginfo_t *sinfo, void *ctx) +{ + /* Check if this fault originated from the expected address */ + if (sinfo->si_addr != (void *)fault_addr) + sigsafe_err("got a fault for an unexpected address\n"); + + _exit(1); +} + +static void segv_handler(int signum, siginfo_t *sinfo, void *ctx) +{ + fault_code = sinfo->si_code; + + /* Check if this fault originated from the expected address */ + if (sinfo->si_addr != (void *)fault_addr) { + sigsafe_err("got a fault for an unexpected address\n"); + _exit(1); + } + + /* Check if too many faults have occurred for a single test case */ + if (!remaining_faults) { + sigsafe_err("got too many faults for the same address\n"); + _exit(1); + } + + + /* Restore permissions in order to continue */ + if (is_fault_expected(fault_code)) { + if (mprotect(insns, pgsize, PROT_READ | PROT_WRITE | PROT_EXEC)) { + sigsafe_err("failed to set access permissions\n"); + _exit(1); + } + } else { + sigsafe_err("got a fault with an unexpected code\n"); + _exit(1); + } + + remaining_faults--; +} + +static int check_exec_fault(int rights) +{ + /* +* Jump to the executable region. +* +* The first iteration also checks if the overwrite of the +* first instruction word from a trap to a no-op succeeded. +*/ + fault_code = -1; + remaining_faults = 0; + if (!(rights & PROT_EXEC)) + remaining_faults = 1; + + FAIL_IF(mprotect(insns, pgsize, rights) != 0); + asm volatile("mtctr %0; bctrl" : : "r"(insns)); + + FAIL_IF(remaining_faults != 0); + if (!(rights & PROT_EXEC)) + FAIL_IF(!is_fault_expected(fault_code)); + + return 0; +} + +static int test(void) +{ + struct sigaction segv_act, trap_act; + int i; + + /* Skip the test if the CPU doesn't support Radix */ + SKIP_IF(!have_hwcap2
[PATCH v4 1/2] powerpc/mm: Support execute-only memory on the Radix MMU
Add support for execute-only memory (XOM) for the Radix MMU by using an execute-only mapping, as opposed to the RX mapping used by powerpc's other MMUs. The Hash MMU already supports XOM through the execute-only pkey, which is a separate mechanism shared with x86. A PROT_EXEC-only mapping will map to RX, and then the pkey will be applied on top of it. mmap() and mprotect() consumers in userspace should observe the same behaviour on Hash and Radix despite the differences in implementation. Replacing the vma_is_accessible() check in access_error() with a read check should be functionally equivalent for non-Radix MMUs, since it follows write and execute checks. For Radix, the change enables detecting faults on execute-only mappings where vma_is_accessible() would return true. Signed-off-by: Russell Currey --- v4: Reword commit message, add changes suggested by Christophe and Aneesh arch/powerpc/include/asm/book3s/64/pgtable.h | 2 ++ arch/powerpc/mm/book3s64/pgtable.c | 11 +-- arch/powerpc/mm/fault.c | 6 +- 3 files changed, 16 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h index 392ff48f77df..486902aff040 100644 --- a/arch/powerpc/include/asm/book3s/64/pgtable.h +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h @@ -151,6 +151,8 @@ #define PAGE_COPY_X__pgprot(_PAGE_BASE | _PAGE_READ | _PAGE_EXEC) #define PAGE_READONLY __pgprot(_PAGE_BASE | _PAGE_READ) #define PAGE_READONLY_X__pgprot(_PAGE_BASE | _PAGE_READ | _PAGE_EXEC) +/* Radix only, Hash uses PAGE_READONLY_X + execute-only pkey instead */ +#define PAGE_EXECONLY __pgprot(_PAGE_BASE | _PAGE_EXEC) /* Permission masks used for kernel mappings */ #define PAGE_KERNEL__pgprot(_PAGE_BASE | _PAGE_KERNEL_RW) diff --git a/arch/powerpc/mm/book3s64/pgtable.c b/arch/powerpc/mm/book3s64/pgtable.c index 7b9966402b25..f6151a589298 100644 --- a/arch/powerpc/mm/book3s64/pgtable.c +++ b/arch/powerpc/mm/book3s64/pgtable.c @@ -553,8 +553,15 @@ EXPORT_SYMBOL_GPL(memremap_compat_align); pgprot_t vm_get_page_prot(unsigned long vm_flags) { - unsigned long prot = pgprot_val(protection_map[vm_flags & - (VM_READ|VM_WRITE|VM_EXEC|VM_SHARED)]); + unsigned long prot; + + /* Radix supports execute-only, but protection_map maps X -> RX */ + if (radix_enabled() && ((vm_flags & VM_ACCESS_FLAGS) == VM_EXEC)) { + prot = pgprot_val(PAGE_EXECONLY); + } else { + prot = pgprot_val(protection_map[vm_flags & +(VM_ACCESS_FLAGS | VM_SHARED)]); + } if (vm_flags & VM_SAO) prot |= _PAGE_SAO; diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c index 014005428687..1566804e4b3d 100644 --- a/arch/powerpc/mm/fault.c +++ b/arch/powerpc/mm/fault.c @@ -270,7 +270,11 @@ static bool access_error(bool is_write, bool is_exec, struct vm_area_struct *vma return false; } - if (unlikely(!vma_is_accessible(vma))) + /* +* Check for a read fault. This could be caused by a read on an +* inaccessible page (i.e. PROT_NONE), or a Radix MMU execute-only page. +*/ + if (unlikely(!(vma->vm_flags & VM_READ))) return true; /* * We should ideally do the vma pkey access check here. But in the -- 2.37.2
Re: [PATCH] gcc-plugins: Undefine LATENT_ENTROPY_PLUGIN when plugin disabled for a file
On Tue, 2022-08-16 at 12:25 -0700, Kees Cook wrote: > Applied to for-next/hardening, thanks! > > [1/1] gcc-plugins: Undefine LATENT_ENTROPY_PLUGIN when plugin > disabled for a file > https://git.kernel.org/kees/c/2d08c71d2c79 > Thanks Kees! Can we make sure this lands in rc2? -- Andrew DonnellanOzLabs, ADL Canberra a...@linux.ibm.com IBM Australia Limited
Re: [PATCH] gcc-plugins: Undefine LATENT_ENTROPY_PLUGIN when plugin disabled for a file
On Tue, 2022-08-16 at 05:14 -0700, Yury Norov wrote: > On Tue, Aug 16, 2022 at 03:17:20PM +1000, Andrew Donnellan wrote: > > Commit 36d4b36b6959 ("lib/nodemask: inline next_node_in() and > > node_random()") refactored some code by moving node_random() from > > lib/nodemask.c to include/linux/nodemask.h, thus requiring > > nodemask.h to > > include random.h, which conditionally defines add_latent_entropy() > > depending on whether the macro LATENT_ENTROPY_PLUGIN is defined. > > > > This broke the build on powerpc, where nodemask.h is indirectly > > included > > in arch/powerpc/kernel/prom_init.c, part of the early boot > > machinery that > > is excluded from the latent entropy plugin using > > DISABLE_LATENT_ENTROPY_PLUGIN. It turns out that while we add a gcc > > flag > > to disable the actual plugin, we don't undefine > > LATENT_ENTROPY_PLUGIN. > > > > This leads to the following: > > > > CC arch/powerpc/kernel/prom_init.o > > In file included from ./include/linux/nodemask.h:97, > > from ./include/linux/mmzone.h:17, > > from ./include/linux/gfp.h:7, > > from ./include/linux/xarray.h:15, > > As a side note, xarray can go with gfp_types.h instead of gfp.h Indeed, I just saw your patch to fix this. > > > Fixes: 36d4b36b6959 ("lib/nodemask: inline next_node_in() and > > node_random()") > > I think it rather fixes 38addce8b600ca33 ("gcc-plugins: Add > latent_entropy plugin"). You're right, I was in a rush and should have tagged that appropriately. > > For the rest, > Reviewed-by: Yury Norov Thanks! -- Andrew DonnellanOzLabs, ADL Canberra a...@linux.ibm.com IBM Australia Limited
Re: [PATCH v2 1/2] mm/migrate_device.c: Copy pte dirty bit to page
On Wed, Aug 17, 2022 at 11:49:03AM +1000, Alistair Popple wrote: > > Peter Xu writes: > > > On Tue, Aug 16, 2022 at 04:10:29PM +0800, huang ying wrote: > >> > @@ -193,11 +194,10 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp, > >> > bool anon_exclusive; > >> > pte_t swp_pte; > >> > > >> > + flush_cache_page(vma, addr, pte_pfn(*ptep)); > >> > + pte = ptep_clear_flush(vma, addr, ptep); > >> > >> Although I think it's possible to batch the TLB flushing just before > >> unlocking PTL. The current code looks correct. > > > > If we're with unconditionally ptep_clear_flush(), does it mean we should > > probably drop the "unmapped" and the last flush_tlb_range() already since > > they'll be redundant? > > This patch does that, unless I missed something? Yes it does. Somehow I didn't read into the real v2 patch, sorry! > > > If that'll need to be dropped, it looks indeed better to still keep the > > batch to me but just move it earlier (before unlock iiuc then it'll be > > safe), then we can keep using ptep_get_and_clear() afaiu but keep "pte" > > updated. > > I think we would also need to check should_defer_flush(). Looking at > try_to_unmap_one() there is this comment: > > if (should_defer_flush(mm, flags) && !anon_exclusive) { > /* >* We clear the PTE but do not flush so > potentially >* a remote CPU could still be writing to the > folio. >* If the entry was previously clean then the >* architecture must guarantee that a > clear->dirty >* transition on a cached TLB entry is written > through >* and traps if the PTE is unmapped. >*/ > > And as I understand it we'd need the same guarantee here. Given > try_to_migrate_one() doesn't do batched TLB flushes either I'd rather > keep the code as consistent as possible between > migrate_vma_collect_pmd() and try_to_migrate_one(). I could look at > introducing TLB flushing for both in some future patch series. should_defer_flush() is TTU-specific code? IIUC the caller sets TTU_BATCH_FLUSH showing that tlb can be omitted since the caller will be responsible for doing it. In migrate_vma_collect_pmd() iiuc we don't need that hint because it'll be flushed within the same function but just only after the loop of modifying the ptes. Also it'll be with the pgtable lock held. Indeed try_to_migrate_one() doesn't do batching either, but IMHO it's just harder to do due to using the vma walker (e.g., the lock is released in not_found() implicitly so iiuc it's hard to do tlb flush batching safely in the loop of page_vma_mapped_walk). Also that's less a concern since the loop will only operate upon >1 ptes only if it's a thp page mapped in ptes. OTOH migrate_vma_collect_pmd() operates on all ptes on a pmd always. No strong opinion anyway, it's just a bit of a pity because fundamentally this patch is removing the batching tlb flush. I also don't know whether there'll be observe-able perf degrade for migrate_vma_collect_pmd(), especially on large machines. Thanks, -- Peter Xu
Re: [PATCH v2 1/2] mm/migrate_device.c: Copy pte dirty bit to page
Peter Xu writes: > On Tue, Aug 16, 2022 at 04:10:29PM +0800, huang ying wrote: >> > @@ -193,11 +194,10 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp, >> > bool anon_exclusive; >> > pte_t swp_pte; >> > >> > + flush_cache_page(vma, addr, pte_pfn(*ptep)); >> > + pte = ptep_clear_flush(vma, addr, ptep); >> >> Although I think it's possible to batch the TLB flushing just before >> unlocking PTL. The current code looks correct. > > If we're with unconditionally ptep_clear_flush(), does it mean we should > probably drop the "unmapped" and the last flush_tlb_range() already since > they'll be redundant? This patch does that, unless I missed something? > If that'll need to be dropped, it looks indeed better to still keep the > batch to me but just move it earlier (before unlock iiuc then it'll be > safe), then we can keep using ptep_get_and_clear() afaiu but keep "pte" > updated. I think we would also need to check should_defer_flush(). Looking at try_to_unmap_one() there is this comment: if (should_defer_flush(mm, flags) && !anon_exclusive) { /* * We clear the PTE but do not flush so potentially * a remote CPU could still be writing to the folio. * If the entry was previously clean then the * architecture must guarantee that a clear->dirty * transition on a cached TLB entry is written through * and traps if the PTE is unmapped. */ And as I understand it we'd need the same guarantee here. Given try_to_migrate_one() doesn't do batched TLB flushes either I'd rather keep the code as consistent as possible between migrate_vma_collect_pmd() and try_to_migrate_one(). I could look at introducing TLB flushing for both in some future patch series. - Alistair > Thanks,
Re: [PATCH v2 1/2] mm/migrate_device.c: Copy pte dirty bit to page
huang ying writes: > On Tue, Aug 16, 2022 at 3:39 PM Alistair Popple wrote: >> >> migrate_vma_setup() has a fast path in migrate_vma_collect_pmd() that >> installs migration entries directly if it can lock the migrating page. >> When removing a dirty pte the dirty bit is supposed to be carried over >> to the underlying page to prevent it being lost. >> >> Currently migrate_vma_*() can only be used for private anonymous >> mappings. That means loss of the dirty bit usually doesn't result in >> data loss because these pages are typically not file-backed. However >> pages may be backed by swap storage which can result in data loss if an >> attempt is made to migrate a dirty page that doesn't yet have the >> PageDirty flag set. >> >> In this case migration will fail due to unexpected references but the >> dirty pte bit will be lost. If the page is subsequently reclaimed data >> won't be written back to swap storage as it is considered uptodate, >> resulting in data loss if the page is subsequently accessed. >> >> Prevent this by copying the dirty bit to the page when removing the pte >> to match what try_to_migrate_one() does. >> >> Signed-off-by: Alistair Popple >> Acked-by: Peter Xu >> Reported-by: Huang Ying >> Fixes: 8c3328f1f36a ("mm/migrate: migrate_vma() unmap page from vma while >> collecting pages") >> Cc: sta...@vger.kernel.org >> >> --- >> >> Changes for v2: >> >> - Fixed up Reported-by tag. >> - Added Peter's Acked-by. >> - Atomically read and clear the pte to prevent the dirty bit getting >>set after reading it. >> - Added fixes tag >> --- >> mm/migrate_device.c | 21 - >> 1 file changed, 8 insertions(+), 13 deletions(-) >> >> diff --git a/mm/migrate_device.c b/mm/migrate_device.c >> index 27fb37d..e2d09e5 100644 >> --- a/mm/migrate_device.c >> +++ b/mm/migrate_device.c >> @@ -7,6 +7,7 @@ >> #include >> #include >> #include >> +#include >> #include >> #include >> #include >> @@ -61,7 +62,7 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp, >> struct migrate_vma *migrate = walk->private; >> struct vm_area_struct *vma = walk->vma; >> struct mm_struct *mm = vma->vm_mm; >> - unsigned long addr = start, unmapped = 0; >> + unsigned long addr = start; >> spinlock_t *ptl; >> pte_t *ptep; >> >> @@ -193,11 +194,10 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp, >> bool anon_exclusive; >> pte_t swp_pte; >> >> + flush_cache_page(vma, addr, pte_pfn(*ptep)); >> + pte = ptep_clear_flush(vma, addr, ptep); > > Although I think it's possible to batch the TLB flushing just before > unlocking PTL. The current code looks correct. I think you might be right but I'd rather deal with batch TLB flushing as a separate change that implements it for normal migration as well given we don't seem to do it there either. > Reviewed-by: "Huang, Ying" Thanks. > Best Regards, > Huang, Ying > >> anon_exclusive = PageAnon(page) && >> PageAnonExclusive(page); >> if (anon_exclusive) { >> - flush_cache_page(vma, addr, pte_pfn(*ptep)); >> - ptep_clear_flush(vma, addr, ptep); >> - >> if (page_try_share_anon_rmap(page)) { >> set_pte_at(mm, addr, ptep, pte); >> unlock_page(page); >> @@ -205,12 +205,14 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp, >> mpfn = 0; >> goto next; >> } >> - } else { >> - ptep_get_and_clear(mm, addr, ptep); >> } >> >> migrate->cpages++; >> >> + /* Set the dirty flag on the folio now the pte is >> gone. */ >> + if (pte_dirty(pte)) >> + folio_mark_dirty(page_folio(page)); >> + >> /* Setup special migration page table entry */ >> if (mpfn & MIGRATE_PFN_WRITE) >> entry = make_writable_migration_entry( >> @@ -242,9 +244,6 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp, >> */ >> page_remove_rmap(page, vma, false); >> put_page(page); >> - >> - if (pte_present(pte)) >> - unmapped++; >> } else { >> put_page(page); >> mpfn = 0; >> @@ -257,10 +256,6 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp, >> arch_leave_lazy_mmu_mode(); >> pte_unmap_unlock(ptep - 1, ptl); >> >> - /* Only flush the TLB if we actually modified any entries */ >>
[Bug 216367] Kernel 6.0-rc1 fails to build with GCC_PLUGIN_LATENT_ENTROPY=y (PowerMac G5 11,2)
https://bugzilla.kernel.org/show_bug.cgi?id=216367 --- Comment #3 from Andrew Donnellan (a...@linux.ibm.com) --- Patch has been taken via Kees Cook's hardening tree: https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git/commit/?h=for-next/hardening&id=012e8d2034f1bda8863435cd589636e618d6a659 -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
[PATCH] KVM: PPC: Book3S HV: Fix decrementer migration
We used to have a workaround[1] for a hang during migration that was made ineffective when we converted the decrementer expiry to be relative to guest timebase. The point of the workaround was that in the absence of an explicit decrementer expiry value provided by userspace during migration, KVM needs to initialize dec_expires to a value that will result in an expired decrementer after subtracting the current guest timebase. That stops the vcpu from hanging after migration due to a decrementer that's too large. If the dec_expires is now relative to guest timebase, its initialization needs to be guest timebase-relative as well, otherwise we end up with a decrementer expiry that is still larger than the guest timebase. 1- https://git.kernel.org/torvalds/c/5855564c8ab2 Fixes: 3c1a4322bba7 ("KVM: PPC: Book3S HV: Change dec_expires to be relative to guest timebase") Signed-off-by: Fabiano Rosas --- arch/powerpc/kvm/book3s_hv.c | 18 -- arch/powerpc/kvm/powerpc.c | 1 - 2 files changed, 16 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c index 57d0835e56fd..917abda9e5ce 100644 --- a/arch/powerpc/kvm/book3s_hv.c +++ b/arch/powerpc/kvm/book3s_hv.c @@ -2517,10 +2517,24 @@ static int kvmppc_set_one_reg_hv(struct kvm_vcpu *vcpu, u64 id, r = set_vpa(vcpu, &vcpu->arch.dtl, addr, len); break; case KVM_REG_PPC_TB_OFFSET: + { /* round up to multiple of 2^24 */ - vcpu->arch.vcore->tb_offset = - ALIGN(set_reg_val(id, *val), 1UL << 24); + u64 tb_offset = ALIGN(set_reg_val(id, *val), 1UL << 24); + + /* +* Now that we know the timebase offset, update the +* decrementer expiry with a guest timebase value. If +* the userspace does not set DEC_EXPIRY, this ensures +* a migrated vcpu at least starts with an expired +* decrementer, which is better than a large one that +* causes a hang. +*/ + if (!vcpu->arch.dec_expires && tb_offset) + vcpu->arch.dec_expires = get_tb() + tb_offset; + + vcpu->arch.vcore->tb_offset = tb_offset; break; + } case KVM_REG_PPC_LPCR: kvmppc_set_lpcr(vcpu, set_reg_val(id, *val), true); break; diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c index fb1490761c87..757491dd6b7b 100644 --- a/arch/powerpc/kvm/powerpc.c +++ b/arch/powerpc/kvm/powerpc.c @@ -786,7 +786,6 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu) hrtimer_init(&vcpu->arch.dec_timer, CLOCK_REALTIME, HRTIMER_MODE_ABS); vcpu->arch.dec_timer.function = kvmppc_decrementer_wakeup; - vcpu->arch.dec_expires = get_tb(); #ifdef CONFIG_KVM_EXIT_TIMING mutex_init(&vcpu->arch.exit_timing_lock); -- 2.35.3
[PATCH 6/8] serial: Make ->set_termios() old ktermios const
There should be no reason to adjust old ktermios which is going to get discarded anyway. Signed-off-by: Ilpo Järvinen --- drivers/tty/serial/21285.c | 2 +- drivers/tty/serial/8250/8250_bcm7271.c | 2 +- drivers/tty/serial/8250/8250_dw.c | 2 +- drivers/tty/serial/8250/8250_dwlib.c| 3 ++- drivers/tty/serial/8250/8250_dwlib.h| 2 +- drivers/tty/serial/8250/8250_fintek.c | 2 +- drivers/tty/serial/8250/8250_lpss.c | 2 +- drivers/tty/serial/8250/8250_mid.c | 5 ++--- drivers/tty/serial/8250/8250_mtk.c | 2 +- drivers/tty/serial/8250/8250_omap.c | 2 +- drivers/tty/serial/8250/8250_port.c | 6 +++--- drivers/tty/serial/altera_jtaguart.c| 4 ++-- drivers/tty/serial/altera_uart.c| 2 +- drivers/tty/serial/amba-pl010.c | 2 +- drivers/tty/serial/amba-pl011.c | 4 ++-- drivers/tty/serial/apbuart.c| 2 +- drivers/tty/serial/ar933x_uart.c| 2 +- drivers/tty/serial/arc_uart.c | 2 +- drivers/tty/serial/atmel_serial.c | 5 +++-- drivers/tty/serial/bcm63xx_uart.c | 5 ++--- drivers/tty/serial/clps711x.c | 2 +- drivers/tty/serial/cpm_uart/cpm_uart_core.c | 2 +- drivers/tty/serial/digicolor-usart.c| 2 +- drivers/tty/serial/dz.c | 2 +- drivers/tty/serial/fsl_linflexuart.c| 2 +- drivers/tty/serial/fsl_lpuart.c | 4 ++-- drivers/tty/serial/icom.c | 5 ++--- drivers/tty/serial/imx.c| 2 +- drivers/tty/serial/ip22zilog.c | 2 +- drivers/tty/serial/jsm/jsm_tty.c| 4 ++-- drivers/tty/serial/lantiq.c | 4 ++-- drivers/tty/serial/liteuart.c | 2 +- drivers/tty/serial/lpc32xx_hs.c | 2 +- drivers/tty/serial/max3100.c| 2 +- drivers/tty/serial/max310x.c| 2 +- drivers/tty/serial/mcf.c| 2 +- drivers/tty/serial/men_z135_uart.c | 4 ++-- drivers/tty/serial/meson_uart.c | 2 +- drivers/tty/serial/milbeaut_usio.c | 3 ++- drivers/tty/serial/mpc52xx_uart.c | 12 ++-- drivers/tty/serial/mps2-uart.c | 2 +- drivers/tty/serial/msm_serial.c | 2 +- drivers/tty/serial/mux.c| 2 +- drivers/tty/serial/mvebu-uart.c | 2 +- drivers/tty/serial/mxs-auart.c | 2 +- drivers/tty/serial/omap-serial.c| 2 +- drivers/tty/serial/owl-uart.c | 2 +- drivers/tty/serial/pch_uart.c | 3 ++- drivers/tty/serial/pic32_uart.c | 2 +- drivers/tty/serial/pmac_zilog.c | 4 ++-- drivers/tty/serial/pxa.c| 2 +- drivers/tty/serial/qcom_geni_serial.c | 3 ++- drivers/tty/serial/rda-uart.c | 2 +- drivers/tty/serial/rp2.c| 5 ++--- drivers/tty/serial/sa1100.c | 2 +- drivers/tty/serial/samsung_tty.c| 2 +- drivers/tty/serial/sb1250-duart.c | 2 +- drivers/tty/serial/sc16is7xx.c | 2 +- drivers/tty/serial/sccnxp.c | 3 ++- drivers/tty/serial/serial-tegra.c | 3 ++- drivers/tty/serial/serial_core.c| 2 +- drivers/tty/serial/serial_txx9.c| 2 +- drivers/tty/serial/sh-sci.c | 2 +- drivers/tty/serial/sifive.c | 2 +- drivers/tty/serial/sprd_serial.c| 5 ++--- drivers/tty/serial/st-asc.c | 2 +- drivers/tty/serial/stm32-usart.c| 2 +- drivers/tty/serial/sunhv.c | 2 +- drivers/tty/serial/sunplus-uart.c | 2 +- drivers/tty/serial/sunsab.c | 2 +- drivers/tty/serial/sunsu.c | 2 +- drivers/tty/serial/sunzilog.c | 2 +- drivers/tty/serial/tegra-tcu.c | 2 +- drivers/tty/serial/timbuart.c | 4 ++-- drivers/tty/serial/uartlite.c | 5 +++-- drivers/tty/serial/ucc_uart.c | 3 ++- drivers/tty/serial/vt8500_serial.c | 2 +- drivers/tty/serial/xilinx_uartps.c | 3 ++- drivers/tty/serial/zs.c | 2 +- drivers/tty/tty_ioctl.c | 2 +- include/linux/serial_8250.h | 4 ++-- include/linux/serial_core.h | 6 +++--- 82 files changed, 117 insertions(+), 112 deletions(-) diff --git a/drivers/tty/serial/21285.c b/drivers/tty/serial/21285.c index 7520cc02fd4d..2f17bf4b221e 100644 --- a/drivers/tty/serial/21285.c +++ b/drivers/tty/serial/21285.c @@ -243,7 +243,7 @@ static void serial21285_shutdown(struct uart_port *port) static void serial21285_set_termios(struct uart_port *port, struct ktermios *termios, - struct ktermios *old) +
Re: [PATCH v2 1/4] Make place for common balloon code
Hello, On 16.08.22 12:49, Greg Kroah-Hartman wrote: On Tue, Aug 16, 2022 at 12:41:14PM +0300, Alexander Atanasov wrote: rename include/linux/{balloon_compaction.h => balloon_common.h} (99%) Why rename the .h file? It still handles the "balloon compaction" logic. File contains code that is common to balloon drivers, compaction is only part of it. Series add more code to it. Since it was suggested to use it for such common code. I find that common becomes a better name for it so the rename. I can drop the rename easy on next iteration if you suggest to. -- Regards, Alexander Atanasov
Re: [PATCH] arch: mm: rename FORCE_MAX_ZONEORDER to ARCH_FORCE_MAX_ORDER
For LoongArch: Acked-by: Huacai Chen On Tue, Aug 16, 2022 at 6:30 PM Catalin Marinas wrote: > > On Mon, Aug 15, 2022 at 10:39:59AM -0400, Zi Yan wrote: > > diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig > > index 571cc234d0b3..c6fcd8746f60 100644 > > --- a/arch/arm64/Kconfig > > +++ b/arch/arm64/Kconfig > > @@ -1401,7 +1401,7 @@ config XEN > > help > > Say Y if you want to run Linux in a Virtual Machine on Xen on ARM64. > > > > -config FORCE_MAX_ZONEORDER > > +config ARCH_FORCE_MAX_ORDER > > int > > default "14" if ARM64_64K_PAGES > > default "12" if ARM64_16K_PAGES > > For arm64: > > Acked-by: Catalin Marinas
Re: [PATCH v2 1/2] mm/migrate_device.c: Copy pte dirty bit to page
On Tue, Aug 16, 2022 at 04:10:29PM +0800, huang ying wrote: > > @@ -193,11 +194,10 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp, > > bool anon_exclusive; > > pte_t swp_pte; > > > > + flush_cache_page(vma, addr, pte_pfn(*ptep)); > > + pte = ptep_clear_flush(vma, addr, ptep); > > Although I think it's possible to batch the TLB flushing just before > unlocking PTL. The current code looks correct. If we're with unconditionally ptep_clear_flush(), does it mean we should probably drop the "unmapped" and the last flush_tlb_range() already since they'll be redundant? If that'll need to be dropped, it looks indeed better to still keep the batch to me but just move it earlier (before unlock iiuc then it'll be safe), then we can keep using ptep_get_and_clear() afaiu but keep "pte" updated. Thanks, -- Peter Xu
Re: [PATCH] gcc-plugins: Undefine LATENT_ENTROPY_PLUGIN when plugin disabled for a file
On Tue, 16 Aug 2022 15:17:20 +1000, Andrew Donnellan wrote: > Commit 36d4b36b6959 ("lib/nodemask: inline next_node_in() and > node_random()") refactored some code by moving node_random() from > lib/nodemask.c to include/linux/nodemask.h, thus requiring nodemask.h to > include random.h, which conditionally defines add_latent_entropy() > depending on whether the macro LATENT_ENTROPY_PLUGIN is defined. > > This broke the build on powerpc, where nodemask.h is indirectly included > in arch/powerpc/kernel/prom_init.c, part of the early boot machinery that > is excluded from the latent entropy plugin using > DISABLE_LATENT_ENTROPY_PLUGIN. It turns out that while we add a gcc flag > to disable the actual plugin, we don't undefine LATENT_ENTROPY_PLUGIN. > > [...] Applied to for-next/hardening, thanks! [1/1] gcc-plugins: Undefine LATENT_ENTROPY_PLUGIN when plugin disabled for a file https://git.kernel.org/kees/c/2d08c71d2c79 -- Kees Cook
Re: [PATCH 0/2] ftrace/recordmcount: Handle object files without section symbols
Hi Steven, Steven Rostedt wrote: On Wed, 27 Apr 2022 15:01:20 +0530 "Naveen N. Rao" wrote: This solves a build issue on powerpc with binutils v2.36 and newer [1]. Since commit d1bcae833b32f1 ("ELF: Don't generate unused section symbols") [2], binutils started dropping section symbols that it thought were unused. Due to this, in certain scenarios, recordmcount is unable to find a non-weak symbol to generate a relocation record against. Clang integrated assembler is also aggressive in dropping section symbols [3]. In the past, there have been various workarounds to address this. See commits 55d5b7dd6451b5 ("initramfs: fix clang build failure") and 6e7b64b9dd6d96 ("elfcore: fix building with clang") and a recent patch: https://lore.kernel.org/linuxppc-dev/20220425174128.11455-1-naveen.n@linux.vnet.ibm.com/T/#u Fix this issue by using the weak symbol in the relocation record. This can result in duplicate locations in the mcount table if those weak functions are overridden, so have ftrace skip dupicate entries. Objtool already follows this approach, so patch 2 updates recordmcount to do the same. Patch 1 updates ftrace to skip duplicate entries. - Naveen [1] https://github.com/linuxppc/issues/issues/388 [2] https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=d1bcae833b32f1 [3] https://github.com/ClangBuiltLinux/linux/issues/981 There's been work to handle weak functions, but I'm not sure that work handled the issues here. Are these patches still needed, or was there another workaround to handle the problems this addressed? I'm afraid these patches are still needed to address issues in recordmcount. I submitted patches to remove use of weak functions in the kexec subsystem, but those have only enabled building ppc64le defconfig without errors: https://lore.kernel.org/all/20220519091237.676736-1-naveen.n@linux.vnet.ibm.com/ https://lore.kernel.org/all/cover.1656659357.git.naveen.n@linux.vnet.ibm.com/ The patch adding support for FTRACE_MCOUNT_MAX_OFFSET to powerpc only helps ignore weak functions during runtime: https://lore.kernel.org/all/20220809105425.424045-1-naveen.n@linux.vnet.ibm.com/ We still see errors from recordmcount when trying to build certain powerpc configs. We are pursuing support for objtool, which doesn't have the same issues: https://lore.kernel.org/all/20220808114908.240813-1...@linux.ibm.com/ - Naveen
[6.0-rc1] Kernel crash while running MCE tests
Following crash is seen while running powerpc/mce subtest on a Power10 LPAR. 1..1 # selftests: powerpc/mce: inject-ra-err [ 155.240591] BUG: Unable to handle kernel data access on read at 0xc00e00022d55b503 [ 155.240618] Faulting instruction address: 0xc06f1f0c [ 155.240627] Oops: Kernel access of bad area, sig: 11 [#1] [ 155.240633] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries [ 155.240642] Modules linked in: dm_mod mptcp_diag xsk_diag tcp_diag udp_diag raw_diag inet_diag unix_diag af_packet_diag netlink_diag nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 bonding rfkill tls ip_set nf_tables nfnetlink sunrpc binfmt_misc pseries_rng drm drm_panel_orientation_quirks xfs libcrc32c sd_mod t10_pi sr_mod crc64_rocksoft_generic cdrom crc64_rocksoft crc64 sg ibmvscsi ibmveth scsi_transport_srp xts vmx_crypto fuse [ 155.240750] CPU: 4 PID: 3645 Comm: inject-ra-err Not tainted 6.0.0-rc1 #2 [ 155.240761] NIP: c06f1f0c LR: c00630d0 CTR: [ 155.240768] REGS: c000ff887890 TRAP: 0300 Not tainted (6.0.0-rc1) [ 155.240776] MSR: 80001003 CR: 48002828 XER: [ 155.240792] CFAR: c00630cc DAR: c00e00022d55b503 DSISR: 4000 IRQMASK: 3 [ 155.240792] GPR00: c00630d0 c000ff887b30 c44afe00 c0116aada818 [ 155.240792] GPR04: 4d43 0008 c00630d0 004d4249 [ 155.240792] GPR08: 0001 18022d55b503 a80e 0348 [ 155.240792] GPR12: c000b700 [ 155.240792] GPR16: [ 155.240792] GPR20: 1b30 [ 155.240792] GPR24: 7fff8dad 7fff8dacf6d8 7fffd1551e98 1001fce8 [ 155.240792] GPR28: c0116aada888 c0116aada800 4d43 c0116aada818 [ 155.240885] NIP [c06f1f0c] __asan_load2+0x5c/0xe0 [ 155.240898] LR [c00630d0] pseries_errorlog_id+0x20/0x40 [ 155.240910] Call Trace: [ 155.240914] [c000ff887b50] [c00630d0] pseries_errorlog_id+0x20/0x40 [ 155.240925] [c000ff887b80] [c15595c8] get_pseries_errorlog+0xa8/0x110 [ 155.240937] [c000ff887bc0] [c014e080] pseries_machine_check_realmode+0x140/0x2d0 [ 155.240949] [c000ff887ca0] [c005e5b8] machine_check_early+0x68/0xc0 [ 155.240959] [c000ff887cf0] [c0008364] machine_check_early_common+0x134/0x1f8 [ 155.240971] --- interrupt: 200 at 0x1e48 [ 155.240978] NIP: 1e48 LR: 1e40 CTR: [ 155.240984] REGS: c000ff887d60 TRAP: 0200 Not tainted (6.0.0-rc1) [ 155.240991] MSR: 82a0f033 CR: 82002822 XER: [ 155.241015] CFAR: 021c DAR: 7fff8da3 DSISR: 0208 IRQMASK: 0 [ 155.241015] GPR00: 1e40 7fffd15517b0 10027f00 7fff8da3 [ 155.241015] GPR04: 1000 0003 0001 0005 [ 155.241015] GPR08: f000 [ 155.241015] GPR12: 7fff8dada5e0 [ 155.241015] GPR16: [ 155.241015] GPR20: 1b30 [ 155.241015] GPR24: 7fff8dad 7fff8dacf6d8 7fffd1551e98 1001fce8 [ 155.241015] GPR28: 7fffd1552020 0001 0005 [ 155.241104] NIP [1e48] 0x1e48 [ 155.241109] LR [1e40] 0x1e40 [ 155.241115] --- interrupt: 200 [ 155.241119] Instruction dump: [ 155.241125] 6129 792907c6 6529 6129 7c234840 40810058 39230001 71280007 [ 155.241141] 41820034 3d40a80e 7929e8c2 794a07c6 <7d2950ae> 7d290775 4082006c 38210020 [ 155.241160] ---[ end trace ]--- [ 155.247904] The crash is seen only with CONFIG_KASAN enabled. After disabling KASAN the test runs to completion. # cat .config | grep KASAN CONFIG_HAVE_ARCH_KASAN=y CONFIG_HAVE_ARCH_KASAN_VMALLOC=y CONFIG_ARCH_DISABLE_KASAN_INLINE=y CONFIG_CC_HAS_KASAN_GENERIC=y # CONFIG_KASAN is not set # 1..1 # selftests: powerpc/mce: inject-ra-err [ 42.777173] Disabling lock debugging due to kernel taint [ 42.777195] MCE: CPU2: machine check (Severe) Real address Load/Store (foreign/control memory) [Not recovered] [ 42.777203] MCE: CPU2: PID: 2920 Comm: inject-ra-err NIP: [1e48] [ 42.777208] MCE: CPU2: Initiator CPU [ 42.777210] MCE: CPU2: Unknown # test: inject-ra-err # tags: git_version:v6.0-rc1-0-g568035b01cfb # success: inject-ra-err ok 1 selftests: powerpc/mce: inj
Re: [PATCH] powerpc/ftrace: Ignore weak functions
On Tue, 9 Aug 2022 16:24:25 +0530 "Naveen N. Rao" wrote: > Extend commit b39181f7c6907d ("ftrace: Add FTRACE_MCOUNT_MAX_OFFSET to > avoid adding weak function") to ppc32 and ppc64 -mprofile-kernel by > defining FTRACE_MCOUNT_MAX_OFFSET. > > For ppc64 -mprofile-kernel ABI, we can have two instructions at function > entry for TOC setup followed by 'mflr r0' and 'bl _mcount'. So, the > mcount location is at most the 4th instruction in a function. For ppc32, > mcount location is always the 3rd instruction in a function, preceded by > 'mflr r0' and 'stw r0,4(r1)'. > > With this patch, and with ppc64le_guest_defconfig and some ftrace/bpf > config items enabled: > # grep __ftrace_invalid_address available_filter_functions | wc -l > 79 I wonder if this patch answers the question to my last email. ;-) Acked-by: Steven Rostedt (Google) -- Steve > > Signed-off-by: Naveen N. Rao > --- > arch/powerpc/include/asm/ftrace.h | 7 +++ > 1 file changed, 7 insertions(+) > > diff --git a/arch/powerpc/include/asm/ftrace.h > b/arch/powerpc/include/asm/ftrace.h > index 3cee7115441b41..ade406dc6504e3 100644 > --- a/arch/powerpc/include/asm/ftrace.h > +++ b/arch/powerpc/include/asm/ftrace.h > @@ -10,6 +10,13 @@ > > #define HAVE_FUNCTION_GRAPH_RET_ADDR_PTR > > +/* Ignore unused weak functions which will have larger offsets */ > +#ifdef CONFIG_MPROFILE_KERNEL > +#define FTRACE_MCOUNT_MAX_OFFSET 12 > +#elif defined(CONFIG_PPC32) > +#define FTRACE_MCOUNT_MAX_OFFSET 8 > +#endif > + > #ifndef __ASSEMBLY__ > extern void _mcount(void); > > > base-commit: ff1ed171e05c971652a0ede3d716997de8ee41c9
Re: [PATCH 0/2] ftrace/recordmcount: Handle object files without section symbols
On Wed, 27 Apr 2022 15:01:20 +0530 "Naveen N. Rao" wrote: > This solves a build issue on powerpc with binutils v2.36 and newer [1]. > Since commit d1bcae833b32f1 ("ELF: Don't generate unused section > symbols") [2], binutils started dropping section symbols that it thought > were unused. Due to this, in certain scenarios, recordmcount is unable > to find a non-weak symbol to generate a relocation record against. > > Clang integrated assembler is also aggressive in dropping section > symbols [3]. > > In the past, there have been various workarounds to address this. See > commits 55d5b7dd6451b5 ("initramfs: fix clang build failure") and > 6e7b64b9dd6d96 ("elfcore: fix building with clang") and a recent patch: > https://lore.kernel.org/linuxppc-dev/20220425174128.11455-1-naveen.n@linux.vnet.ibm.com/T/#u > > Fix this issue by using the weak symbol in the relocation record. This > can result in duplicate locations in the mcount table if those weak > functions are overridden, so have ftrace skip dupicate entries. > > Objtool already follows this approach, so patch 2 updates recordmcount > to do the same. Patch 1 updates ftrace to skip duplicate entries. > > - Naveen > > > [1] https://github.com/linuxppc/issues/issues/388 > [2] https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=d1bcae833b32f1 > [3] https://github.com/ClangBuiltLinux/linux/issues/981 > > There's been work to handle weak functions, but I'm not sure that work handled the issues here. Are these patches still needed, or was there another workaround to handle the problems this addressed? -- Steve
Re: [PATCH v2] ASoC: fsl_sai: fix incorrect mclk number in error message
On Sat, 13 Aug 2022 10:33:52 +0200, Pieterjan Camerlynck wrote: > In commit c3ecef21c3f26 ("ASoC: fsl_sai: add sai master mode support") > the loop was changed to start iterating from 1 instead of 0. The error > message however was not updated, reporting the wrong clock to the user. > > Applied to https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound.git for-next Thanks! [1/1] ASoC: fsl_sai: fix incorrect mclk number in error message commit: dcdfa3471f9c28ee716c687d85701353e2e86fde All being well this means that it will be integrated into the linux-next tree (usually sometime in the next 24 hours) and sent to Linus during the next merge window (or sooner if it is a bug fix), however if problems are discovered then the patch may be dropped or reverted. You may get further e-mails resulting from automated or manual testing and review of the tree, please engage with people reporting problems and send followup patches addressing any issues that are reported if needed. If any updates are required or you are submitting further changes they should be sent as incremental updates against current git, existing patches will not be replaced. Please add any relevant lists and maintainers to the CCs when replying to this mail. Thanks, Mark
Re: [PATCH] gcc-plugins: Undefine LATENT_ENTROPY_PLUGIN when plugin disabled for a file
On Tue, Aug 16, 2022 at 03:17:20PM +1000, Andrew Donnellan wrote: > Commit 36d4b36b6959 ("lib/nodemask: inline next_node_in() and > node_random()") refactored some code by moving node_random() from > lib/nodemask.c to include/linux/nodemask.h, thus requiring nodemask.h to > include random.h, which conditionally defines add_latent_entropy() > depending on whether the macro LATENT_ENTROPY_PLUGIN is defined. > > This broke the build on powerpc, where nodemask.h is indirectly included > in arch/powerpc/kernel/prom_init.c, part of the early boot machinery that > is excluded from the latent entropy plugin using > DISABLE_LATENT_ENTROPY_PLUGIN. It turns out that while we add a gcc flag > to disable the actual plugin, we don't undefine LATENT_ENTROPY_PLUGIN. > > This leads to the following: > > CC arch/powerpc/kernel/prom_init.o > In file included from ./include/linux/nodemask.h:97, >from ./include/linux/mmzone.h:17, >from ./include/linux/gfp.h:7, >from ./include/linux/xarray.h:15, As a side note, xarray can go with gfp_types.h instead of gfp.h >from ./include/linux/radix-tree.h:21, >from ./include/linux/idr.h:15, >from ./include/linux/kernfs.h:12, >from ./include/linux/sysfs.h:16, >from ./include/linux/kobject.h:20, >from ./include/linux/pci.h:35, >from arch/powerpc/kernel/prom_init.c:24: > ./include/linux/random.h: In function 'add_latent_entropy': > ./include/linux/random.h:25:46: error: 'latent_entropy' undeclared (first > use in this function); did you mean 'add_latent_entropy'? > 25 | add_device_randomness((const void *)&latent_entropy, > sizeof(latent_entropy)); > | ^~ > | add_latent_entropy > ./include/linux/random.h:25:46: note: each undeclared identifier is > reported only once for each function it appears in > make[2]: *** [scripts/Makefile.build:249: arch/powerpc/kernel/prom_init.o] > Fehler 1 > make[1]: *** [scripts/Makefile.build:465: arch/powerpc/kernel] Fehler 2 > make: *** [Makefile:1855: arch/powerpc] Error 2 > > Change the DISABLE_LATENT_ENTROPY_PLUGIN flags to undefine > LATENT_ENTROPY_PLUGIN for files where the plugin is disabled. > > Cc: Yury Norov > Cc: Emese Revfy > Fixes: 36d4b36b6959 ("lib/nodemask: inline next_node_in() and node_random()") I think it rather fixes 38addce8b600ca33 ("gcc-plugins: Add latent_entropy plugin"). For the rest, Reviewed-by: Yury Norov > Link: https://bugzilla.kernel.org/show_bug.cgi?id=216367 > Link: > https://lore.kernel.org/linuxppc-dev/alpine.deb.2.22.394.2208152006320.289...@ramsan.of.borg/ > Reported-by: Erhard Furtner > Signed-off-by: Andrew Donnellan > --- > scripts/Makefile.gcc-plugins | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/scripts/Makefile.gcc-plugins b/scripts/Makefile.gcc-plugins > index 692d64a70542..e4deaf5fa571 100644 > --- a/scripts/Makefile.gcc-plugins > +++ b/scripts/Makefile.gcc-plugins > @@ -4,7 +4,7 @@ gcc-plugin-$(CONFIG_GCC_PLUGIN_LATENT_ENTROPY)+= > latent_entropy_plugin.so > gcc-plugin-cflags-$(CONFIG_GCC_PLUGIN_LATENT_ENTROPY)\ > += -DLATENT_ENTROPY_PLUGIN > ifdef CONFIG_GCC_PLUGIN_LATENT_ENTROPY > -DISABLE_LATENT_ENTROPY_PLUGIN += > -fplugin-arg-latent_entropy_plugin-disable > +DISABLE_LATENT_ENTROPY_PLUGIN += > -fplugin-arg-latent_entropy_plugin-disable -ULATENT_ENTROPY_PLUGIN > endif > export DISABLE_LATENT_ENTROPY_PLUGIN > > -- > 2.30.2
Re: [PATCH v2 1/4] Make place for common balloon code
On Tue, Aug 16, 2022 at 01:56:32PM +0200, Greg Kroah-Hartman wrote: > On Tue, Aug 16, 2022 at 02:47:22PM +0300, Alexander Atanasov wrote: > > Hello, > > > > On 16.08.22 12:49, Greg Kroah-Hartman wrote: > > > On Tue, Aug 16, 2022 at 12:41:14PM +0300, Alexander Atanasov wrote: > > > > > > rename include/linux/{balloon_compaction.h => balloon_common.h} (99%) > > > > > > Why rename the .h file? It still handles the "balloon compaction" > > > logic. > > > > File contains code that is common to balloon drivers, > > compaction is only part of it. Series add more code to it. > > Since it was suggested to use it for such common code. > > I find that common becomes a better name for it so the rename. > > I can drop the rename easy on next iteration if you suggest to. > > "balloon_common.h" is very vague, you should only need one balloon.h > file in the include/linux/ directory, right, so of course it is "common" > :) > > thanks, > > greg "naming is hard" k-h Yea, just call it balloon.h and balloon.c then. -- MST
Re: [PATCH v2 1/4] Make place for common balloon code
On Tue, Aug 16, 2022 at 02:47:22PM +0300, Alexander Atanasov wrote: > Hello, > > On 16.08.22 12:49, Greg Kroah-Hartman wrote: > > On Tue, Aug 16, 2022 at 12:41:14PM +0300, Alexander Atanasov wrote: > > > > rename include/linux/{balloon_compaction.h => balloon_common.h} (99%) > > > > Why rename the .h file? It still handles the "balloon compaction" > > logic. > > File contains code that is common to balloon drivers, > compaction is only part of it. Series add more code to it. > Since it was suggested to use it for such common code. > I find that common becomes a better name for it so the rename. > I can drop the rename easy on next iteration if you suggest to. "balloon_common.h" is very vague, you should only need one balloon.h file in the include/linux/ directory, right, so of course it is "common" :) thanks, greg "naming is hard" k-h
Re: [PATCH] arch: mm: rename FORCE_MAX_ZONEORDER to ARCH_FORCE_MAX_ORDER
On Mon, Aug 15, 2022 at 10:39:59AM -0400, Zi Yan wrote: > diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig > index 571cc234d0b3..c6fcd8746f60 100644 > --- a/arch/arm64/Kconfig > +++ b/arch/arm64/Kconfig > @@ -1401,7 +1401,7 @@ config XEN > help > Say Y if you want to run Linux in a Virtual Machine on Xen on ARM64. > > -config FORCE_MAX_ZONEORDER > +config ARCH_FORCE_MAX_ORDER > int > default "14" if ARM64_64K_PAGES > default "12" if ARM64_16K_PAGES For arm64: Acked-by: Catalin Marinas
[Bug 216367] Kernel 6.0-rc1 fails to build with GCC_PLUGIN_LATENT_ENTROPY=y (PowerMac G5 11,2)
https://bugzilla.kernel.org/show_bug.cgi?id=216367 --- Comment #2 from Erhard F. (erhar...@mailbox.org) --- (In reply to Andrew Donnellan from comment #1) > I've sent a patch: > https://patchwork.ozlabs.org/project/linuxppc-dev/patch/20220816051720.44108- > 1-...@linux.ibm.com/ > > Please let me know if it works for you. Applied it on top of v6.0-rc1 and it works fine. Many thanks! -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
Re: [PATCH v2 1/4] Make place for common balloon code
On Tue, Aug 16, 2022 at 12:41:14PM +0300, Alexander Atanasov wrote: > File already contains code that is common along balloon > drivers so rename it to reflect its contents. > mm/balloon_compaction.c -> mm/balloon_common.c > > Signed-off-by: Alexander Atanasov > --- > MAINTAINERS | 4 ++-- > arch/powerpc/platforms/pseries/cmm.c | 2 +- > drivers/misc/vmw_balloon.c | 2 +- > drivers/virtio/virtio_balloon.c | 2 +- > include/linux/{balloon_compaction.h => balloon_common.h} | 2 +- > mm/Makefile | 2 +- > mm/{balloon_compaction.c => balloon_common.c}| 4 ++-- > mm/migrate.c | 2 +- > mm/vmscan.c | 2 +- > 9 files changed, 11 insertions(+), 11 deletions(-) > rename include/linux/{balloon_compaction.h => balloon_common.h} (99%) Why rename the .h file? It still handles the "balloon compaction" logic. thanks, greg k-h
[PATCH] [backport for 4.14] powerpc/ptdump: Fix display of RW pages on FSL_BOOK3E
[ Upstream commit dd8de84b57b02ba9c1fe530a6d916c0853f136bd ] On FSL_BOOK3E, _PAGE_RW is defined with two bits, one for user and one for supervisor. As soon as one of the two bits is set, the page has to be display as RW. But the way it is implemented today requires both bits to be set in order to display it as RW. Instead of display RW when _PAGE_RW bits are set and R otherwise, reverse the logic and display R when _PAGE_RW bits are all 0 and RW otherwise. This change has no impact on other platforms as _PAGE_RW is a single bit on all of them. Fixes: 8eb07b187000 ("powerpc/mm: Dump linux pagetables") Cc: sta...@vger.kernel.org Signed-off-by: Christophe Leroy Signed-off-by: Michael Ellerman Link: https://lore.kernel.org/r/0c33b96317811edf691e81698aaee8fa45ec3449.1656427391.git.christophe.le...@csgroup.eu --- arch/powerpc/mm/dump_linuxpagetables.c | 13 - 1 file changed, 4 insertions(+), 9 deletions(-) diff --git a/arch/powerpc/mm/dump_linuxpagetables.c b/arch/powerpc/mm/dump_linuxpagetables.c index 0bbaf7344872..07541d322b56 100644 --- a/arch/powerpc/mm/dump_linuxpagetables.c +++ b/arch/powerpc/mm/dump_linuxpagetables.c @@ -123,15 +123,10 @@ static const struct flag_info flag_array[] = { .set= "user", .clear = "", }, { -#if _PAGE_RO == 0 - .mask = _PAGE_RW, - .val= _PAGE_RW, -#else - .mask = _PAGE_RO, - .val= 0, -#endif - .set= "rw", - .clear = "ro", + .mask = _PAGE_RW | _PAGE_RO, + .val= _PAGE_RO, + .set= "ro", + .clear = "rw", }, { .mask = _PAGE_EXEC, .val= _PAGE_EXEC, -- 2.37.1
Re: [PATCH v2 1/2] mm/migrate_device.c: Copy pte dirty bit to page
On Tue, Aug 16, 2022 at 3:39 PM Alistair Popple wrote: > > migrate_vma_setup() has a fast path in migrate_vma_collect_pmd() that > installs migration entries directly if it can lock the migrating page. > When removing a dirty pte the dirty bit is supposed to be carried over > to the underlying page to prevent it being lost. > > Currently migrate_vma_*() can only be used for private anonymous > mappings. That means loss of the dirty bit usually doesn't result in > data loss because these pages are typically not file-backed. However > pages may be backed by swap storage which can result in data loss if an > attempt is made to migrate a dirty page that doesn't yet have the > PageDirty flag set. > > In this case migration will fail due to unexpected references but the > dirty pte bit will be lost. If the page is subsequently reclaimed data > won't be written back to swap storage as it is considered uptodate, > resulting in data loss if the page is subsequently accessed. > > Prevent this by copying the dirty bit to the page when removing the pte > to match what try_to_migrate_one() does. > > Signed-off-by: Alistair Popple > Acked-by: Peter Xu > Reported-by: Huang Ying > Fixes: 8c3328f1f36a ("mm/migrate: migrate_vma() unmap page from vma while > collecting pages") > Cc: sta...@vger.kernel.org > > --- > > Changes for v2: > > - Fixed up Reported-by tag. > - Added Peter's Acked-by. > - Atomically read and clear the pte to prevent the dirty bit getting >set after reading it. > - Added fixes tag > --- > mm/migrate_device.c | 21 - > 1 file changed, 8 insertions(+), 13 deletions(-) > > diff --git a/mm/migrate_device.c b/mm/migrate_device.c > index 27fb37d..e2d09e5 100644 > --- a/mm/migrate_device.c > +++ b/mm/migrate_device.c > @@ -7,6 +7,7 @@ > #include > #include > #include > +#include > #include > #include > #include > @@ -61,7 +62,7 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp, > struct migrate_vma *migrate = walk->private; > struct vm_area_struct *vma = walk->vma; > struct mm_struct *mm = vma->vm_mm; > - unsigned long addr = start, unmapped = 0; > + unsigned long addr = start; > spinlock_t *ptl; > pte_t *ptep; > > @@ -193,11 +194,10 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp, > bool anon_exclusive; > pte_t swp_pte; > > + flush_cache_page(vma, addr, pte_pfn(*ptep)); > + pte = ptep_clear_flush(vma, addr, ptep); Although I think it's possible to batch the TLB flushing just before unlocking PTL. The current code looks correct. Reviewed-by: "Huang, Ying" Best Regards, Huang, Ying > anon_exclusive = PageAnon(page) && > PageAnonExclusive(page); > if (anon_exclusive) { > - flush_cache_page(vma, addr, pte_pfn(*ptep)); > - ptep_clear_flush(vma, addr, ptep); > - > if (page_try_share_anon_rmap(page)) { > set_pte_at(mm, addr, ptep, pte); > unlock_page(page); > @@ -205,12 +205,14 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp, > mpfn = 0; > goto next; > } > - } else { > - ptep_get_and_clear(mm, addr, ptep); > } > > migrate->cpages++; > > + /* Set the dirty flag on the folio now the pte is > gone. */ > + if (pte_dirty(pte)) > + folio_mark_dirty(page_folio(page)); > + > /* Setup special migration page table entry */ > if (mpfn & MIGRATE_PFN_WRITE) > entry = make_writable_migration_entry( > @@ -242,9 +244,6 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp, > */ > page_remove_rmap(page, vma, false); > put_page(page); > - > - if (pte_present(pte)) > - unmapped++; > } else { > put_page(page); > mpfn = 0; > @@ -257,10 +256,6 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp, > arch_leave_lazy_mmu_mode(); > pte_unmap_unlock(ptep - 1, ptl); > > - /* Only flush the TLB if we actually modified any entries */ > - if (unmapped) > - flush_tlb_range(walk->vma, start, end); > - > return 0; > } > > > base-commit: ffcf9c5700e49c0aee42dcba9a12ba21338e8136 > -- > git-series 0.9.1 >
Re: [PATCH v2] ASoC: fsl_sai: fix incorrect mclk number in error message
On Sat, Aug 13, 2022 at 4:34 PM Pieterjan Camerlynck < pieterjan.camerly...@gmail.com> wrote: > In commit c3ecef21c3f26 ("ASoC: fsl_sai: add sai master mode support") > the loop was changed to start iterating from 1 instead of 0. The error > message however was not updated, reporting the wrong clock to the user. > > Signed-off-by: Pieterjan Camerlynck > Acked-by: Shengjiu Wang Best regards Wang shengjiu > --- > V2: rebase against latest version > --- > sound/soc/fsl/fsl_sai.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/sound/soc/fsl/fsl_sai.c b/sound/soc/fsl/fsl_sai.c > index 7523bb944b21..d430eece1d6b 100644 > --- a/sound/soc/fsl/fsl_sai.c > +++ b/sound/soc/fsl/fsl_sai.c > @@ -1306,7 +1306,7 @@ static int fsl_sai_probe(struct platform_device > *pdev) > sai->mclk_clk[i] = devm_clk_get(dev, tmp); > if (IS_ERR(sai->mclk_clk[i])) { > dev_err(dev, "failed to get mclk%d clock: %ld\n", > - i + 1, PTR_ERR(sai->mclk_clk[i])); > + i, PTR_ERR(sai->mclk_clk[i])); > sai->mclk_clk[i] = NULL; > } > } > -- > 2.25.1 > >
[PATCH v2 2/2] selftests/hmm-tests: Add test for dirty bits
We were not correctly copying PTE dirty bits to pages during migrate_vma_setup() calls. This could potentially lead to data loss, so add a test for this. Signed-off-by: Alistair Popple --- tools/testing/selftests/vm/hmm-tests.c | 124 ++- 1 file changed, 124 insertions(+) diff --git a/tools/testing/selftests/vm/hmm-tests.c b/tools/testing/selftests/vm/hmm-tests.c index 529f53b..70fdb49 100644 --- a/tools/testing/selftests/vm/hmm-tests.c +++ b/tools/testing/selftests/vm/hmm-tests.c @@ -1200,6 +1200,130 @@ TEST_F(hmm, migrate_multiple) } } +static char cgroup[] = "/sys/fs/cgroup/hmm-test-XX"; +static int write_cgroup_param(char *cgroup_path, char *param, long value) +{ + int ret; + FILE *f; + char *filename; + + if (asprintf(&filename, "%s/%s", cgroup_path, param) < 0) + return -1; + + f = fopen(filename, "w"); + if (!f) { + ret = -1; + goto out; + } + + ret = fprintf(f, "%ld\n", value); + if (ret < 0) + goto out1; + + ret = 0; + +out1: + fclose(f); +out: + free(filename); + + return ret; +} + +static int setup_cgroup(void) +{ + pid_t pid = getpid(); + int ret; + + if (!mkdtemp(cgroup)) + return -1; + + ret = write_cgroup_param(cgroup, "cgroup.procs", pid); + if (ret) + return ret; + + return 0; +} + +static int destroy_cgroup(void) +{ + pid_t pid = getpid(); + int ret; + + ret = write_cgroup_param("/sys/fs/cgroup/cgroup.procs", + "cgroup.proc", pid); + if (ret) + return ret; + + if (rmdir(cgroup)) + return -1; + + return 0; +} + +/* + * Try and migrate a dirty page that has previously been swapped to disk. This + * checks that we don't loose dirty bits. + */ +TEST_F(hmm, migrate_dirty_page) +{ + struct hmm_buffer *buffer; + unsigned long npages; + unsigned long size; + unsigned long i; + int *ptr; + int tmp = 0; + + npages = ALIGN(HMM_BUFFER_SIZE, self->page_size) >> self->page_shift; + ASSERT_NE(npages, 0); + size = npages << self->page_shift; + + buffer = malloc(sizeof(*buffer)); + ASSERT_NE(buffer, NULL); + + buffer->fd = -1; + buffer->size = size; + buffer->mirror = malloc(size); + ASSERT_NE(buffer->mirror, NULL); + + ASSERT_EQ(setup_cgroup(), 0); + + buffer->ptr = mmap(NULL, size, + PROT_READ | PROT_WRITE, + MAP_PRIVATE | MAP_ANONYMOUS, + buffer->fd, 0); + ASSERT_NE(buffer->ptr, MAP_FAILED); + + /* Initialize buffer in system memory. */ + for (i = 0, ptr = buffer->ptr; i < size / sizeof(*ptr); ++i) + ptr[i] = 0; + + ASSERT_FALSE(write_cgroup_param(cgroup, "memory.reclaim", 1UL<<30)); + + /* Fault pages back in from swap as clean pages */ + for (i = 0, ptr = buffer->ptr; i < size / sizeof(*ptr); ++i) + tmp += ptr[i]; + + /* Dirty the pte */ + for (i = 0, ptr = buffer->ptr; i < size / sizeof(*ptr); ++i) + ptr[i] = i; + + /* +* Attempt to migrate memory to device, which should fail because +* hopefully some pages are backed by swap storage. +*/ + ASSERT_TRUE(hmm_migrate_sys_to_dev(self->fd, buffer, npages)); + + ASSERT_FALSE(write_cgroup_param(cgroup, "memory.reclaim", 1UL<<30)); + + /* Check we still see the updated data after restoring from swap. */ + for (i = 0, ptr = buffer->ptr; i < size / sizeof(*ptr); ++i) + ASSERT_EQ(ptr[i], i); + + hmm_buffer_free(buffer); + destroy_cgroup(); +} + /* * Read anonymous memory multiple times. */ -- git-series 0.9.1
[PATCH v2 1/2] mm/migrate_device.c: Copy pte dirty bit to page
migrate_vma_setup() has a fast path in migrate_vma_collect_pmd() that installs migration entries directly if it can lock the migrating page. When removing a dirty pte the dirty bit is supposed to be carried over to the underlying page to prevent it being lost. Currently migrate_vma_*() can only be used for private anonymous mappings. That means loss of the dirty bit usually doesn't result in data loss because these pages are typically not file-backed. However pages may be backed by swap storage which can result in data loss if an attempt is made to migrate a dirty page that doesn't yet have the PageDirty flag set. In this case migration will fail due to unexpected references but the dirty pte bit will be lost. If the page is subsequently reclaimed data won't be written back to swap storage as it is considered uptodate, resulting in data loss if the page is subsequently accessed. Prevent this by copying the dirty bit to the page when removing the pte to match what try_to_migrate_one() does. Signed-off-by: Alistair Popple Acked-by: Peter Xu Reported-by: Huang Ying Fixes: 8c3328f1f36a ("mm/migrate: migrate_vma() unmap page from vma while collecting pages") Cc: sta...@vger.kernel.org --- Changes for v2: - Fixed up Reported-by tag. - Added Peter's Acked-by. - Atomically read and clear the pte to prevent the dirty bit getting set after reading it. - Added fixes tag --- mm/migrate_device.c | 21 - 1 file changed, 8 insertions(+), 13 deletions(-) diff --git a/mm/migrate_device.c b/mm/migrate_device.c index 27fb37d..e2d09e5 100644 --- a/mm/migrate_device.c +++ b/mm/migrate_device.c @@ -7,6 +7,7 @@ #include #include #include +#include #include #include #include @@ -61,7 +62,7 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp, struct migrate_vma *migrate = walk->private; struct vm_area_struct *vma = walk->vma; struct mm_struct *mm = vma->vm_mm; - unsigned long addr = start, unmapped = 0; + unsigned long addr = start; spinlock_t *ptl; pte_t *ptep; @@ -193,11 +194,10 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp, bool anon_exclusive; pte_t swp_pte; + flush_cache_page(vma, addr, pte_pfn(*ptep)); + pte = ptep_clear_flush(vma, addr, ptep); anon_exclusive = PageAnon(page) && PageAnonExclusive(page); if (anon_exclusive) { - flush_cache_page(vma, addr, pte_pfn(*ptep)); - ptep_clear_flush(vma, addr, ptep); - if (page_try_share_anon_rmap(page)) { set_pte_at(mm, addr, ptep, pte); unlock_page(page); @@ -205,12 +205,14 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp, mpfn = 0; goto next; } - } else { - ptep_get_and_clear(mm, addr, ptep); } migrate->cpages++; + /* Set the dirty flag on the folio now the pte is gone. */ + if (pte_dirty(pte)) + folio_mark_dirty(page_folio(page)); + /* Setup special migration page table entry */ if (mpfn & MIGRATE_PFN_WRITE) entry = make_writable_migration_entry( @@ -242,9 +244,6 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp, */ page_remove_rmap(page, vma, false); put_page(page); - - if (pte_present(pte)) - unmapped++; } else { put_page(page); mpfn = 0; @@ -257,10 +256,6 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp, arch_leave_lazy_mmu_mode(); pte_unmap_unlock(ptep - 1, ptl); - /* Only flush the TLB if we actually modified any entries */ - if (unmapped) - flush_tlb_range(walk->vma, start, end); - return 0; } base-commit: ffcf9c5700e49c0aee42dcba9a12ba21338e8136 -- git-series 0.9.1
[Bug 216368] do_IRQ: stack overflow at boot on a PowerMac G5 11,2
https://bugzilla.kernel.org/show_bug.cgi?id=216368 Christophe Leroy (christophe.le...@csgroup.eu) changed: What|Removed |Added CC||christophe.le...@csgroup.eu --- Comment #3 from Christophe Leroy (christophe.le...@csgroup.eu) --- That might be a consequence of commit 41f20d6db2b6 ("powerpc/irq: Increase stack_overflow detection limit when KASAN is enabled") which increased the limit from 2k to 4k for PPC64. This happens when you get an IRQ while being deep into BTRFS handling it seems. It should be investigated with BTRFS team why the callstack is so deep. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.