Re: [PATCH] mm: kfence: Fix false positives on big endian

2023-05-18 Thread Benjamin Gray
On Fri, 2023-05-19 at 15:14 +1000, Michael Ellerman wrote: > Andrew Morton writes: > > On Fri, 5 May 2023 16:02:17 + David Laight > > wrote: > > > > > From: Michael Ellerman > > > > Sent: 05 May 2023 04:51 > > > > > > > > Since commit 1ba3cbf3ec3b ("mm: kfence: improve the performance > > >

Re: [PATCH] mm: kfence: Fix false positives on big endian

2023-05-18 Thread Christophe Leroy
Le 18/05/2023 à 00:20, Andrew Morton a écrit : > On Fri, 5 May 2023 16:02:17 + David Laight > wrote: > >> From: Michael Ellerman >>> Sent: 05 May 2023 04:51 >>> >>> Since commit 1ba3cbf3ec3b ("mm: kfence: improve the performance of >>> __kfence_alloc() and __kfence_free()"), kfence reports

[PATCH v3 01/12] powerpc/book3s: Add missing include

2023-05-18 Thread Benjamin Gray
The functions here use struct thread_struct fields, so need to import the full definition from . The header that defines current only forward declares struct thread_struct. Failing to include this header leads to a compilation error when a translation unit does not also include indirectly. Sig

Re: [PATCH] mm: kfence: Fix false positives on big endian

2023-05-18 Thread Michael Ellerman
Andrew Morton writes: > On Fri, 5 May 2023 16:02:17 + David Laight > wrote: > >> From: Michael Ellerman >> > Sent: 05 May 2023 04:51 >> > >> > Since commit 1ba3cbf3ec3b ("mm: kfence: improve the performance of >> > __kfence_alloc() and __kfence_free()"), kfence reports failures in >> > rand

[PATCH v3 11/12] selftests/powerpc/dexcr: Add hashst/hashchk test

2023-05-18 Thread Benjamin Gray
Test the kernel DEXCR[NPHIE] interface and hashchk exception handling. Introduces with it a DEXCR utils library for common DEXCR operations. Volatile is used to prevent the compiler optimising away the signal tests. Signed-off-by: Benjamin Gray --- v1: * Clean up dexcr makefile * I

[PATCH v3 06/12] powerpc/dexcr: Support custom default DEXCR value

2023-05-18 Thread Benjamin Gray
Make the DEXCR value configurable at config time. Intentionally don't limit possible values to support future aspects without needing kernel updates. The default config value enables hashst/hashchk in problem state. This should be safe, as generally software needs to request these instructions be

[PATCH v3 08/12] powerpc/ptrace: Expose HASHKEYR register to ptrace

2023-05-18 Thread Benjamin Gray
The HASHKEYR register contains a secret per-process key to enable unique hashes per process. In general it should not be exposed to userspace at all and a regular process has no need to know its key. However, checkpoint restore in userspace (CRIU) functionality requires that a process be able to s

[PATCH v3 02/12] powerpc/ptrace: Add missing include

2023-05-18 Thread Benjamin Gray
ptrace-decl.h uses user_regset_get2_fn (among other things) from regset.h. While all current users of ptrace-decl.h include regset.h before it anyway, it adds an implicit ordering dependency and breaks source tooling that tries to inspect ptrace-decl.h by itself. Signed-off-by: Benjamin Gray Revi

[PATCH v3 07/12] powerpc/ptrace: Expose DEXCR and HDEXCR registers to ptrace

2023-05-18 Thread Benjamin Gray
The DEXCR register is of interest when ptracing processes. Currently it is static, but eventually will be dynamically controllable by a process. If a process can control its own, then it is useful for it to be ptrace-able to (e.g., for checkpoint-restore functionality). It is also relevant to core

[PATCH v3 10/12] selftests/powerpc: Add more utility macros

2023-05-18 Thread Benjamin Gray
Adds _MSG assertion variants to provide more context behind why a failure occurred. Also include unistd.h for _exit() and stdio.h for fprintf(), and move ARRAY_SIZE macro to utils.h. The _MSG variants and ARRAY_SIZE will be used by the following DEXCR selftests. Signed-off-by: Benjamin Gray Revi

[PATCH v3 12/12] selftests/powerpc/dexcr: Add DEXCR status utility lsdexcr

2023-05-18 Thread Benjamin Gray
Add a utility 'lsdexcr' to print the current DEXCR status. Useful for quickly checking the status such as when debugging test failures or verifying the new default DEXCR does what you want (for userspace at least). Example output: # ./lsdexcr uDEXCR: 0400 (NPHIE) HDEXCR:

[PATCH v3 05/12] powerpc/dexcr: Support userspace ROP protection

2023-05-18 Thread Benjamin Gray
The ISA 3.1B hashst and hashchk instructions use a per-cpu SPR HASHKEYR to hold a key used in the hash calculation. This key should be different for each process to make it harder for a malicious process to recreate valid hash values for a victim process. Add support for storing a per-thread hash

[PATCH v3 09/12] Documentation: Document PowerPC kernel DEXCR interface

2023-05-18 Thread Benjamin Gray
Describe the DEXCR and document how to configure it. Signed-off-by: Benjamin Gray Reviewed-by: Russell Currey --- v3: * Add ruscur reviewed-by v2: * Document coredump & ptrace support v1: * Remove the dynamic control docs, describe the static config option This documenta

[PATCH v3 00/12] Add static DEXCR support

2023-05-18 Thread Benjamin Gray
The DEXCR is a SPR that allows control over various execution 'aspects', such as indirect branch prediction and enabling the hashst/hashchk instructions. Further details are in ISA 3.1B Book 3 chapter 9. This series adds static (compile time) support for initialising the DEXCR, and basic HASHKEYR

[PATCH v3 04/12] powerpc/dexcr: Handle hashchk exception

2023-05-18 Thread Benjamin Gray
Recognise and pass the appropriate signal to the user program when a hashchk instruction triggers. This is independent of allowing configuration of DEXCR[NPHIE], as a hypervisor can enforce this aspect regardless of the kernel. The signal mirrors how ARM reports their similar check failure. For ex

[PATCH v3 03/12] powerpc/dexcr: Add initial Dynamic Execution Control Register (DEXCR) support

2023-05-18 Thread Benjamin Gray
ISA 3.1B introduces the Dynamic Execution Control Register (DEXCR). It is a per-cpu register that allows control over various CPU behaviours including branch hint usage, indirect branch speculation, and hashst/hashchk support. Add some definitions and basic support for the DEXCR in the kernel. Rig

Re: [PATCH AUTOSEL 6.3 6/7] powerpc/fsl_uli1575: Allow to disable FSL_ULI1575 support

2023-05-18 Thread Sasha Levin
On Tue, May 09, 2023 at 09:18:35AM +0200, Pali Rohár wrote: On Tuesday 09 May 2023 17:14:48 Michael Ellerman wrote: Randy Dunlap writes: > Hi-- > > Just a heads up. This patch can cause build errors. > I sent a patch for these on 2023-APR-28: > https://lore.kernel.org/linuxppc-dev/2023042904

Re: [PATCH v2 00/34] Split ptdesc from struct page

2023-05-18 Thread Jason Gunthorpe
On Mon, May 01, 2023 at 12:27:55PM -0700, Vishal Moola (Oracle) wrote: > The MM subsystem is trying to shrink struct page. This patchset > introduces a memory descriptor for page table tracking - struct ptdesc. > > This patchset introduces ptdesc, splits ptdesc from struct page, and > converts man

Re: [PATCH v2 00/25] iommu: Make default_domain's mandatory

2023-05-18 Thread Steven Price
On 16/05/2023 01:00, Jason Gunthorpe wrote: > This is on github: > https://github.com/jgunthorpe/linux/commits/iommu_all_defdom Tested-by: Steven Price Works fine on my Firefly-RK3288. Thanks, Steve

[RESEND PATCH v9 1/2] mm/tlbbatch: Introduce arch_tlbbatch_should_defer()

2023-05-18 Thread Yicong Yang
From: Anshuman Khandual The entire scheme of deferred TLB flush in reclaim path rests on the fact that the cost to refill TLB entries is less than flushing out individual entries by sending IPI to remote CPUs. But architecture can have different ways to evaluate that. Hence apart from checking TT

[RESEND PATCH v9 0/2] arm64: support batched/deferred tlb shootdown during page reclamation/migration

2023-05-18 Thread Yicong Yang
From: Yicong Yang Though ARM64 has the hardware to do tlb shootdown, the hardware broadcasting is not free. A simplest micro benchmark shows even on snapdragon 888 with only 8 cores, the overhead for ptep_clear_flush is huge even for paging out one page mapped by only one process: 5.36% a.out

[RESEND PATCH v9 2/2] arm64: support batched/deferred tlb shootdown during page reclamation/migration

2023-05-18 Thread Yicong Yang
From: Barry Song on x86, batched and deferred tlb shootdown has lead to 90% performance increase on tlb shootdown. on arm64, HW can do tlb shootdown without software IPI. But sync tlbi is still quite expensive. Even running a simplest program which requires swapout can prove this is true, #incl