Re: [FSL P50x0] [GIT KERNEL] [VDSO] compiling issue
Le 20/09/2024 à 06:30, Michael Ellerman a écrit : Christian Zigotzky writes: Hi All, The compiling of the latest Git kernel doesn’t work anymore for our FSL P5020/P5040 boards [1] since the random-6.12-rc1 updates [2]. Error messages: arch/powerpc/kernel/vdso/vdso32.so.dbg: dynamic relocations are not supported make[2]: *** [arch/powerpc/kernel/vdso/Makefile:75: arch/powerpc/kernel/vdso/vdso32.so.dbg] Reverting of the vdso updates has solved the compiing issue. Could you please check the random-6.12-rc1 updates? [2] Thanks, Christian [1] https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwiki.amiga.org%2Findex.php%3Ftitle%3DX5000&data=05%7C02%7Cchristophe.leroy%40csgroup.eu%7Ce6f19b86406f4cd847f508dcd92d0a13%7C8b87af7d86474dc78df45f69a2011bb5%7C0%7C0%7C638624034682015871%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=VyrTeeljcPlZqFqPSu2unkEoxrEB9%2FCCDdOnIr7CvG4%3D&reserved=0 [2] https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.kernel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Ftorvalds%2Flinux.git%2Fcommit%2F%3Fid%3D4a39ac5b7d62679c07a3e3d12b0f6982377d8a7d&data=05%7C02%7Cchristophe.leroy%40csgroup.eu%7Ce6f19b86406f4cd847f508dcd92d0a13%7C8b87af7d86474dc78df45f69a2011bb5%7C0%7C0%7C638624034682033680%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=EU8R%2BOEIQUWivXDhkcdwvyUfGqR13%2FOAlm3VUntSblk%3D&reserved=0 + Kernel config Link: https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fraw.githubusercontent.com%2Fchzigotzky%2Fkernels%2Frefs%2Fheads%2Fmain%2Fconfigs%2Fx5000_defconfig&data=05%7C02%7Cchristophe.leroy%40csgroup.eu%7Ce6f19b86406f4cd847f508dcd92d0a13%7C8b87af7d86474dc78df45f69a2011bb5%7C0%7C0%7C638624034682046602%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=r9CQKH74EJx%2B74Yniufx%2BFgvPSJVlGgFrvMma1K9Uaw%3D&reserved=0 Your config has: # CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE is not set CONFIG_CC_OPTIMIZE_FOR_SIZE=y But all our defconfigs use CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE. Which explains why we didn't catch this in build testing. I've added a build with OPTIMIZE_FOR_SIZE=y so hopefully we'll catch any similar errors in future. And I sent a patch to fix it: https://patchwork.ozlabs.org/project/linuxppc-dev/patch/aded2b257018fe654db759fdfa4ab1a0b5426b1b.1726772140.git.christophe.le...@csgroup.eu/ Christophe
[PATCH] powerpc/vdso32: Fix use of crtsavres for PPC64
crtsavres.S content is encloded by a #ifndef CONFIG_PPC64 To be used on VDSO32 on PPC64 it's content must available on PPC64 as well. Replace #ifndef CONFIG_PPC64 by #ifndef __powerpc64__ as __powerpc64__ is not set when building VDSO32 on PPC64. Reported-by: Christian Zigotzky Closed: https://lore.kernel.org/linuxppc-dev/047b7503-af0c-4bb0-b12a-2f6b1e461...@csgroup.eu/T/ Fixes: b163596a5b6f ("powerpc/vdso32: Add crtsavres") Signed-off-by: Christophe Leroy --- arch/powerpc/lib/crtsavres.S | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/lib/crtsavres.S b/arch/powerpc/lib/crtsavres.S index 7e5e1c28e56a..8967903c15e9 100644 --- a/arch/powerpc/lib/crtsavres.S +++ b/arch/powerpc/lib/crtsavres.S @@ -46,7 +46,7 @@ .section ".text" -#ifndef CONFIG_PPC64 +#ifndef __powerpc64__ /* Routines for saving integer registers, called by the compiler. */ /* Called with r11 pointing to the stack header word of the caller of the */ -- 2.44.0
Re: [FSL P50x0] [GIT KERNEL] [VDSO] compiling issue
Hi Christian, Le 19/09/2024 à 17:02, Christian Zigotzky a écrit : Hi All, The compiling of the latest Git kernel doesn’t work anymore for our FSL P5020/P5040 boards [1] since the random-6.12-rc1 updates [2]. Error messages: arch/powerpc/kernel/vdso/vdso32.so.dbg: dynamic relocations are not supported make[2]: *** [arch/powerpc/kernel/vdso/Makefile:75: arch/powerpc/kernel/vdso/vdso32.so.dbg] Reverting of the vdso updates has solved the compiing issue. Could you please check the random-6.12-rc1 updates? [2] Thanks, Christian [1] https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwiki.amiga.org%2Findex.php%3Ftitle%3DX5000&data=05%7C02%7Cchristophe.leroy%40csgroup.eu%7C4b327823a8d843f5dc8d08dcd8bc2600%7C8b87af7d86474dc78df45f69a2011bb5%7C0%7C0%7C638623549830455660%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=xtne08CcJdt0jF0eir8q%2F5CeMcCv6JN4Uj4LDKqUiog%3D&reserved=0 [2] https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.kernel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Ftorvalds%2Flinux.git%2Fcommit%2F%3Fid%3D4a39ac5b7d62679c07a3e3d12b0f6982377d8a7d&data=05%7C02%7Cchristophe.leroy%40csgroup.eu%7C4b327823a8d843f5dc8d08dcd8bc2600%7C8b87af7d86474dc78df45f69a2011bb5%7C0%7C0%7C638623549830472839%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=sVQ9FhoXe9YDrqFrwb4a1rmMlN7Kegg7z1yMQ4uAFvo%3D&reserved=0 + Kernel config Link: https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fraw.githubusercontent.com%2Fchzigotzky%2Fkernels%2Frefs%2Fheads%2Fmain%2Fconfigs%2Fx5000_defconfig&data=05%7C02%7Cchristophe.leroy%40csgroup.eu%7C4b327823a8d843f5dc8d08dcd8bc2600%7C8b87af7d86474dc78df45f69a2011bb5%7C0%7C0%7C638623549830486185%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=SQi%2B9R4TM59QB8Msxq3KuKPvvJEBMVUdyOhoY6ndzVU%3D&reserved=0 + Christophe Leroy + Michael Ellerman Can you try with the following change: diff --git a/arch/powerpc/lib/crtsavres.S b/arch/powerpc/lib/crtsavres.S index 7e5e1c28e56a..8967903c15e9 100644 --- a/arch/powerpc/lib/crtsavres.S +++ b/arch/powerpc/lib/crtsavres.S @@ -46,7 +46,7 @@ .section ".text" -#ifndef CONFIG_PPC64 +#ifndef __powerpc64__ /* Routines for saving integer registers, called by the compiler. */ /* Called with r11 pointing to the stack header word of the caller of the */
Re: [RFC v2 03/13] book3s64/hash: Remove kfence support temporarily
Le 19/09/2024 à 04:56, Ritesh Harjani (IBM) a écrit : Kfence on book3s Hash on pseries is anyways broken. It fails to boot due to RMA size limitation. That is because, kfence with Hash uses debug_pagealloc infrastructure. debug_pagealloc allocates linear map for entire dram size instead of just kfence relevant objects. This means for 16TB of DRAM it will require (16TB >> PAGE_SHIFT) which is 256MB which is half of RMA region on P8. crash kernel reserves 256MB and we also need 2048 * 16KB * 3 for emergency stack and some more for paca allocations. That means there is not enough memory for reserving the full linear map in the RMA region, if the DRAM size is too big (>=16TB) (The issue is seen above 8TB with crash kernel 256 MB reservation). Now Kfence does not require linear memory map for entire DRAM. It only needs for kfence objects. So this patch temporarily removes the kfence functionality since debug_pagealloc code needs some refactoring. We will bring in kfence on Hash support in later patches. Signed-off-by: Ritesh Harjani (IBM) --- arch/powerpc/include/asm/kfence.h | 5 + arch/powerpc/mm/book3s64/hash_utils.c | 16 +++- 2 files changed, 16 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/include/asm/kfence.h b/arch/powerpc/include/asm/kfence.h index fab124ada1c7..f3a9476a71b3 100644 --- a/arch/powerpc/include/asm/kfence.h +++ b/arch/powerpc/include/asm/kfence.h @@ -10,6 +10,7 @@ #include #include +#include #ifdef CONFIG_PPC64_ELF_ABI_V1 #define ARCH_FUNC_PREFIX "." @@ -25,6 +26,10 @@ static inline void disable_kfence(void) static inline bool arch_kfence_init_pool(void) { +#ifdef CONFIG_PPC64 + if (!radix_enabled()) No need for a #ifdef here, you can just do: if (IS_ENABLED(CONFIG_PPC64) && !radix_enabled()) return false; + return false; +#endif return !kfence_disabled; But why not just set kfence_disabled to true by calling disable_kfence() from one of the powerpc init functions ? } #endif
Re: [RFC v2 02/13] powerpc: mm: Fix kfence page fault reporting
Le 19/09/2024 à 04:56, Ritesh Harjani (IBM) a écrit : copy_from_kernel_nofault() can be called when doing read of /proc/kcore. /proc/kcore can have some unmapped kfence objects which when read via copy_from_kernel_nofault() can cause page faults. Since *_nofault() functions define their own fixup table for handling fault, use that instead of asking kfence to handle such faults. Hence we search the exception tables for the nip which generated the fault. If there is an entry then we let the fixup table handler handle the page fault by returning an error from within ___do_page_fault(). Searching the exception table is a heavy operation and all has been done in the past to minimise the number of times it is called, see for instance commit cbd7e6ca0210 ("powerpc/fault: Avoid heavy search_exception_tables() verification") Also, by trying to hide false positives you also hide real ones. For instance if csum_partial_copy_generic() is using a kfence protected area, it will now go undetected. IIUC, here your problem is limited to copy_from_kernel_nofault(). You should handle the root cause, not its effects. For that, you could perform additional verifications in copy_from_kernel_nofault_allowed(). Christophe
Re: [PATCH 1/2] powerpc/entry: convert to common and generic entry
Hi, Le 14/09/2024 à 04:22, Luming Yu a écrit : On Fri, Sep 13, 2024 at 02:15:40PM +0200, Christophe Leroy wrote: Le 13/09/2024 à 14:02, Luming Yu a écrit : ... nothing happens after that. reproduced with ppc64_defconfig [0.818972][T1] Run /init as init process [5.851684][ T240] request_module: modprobe binfmt-4c46 cannot be processed, kmod busy with 50 threads for more than 5 seconds now [5.851742][ T240] kworker/u33:18 (240) used greatest stack depth: 13584 bytes left [5.860081][ T232] kworker/u33:16 (232) used greatest stack depth: 13072 bytes left [5.863145][ T210] kworker/u35:13 (210) used greatest stack depth: 12928 bytes left [5.865000][T1] Failed to execute /init (error -8) [5.868897][T1] Run /sbin/init as init process [ 10.891673][ T315] request_module: modprobe binfmt-4c46 cannot be processed, kmod busy with 50 threads for more than 5 seconds now [ 10.894036][T1] Starting init: /sbin/init exists but couldn't execute it (error -8) [ 10.901455][T1] Run /etc/init as init process [ 10.903154][T1] Run /bin/init as init process [ 10.904747][T1] Run /bin/sh as init process [ 15.931679][ T367] request_module: modprobe binfmt-4c46 cannot be processed, kmod busy with 50 threads for more than 5 seconds now [ 15.934689][T1] Starting init: /bin/sh exists but couldn't execute it (error -8) That's something different, this is because you built a big-endian kernel and you are trying to run a little-endian userspace. okay Does it work with ppc64le_defconfig ? make ppc64le_defconfig yes, it builds && boots just fine. the host is a p8 powernv system , the qemu command line is as below: qemu-system-ppc64 -m 64g -smp 16,cores=4,threads=4 --enable-kvm -nographic -net nic -net tap,ifname=tap0,script=/etc/qemu-ifup-nat,downscript=/etc/qemu-ifdown-nat Downloads/Fedora-Cloud-Base-38-1.6.ppc64le.qcow2 With that command you don't boot a freshly built kernel, you boot: Linux version 6.2.9-300.fc38.ppc64le (mockbuild@0e2dbea752814aea985bdc5347ce35da) (gcc (GCC) 13.0.1 20230318 (Red Hat 13.0.1-0), GNU ld version 2.39-9.fc38) Are you sure you tried with the ppc64le_defconfig ? On my side the boot fails as follows when using a ppc64le_defconfig vmlinux with the file Fedora-Cloud-Base-38-1.6.ppc64le.qcow2: ... [2.602758][T1] md: autorun ... [2.602808][T1] md: ... autorun DONE. [2.612596][ T189] kworker/u73:0 (189) used greatest stack depth: 29008 bytes left [2.617068][T1] /dev/root: Can't open blockdev [2.618136][T1] VFS: Cannot open root device "" or unknown-block(0,0): error -6 [2.618239][T1] Please append a correct "root=" boot option; here are the available partitions: [2.618611][T1] 0100 65536 ram0 [2.618768][T1] (driver?) [2.619101][T1] 0101 65536 ram1 [2.619120][T1] (driver?) [2.619187][T1] 0102 65536 ram2 [2.619199][T1] (driver?) [2.619251][T1] 0103 65536 ram3 [2.619261][T1] (driver?) [2.619312][T1] 0104 65536 ram4 [2.619322][T1] (driver?) [2.619372][T1] 0105 65536 ram5 [2.619382][T1] (driver?) [2.619436][T1] 0106 65536 ram6 [2.619447][T1] (driver?) [2.619500][T1] 0107 65536 ram7 [2.619519][T1] (driver?) [2.619571][T1] 0108 65536 ram8 [2.619581][T1] (driver?) [2.619631][T1] 0109 65536 ram9 [2.619641][T1] (driver?) [2.619690][T1] 010a 65536 ram10 [2.619700][T1] (driver?) [2.619754][T1] 010b 65536 ram11 [2.619764][T1] (driver?) [2.619818][T1] 010c 65536 ram12 [2.619827][T1] (driver?) [2.619880][T1] 010d 65536 ram13 [2.619889][T1] (driver?) [2.619942][T1] 010e 65536 ram14 [2.619952][T1] (driver?) [2.620023][T1] 010f 65536 ram15 [2.620036][T1] (driver?) [2.620116][T1] 0b00 1048575 sr0 [2.620150][T1] driver: sr [2.620221][T1] 0800 5242880 sda [2.620234][T1] driver: sd [2.620310][T1] 08014096 sda1 709431c7-74bd-4ec4-bbe8-d4f7e7e3194e [2.620369][T1] [2.620449][T1] 0802 1024000 sda2 e0b0a6de-ca8f-4e50-808c-121324c94d04 [2.620463][T1] [2.620531][T1] 0803 102400 sda3 8ed2fbf1-fd2c-4ab0-b66f-d31df1d24e3e [2.620544][T1] [2.620599][T1] 08041024 sda4 46dc7fc8-bf10-4166-9bc8-98daabbec06d [2.620610][T1] [2.620666][T1] 0805 4109312 sda5 8a52b54b-c379-43a5-bf8d-a43fdef4a370 [2.620676][T1] [2.620838][T1] List of all bdev filesystems: [2.620884][T1] ext3 [2.620918][T1] ex
Re: [PATCH net-next v2] page_pool: fix build on powerpc with GCC 14
Le 14/09/2024 à 04:02, Michael Ellerman a écrit : Mina Almasry writes: Building net-next with powerpc with GCC 14 compiler results in this build error: /home/sfr/next/tmp/ccuSzwiR.s: Assembler messages: /home/sfr/next/tmp/ccuSzwiR.s:2579: Error: operand out of domain (39 is not a multiple of 4) make[5]: *** [/home/sfr/next/next/scripts/Makefile.build:229: net/core/page_pool.o] Error 1 Root caused in this thread: https://lore.kernel.org/netdev/913e2fbd-d318-4c9b-aed2-4d333a1d5...@cs-soprasteria.com/ Sorry I'm late to this, the original report wasn't Cc'ed to linuxppc-dev :D I think this is a bug in the arch/powerpc inline asm constraints. Can you try the patch below, it fixes the build error for me. I'll run it through some boot tests and turn it into a proper patch over the weekend. cheers diff --git a/arch/powerpc/include/asm/atomic.h b/arch/powerpc/include/asm/atomic.h index 5bf6a4d49268..0e41c1da82dd 100644 --- a/arch/powerpc/include/asm/atomic.h +++ b/arch/powerpc/include/asm/atomic.h @@ -23,6 +23,12 @@ #define __atomic_release_fence() \ __asm__ __volatile__(PPC_RELEASE_BARRIER "" : : : "memory") +#ifdef CONFIG_CC_IS_CLANG +#define DS_FORM_CONSTRAINT "Z<>" +#else +#define DS_FORM_CONSTRAINT "YZ<>" +#endif I see we have the same in uaccess.h, added by commit 2d43cc701b96 ("powerpc/uaccess: Fix build errors seen with GCC 13/14") Should that go in a common header, maybe ppc_asm.h ? + static __inline__ int arch_atomic_read(const atomic_t *v) { int t; @@ -197,7 +203,7 @@ static __inline__ s64 arch_atomic64_read(const atomic64_t *v) if (IS_ENABLED(CONFIG_PPC_KERNEL_PREFIXED)) __asm__ __volatile__("ld %0,0(%1)" : "=r"(t) : "b"(&v->counter)); else - __asm__ __volatile__("ld%U1%X1 %0,%1" : "=r"(t) : "m<>"(v->counter)); + __asm__ __volatile__("ld%U1%X1 %0,%1" : "=r"(t) : DS_FORM_CONSTRAINT (v->counter)); return t; } @@ -208,7 +214,7 @@ static __inline__ void arch_atomic64_set(atomic64_t *v, s64 i) if (IS_ENABLED(CONFIG_PPC_KERNEL_PREFIXED)) __asm__ __volatile__("std %1,0(%2)" : "=m"(v->counter) : "r"(i), "b"(&v->counter)); else - __asm__ __volatile__("std%U0%X0 %1,%0" : "=m<>"(v->counter) : "r"(i)); + __asm__ __volatile__("std%U0%X0 %1,%0" : "=" DS_FORM_CONSTRAINT (v->counter) : "r"(i)); } #define ATOMIC64_OP(op, asm_op) \
Re: [PATCH net-next v1] mm: fix build on powerpc with GCC 14
Hi, Le 13/09/2024 à 21:22, Matthew Wilcox a écrit : On Fri, Sep 13, 2024 at 07:20:36PM +, Mina Almasry wrote: +++ b/include/linux/page-flags.h @@ -239,8 +239,8 @@ static inline unsigned long _compound_head(const struct page *page) { unsigned long head = READ_ONCE(page->compound_head); - if (unlikely(head & 1)) - return head - 1; + if (unlikely(head & 1UL)) + return head & ~1UL; return (unsigned long)page_fixed_fake_head(page); NAK, that pessimises compound_head(). Can you please give more details on what the difference is ? I can't see what it pessimises. In both cases, you test if the value is odd, when it is odd you make it even. Christophe
Re: [PATCH] crypto: Removing CRYPTO_AES_GCM_P10.
Le 13/09/2024 à 14:30, Danny Tsen a écrit : [Vous ne recevez pas souvent de courriers de dt...@linux.ibm.com. Découvrez pourquoi ceci est important à https://aka.ms/LearnAboutSenderIdentification ] Removing CRYPTO_AES_GCM_P10 in Kconfig first so that we can apply the subsequent patches to fix data mismatch over ipsec tunnel. To deactivate a driver, all you have to do is to add: depends on BROKEN Christophe Signed-off-by: Danny Tsen --- arch/powerpc/crypto/Kconfig | 32 1 file changed, 16 insertions(+), 16 deletions(-) diff --git a/arch/powerpc/crypto/Kconfig b/arch/powerpc/crypto/Kconfig index 09ebcbdfb34f..96ca2c4c8827 100644 --- a/arch/powerpc/crypto/Kconfig +++ b/arch/powerpc/crypto/Kconfig @@ -105,22 +105,22 @@ config CRYPTO_AES_PPC_SPE architecture specific assembler implementations that work on 1KB tables or 256 bytes S-boxes. -config CRYPTO_AES_GCM_P10 - tristate "Stitched AES/GCM acceleration support on P10 or later CPU (PPC)" - depends on PPC64 && CPU_LITTLE_ENDIAN && VSX - select CRYPTO_LIB_AES - select CRYPTO_ALGAPI - select CRYPTO_AEAD - select CRYPTO_SKCIPHER - help - AEAD cipher: AES cipher algorithms (FIPS-197) - GCM (Galois/Counter Mode) authenticated encryption mode (NIST SP800-38D) - Architecture: powerpc64 using: - - little-endian - - Power10 or later features - - Support for cryptographic acceleration instructions on Power10 or - later CPU. This module supports stitched acceleration for AES/GCM. +#config CRYPTO_AES_GCM_P10 +# tristate "Stitched AES/GCM acceleration support on P10 or later CPU (PPC)" +# depends on PPC64 && CPU_LITTLE_ENDIAN && VSX +# select CRYPTO_LIB_AES +# select CRYPTO_ALGAPI +# select CRYPTO_AEAD +# select CRYPTO_SKCIPHER +# help +#AEAD cipher: AES cipher algorithms (FIPS-197) +#GCM (Galois/Counter Mode) authenticated encryption mode (NIST SP800-38D) +#Architecture: powerpc64 using: +# - little-endian +# - Power10 or later features +# +#Support for cryptographic acceleration instructions on Power10 or +#later CPU. This module supports stitched acceleration for AES/GCM. config CRYPTO_CHACHA20_P10 tristate "Ciphers: ChaCha20, XChacha20, XChacha12 (P10 or later)" -- 2.43.0
Re: [PATCH 1/2] powerpc/entry: convert to common and generic entry
Le 13/09/2024 à 14:02, Luming Yu a écrit : ... nothing happens after that. reproduced with ppc64_defconfig [0.818972][T1] Run /init as init process [5.851684][ T240] request_module: modprobe binfmt-4c46 cannot be processed, kmod busy with 50 threads for more than 5 seconds now [5.851742][ T240] kworker/u33:18 (240) used greatest stack depth: 13584 bytes left [5.860081][ T232] kworker/u33:16 (232) used greatest stack depth: 13072 bytes left [5.863145][ T210] kworker/u35:13 (210) used greatest stack depth: 12928 bytes left [5.865000][T1] Failed to execute /init (error -8) [5.868897][T1] Run /sbin/init as init process [ 10.891673][ T315] request_module: modprobe binfmt-4c46 cannot be processed, kmod busy with 50 threads for more than 5 seconds now [ 10.894036][T1] Starting init: /sbin/init exists but couldn't execute it (error -8) [ 10.901455][T1] Run /etc/init as init process [ 10.903154][T1] Run /bin/init as init process [ 10.904747][T1] Run /bin/sh as init process [ 15.931679][ T367] request_module: modprobe binfmt-4c46 cannot be processed, kmod busy with 50 threads for more than 5 seconds now [ 15.934689][T1] Starting init: /bin/sh exists but couldn't execute it (error -8) That's something different, this is because you built a big-endian kernel and you are trying to run a little-endian userspace. Does it work with ppc64le_defconfig ? On my side there is absolutely nothing happening after the last line, the screen remains steady. Christophe
Re: [PATCH 1/2] powerpc/vpa_pmu: Add interface to expose vpa counters via perf
Le 13/09/2024 à 10:35, kajoljain a écrit : On 9/13/24 12:00, Christophe Leroy wrote: Le 28/08/2024 à 12:21, Kajol Jain a écrit : The pseries Shared Processor Logical Partition(SPLPAR) machines can retrieve a log of dispatch and preempt events from the hypervisor using data from Disptach Trace Log(DTL) buffer. With this information, user can retrieve when and why each dispatch & preempt has occurred. Added an interface to expose the Virtual Processor Area(VPA) DTL counters via perf. The following events are available and exposed in sysfs: vpa_dtl/dtl_cede/ - Trace voluntary (OS initiated) virtual processor waits vpa_dtl/dtl_preempt/ - Trace time slice preempts vpa_dtl/dtl_fault/ - Trace virtual partition memory page faults. vpa_dtl/dtl_all/ - Trace all (dtl_cede/dtl_preempt/dtl_fault) Added interface defines supported event list, config fields for the event attributes and their corresponding bit values which are exported via sysfs. User could use the standard perf tool to access perf events exposed via vpa-dtl pmu. The VPA DTL PMU counters do not interrupt on overflow or generate any PMI interrupts. Therefore, the kernel needs to poll the counters, added hrtimer code to do that. The timer interval can be provided by user via sample_period field in nano seconds. Result on power10 SPLPAR system with 656 cpu threads. In the below perf record command with vpa_dtl pmu, -c option is used to provide sample_period whch corresponding to 10ns i.e; 1sec and the workload time is also 1 second, hence we are getting 656 samples: [command] perf record -a -R -e vpa_dtl/dtl_all/ -c 10 sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.828 MB perf.data (656 samples) ] There is one hrtimer added per vpa-dtl pmu thread. Code added to handle addition of dtl buffer data in the raw sample. Since DTL does not provide IP address for a sample and it just have traces on reason of dispatch/preempt, we directly saving DTL buffer data to perf.data file as raw sample. For each hrtimer restart call, interface will dump all the new dtl entries added to dtl buffer as a raw sample. To ensure there are no other conflicting dtl users (example: debugfs dtl or /proc/powerpc/vcpudispatch_stats), interface added code to use "down_write_trylock" call to take the dtl_access_lock. The dtl_access_lock is defined in dtl.h file. Also added global reference count variable called "dtl_global_refc", to ensure dtl data can be captured per-cpu. Code also added global lock called "dtl_global_lock" to avoid race condition. Signed-off-by: Kajol Jain --- Notes: - Made code changes on top of recent fix sent by Michael Ellerman. Link to the patch: https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpatchwork.ozlabs.org%2Fproject%2Flinuxppc-dev%2Fpatch%2F20240819122401.513203-1-mpe%40ellerman.id.au%2F&data=05%7C02%7Cchristophe.leroy%40csgroup.eu%7C95cfb2842b2a44907c9108dcd3cf0b12%7C8b87af7d86474dc78df45f69a2011bb5%7C0%7C0%7C638618133431151306%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=IqFjcvT9G0HYGIbuUWhCrnXkFr9yMtSC1mGFFKZ66MI%3D&reserved=0 arch/powerpc/perf/Makefile | 2 +- arch/powerpc/perf/vpa-pmu.c | 469 include/linux/cpuhotplug.h | 1 + 3 files changed, 471 insertions(+), 1 deletion(-) create mode 100644 arch/powerpc/perf/vpa-pmu.c Seems like it doesn't build on PPC64: arch/powerpc/perf/vpa-pmu.c#L212 passing argument 1 of 'up_write' from incompatible pointer type [-Wincompatible-pointer-types] arch/powerpc/perf/vpa-pmu.c#L261 passing argument 1 of 'down_write_trylock' from incompatible pointer type [-Wincompatible-pointer-types] arch/powerpc/perf/vpa-pmu.c#L402 passing argument 1 of 'up_write' from incompatible pointer type [-Wincompatible-pointer-types] Hi Christophe, Thanks for checking the patch. These changes are on top of fix patch sent by Michael Ellerman https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpatchwork.ozlabs.org%2Fproject%2Flinuxppc-dev%2Fpatch%2F20240819122401.513203-1-mpe%40ellerman.id.au%2F&data=05%7C02%7Cchristophe.leroy%40csgroup.eu%7C95cfb2842b2a44907c9108dcd3cf0b12%7C8b87af7d86474dc78df45f69a2011bb5%7C0%7C0%7C638618133431160525%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=r1vQxb%2F8IXBRffDtFMwiqbKnFZ4iiqer6cLykT00%2Bew%3D&reserved=0 Since he changed the dtl_access_lock to be a rw_semaphore. Are you trying with Michael patch changes? No, I only saw CI test failure here : https://github.com/linuxppc/linux-snowpatch/actions/runs/10594868105 Sorry I didn't see you mentioned it in a note. Christophe
Re: [PATCH 1/2] powerpc/entry: convert to common and generic entry
Le 13/09/2024 à 09:38, Luming Yu a écrit : On Fri, Sep 13, 2024 at 08:54:12AM +0200, Christophe Leroy wrote: Le 13/09/2024 à 03:40, Luming Yu a écrit : On Thu, Sep 12, 2024 at 12:23:29PM +0200, Christophe Leroy wrote: Le 12/09/2024 à 10:24, Luming Yu a écrit : From: Yu Luming convert powerpc entry code in syscall and fault to use syscall_work and irqentry_state as well as common calls from generic entry infrastructure. Signed-off-by: Luming Yu --- arch/powerpc/Kconfig | 1 + arch/powerpc/include/asm/hw_irq.h | 5 + arch/powerpc/include/asm/processor.h | 6 ++ arch/powerpc/include/asm/syscall.h | 5 + arch/powerpc/include/asm/thread_info.h | 1 + arch/powerpc/kernel/syscall.c | 6 +- arch/powerpc/mm/fault.c| 5 + 7 files changed, 28 insertions(+), 1 deletion(-) There is another build problem: CC kernel/entry/common.o kernel/entry/common.c: In function 'irqentry_exit': kernel/entry/common.c:335:21: error: implicit declaration of function 'regs_irqs_disabled'; did you mean 'raw_irqs_disabled'? [-Werror=implicit-function-declaration] 335 | } else if (!regs_irqs_disabled(regs)) { | ^~ | raw_irqs_disabled You have put regs_irqs_disabled() in a section dedicated to PPC64, so it fails on PPC32. After fixing this problem and providing an empty asm/entry-common.h it is now possible to build the kernel. But that's not enough, the board is stuck after: ... [2.871391] Freeing unused kernel image (initmem) memory: 1228K [2.877990] Run /init as init process Thanks for these questions. :-) I haven't gotten chance to run it in ppc32 qemu. the common syscall trace enter lost this hunk - if (!is_32bit_task()) - audit_syscall_entry(regs->gpr[0], regs->gpr[3], regs->gpr[4], - regs->gpr[5], regs->gpr[6]); - else - audit_syscall_entry(regs->gpr[0], - regs->gpr[3] & 0x, - regs->gpr[4] & 0x, - regs->gpr[5] & 0x, - regs->gpr[6] & 0x); which I don't understand whether we need a arch callbacks for it. I don't thing so. As far as I can see, audit_syscall_entry() is called by syscall_enter_audit() in kernel/entry/common.c And the masking of arguments based on is_32bit_task() is done in syscall_get_arguments() with is called by syscall_enter_audit() just before calling audit_syscall_entry() and which is an arch callback that does the same as the removed hunk. so, syscall_get_arguments is the ppc arch callback. thanks. :-) Before I sent out the RFC patch set, the very limited compile and boot test goes well with a ppc64 qemu VM. Surely, there will be a lot of test, debug and following up patch set update that is necessary to make it a complete convert. Even on ppc64 it doesn't build, at the first place because arch/powerpc/include/asm/entry-common.h is missing in your patch. Did you forget to 'git add' it ? oh, I forget that I was testing this patch on top of the early user notifier patch: https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Flinuxppc%2Fissues%2Fissues%2F477&data=05%7C02%7Cchristophe.leroy%40csgroup.eu%7C35a08ca9a81f4c6ff8ce08dcd3c73555%7C8b87af7d86474dc78df45f69a2011bb5%7C0%7C0%7C638618099770810941%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=yCQWLIAXL%2BNHnzrh0e91WIBvF0c5WfF6pY6ZMHstocA%3D&reserved=0, https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpatchwork.ozlabs.org%2Fproject%2Flinuxppc-dev%2Fpatch%2F1FD36D52828D2506%2B20231218031309.2063-1-luming.yu%40shingroup.cn%2F&data=05%7C02%7Cchristophe.leroy%40csgroup.eu%7C35a08ca9a81f4c6ff8ce08dcd3c73555%7C8b87af7d86474dc78df45f69a2011bb5%7C0%7C0%7C638618099770819779%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=0WBSUlpAbL6EMdPEAtJv1HUHkbeUjjUcP98wYf9IxM4%3D&reserved=0 and the entry-common.h is as follows: [root@localhost linux]# cat arch/powerpc/include/asm/entry-common.h /* SPDX-License-Identifier: GPL-2.0 */ #ifndef ARCH_POWERPC_ENTRY_COMMON_H #define ARCH_POWERPC_ENTRY_COMMON_H #include static inline void arch_exit_to_user_mode_prepare(struct pt_regs *regs, unsigned long ti_work) { if (ti_work & _TIF_USER_RETURN_NOTIFY) fire_user_return_notifiers(); } #define arch_exit_to_user_mode_prepare arch_exit_to_user_mode_prepare #endif As you could see , it looks irrelevant. And same as with PPC32, when I build P
Re: [PATCH 1/2] powerpc/entry: convert to common and generic entry
Le 13/09/2024 à 03:40, Luming Yu a écrit : On Thu, Sep 12, 2024 at 12:23:29PM +0200, Christophe Leroy wrote: Le 12/09/2024 à 10:24, Luming Yu a écrit : From: Yu Luming convert powerpc entry code in syscall and fault to use syscall_work and irqentry_state as well as common calls from generic entry infrastructure. Signed-off-by: Luming Yu --- arch/powerpc/Kconfig | 1 + arch/powerpc/include/asm/hw_irq.h | 5 + arch/powerpc/include/asm/processor.h | 6 ++ arch/powerpc/include/asm/syscall.h | 5 + arch/powerpc/include/asm/thread_info.h | 1 + arch/powerpc/kernel/syscall.c | 6 +- arch/powerpc/mm/fault.c| 5 + 7 files changed, 28 insertions(+), 1 deletion(-) There is another build problem: CC kernel/entry/common.o kernel/entry/common.c: In function 'irqentry_exit': kernel/entry/common.c:335:21: error: implicit declaration of function 'regs_irqs_disabled'; did you mean 'raw_irqs_disabled'? [-Werror=implicit-function-declaration] 335 | } else if (!regs_irqs_disabled(regs)) { | ^~ | raw_irqs_disabled You have put regs_irqs_disabled() in a section dedicated to PPC64, so it fails on PPC32. After fixing this problem and providing an empty asm/entry-common.h it is now possible to build the kernel. But that's not enough, the board is stuck after: ... [2.871391] Freeing unused kernel image (initmem) memory: 1228K [2.877990] Run /init as init process Thanks for these questions. :-) I haven't gotten chance to run it in ppc32 qemu. the common syscall trace enter lost this hunk - if (!is_32bit_task()) - audit_syscall_entry(regs->gpr[0], regs->gpr[3], regs->gpr[4], - regs->gpr[5], regs->gpr[6]); - else - audit_syscall_entry(regs->gpr[0], - regs->gpr[3] & 0x, - regs->gpr[4] & 0x, - regs->gpr[5] & 0x, - regs->gpr[6] & 0x); which I don't understand whether we need a arch callbacks for it. I don't thing so. As far as I can see, audit_syscall_entry() is called by syscall_enter_audit() in kernel/entry/common.c And the masking of arguments based on is_32bit_task() is done in syscall_get_arguments() with is called by syscall_enter_audit() just before calling audit_syscall_entry() and which is an arch callback that does the same as the removed hunk. Before I sent out the RFC patch set, the very limited compile and boot test goes well with a ppc64 qemu VM. Surely, there will be a lot of test, debug and following up patch set update that is necessary to make it a complete convert. Even on ppc64 it doesn't build, at the first place because arch/powerpc/include/asm/entry-common.h is missing in your patch. Did you forget to 'git add' it ? And same as with PPC32, when I build PPC64 with an empty asm/entry-common.h, it doesn't work. So, I guess you had some needed code in that file and you have to send it.
Re: [PATCH 1/2] powerpc/vpa_pmu: Add interface to expose vpa counters via perf
Le 28/08/2024 à 12:21, Kajol Jain a écrit : The pseries Shared Processor Logical Partition(SPLPAR) machines can retrieve a log of dispatch and preempt events from the hypervisor using data from Disptach Trace Log(DTL) buffer. With this information, user can retrieve when and why each dispatch & preempt has occurred. Added an interface to expose the Virtual Processor Area(VPA) DTL counters via perf. The following events are available and exposed in sysfs: vpa_dtl/dtl_cede/ - Trace voluntary (OS initiated) virtual processor waits vpa_dtl/dtl_preempt/ - Trace time slice preempts vpa_dtl/dtl_fault/ - Trace virtual partition memory page faults. vpa_dtl/dtl_all/ - Trace all (dtl_cede/dtl_preempt/dtl_fault) Added interface defines supported event list, config fields for the event attributes and their corresponding bit values which are exported via sysfs. User could use the standard perf tool to access perf events exposed via vpa-dtl pmu. The VPA DTL PMU counters do not interrupt on overflow or generate any PMI interrupts. Therefore, the kernel needs to poll the counters, added hrtimer code to do that. The timer interval can be provided by user via sample_period field in nano seconds. Result on power10 SPLPAR system with 656 cpu threads. In the below perf record command with vpa_dtl pmu, -c option is used to provide sample_period whch corresponding to 10ns i.e; 1sec and the workload time is also 1 second, hence we are getting 656 samples: [command] perf record -a -R -e vpa_dtl/dtl_all/ -c 10 sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.828 MB perf.data (656 samples) ] There is one hrtimer added per vpa-dtl pmu thread. Code added to handle addition of dtl buffer data in the raw sample. Since DTL does not provide IP address for a sample and it just have traces on reason of dispatch/preempt, we directly saving DTL buffer data to perf.data file as raw sample. For each hrtimer restart call, interface will dump all the new dtl entries added to dtl buffer as a raw sample. To ensure there are no other conflicting dtl users (example: debugfs dtl or /proc/powerpc/vcpudispatch_stats), interface added code to use "down_write_trylock" call to take the dtl_access_lock. The dtl_access_lock is defined in dtl.h file. Also added global reference count variable called "dtl_global_refc", to ensure dtl data can be captured per-cpu. Code also added global lock called "dtl_global_lock" to avoid race condition. Signed-off-by: Kajol Jain --- Notes: - Made code changes on top of recent fix sent by Michael Ellerman. Link to the patch: https://patchwork.ozlabs.org/project/linuxppc-dev/patch/20240819122401.513203-1-...@ellerman.id.au/ arch/powerpc/perf/Makefile | 2 +- arch/powerpc/perf/vpa-pmu.c | 469 include/linux/cpuhotplug.h | 1 + 3 files changed, 471 insertions(+), 1 deletion(-) create mode 100644 arch/powerpc/perf/vpa-pmu.c Seems like it doesn't build on PPC64: arch/powerpc/perf/vpa-pmu.c#L212 passing argument 1 of 'up_write' from incompatible pointer type [-Wincompatible-pointer-types] arch/powerpc/perf/vpa-pmu.c#L261 passing argument 1 of 'down_write_trylock' from incompatible pointer type [-Wincompatible-pointer-types] arch/powerpc/perf/vpa-pmu.c#L402 passing argument 1 of 'up_write' from incompatible pointer type [-Wincompatible-pointer-types]
[RFC PATCH] powerpc/vdso: Should VDSO64 functions be flagged as functions like VDSO32 ?
On powerpc64 as shown below by readelf, vDSO functions symbols have type NOTYPE. $ powerpc64-linux-gnu-readelf -a arch/powerpc/kernel/vdso/vdso64.so.dbg ELF Header: Magic: 7f 45 4c 46 02 02 01 00 00 00 00 00 00 00 00 00 Class: ELF64 Data: 2's complement, big endian Version: 1 (current) OS/ABI:UNIX - System V ABI Version: 0 Type: DYN (Shared object file) Machine: PowerPC64 Version: 0x1 ... Symbol table '.dynsym' contains 12 entries: Num:Value Size TypeBind Vis Ndx Name ... 1: 052484 NOTYPE GLOBAL DEFAULT8 __[...]@@LINUX_2.6.15 ... 4: 0 OBJECT GLOBAL DEFAULT ABS LINUX_2.6.15 5: 06c048 NOTYPE GLOBAL DEFAULT8 __[...]@@LINUX_2.6.15 Symbol table '.symtab' contains 56 entries: Num:Value Size TypeBind Vis Ndx Name ... 45: 0 OBJECT GLOBAL DEFAULT ABS LINUX_2.6.15 46: 06c048 NOTYPE GLOBAL DEFAULT8 __kernel_getcpu 47: 052484 NOTYPE GLOBAL DEFAULT8 __kernel_clock_getres To overcome that, commit ba83b3239e65 ("selftests: vDSO: fix vDSO symbols lookup for powerpc64") was proposed to make selftests also look for NOTYPE symbols, but is it the correct fix ? VDSO32 functions are flagged as functions, why not VDSO64 functions ? Is it because VDSO functions are not traditional C functions using the standard API ? But it is exactly the same for VDSO32 functions, allthough they are flagged as functions. So lets flag them as functions and revert the selftest change. What's your opinion on that ? It predates git kernel history and both VDSO32 and VDSO64 were brough by arch/ppc64/ with that difference already. Signed-off-by: Christophe Leroy --- commit ba83b3239e65 is in random git tree at the moment : https://git.kernel.org/pub/scm/linux/kernel/git/crng/random.git/commit/?id=ba83b3239e657469709d15dcea5f9b65bf9dbf34 On the list at : https://lore.kernel.org/lkml/fc1a0862516b1e11b336d409f2cb8aab10a97337.1725020674.git.christophe.le...@csgroup.eu/T/#u --- arch/powerpc/include/asm/vdso.h | 1 + tools/testing/selftests/vDSO/parse_vdso.c | 3 +-- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/include/asm/vdso.h b/arch/powerpc/include/asm/vdso.h index 7650b6ce14c8..8d972bc98b55 100644 --- a/arch/powerpc/include/asm/vdso.h +++ b/arch/powerpc/include/asm/vdso.h @@ -25,6 +25,7 @@ int vdso_getcpu_init(void); #ifdef __VDSO64__ #define V_FUNCTION_BEGIN(name) \ .globl name;\ + .type name,@function; \ name: \ #define V_FUNCTION_END(name) \ diff --git a/tools/testing/selftests/vDSO/parse_vdso.c b/tools/testing/selftests/vDSO/parse_vdso.c index d9ccc5acac18..4ae417372e9e 100644 --- a/tools/testing/selftests/vDSO/parse_vdso.c +++ b/tools/testing/selftests/vDSO/parse_vdso.c @@ -216,8 +216,7 @@ void *vdso_sym(const char *version, const char *name) ELF(Sym) *sym = &vdso_info.symtab[chain]; /* Check for a defined global or weak function w/ right name. */ - if (ELF64_ST_TYPE(sym->st_info) != STT_FUNC && - ELF64_ST_TYPE(sym->st_info) != STT_NOTYPE) + if (ELF64_ST_TYPE(sym->st_info) != STT_FUNC) continue; if (ELF64_ST_BIND(sym->st_info) != STB_GLOBAL && ELF64_ST_BIND(sym->st_info) != STB_WEAK) -- 2.44.0
Re: [PATCH 1/2] powerpc/entry: convert to common and generic entry
Le 12/09/2024 à 10:24, Luming Yu a écrit : From: Yu Luming convert powerpc entry code in syscall and fault to use syscall_work and irqentry_state as well as common calls from generic entry infrastructure. Signed-off-by: Luming Yu --- arch/powerpc/Kconfig | 1 + arch/powerpc/include/asm/hw_irq.h | 5 + arch/powerpc/include/asm/processor.h | 6 ++ arch/powerpc/include/asm/syscall.h | 5 + arch/powerpc/include/asm/thread_info.h | 1 + arch/powerpc/kernel/syscall.c | 6 +- arch/powerpc/mm/fault.c| 5 + 7 files changed, 28 insertions(+), 1 deletion(-) There is another build problem: CC kernel/entry/common.o kernel/entry/common.c: In function 'irqentry_exit': kernel/entry/common.c:335:21: error: implicit declaration of function 'regs_irqs_disabled'; did you mean 'raw_irqs_disabled'? [-Werror=implicit-function-declaration] 335 | } else if (!regs_irqs_disabled(regs)) { | ^~ | raw_irqs_disabled You have put regs_irqs_disabled() in a section dedicated to PPC64, so it fails on PPC32. After fixing this problem and providing an empty asm/entry-common.h it is now possible to build the kernel. But that's not enough, the board is stuck after: ... [2.871391] Freeing unused kernel image (initmem) memory: 1228K [2.877990] Run /init as init process Christophe
Re: [PATCH 1/2] powerpc/entry: convert to common and generic entry
Le 12/09/2024 à 10:24, Luming Yu a écrit : From: Yu Luming convert powerpc entry code in syscall and fault to use syscall_work and irqentry_state as well as common calls from generic entry infrastructure. Signed-off-by: Luming Yu --- arch/powerpc/Kconfig | 1 + arch/powerpc/include/asm/hw_irq.h | 5 + arch/powerpc/include/asm/processor.h | 6 ++ arch/powerpc/include/asm/syscall.h | 5 + arch/powerpc/include/asm/thread_info.h | 1 + arch/powerpc/kernel/syscall.c | 6 +- arch/powerpc/mm/fault.c| 5 + 7 files changed, 28 insertions(+), 1 deletion(-) asm/entry-common.h is missing, this patch doesn't build.
Re: [PATCH 1/2] powerpc/entry: convert to common and generic entry
Le 12/09/2024 à 10:24, Luming Yu a écrit : From: Yu Luming convert powerpc entry code in syscall and fault to use syscall_work and irqentry_state as well as common calls from generic entry infrastructure. Could you add more description about the change ? When I look at x86, riscv or s390 commits for the same thing, they tell a lot more: Commit 27d6b4d14f5c ("x86/entry: Use generic syscall entry function") Commit f0bddf50586d ("riscv: entry: Convert to generic entry") Commit 56e62a737028 ("s390: convert to generic entry") Can you elso provide some benchmark comparisons, at least using the null_syscall selftest tools/testing/selftests/powerpc/benchmarks/null_syscall.c Signed-off-by: Luming Yu --- arch/powerpc/Kconfig | 1 + arch/powerpc/include/asm/hw_irq.h | 5 + arch/powerpc/include/asm/processor.h | 6 ++ arch/powerpc/include/asm/syscall.h | 5 + arch/powerpc/include/asm/thread_info.h | 1 + arch/powerpc/kernel/syscall.c | 6 +- arch/powerpc/mm/fault.c| 5 + 7 files changed, 28 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index e21f72bcb61f..e94e7e4bfd40 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -202,6 +202,7 @@ config PPC select GENERIC_IRQ_SHOW_LEVEL select GENERIC_PCI_IOMAPif PCI select GENERIC_PTDUMP +select GENERIC_ENTRY select GENERIC_SMP_IDLE_THREAD select GENERIC_TIME_VSYSCALL select GENERIC_VDSO_TIME_NS diff --git a/arch/powerpc/include/asm/hw_irq.h b/arch/powerpc/include/asm/hw_irq.h index 317659fdeacf..a3d591784c95 100644 --- a/arch/powerpc/include/asm/hw_irq.h +++ b/arch/powerpc/include/asm/hw_irq.h @@ -216,6 +216,11 @@ static inline bool arch_irqs_disabled(void) return arch_irqs_disabled_flags(arch_local_save_flags()); } +/*common entry*/ +static __always_inline bool regs_irqs_disabled(struct pt_regs *regs) +{ + return arch_irqs_disabled(); +} static inline void set_pmi_irq_pending(void) { /* diff --git a/arch/powerpc/include/asm/processor.h b/arch/powerpc/include/asm/processor.h index b2c51d337e60..1292282f8b0e 100644 --- a/arch/powerpc/include/asm/processor.h +++ b/arch/powerpc/include/asm/processor.h @@ -383,6 +383,12 @@ int validate_sp(unsigned long sp, struct task_struct *p); int validate_sp_size(unsigned long sp, struct task_struct *p, unsigned long nbytes); +/*for common entry*/ +static __always_inline bool on_thread_stack(void) +{ + return validate_sp(current_stack_pointer, current); I don't understand. Other architectures have something more simple for on_thread_stack(). Also, validate_sp() will also return true when on irq_stack or emergency stack. +} + /* * Prefetch macros. */ diff --git a/arch/powerpc/include/asm/syscall.h b/arch/powerpc/include/asm/syscall.h index 3dd36c5e334a..0e94806c7bfe 100644 --- a/arch/powerpc/include/asm/syscall.h +++ b/arch/powerpc/include/asm/syscall.h @@ -119,4 +119,9 @@ static inline int syscall_get_arch(struct task_struct *task) else return AUDIT_ARCH_PPC64; } + +static inline bool arch_syscall_is_vdso_sigreturn(struct pt_regs *regs) +{ + return false; +} #endif/* _ASM_SYSCALL_H */ diff --git a/arch/powerpc/include/asm/thread_info.h b/arch/powerpc/include/asm/thread_info.h index 47e226032f9c..c52ca3aaebb5 100644 --- a/arch/powerpc/include/asm/thread_info.h +++ b/arch/powerpc/include/asm/thread_info.h @@ -58,6 +58,7 @@ struct thread_info { unsigned intcpu; #endif unsigned long local_flags;/* private flags for thread */ + unsigned long syscall_work; #ifdef CONFIG_LIVEPATCH_64 unsigned long *livepatch_sp; #endif diff --git a/arch/powerpc/kernel/syscall.c b/arch/powerpc/kernel/syscall.c index 77fedb190c93..cbf0510ed10e 100644 --- a/arch/powerpc/kernel/syscall.c +++ b/arch/powerpc/kernel/syscall.c @@ -3,6 +3,7 @@ #include #include #include +#include #include #include @@ -131,7 +132,7 @@ notrace long system_call_exception(struct pt_regs *regs, unsigned long r0) * and the test against NR_syscalls will fail and the return * value to be used is in regs->gpr[3]. */ - r0 = do_syscall_trace_enter(regs); + syscall_enter_from_user_mode(regs, r0); shouldn't this be: r0 = syscall_enter_from_user_mode(regs, r0); if (unlikely(r0 >= NR_syscalls)) return regs->gpr[3]; @@ -185,5 +186,8 @@ notrace long system_call_exception(struct pt_regs *regs, unsigned long r0) */ choose_random_kstack_offset(mftb()); + /*common entry*/ + syscall_exit_to_user_mode(regs); + This seems to do a lot. Isn't there stuff that was previously done by powerpc and need
Re: [PATCH RFC v3 1/2] mm: Add personality flag to limit address to 47 bits
diff --git a/include/uapi/linux/personality.h b/include/uapi/linux/personality.h index 49796b7756af..cd3b8c154d9b 100644 --- a/include/uapi/linux/personality.h +++ b/include/uapi/linux/personality.h @@ -22,6 +22,7 @@ enum { WHOLE_SECONDS = 0x200, STICKY_TIMEOUTS = 0x400, ADDR_LIMIT_3GB =0x800, + ADDR_LIMIT_47BIT = 0x1000, }; I wonder if ADDR_LIMIT_128T would be clearer? I don't follow, what does 128T represent? 128T is 128 Terabytes, that's the maximum size achievable with a 47BIT address, that naming would be more consistant with the ADDR_LIMIT_3GB just above that means a 3 Gigabytes limit. Christophe
Re: No rule to make target 'arch/powerpc/boot/dtbImage.ps3', needed by 'arch/powerpc/boot/zImage'.
Le 09/09/2024 à 21:25, Naresh Kamboju a écrit : The Powerpc cell_defconfig and mpc83xx_defconfig builds failed on the Linux next-20240909 due to following build warnings / errors with gcc-13 and clang-19. First seen on next-20240909 Good: next-20240906 BAD: next-20240909 Reported-by: Linux Kernel Functional Testing build log: make[3]: *** No rule to make target 'arch/powerpc/boot/dtbImage.ps3', needed by 'arch/powerpc/boot/zImage'. make[3]: Target 'arch/powerpc/boot/zImage' not remade because of errors. See https://lore.kernel.org/linuxppc-dev/b154ab25-70f6-46cd-99db-ccfbe3e13...@csgroup.eu/T/#m7cc489243ce5a17af97ff8ec7cc15c663565b6fd Christophe Build Log links, - https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fqa-reports.linaro.org%2Flkft%2Flinux-next-master%2Fbuild%2Fnext-20240909%2Ftestrun%2F25078675%2Fsuite%2Fbuild%2Ftest%2Fclang-19-cell_defconfig%2Flog&data=05%7C02%7Cchristophe.leroy2%40cs-soprasteria.com%7Cbea1fddc11ef4588817208dcd118fdb1%7C8b87af7d86474dc78df45f69a2011bb5%7C0%7C0%7C638615152486474122%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=k4QsFvSrqrtJkwe5i8qvTetu941J%2FiokAMEDIy1hgO4%3D&reserved=0 Build failed comparison: - https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fqa-reports.linaro.org%2Flkft%2Flinux-next-master%2Fbuild%2Fnext-20240909%2Ftestrun%2F25078675%2Fsuite%2Fbuild%2Ftest%2Fclang-19-cell_defconfig%2Fhistory%2F&data=05%7C02%7Cchristophe.leroy2%40cs-soprasteria.com%7Cbea1fddc11ef4588817208dcd118fdb1%7C8b87af7d86474dc78df45f69a2011bb5%7C0%7C0%7C638615152486482765%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=hZJF6oi09QQnOcjWlPkn5YQw3L33uG5vobqZX%2FJzW%2Fc%3D&reserved=0 metadata: git describe: next-20240909 git repo: https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.com%2FLinaro%2Flkft%2Fmirrors%2Fnext%2Flinux-next&data=05%7C02%7Cchristophe.leroy2%40cs-soprasteria.com%7Cbea1fddc11ef4588817208dcd118fdb1%7C8b87af7d86474dc78df45f69a2011bb5%7C0%7C0%7C638615152486487967%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=FIZK7ic0Dwbnt8zclmji5w2mTS0konX%2Bh6izCQ2QoqQ%3D&reserved=0 git sha: 100cc857359b5d731407d1038f7e76cd0e871d94 kernel config: https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstorage.tuxsuite.com%2Fpublic%2Flinaro%2Flkft%2Fbuilds%2F2lpXzh3wwbuC6nYpMV2nPNA0IpF%2Fconfig&data=05%7C02%7Cchristophe.leroy2%40cs-soprasteria.com%7Cbea1fddc11ef4588817208dcd118fdb1%7C8b87af7d86474dc78df45f69a2011bb5%7C0%7C0%7C638615152487540129%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=KPpD9d%2FM2gxLqjaVIsN26jwolSkzetd%2B0VGVMaV4Mwo%3D&reserved=0 build url: https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstorage.tuxsuite.com%2Fpublic%2Flinaro%2Flkft%2Fbuilds%2F2lpXzh3wwbuC6nYpMV2nPNA0IpF%2F&data=05%7C02%7Cchristophe.leroy2%40cs-soprasteria.com%7Cbea1fddc11ef4588817208dcd118fdb1%7C8b87af7d86474dc78df45f69a2011bb5%7C0%7C0%7C638615152487548564%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=dS0BYsDDrcnW7fF2KeUZFBvFKnMKLz36nLWt%2BmeqIIQ%3D&reserved=0 toolchain: gcc-13, clang-19 and clang-nightly config: cell_defconfig and mpc83xx_defconfig Steps to reproduce: - - # tuxmake --runtime podman --target-arch powerpc --toolchain clang-19 --kconfig cell_defconfig LLVM_IAS=0 - # tuxmake --runtime podman --target-arch powerpc --toolchain gcc-13 --kconfig mpc83xx_defconfig -- Linaro LKFT https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flkft.linaro.org%2F&data=05%7C02%7Cchristophe.leroy2%40cs-soprasteria.com%7Cbea1fddc11ef4588817208dcd118fdb1%7C8b87af7d86474dc78df45f69a2011bb5%7C0%7C0%7C638615152487553541%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=Pzu3tW7FdI%2F7FYj%2BUSz%2FTfENPPGRZUEf%2FsXcaSpPXB8%3D&reserved=0
Re: linux-next: build failure after merge of the powerpc tree
Le 09/09/2024 à 18:23, Masahiro Yamada a écrit : On Mon, Sep 9, 2024 at 11:58 PM Stephen Rothwell wrote: Hi Christophe, On Mon, 9 Sep 2024 16:22:26 +0200 Christophe Leroy wrote: Le 09/09/2024 à 12:09, Stephen Rothwell a écrit : Hi all, After merging the powerpc tree, today's linux-next build (powerpc ppc44x_defconfig) failed like this: make[3]: *** No rule to make target 'arch/powerpc/boot/treeImage.ebony', needed by 'arch/powerpc/boot/zImage'. Stop. make[2]: *** [/home/sfr/next/next/arch/powerpc/Makefile:236: zImage] Error 2 make[1]: *** [/home/sfr/next/next/Makefile:224: __sub-make] Error 2 make: *** [Makefile:224: __sub-make] Error 2 It is not obvious to me what change caused this, so I have just left the build broken for today. Bisected to commit e6abfb536d16 ("kbuild: split device tree build rules into scripts/Makefile.dtbs") Thanks for that. -- Cheers, Stephen Rothwell I squashed the following fix. Hopefully, it will be ok tomorrow. diff --git a/scripts/Makefile.build b/scripts/Makefile.build index 6385e7aa5dbb..8403eba15457 100644 --- a/scripts/Makefile.build +++ b/scripts/Makefile.build @@ -444,7 +444,7 @@ ifneq ($(userprogs),) include $(srctree)/scripts/Makefile.userprogs endif -ifneq ($(need-dtbslist)$(dtb-y)$(dtb-)$(filter %.dtb.o %.dtbo.o,$(targets)),) +ifneq ($(need-dtbslist)$(dtb-y)$(dtb-)$(filter %.dtb %.dtb.o %.dtbo.o,$(targets)),) include $(srctree)/scripts/Makefile.dtbs endif The build of ppc44x_defconfig is ok with this change on top of next-20240909
Re: linux-next: build failure after merge of the powerpc tree
Le 09/09/2024 à 12:09, Stephen Rothwell a écrit : Hi all, After merging the powerpc tree, today's linux-next build (powerpc ppc44x_defconfig) failed like this: make[3]: *** No rule to make target 'arch/powerpc/boot/treeImage.ebony', needed by 'arch/powerpc/boot/zImage'. Stop. make[2]: *** [/home/sfr/next/next/arch/powerpc/Makefile:236: zImage] Error 2 make[1]: *** [/home/sfr/next/next/Makefile:224: __sub-make] Error 2 make: *** [Makefile:224: __sub-make] Error 2 It is not obvious to me what change caused this, so I have just left the build broken for today. Bisected to commit e6abfb536d16 ("kbuild: split device tree build rules into scripts/Makefile.dtbs") Christophe
[PATCH] powerpc: Add __must_check to set_memory_...()
After the following powerpc commits, all calls to set_memory_...() functions check returned value. - Commit 8f17bd2f4196 ("powerpc: Handle error in mark_rodata_ro() and mark_initmem_nx()") - Commit f7f18e30b468 ("powerpc/kprobes: Handle error returned by set_memory_rox()") - Commit 009cf11d4aab ("powerpc: Don't ignore errors from set_memory_{n}p() in __kernel_map_pages()") - Commit 9cbacb834b4a ("powerpc: Don't ignore errors from set_memory_{n}p() in __kernel_map_pages()") - Commit 78cb0945f714 ("powerpc: Handle error in mark_rodata_ro() and mark_initmem_nx()") All calls in core parts of the kernel also always check returned value, can be looked at with following query: $ git grep -w -e set_memory_ro -e set_memory_rw -e set_memory_x -e set_memory_nx -e set_memory_rox `find . -maxdepth 1 -type d | grep -v arch | grep /` It is now possible to flag those functions with __must_check to make sure no new unchecked call it added. Link: https://github.com/KSPP/linux/issues/7 Signed-off-by: Christophe Leroy --- arch/powerpc/include/asm/set_memory.h | 14 +++--- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/arch/powerpc/include/asm/set_memory.h b/arch/powerpc/include/asm/set_memory.h index 9a025b776a4b..9c8d5747755d 100644 --- a/arch/powerpc/include/asm/set_memory.h +++ b/arch/powerpc/include/asm/set_memory.h @@ -12,37 +12,37 @@ int change_memory_attr(unsigned long addr, int numpages, long action); -static inline int set_memory_ro(unsigned long addr, int numpages) +static inline int __must_check set_memory_ro(unsigned long addr, int numpages) { return change_memory_attr(addr, numpages, SET_MEMORY_RO); } -static inline int set_memory_rw(unsigned long addr, int numpages) +static inline int __must_check set_memory_rw(unsigned long addr, int numpages) { return change_memory_attr(addr, numpages, SET_MEMORY_RW); } -static inline int set_memory_nx(unsigned long addr, int numpages) +static inline int __must_check set_memory_nx(unsigned long addr, int numpages) { return change_memory_attr(addr, numpages, SET_MEMORY_NX); } -static inline int set_memory_x(unsigned long addr, int numpages) +static inline int __must_check set_memory_x(unsigned long addr, int numpages) { return change_memory_attr(addr, numpages, SET_MEMORY_X); } -static inline int set_memory_np(unsigned long addr, int numpages) +static inline int __must_check set_memory_np(unsigned long addr, int numpages) { return change_memory_attr(addr, numpages, SET_MEMORY_NP); } -static inline int set_memory_p(unsigned long addr, int numpages) +static inline int __must_check set_memory_p(unsigned long addr, int numpages) { return change_memory_attr(addr, numpages, SET_MEMORY_P); } -static inline int set_memory_rox(unsigned long addr, int numpages) +static inline int __must_check set_memory_rox(unsigned long addr, int numpages) { return change_memory_attr(addr, numpages, SET_MEMORY_ROX); } -- 2.44.0
[PATCH] set_memory: Add __must_check to generic stubs
Following query shows that architectures that don't provide asm/set_memory.h don't use set_memory_...() functions. $ git grep set_memory_ alpha arc csky hexagon loongarch m68k microblaze mips nios2 openrisc parisc sh sparc um xtensa Following query shows that all core users of set_memory_...() functions always take returned value into account: $ git grep -w -e set_memory_ro -e set_memory_rw -e set_memory_x -e set_memory_nx -e set_memory_rox `find . -maxdepth 1 -type d | grep -v arch | grep /` set_memory_...() functions can fail, leaving the memory attributes unchanged. Make sure all callers check the returned code. Link: https://github.com/KSPP/linux/issues/7 Signed-off-by: Christophe Leroy --- include/linux/set_memory.h | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/include/linux/set_memory.h b/include/linux/set_memory.h index 95ac8398ee72..e7aec20fb44f 100644 --- a/include/linux/set_memory.h +++ b/include/linux/set_memory.h @@ -8,10 +8,10 @@ #ifdef CONFIG_ARCH_HAS_SET_MEMORY #include #else -static inline int set_memory_ro(unsigned long addr, int numpages) { return 0; } -static inline int set_memory_rw(unsigned long addr, int numpages) { return 0; } -static inline int set_memory_x(unsigned long addr, int numpages) { return 0; } -static inline int set_memory_nx(unsigned long addr, int numpages) { return 0; } +static inline int __must_check set_memory_ro(unsigned long addr, int numpages) { return 0; } +static inline int __must_check set_memory_rw(unsigned long addr, int numpages) { return 0; } +static inline int __must_check set_memory_x(unsigned long addr, int numpages) { return 0; } +static inline int __must_check set_memory_nx(unsigned long addr, int numpages) { return 0; } #endif #ifndef set_memory_rox -- 2.44.0
Re: [PATCH 2/2] Fixup for 3279be36b671 ("powerpc/vdso: Wire up getrandom() vDSO implementation on VDSO32")
Le 07/09/2024 à 16:35, Jason A. Donenfeld a écrit : On Fri, Sep 06, 2024 at 08:54:49PM +0200, Jason A. Donenfeld wrote: On Fri, Sep 06, 2024 at 05:14:43PM +0200, Christophe Leroy wrote: Le 06/09/2024 à 16:46, Jason A. Donenfeld a écrit : On Fri, Sep 06, 2024 at 04:26:32PM +0200, Christophe Leroy wrote: On the long run I wonder if we should try to find a more generic solution for getrandom instead of requiring each architecture to handle it. On gettimeofday the selection of the right page is embeded in the generic part, see for instance : static __maybe_unused __kernel_old_time_t __cvdso_time_data(const struct vdso_data *vd, __kernel_old_time_t *time) { __kernel_old_time_t t; if (IS_ENABLED(CONFIG_TIME_NS) && vd->clock_mode == VDSO_CLOCKMODE_TIMENS) vd = __arch_get_timens_vdso_data(vd); t = READ_ONCE(vd[CS_HRES_COARSE].basetime[CLOCK_REALTIME].sec); if (time) *time = t; return t; } and powerpc just provides: static __always_inline const struct vdso_data *__arch_get_timens_vdso_data(const struct vdso_data *vd) { return (void *)vd + (1U << CONFIG_PAGE_SHIFT); } It's tempting, but maybe a bit tricky. LoongArch, for example, doesn't have this problem at all, because the layout of their vvars doesn't require it. So the vd->clock_mode access is unnecessary. Or another solution could be to put random data in a third page that is always at the same place regardless of timens ? Maybe that's the easier way, yea. Potentially wasteful, though. Indeed I just looked at Loongarch and that's exactly what they do: they have a third page after the two pages dedicated to TIME for arch specific data, and they have added getrandom data there. The third page is common to every process so it won't waste more than a few bytes. It doesn't worry me even on the older boards that only have 32 Mbytes of RAM. So yes, I may have a look at that in the future, what we have at the moment is good enough to move forward. My x86 code is kind of icky for this: static __always_inline const struct vdso_rng_data *__arch_get_vdso_rng_data(void) { if (IS_ENABLED(CONFIG_TIME_NS) && __vdso_data->clock_mode == VDSO_CLOCKMODE_TIMENS) return (void *)&__vdso_rng_data + ((void *)&__timens_vdso_data - (void *)&__vdso_data); return &__vdso_rng_data; } Doing the subtraction like that means that this is more clearly correct. But it also makes the compiler insert two jumps for the branch, and then reads the addresses of those variables and such. If I change it to: static __always_inline const struct vdso_rng_data *__arch_get_vdso_rng_data(void) { if (IS_ENABLED(CONFIG_TIME_NS) && __vdso_data->clock_mode == VDSO_CLOCKMODE_TIMENS) return (void *)&__vdso_rng_data + (3UL << CONFIG_PAGE_SHIFT); return &__vdso_rng_data; } Then there's a much nicer single `cmov` with no branching. But if I want to do that for real, I'll have to figure out what set of nice compile-time constants I can use. I haven't looked into this yet. https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fall%2F20240906190655.2777023-1-Jason%40zx2c4.com%2F&data=05%7C02%7Cchristophe.leroy%40csgroup.eu%7C3ee8b35fe848434e72fd08dccf4a67ff%7C8b87af7d86474dc78df45f69a2011bb5%7C0%7C0%7C638613165688600378%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=g4zcMonjJNYhwrUWeCDoL5Ri7Mbg5hVQJyZNU2zH4Pc%3D&reserved=0 Looks good. Allthough other architectures don't use defines but enums for that: arch/arm64/kernel/vdso.c-36- arch/arm64/kernel/vdso.c-37-enum vvar_pages { arch/arm64/kernel/vdso.c:38:VVAR_DATA_PAGE_OFFSET, arch/arm64/kernel/vdso.c:39:VVAR_TIMENS_PAGE_OFFSET, arch/arm64/kernel/vdso.c-40-VVAR_NR_PAGES, arch/arm64/kernel/vdso.c-41-}; -- arch/loongarch/include/asm/vdso/vdso.h-36- arch/loongarch/include/asm/vdso/vdso.h-37-enum vvar_pages { arch/loongarch/include/asm/vdso/vdso.h:38: VVAR_GENERIC_PAGE_OFFSET, arch/loongarch/include/asm/vdso/vdso.h:39: VVAR_TIMENS_PAGE_OFFSET, arch/loongarch/include/asm/vdso/vdso.h-40- VVAR_LOONGARCH_PAGES_START, arch/loongarch/include/asm/vdso/vdso.h-41- VVAR_LOONGARCH_PAGES_END = VVAR_LOONGARCH_PAGES_START + LOONGARCH_VDSO_DATA_PAGES - 1, -- arch/powerpc/kernel/vdso.c-54- arch/powerpc/kernel/vdso.c-55-enum vvar_pages { arch/powerpc/kernel/vdso.c:56: VVAR_DATA_PAGE_OFFSET, arch/powerpc/kernel/vdso.c:57: VVAR_TIMENS_PAGE_OFFSET, arch/powerpc/kernel/vdso.c-58- VVAR_NR_PAGES, arch/powerpc/kernel/vdso.c-59-}; -- arch/riscv/kernel/vdso.c-19- arch/riscv/kernel/vdso.c-20-enum vvar_pages { arch/riscv/kernel/vdso.c:21:VVAR_DATA_PAGE_OFFSET, arch/riscv/kernel/vdso.c:22:VVAR_TIMENS_PAGE_OFFSET, arch/riscv/kernel/v
Re: [PATCH 2/2] Fixup for 3279be36b671 ("powerpc/vdso: Wire up getrandom() vDSO implementation on VDSO32")
Le 06/09/2024 à 16:46, Jason A. Donenfeld a écrit : On Fri, Sep 06, 2024 at 04:26:32PM +0200, Christophe Leroy wrote: On the long run I wonder if we should try to find a more generic solution for getrandom instead of requiring each architecture to handle it. On gettimeofday the selection of the right page is embeded in the generic part, see for instance : static __maybe_unused __kernel_old_time_t __cvdso_time_data(const struct vdso_data *vd, __kernel_old_time_t *time) { __kernel_old_time_t t; if (IS_ENABLED(CONFIG_TIME_NS) && vd->clock_mode == VDSO_CLOCKMODE_TIMENS) vd = __arch_get_timens_vdso_data(vd); t = READ_ONCE(vd[CS_HRES_COARSE].basetime[CLOCK_REALTIME].sec); if (time) *time = t; return t; } and powerpc just provides: static __always_inline const struct vdso_data *__arch_get_timens_vdso_data(const struct vdso_data *vd) { return (void *)vd + (1U << CONFIG_PAGE_SHIFT); } It's tempting, but maybe a bit tricky. LoongArch, for example, doesn't have this problem at all, because the layout of their vvars doesn't require it. So the vd->clock_mode access is unnecessary. Or another solution could be to put random data in a third page that is always at the same place regardless of timens ? Maybe that's the easier way, yea. Potentially wasteful, though. Indeed I just looked at Loongarch and that's exactly what they do: they have a third page after the two pages dedicated to TIME for arch specific data, and they have added getrandom data there. The third page is common to every process so it won't waste more than a few bytes. It doesn't worry me even on the older boards that only have 32 Mbytes of RAM. So yes, I may have a look at that in the future, what we have at the moment is good enough to move forward. Christophe
Re: [PATCH 2/2] Fixup for 3279be36b671 ("powerpc/vdso: Wire up getrandom() vDSO implementation on VDSO32")
Le 06/09/2024 à 16:07, Jason A. Donenfeld a écrit : On Fri, Sep 06, 2024 at 10:33:44AM +0200, Christophe Leroy wrote: Use the new get_realdatapage macro instead of get_datapage Signed-off-by: Christophe Leroy --- arch/powerpc/kernel/vdso/getrandom.S | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/kernel/vdso/getrandom.S b/arch/powerpc/kernel/vdso/getrandom.S index a957cd2b2b03..f3bbf931931c 100644 --- a/arch/powerpc/kernel/vdso/getrandom.S +++ b/arch/powerpc/kernel/vdso/getrandom.S @@ -31,7 +31,7 @@ PPC_STL r2, PPC_MIN_STKFRM + STK_GOT(r1) .cfi_rel_offset r2, PPC_MIN_STKFRM + STK_GOT #endif - get_datapager8 + get_realdatapager8, r11 addir8, r8, VDSO_RNG_DATA_OFFSET bl CFUNC(DOTSYM(\funct)) PPC_LL r0, PPC_MIN_STKFRM + PPC_LR_STKOFF(r1) I tested that this is working as intended on powerpc, powerpc64, and powerpc64le. Thanks for writing the patch so quickly. You are welcome. And thanks for playing up with it while I was sleeping and getting ideas too. Did you learn powerpc assembly during the night or did you know it already ? At the end I ended up with something which I think is simple enough for a backport to stable. On the long run I wonder if we should try to find a more generic solution for getrandom instead of requiring each architecture to handle it. On gettimeofday the selection of the right page is embeded in the generic part, see for instance : static __maybe_unused __kernel_old_time_t __cvdso_time_data(const struct vdso_data *vd, __kernel_old_time_t *time) { __kernel_old_time_t t; if (IS_ENABLED(CONFIG_TIME_NS) && vd->clock_mode == VDSO_CLOCKMODE_TIMENS) vd = __arch_get_timens_vdso_data(vd); t = READ_ONCE(vd[CS_HRES_COARSE].basetime[CLOCK_REALTIME].sec); if (time) *time = t; return t; } and powerpc just provides: static __always_inline const struct vdso_data *__arch_get_timens_vdso_data(const struct vdso_data *vd) { return (void *)vd + (1U << CONFIG_PAGE_SHIFT); } I know it may not be that simple for getrandom but its probably worth trying. Or another solution could be to put random data in a third page that is always at the same place regardless of timens ? Christophe
Re: [PATCH 1/2] powerpc/vdso: Fix VDSO data access when running in a non-root time namespace
Le 06/09/2024 à 14:23, Michael Ellerman a écrit : Christophe Leroy writes: When running in a non-root time namespace, the global VDSO data page is replaced by a dedicated namespace data page and the global data page is mapped next to it. Detailed explanations can be found at commit 660fd04f9317 ("lib/vdso: Prepare for time namespace support"). When it happens, __kernel_get_syscall_map and __kernel_get_tbfreq and __kernel_sync_dicache don't work anymore because they read 0 instead of the data they need. To address that, clock_mode has to be read. When it is set to VDSO_CLOCKMODE_TIMENS, it means it is a dedicated namespace data page and the global data is located on the following page. Add a macro called get_realdatapage which reads clock_mode and add PAGE_SIZE to the pointer provided by get_datapage macro when clock_mode is equal to VDSO_CLOCKMODE_TIMENS. Use this new macro instead of get_datapage macro except for time functions as they handle it internally. Fixes: 74205b3fc2ef ("powerpc/vdso: Add support for time namespaces") Signed-off-by: Christophe Leroy Oops. I guess it should also have: Cc: sta...@vger.kernel.org # v5.13+ Reported-by: Jason A. Donenfeld Closes: https://lore.kernel.org/all/ztnyqzi-nrsns...@zx2c4.com/ Jason only reported a problem with getrandom, the other three are "cherry on the cake". The bug has been there for 3 years, I'm sure it can stay 3-4 more weeks, I'm not sure there is a need to apply it in both trees. As far as I understood Jason was about to squash the fix into his tree so I was expecting him to apply patch 1 before "vDSO getrandom implementation for powerpc" patches and then squash patch 2 in place. Jason how do you want to handle this? I can put patch 1 in a topic branch that we both merge? Then you can apply patch 2 on top of that merge in your tree. Or we could both apply patch 1 to our trees, it might lead to a conflict but it wouldn't be anything drastic.
Re: [PATCH v2 7/8] execmem: add support for cache of large ROX pages
Le 26/08/2024 à 08:55, Mike Rapoport a écrit : From: "Mike Rapoport (Microsoft)" Using large pages to map text areas reduces iTLB pressure and improves performance. Extend execmem_alloc() with an ability to use PMD_SIZE'ed pages with ROX permissions as a cache for smaller allocations. Why only PMD_SIZE ? On power 8xx, PMD_SIZE is 4M and the 8xx doesn't have such a page size. When you call vmalloc() with VM_ALLOW_HUGE_VMAP you get 16k pages or 512k pages depending on the size you ask for, see function arch_vmap_pte_supported_shift() To populate the cache, a writable large page is allocated from vmalloc with VM_ALLOW_HUGE_VMAP, filled with invalid instructions and then remapped as ROX. Portions of that large page are handed out to execmem_alloc() callers without any changes to the permissions. When the memory is freed with execmem_free() it is invalidated again so that it won't contain stale instructions. The cache is enabled when an architecture sets EXECMEM_ROX_CACHE flag in definition of an execmem_range. Christophe
[PATCH 2/2] Fixup for 3279be36b671 ("powerpc/vdso: Wire up getrandom() vDSO implementation on VDSO32")
Use the new get_realdatapage macro instead of get_datapage Signed-off-by: Christophe Leroy --- arch/powerpc/kernel/vdso/getrandom.S | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/kernel/vdso/getrandom.S b/arch/powerpc/kernel/vdso/getrandom.S index a957cd2b2b03..f3bbf931931c 100644 --- a/arch/powerpc/kernel/vdso/getrandom.S +++ b/arch/powerpc/kernel/vdso/getrandom.S @@ -31,7 +31,7 @@ PPC_STL r2, PPC_MIN_STKFRM + STK_GOT(r1) .cfi_rel_offset r2, PPC_MIN_STKFRM + STK_GOT #endif - get_datapager8 + get_realdatapager8, r11 addir8, r8, VDSO_RNG_DATA_OFFSET bl CFUNC(DOTSYM(\funct)) PPC_LL r0, PPC_MIN_STKFRM + PPC_LR_STKOFF(r1) -- 2.44.0
[PATCH 1/2] powerpc/vdso: Fix VDSO data access when running in a non-root time namespace
When running in a non-root time namespace, the global VDSO data page is replaced by a dedicated namespace data page and the global data page is mapped next to it. Detailed explanations can be found at commit 660fd04f9317 ("lib/vdso: Prepare for time namespace support"). When it happens, __kernel_get_syscall_map and __kernel_get_tbfreq and __kernel_sync_dicache don't work anymore because they read 0 instead of the data they need. To address that, clock_mode has to be read. When it is set to VDSO_CLOCKMODE_TIMENS, it means it is a dedicated namespace data page and the global data is located on the following page. Add a macro called get_realdatapage which reads clock_mode and add PAGE_SIZE to the pointer provided by get_datapage macro when clock_mode is equal to VDSO_CLOCKMODE_TIMENS. Use this new macro instead of get_datapage macro except for time functions as they handle it internally. Fixes: 74205b3fc2ef ("powerpc/vdso: Add support for time namespaces") Signed-off-by: Christophe Leroy --- arch/powerpc/include/asm/vdso_datapage.h | 15 +++ arch/powerpc/kernel/asm-offsets.c| 2 ++ arch/powerpc/kernel/vdso/cacheflush.S| 2 +- arch/powerpc/kernel/vdso/datapage.S | 4 ++-- 4 files changed, 20 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/include/asm/vdso_datapage.h b/arch/powerpc/include/asm/vdso_datapage.h index e17500c5237e..248dee138f7b 100644 --- a/arch/powerpc/include/asm/vdso_datapage.h +++ b/arch/powerpc/include/asm/vdso_datapage.h @@ -113,6 +113,21 @@ extern struct vdso_arch_data *vdso_data; addi\ptr, \ptr, (_vdso_datapage - 999b)@l .endm +#include +#include + +.macro get_realdatapage ptr scratch + get_datapage \ptr +#ifdef CONFIG_TIME_NS + lwz \scratch, VDSO_CLOCKMODE_OFFSET(\ptr) + xoris \scratch, \scratch, VDSO_CLOCKMODE_TIMENS@h + xori\scratch, \scratch, VDSO_CLOCKMODE_TIMENS@l + cntlzw \scratch, \scratch + rlwinm \scratch, \scratch, PAGE_SHIFT - 5, 1 << PAGE_SHIFT + add \ptr, \ptr, \scratch +#endif +.endm + #endif /* __ASSEMBLY__ */ #endif /* __KERNEL__ */ diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c index eedb2e04c785..131a8cc10dbe 100644 --- a/arch/powerpc/kernel/asm-offsets.c +++ b/arch/powerpc/kernel/asm-offsets.c @@ -347,6 +347,8 @@ int main(void) #else OFFSET(CFG_SYSCALL_MAP32, vdso_arch_data, syscall_map); #endif + OFFSET(VDSO_CLOCKMODE_OFFSET, vdso_arch_data, data[0].clock_mode); + DEFINE(VDSO_CLOCKMODE_TIMENS, VDSO_CLOCKMODE_TIMENS); #ifdef CONFIG_BUG DEFINE(BUG_ENTRY_SIZE, sizeof(struct bug_entry)); diff --git a/arch/powerpc/kernel/vdso/cacheflush.S b/arch/powerpc/kernel/vdso/cacheflush.S index 0085ae464dac..3b2479bd2f9a 100644 --- a/arch/powerpc/kernel/vdso/cacheflush.S +++ b/arch/powerpc/kernel/vdso/cacheflush.S @@ -30,7 +30,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_COHERENT_ICACHE) #ifdef CONFIG_PPC64 mflrr12 .cfi_register lr,r12 - get_datapager10 + get_realdatapager10, r11 mtlrr12 .cfi_restore lr #endif diff --git a/arch/powerpc/kernel/vdso/datapage.S b/arch/powerpc/kernel/vdso/datapage.S index db8e167f0166..2b19b6201a33 100644 --- a/arch/powerpc/kernel/vdso/datapage.S +++ b/arch/powerpc/kernel/vdso/datapage.S @@ -28,7 +28,7 @@ V_FUNCTION_BEGIN(__kernel_get_syscall_map) mflrr12 .cfi_register lr,r12 mr. r4,r3 - get_datapager3 + get_realdatapager3, r11 mtlrr12 #ifdef __powerpc64__ addir3,r3,CFG_SYSCALL_MAP64 @@ -52,7 +52,7 @@ V_FUNCTION_BEGIN(__kernel_get_tbfreq) .cfi_startproc mflrr12 .cfi_register lr,r12 - get_datapager3 + get_realdatapager3, r11 #ifndef __powerpc64__ lwz r4,(CFG_TB_TICKS_PER_SEC + 4)(r3) #endif -- 2.44.0
Re: [PATCH v5 4/5] powerpc/vdso: Wire up getrandom() vDSO implementation on VDSO32
Hi Jason, Le 06/09/2024 à 05:24, Jason A. Donenfeld a écrit : On Fri, Sep 06, 2024 at 04:48:28AM +0200, Jason A. Donenfeld wrote: On Thu, Sep 05, 2024 at 10:41:40PM +0200, Jason A. Donenfeld wrote: On Thu, Sep 05, 2024 at 06:13:29PM +0200, Jason A. Donenfeld wrote: +/* + * The macro sets two stack frames, one for the caller and one for the callee + * because there are no requirement for the caller to set a stack frame when + * calling VDSO so it may have omitted to set one, especially on PPC64 + */ + +.macro cvdso_call funct + .cfi_startproc + PPC_STLUr1, -PPC_MIN_STKFRM(r1) + .cfi_adjust_cfa_offset PPC_MIN_STKFRM + mflrr0 + PPC_STLUr1, -PPC_MIN_STKFRM(r1) + .cfi_adjust_cfa_offset PPC_MIN_STKFRM + PPC_STL r0, PPC_MIN_STKFRM + PPC_LR_STKOFF(r1) + .cfi_rel_offset lr, PPC_MIN_STKFRM + PPC_LR_STKOFF + get_datapager8 + addir8, r8, VDSO_RNG_DATA_OFFSET + bl CFUNC(DOTSYM(\funct)) + PPC_LL r0, PPC_MIN_STKFRM + PPC_LR_STKOFF(r1) + cmpwi r3, 0 + mtlrr0 + addir1, r1, 2 * PPC_MIN_STKFRM + .cfi_restore lr + .cfi_def_cfa_offset 0 + crclr so + bgelr+ + crset so + neg r3, r3 + blr + .cfi_endproc +.endm Can you figure out what's going on and send a fix, which I'll squash into this commit? This doesn't work, but I wonder if something like it is what we want. I need to head out for the day, but here's what I've got. It's all wrong but might be of interest. Oh, I just got one small detail wrong before. The below actually works, and uses the same strategy as on arm64. Let me know if you'd like me to fix up this commit with the below patch, or if you have another way you'd like to go about it. And here's the much shorter version in assembly, which maybe you prefer. Also works, and is a bit less invasive than the other thing. diff --git a/arch/powerpc/kernel/vdso/getrandom.S b/arch/powerpc/kernel/vdso/getrandom.S index a957cd2b2b03..070daba2d547 100644 --- a/arch/powerpc/kernel/vdso/getrandom.S +++ b/arch/powerpc/kernel/vdso/getrandom.S @@ -32,6 +32,14 @@ .cfi_rel_offset r2, PPC_MIN_STKFRM + STK_GOT #endif get_datapager8 +#ifdef CONFIG_TIME_NS + lis r10, 0x7fff + ori r10, r10, 0x + lwz r9, VDSO_DATA_OFFSET + 4(r8) + cmpwr9, r10 + bne +8 + addir8, r8, (1 << CONFIG_PAGE_SHIFT) +#endif addir8, r8, VDSO_RNG_DATA_OFFSET bl CFUNC(DOTSYM(\funct)) PPC_LL r0, PPC_MIN_STKFRM + PPC_LR_STKOFF(r1) Thanks for looking. I came to more or less the same solutions thnt you with the following that seems to work: diff --git a/arch/powerpc/kernel/vdso/vgetrandom.c b/arch/powerpc/kernel/vdso/vgetrandom.c index 5f855d45fb7b..9705344d39d0 100644 --- a/arch/powerpc/kernel/vdso/vgetrandom.c +++ b/arch/powerpc/kernel/vdso/vgetrandom.c @@ -4,11 +4,19 @@ * * Copyright (C) 2024 Christophe Leroy , CS GROUP France */ +#include #include #include +#include + ssize_t __c_kernel_getrandom(void *buffer, size_t len, unsigned int flags, void *opaque_state, size_t opaque_len, const struct vdso_rng_data *vd) { + struct vdso_arch_data *arch_data = container_of(vd, struct vdso_arch_data, rng_data); + + if (IS_ENABLED(CONFIG_TIME_NS) && arch_data->data[0].clock_mode == VDSO_CLOCKMODE_TIMENS) + vd = (void *)vd + (1UL << CONFIG_PAGE_SHIFT); + return __cvdso_getrandom_data(vd, buffer, len, flags, opaque_state, opaque_len); } However, if we have this problem with __kernel_getrandom, don't we also have it with: ? __kernel_get_syscall_map; __kernel_get_tbfreq; __kernel_sync_dicache; If they are also affected, then get_page macro is the place to fix. I will check all of this now and keep you updated before noon (Paris Time). Christophe
Re: [PATCH v5 4/5] powerpc/vdso: Wire up getrandom() vDSO implementation on VDSO32
Le 05/09/2024 à 18:13, Jason A. Donenfeld a écrit : +/* + * The macro sets two stack frames, one for the caller and one for the callee + * because there are no requirement for the caller to set a stack frame when + * calling VDSO so it may have omitted to set one, especially on PPC64 + */ + +.macro cvdso_call funct + .cfi_startproc + PPC_STLUr1, -PPC_MIN_STKFRM(r1) + .cfi_adjust_cfa_offset PPC_MIN_STKFRM + mflrr0 + PPC_STLUr1, -PPC_MIN_STKFRM(r1) + .cfi_adjust_cfa_offset PPC_MIN_STKFRM + PPC_STL r0, PPC_MIN_STKFRM + PPC_LR_STKOFF(r1) + .cfi_rel_offset lr, PPC_MIN_STKFRM + PPC_LR_STKOFF + get_datapager8 + addir8, r8, VDSO_RNG_DATA_OFFSET + bl CFUNC(DOTSYM(\funct)) + PPC_LL r0, PPC_MIN_STKFRM + PPC_LR_STKOFF(r1) + cmpwi r3, 0 + mtlrr0 + addir1, r1, 2 * PPC_MIN_STKFRM + .cfi_restore lr + .cfi_def_cfa_offset 0 + crclr so + bgelr+ + crset so + neg r3, r3 + blr + .cfi_endproc +.endm You wrote in an earlier email that this worked with time namespaces, but in my testing that doesn't seem to be the case. Did I write that ? I can't remember and neither can I remember testing it with time namespaces. From my test harness [1]: Normal single thread vdso: 2500 times in 12.494133131 seconds libc: 2500 times in 69.594625188 seconds syscall: 2500 times in 67.349243972 seconds Time namespace single thread vdso: 2500 times in 71.673057436 seconds libc: 2500 times in 71.712774121 seconds syscall: 2500 times in 66.902318080 seconds I'm seeing this on ppc, ppc64, and ppc64le. What is the command to use to test with time namespace ? Can you figure out what's going on and send a fix, which I'll squash into this commit? Sure Jason [1] https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.zx2c4.com%2Flinux-rng%2Fcommit%2F%3Fh%3Djd%2Fvdso-test-harness&data=05%7C02%7Cchristophe.leroy%40csgroup.eu%7C59fa9061064945c73a1608dccdc5b51c%7C8b87af7d86474dc78df45f69a2011bb5%7C0%7C0%7C638611496253413014%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=ZUJqhcnZL7SYkuXUIt9Nlo46sZj26VYW%2F8I%2BrBLRpBE%3D&reserved=0
Re: [PATCH] soc: fsl: qe: ucc: Export ucc_mux_set_grant_tsa_bkpt
Le 05/09/2024 à 09:22, Herve Codina a écrit : When TSA is compiled as module the following error is reported: "ucc_mux_set_grant_tsa_bkpt" [drivers/soc/fsl/qe/tsa.ko] undefined! Indeed, the ucc_mux_set_grant_tsa_bkpt symbol is not exported. Simply export ucc_mux_set_grant_tsa_bkpt. Reported-by: kernel test robot Closes: https://lore.kernel.org/oe-kbuild-all/202409051409.fszn8reo-...@intel.com/ Signed-off-by: Herve Codina Acked-by: Christophe Leroy Arnd, it is ok for you to take this patch directly ? Thanks Christophe --- drivers/soc/fsl/qe/ucc.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/soc/fsl/qe/ucc.c b/drivers/soc/fsl/qe/ucc.c index 21dbcd787cd5..892aa5931d5b 100644 --- a/drivers/soc/fsl/qe/ucc.c +++ b/drivers/soc/fsl/qe/ucc.c @@ -114,6 +114,7 @@ int ucc_mux_set_grant_tsa_bkpt(unsigned int ucc_num, int set, u32 mask) return 0; } +EXPORT_SYMBOL(ucc_mux_set_grant_tsa_bkpt); int ucc_set_qe_mux_rxtx(unsigned int ucc_num, enum qe_clock clock, enum comm_dir mode)
Re: [PATCH v5 0/5] Wire up getrandom() vDSO implementation on powerpc
Le 04/09/2024 à 16:16, Jason A. Donenfeld a écrit : Hi Christophe, Michael, On Mon, Sep 02, 2024 at 09:17:17PM +0200, Christophe Leroy wrote: This series wires up getrandom() vDSO implementation on powerpc. Tested on PPC32 on real hardware. Tested on PPC64 (both BE and LE) on QEMU: Performance on powerpc 885: ~# ./vdso_test_getrandom bench-single vdso: 2500 times in 62.938002291 seconds libc: 2500 times in 535.581916866 seconds syscall: 2500 times in 531.525042806 seconds Performance on powerpc 8321: ~# ./vdso_test_getrandom bench-single vdso: 2500 times in 16.899318858 seconds libc: 2500 times in 131.050596522 seconds syscall: 2500 times in 129.794790389 seconds Performance on QEMU pseries: ~ # ./vdso_test_getrandom bench-single vdso: 2500 times in 4.97162 seconds libc: 2500 times in 75.516749981 seconds syscall: 2500 times in 86.842242014 seconds Looking good. I have no remaining nits on this patchset; it looks good to me. A review from Michael would be nice though (in addition to the necessary "Ack" I need to commit this to my tree), because there are a lot of PPC particulars that I don't know enough about to review properly. For example, you use -ffixed-r30 on PPC64. I'm sure there's a good reason for this, but I don't know enough to assess it. And cvdso_call I have no idea what's going on. Etc. You can learn a bit more about cvdso_call in commit ce7d8056e38b ("powerpc/vdso: Prepare for switching VDSO to generic C implementation.") About the fixed-r30, you can learn more in commit a88603f4b92e ("powerpc/vdso: Don't use r30 to avoid breaking Go lang") But anyway, awesome work, and I look forward to the final stretches. Thanks, looking forward to getting this series applied. Christophe
Profiling of vdso_test_random
Hi, I'm done a 'perf record' on vdso_test_random reduced to vdso test only, and I get the following function usage profile. Do you see the same type of percentage on your platforms ? I would have expected most of the time to be spent in __arch_chacha20_blocks_nostack() but that's in fact not the case. # Samples: 61K of event 'task-clock:ppp' # Event count (approx.): 1546350 # # Overhead Command Shared ObjectSymbol # ... ... # 57.74% vdso_test_getra [vdso] [.] __c_kernel_getrandom 22.49% vdso_test_getra [vdso] [.] __arch_chacha20_blocks_nostack 10.80% vdso_test_getra vdso_test_getrandom [.] test_vdso_getrandom 8.89% vdso_test_getra [vdso] [.] __kernel_getrandom 0.01% vdso_test_getra [kernel.kallsyms][k] finish_task_switch.isra.0 Christophe
[PATCH] soc: fsl: cpm1: qmc: Fix dependency on fsl_soc.h
QMC driver requires fsl_soc.h to use function get_immrbase(). This header is provided by powerpc architecture and the functions it declares are defined only when FSL_SOC is selected. Today the dependency is the following: depends on CPM1 || QUICC_ENGINE || \ (FSL_SOC && (CPM || QUICC_ENGINE) && COMPILE_TEST) This dependency tentatively ensure that FSL_SOC is there when doing a COMPILE_TEST. CPM1 is only selected by PPC_8xx and cannot be selected manually. CPM1 selects FSL_SOC QUICC_ENGINE on the other hand can be selected by ARM or ARM64 which doesn't select FSL_SOC. QUICC_ENGINE can also be selected with just COMPILE_TEST. It is therefore possible to end up with CPM_QMC selected without FSL_SOC. So fix it by making it depend on FSL_SOC at all time. The rest of the above dependency is the same as the one for CPM_TSA on which CPM_QMC also depends, so it can go away, leaving only a simple dependency on FSL_SOC. Reported-by: Stephen Rothwell Closes: https://lore.kernel.org/lkml/20240904104859.020fe...@canb.auug.org.au/ Fixes: 8655b76b7004 ("soc: fsl: cpm1: qmc: Handle QUICC Engine (QE) soft-qmc firmware") Signed-off-by: Christophe Leroy --- drivers/soc/fsl/qe/Kconfig | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/soc/fsl/qe/Kconfig b/drivers/soc/fsl/qe/Kconfig index 5e3c996eb19e..eb03f42ab978 100644 --- a/drivers/soc/fsl/qe/Kconfig +++ b/drivers/soc/fsl/qe/Kconfig @@ -48,8 +48,7 @@ config CPM_TSA config CPM_QMC tristate "CPM/QE QMC support" depends on OF && HAS_IOMEM - depends on CPM1 || QUICC_ENGINE || \ - (FSL_SOC && (CPM || QUICC_ENGINE) && COMPILE_TEST) + depends on FSL_SOC depends on CPM_TSA help Freescale CPM/QE QUICC Multichannel Controller -- 2.44.0
Re: [PATCH 2/2] mm: make copy_to_kernel_nofault() not fault on user addresses
Hi, Le 02/09/2024 à 07:31, Omar Sandoval a écrit : [Vous ne recevez pas souvent de courriers de osan...@osandov.com. Découvrez pourquoi ceci est important à https://aka.ms/LearnAboutSenderIdentification ] From: Omar Sandoval I found that on x86, copy_to_kernel_nofault() still faults on addresses outside of the kernel address range (including NULL): # echo ttyS0 > /sys/module/kgdboc/parameters/kgdboc # echo g > /proc/sysrq-trigger ... [15]kdb> mm 0 1234 [ 94.652476] BUG: kernel NULL pointer dereference, address: ... Note that copy_to_kernel_nofault() uses pagefault_disable(), but it still faults. This is because with Supervisor Mode Access Prevention (SMAP) enabled, do_user_addr_fault() Oopses on a fault for a user address from kernel space _before_ checking faulthandler_disabled(). copy_from_kernel_nofault() avoids this by checking that the address is in the kernel before doing the actual memory access. Do the same in copy_to_kernel_nofault() so that we get an error as expected: # echo ttyS0 > /sys/module/kgdboc/parameters/kgdboc # echo g > /proc/sysrq-trigger ... [17]kdb> mm 0 1234 kdb_putarea_size: Bad address 0x0 diag: -21: Invalid address Signed-off-by: Omar Sandoval --- mm/maccess.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/mm/maccess.c b/mm/maccess.c index 72e9c03ea37f..d67dee51a1cc 100644 --- a/mm/maccess.c +++ b/mm/maccess.c @@ -61,6 +61,9 @@ long copy_to_kernel_nofault(void *dst, const void *src, size_t size) if (!IS_ENABLED(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS)) align = (unsigned long)dst | (unsigned long)src; + if (!copy_kernel_nofault_allowed(dst, size)) + return -ERANGE; + pagefault_disable(); if (!(align & 7)) copy_to_kernel_nofault_loop(dst, src, size, u64, Efault); -- 2.46.0 This patch leads to the following errors on ppc64le_defconfig: [2.423930][T1] Running code patching self-tests ... [2.428912][T1] code-patching: test failed at line 395 [2.429085][T1] code-patching: test failed at line 398 [2.429561][T1] code-patching: test failed at line 432 [2.429679][T1] code-patching: test failed at line 435 This seems to be linked to commit c28c15b6d28a ("powerpc/code-patching: Use temporary mm for Radix MMU"), copy_from_kernel_nofault_allowed() returns false for the patching area. Christophe
Re: [PATCH RFC v2 2/4] mm: Add hint and mmap_flags to struct vm_unmapped_area_info
Hi Charlie, Le 29/08/2024 à 09:15, Charlie Jenkins a écrit : The hint address and mmap_flags are necessary to determine if MAP_BELOW_HINT requirements are satisfied. Signed-off-by: Charlie Jenkins --- arch/alpha/kernel/osf_sys.c | 2 ++ arch/arc/mm/mmap.c | 3 +++ arch/arm/mm/mmap.c | 7 +++ arch/csky/abiv1/mmap.c | 3 +++ arch/loongarch/mm/mmap.c | 3 +++ arch/mips/mm/mmap.c | 3 +++ arch/parisc/kernel/sys_parisc.c | 3 +++ arch/powerpc/mm/book3s64/slice.c | 7 +++ arch/s390/mm/hugetlbpage.c | 4 arch/s390/mm/mmap.c | 6 ++ arch/sh/mm/mmap.c| 6 ++ arch/sparc/kernel/sys_sparc_32.c | 3 +++ arch/sparc/kernel/sys_sparc_64.c | 6 ++ arch/sparc/mm/hugetlbpage.c | 4 arch/x86/kernel/sys_x86_64.c | 6 ++ arch/x86/mm/hugetlbpage.c| 4 fs/hugetlbfs/inode.c | 4 include/linux/mm.h | 2 ++ mm/mmap.c| 6 ++ 19 files changed, 82 insertions(+) diff --git a/arch/powerpc/mm/book3s64/slice.c b/arch/powerpc/mm/book3s64/slice.c index ef3ce37f1bb3..f0e2550af6d0 100644 --- a/arch/powerpc/mm/book3s64/slice.c +++ b/arch/powerpc/mm/book3s64/slice.c @@ -286,6 +286,10 @@ static unsigned long slice_find_area_bottomup(struct mm_struct *mm, .length = len, .align_mask = PAGE_MASK & ((1ul << pshift) - 1), }; + + info.hint = addr; + info.mmap_flags = flags; + /* * Check till the allow max value for this mmap request */ @@ -331,6 +335,9 @@ static unsigned long slice_find_area_topdown(struct mm_struct *mm, }; unsigned long min_addr = max(PAGE_SIZE, mmap_min_addr); + info.hint = addr; + info.mmap_flags = flags; + /* * If we are trying to allocate above DEFAULT_MAP_WINDOW * Add the different to the mmap_base. ppc64_defconfig: CC arch/powerpc/mm/book3s64/slice.o arch/powerpc/mm/book3s64/slice.c: In function 'slice_find_area_bottomup': arch/powerpc/mm/book3s64/slice.c:291:27: error: 'flags' undeclared (first use in this function) 291 | info.mmap_flags = flags; | ^ arch/powerpc/mm/book3s64/slice.c:291:27: note: each undeclared identifier is reported only once for each function it appears in arch/powerpc/mm/book3s64/slice.c: In function 'slice_find_area_topdown': arch/powerpc/mm/book3s64/slice.c:339:27: error: 'flags' undeclared (first use in this function) 339 | info.mmap_flags = flags; | ^ make[5]: *** [scripts/Makefile.build:244: arch/powerpc/mm/book3s64/slice.o] Error 1
Re: [PATCH] of/irq: handle irq_of_parse_and_map() errors
Le 30/08/2024 à 16:21, Ma Ke a écrit : Zero and negative number is not a valid IRQ for in-kernel code and the irq_of_parse_and_map() function returns zero on error. So this check for valid IRQs should only accept values > 0. unsigned int irq_of_parse_and_map(struct device_node *node, int index); I can't see how an 'unsigned int' can be negative. Christophe Cc: sta...@vger.kernel.org Fixes: f7578496a671 ("of/irq: Use irq_of_parse_and_map()") Signed-off-by: Ma Ke --- drivers/i2c/busses/i2c-cpm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/i2c/busses/i2c-cpm.c b/drivers/i2c/busses/i2c-cpm.c index 4794ec066eb0..41e3c95c0ef7 100644 --- a/drivers/i2c/busses/i2c-cpm.c +++ b/drivers/i2c/busses/i2c-cpm.c @@ -435,7 +435,7 @@ static int cpm_i2c_setup(struct cpm_i2c *cpm) init_waitqueue_head(&cpm->i2c_wait); cpm->irq = irq_of_parse_and_map(ofdev->dev.of_node, 0); - if (!cpm->irq) + if (cpm->irq <= 0) return -EINVAL; /* Install interrupt handler. */
Re: [PATCH] soc: fsl: qbman: Remove redundant warnings
Le 02/08/2024 à 04:16, Xiaolei Wang a écrit : [Vous ne recevez pas souvent de courriers de xiaolei.w...@windriver.com. Découvrez pourquoi ceci est important à https://aka.ms/LearnAboutSenderIdentification ] RESERVEDMEM_OF_DECLARE usage has been removed. For non-popwerpc platforms, such as ls1043, this warning is redundant. ls1043 itself uses shared-dma-mem. Fixes: 3e62273ac63a ("soc: fsl: qbman: Remove RESERVEDMEM_OF_DECLARE usage") Signed-off-by: Xiaolei Wang --- drivers/soc/fsl/qbman/qman_ccsr.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/drivers/soc/fsl/qbman/qman_ccsr.c b/drivers/soc/fsl/qbman/qman_ccsr.c index 392e54f14dbe..aa5348f4902f 100644 --- a/drivers/soc/fsl/qbman/qman_ccsr.c +++ b/drivers/soc/fsl/qbman/qman_ccsr.c @@ -791,8 +791,6 @@ static int fsl_qman_probe(struct platform_device *pdev) * FQD memory MUST be zero'd by software */ zero_priv_mem(fqd_a, fqd_sz); -#else - WARN(1, "Unexpected architecture using non shared-dma-mem reservations"); #endif dev_dbg(dev, "Allocated FQD 0x%llx 0x%zx\n", fqd_a, fqd_sz); -- 2.25.1 Applied for 6.12 Thanks Christophe
Re: [PATCH 1/1] soc/fsl/qbman: Use iommu_paging_domain_alloc()
Le 12/08/2024 à 09:25, Lu Baolu a écrit : [Vous ne recevez pas souvent de courriers de baolu...@linux.intel.com. Découvrez pourquoi ceci est important à https://aka.ms/LearnAboutSenderIdentification ] An iommu domain is allocated in portal_set_cpu() and is attached to pcfg->dev in the same function. Use iommu_paging_domain_alloc() to make it explicit. Signed-off-by: Lu Baolu Reviewed-by: Jason Gunthorpe Link: https://lore.kernel.org/r/2024061008.88197-14-baolu...@linux.intel.com --- drivers/soc/fsl/qbman/qman_portal.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/soc/fsl/qbman/qman_portal.c b/drivers/soc/fsl/qbman/qman_portal.c index e23b60618c1a..456ef5d5c199 100644 --- a/drivers/soc/fsl/qbman/qman_portal.c +++ b/drivers/soc/fsl/qbman/qman_portal.c @@ -48,9 +48,10 @@ static void portal_set_cpu(struct qm_portal_config *pcfg, int cpu) struct device *dev = pcfg->dev; int ret; - pcfg->iommu_domain = iommu_domain_alloc(&platform_bus_type); - if (!pcfg->iommu_domain) { + pcfg->iommu_domain = iommu_paging_domain_alloc(dev); + if (IS_ERR(pcfg->iommu_domain)) { dev_err(dev, "%s(): iommu_domain_alloc() failed", __func__); + pcfg->iommu_domain = NULL; goto no_iommu; } ret = fsl_pamu_configure_l1_stash(pcfg->iommu_domain, cpu); -- 2.34.1 Applied for 6.12 Thanks Christophe
Re: [PATCH v2 00/36] soc: fsl: Add support for QUICC Engine TSA and QMC
Le 08/08/2024 à 09:10, Herve Codina a écrit : Hi, This series add support for the QUICC Engine (QE) version of TSA and QMC components. CPM1 version is already supported and, as the QE version of those component are pretty similar to the CPM1 version, the series extend the already existing drivers to support for the QE version. The TSA and QMC components are tightly coupled and so the series provides modifications on both components. Of course, this series can be split if it is needed. Let me know. The series is composed of: - Patches 1 and 2: Fixes related to TRNSYNC in the QMC driver - Patches 3..6: Fixes of checkpatch detected issues in the TSA driver - Patch 7: The QE TSA device-tree binding - Patches 8..13: TSA driver preparations for adding support for QE - Patches 14 and 15: The support for QE in TSA + MAINTAINERS update - Patch 16: A TSA API improvement needed for the QE QMC driver - Patch 17: A clarification in the QE QMC driver - Patches 18..22: Fixes of checkpatch detected issues in the QMC driver - Patch 23: The QE QMC device-tree binding - Patches 24..31: QMC driver preparations for adding support for QE - Patches 32 and 33: Missing features additions in QE code - Patches 34..36: The QMC support for QE in QMC + MAINTAINERS update Compared to the previous iteration, this v2 series updates device-tree bindings and fixes issues detected by kernel test robots. Related to the QE QMC device-tree binding, I kept the unit address in decimal and the 3 compatible strings in order to avoid blocking the review waiting for a confirmation. Of course, this can be change in a next iteration. Series applied for 6.12 Thanks Christophe
[GIT PULL] SOC FSL for 6.12 (retry)
Hi Arnd, Please pull the following Freescale Soc Drivers changes for 6.12 There are no conflicts with latest linux-next tree. Thanks Christophe The following changes since commit 8400291e289ee6b2bf9779ff1c83a291501f017b: Linux 6.11-rc1 (2024-07-28 14:19:55 -0700) are available in the Git repository at: https://github.com/chleroy/linux.git tags/soc_fsl-6.12-2 for you to fetch changes up to 7a99b1c0bce5cf8c554ceecd29ad1e8085557fd3: Merge branch 'support-for-quicc-engine-tsa-and-qmc' (2024-09-03 07:51:34 +0200) - A series from Hervé Codina that bring support for the newer version of QMC (QUICC Multi-channel Controller) and TSA (Time Slots Assigner) found on MPC 83xx micro-controllers. - Misc changes for qbman freescale drivers for removing a redundant warning and using iommu_paging_domain_alloc() ---- Christophe Leroy (1): Merge branch 'support-for-quicc-engine-tsa-and-qmc' Herve Codina (36): soc: fsl: cpm1: qmc: Update TRNSYNC only in transparent mode soc: fsl: cpm1: qmc: Enable TRNSYNC only when needed soc: fsl: cpm1: tsa: Fix tsa_write8() soc: fsl: cpm1: tsa: Use BIT(), GENMASK() and FIELD_PREP() macros soc: fsl: cpm1: tsa: Fix blank line and spaces soc: fsl: cpm1: tsa: Add missing spinlock comment dt-bindings: soc: fsl: cpm_qe: Add QUICC Engine (QE) TSA controller soc: fsl: cpm1: tsa: Remove unused registers offset definition soc: fsl: cpm1: tsa: Use ARRAY_SIZE() instead of hardcoded integer values soc: fsl: cpm1: tsa: Make SIRAM entries specific to CPM1 soc: fsl: cpm1: tsa: Introduce tsa_setup() and its CPM1 compatible version soc: fsl: cpm1: tsa: Isolate specific CPM1 part from tsa_serial_{dis}connect() soc: fsl: cpm1: tsa: Introduce tsa_version soc: fsl: cpm1: tsa: Add support for QUICC Engine (QE) implementation MAINTAINERS: Add QE files related to the Freescale TSA controller soc: fsl: cpm1: tsa: Introduce tsa_serial_get_num() soc: fsl: cpm1: qmc: Rename QMC_TSA_MASK soc: fsl: cpm1: qmc: Use BIT(), GENMASK() and FIELD_PREP() macros soc: fsl: cpm1: qmc: Fix blank line and spaces soc: fsl: cpm1: qmc: Remove unneeded parenthesis soc: fsl: cpm1: qmc: Fix 'transmiter' typo soc: fsl: cpm1: qmc: Add missing spinlock comment dt-bindings: soc: fsl: cpm_qe: Add QUICC Engine (QE) QMC controller soc: fsl: cpm1: qmc: Introduce qmc_data structure soc: fsl: cpm1: qmc: Re-order probe() operations soc: fsl: cpm1: qmc: Introduce qmc_init_resource() and its CPM1 version soc: fsl: cpm1: qmc: Introduce qmc_{init,exit}_xcc() and their CPM1 version soc: fsl: cpm1: qmc: Rename qmc_chan_command() soc: fsl: cpm1: qmc: Handle RPACK initialization soc: fsl: cpm1: qmc: Rename SCC_GSMRL_MODE_QMC soc: fsl: cpm1: qmc: Introduce qmc_version soc: fsl: qe: Add resource-managed muram allocators soc: fsl: qe: Add missing PUSHSCHED command soc: fsl: cpm1: qmc: Add support for QUICC Engine (QE) implementation soc: fsl: cpm1: qmc: Handle QUICC Engine (QE) soft-qmc firmware MAINTAINERS: Add QE files related to the Freescale QMC controller Lu Baolu (1): soc: fsl: qbman: Use iommu_paging_domain_alloc() Xiaolei Wang (1): soc: fsl: qbman: Remove redundant warnings .../bindings/soc/fsl/cpm_qe/fsl,qe-tsa.yaml| 210 +++ .../bindings/soc/fsl/cpm_qe/fsl,qe-ucc-qmc.yaml| 197 ++ MAINTAINERS| 3 + drivers/soc/fsl/qbman/qman_ccsr.c | 2 - drivers/soc/fsl/qbman/qman_portal.c| 5 +- drivers/soc/fsl/qe/Kconfig | 18 +- drivers/soc/fsl/qe/qe_common.c | 80 +++ drivers/soc/fsl/qe/qmc.c | 667 - drivers/soc/fsl/qe/tsa.c | 659 +++- drivers/soc/fsl/qe/tsa.h | 3 + include/dt-bindings/soc/qe-fsl,tsa.h | 13 + include/soc/fsl/qe/qe.h| 23 +- 12 files changed, 1552 insertions(+), 328 deletions(-) create mode 100644 Documentation/devicetree/bindings/soc/fsl/cpm_qe/fsl,qe-tsa.yaml create mode 100644 Documentation/devicetree/bindings/soc/fsl/cpm_qe/fsl,qe-ucc-qmc.yaml create mode 100644 include/dt-bindings/soc/qe-fsl,tsa.h
[PATCH v5 5/5] powerpc/vdso: Wire up getrandom() vDSO implementation on VDSO64
Extend getrandom() vDSO implementation to VDSO64 Tested on QEMU on both ppc64_defconfig and ppc64le_defconfig. The results are not precise as it is QEMU on an x86 laptop, but no need to be precise to see the benefit. ~ # ./vdso_test_getrandom bench-single vdso: 2500 times in 4.97162 seconds libc: 2500 times in 75.516749981 seconds syscall: 2500 times in 86.842242014 seconds ~ # ./vdso_test_getrandom bench-single vdso: 2500 times in 6.473814156 seconds libc: 2500 times in 73.875109463 seconds syscall: 2500 times in 71.805066229 seconds Signed-off-by: Christophe Leroy --- v5: - VDSO32 for both PPC32 and PPC64 is in previous patch. This patch have the logic for VDSO64. v4: - Use __BIG_ENDIAN__ which is defined by GCC instead of CONFIG_CPU_BIG_ENDIAN which is unknown by selftests - Implement a cleaner/smaller output copy for little endian instead of keeping compat macro. v3: New (split out of previous patch) --- arch/powerpc/Kconfig | 2 +- arch/powerpc/kernel/vdso/Makefile| 8 ++- arch/powerpc/kernel/vdso/getrandom.S | 8 +++ arch/powerpc/kernel/vdso/vdso64.lds.S| 1 + arch/powerpc/kernel/vdso/vgetrandom-chacha.S | 53 5 files changed, 69 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index e500a59ddecc..b45452ac4a73 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -311,7 +311,7 @@ config PPC select SYSCTL_EXCEPTION_TRACE select THREAD_INFO_IN_TASK select TRACE_IRQFLAGS_SUPPORT - select VDSO_GETRANDOM if VDSO32 + select VDSO_GETRANDOM # # Please keep this list sorted alphabetically. # diff --git a/arch/powerpc/kernel/vdso/Makefile b/arch/powerpc/kernel/vdso/Makefile index 7a4a935406d8..56fb1633529a 100644 --- a/arch/powerpc/kernel/vdso/Makefile +++ b/arch/powerpc/kernel/vdso/Makefile @@ -9,6 +9,7 @@ obj-vdso32 = sigtramp32-32.o gettimeofday-32.o datapage-32.o cacheflush-32.o not obj-vdso64 = sigtramp64-64.o gettimeofday-64.o datapage-64.o cacheflush-64.o note-64.o getcpu-64.o obj-vdso32 += getrandom-32.o vgetrandom-chacha-32.o +obj-vdso64 += getrandom-64.o vgetrandom-chacha-64.o ifneq ($(c-gettimeofday-y),) CFLAGS_vgettimeofday-32.o += -include $(c-gettimeofday-y) @@ -21,6 +22,7 @@ endif ifneq ($(c-getrandom-y),) CFLAGS_vgetrandom-32.o += -include $(c-getrandom-y) + CFLAGS_vgetrandom-64.o += -include $(c-getrandom-y) $(call cc-option, -ffixed-r30) endif # Build rules @@ -34,7 +36,7 @@ endif targets := $(obj-vdso32) vdso32.so.dbg vgettimeofday-32.o vgetrandom-32.o targets += crtsavres-32.o obj-vdso32 := $(addprefix $(obj)/, $(obj-vdso32)) -targets += $(obj-vdso64) vdso64.so.dbg vgettimeofday-64.o +targets += $(obj-vdso64) vdso64.so.dbg vgettimeofday-64.o vgetrandom-64.o obj-vdso64 := $(addprefix $(obj)/, $(obj-vdso64)) ccflags-y := -fno-common -fno-builtin -DBUILD_VDSO @@ -71,7 +73,7 @@ CPPFLAGS_vdso64.lds += -P -C # link rule for the .so file, .lds has to be first $(obj)/vdso32.so.dbg: $(obj)/vdso32.lds $(obj-vdso32) $(obj)/vgettimeofday-32.o $(obj)/vgetrandom-32.o $(obj)/crtsavres-32.o FORCE $(call if_changed,vdso32ld_and_check) -$(obj)/vdso64.so.dbg: $(obj)/vdso64.lds $(obj-vdso64) $(obj)/vgettimeofday-64.o FORCE +$(obj)/vdso64.so.dbg: $(obj)/vdso64.lds $(obj-vdso64) $(obj)/vgettimeofday-64.o $(obj)/vgetrandom-64.o FORCE $(call if_changed,vdso64ld_and_check) # assembly rules for the .S files @@ -87,6 +89,8 @@ $(obj-vdso64): %-64.o: %.S FORCE $(call if_changed_dep,vdso64as) $(obj)/vgettimeofday-64.o: %-64.o: %.c FORCE $(call if_changed_dep,cc_o_c) +$(obj)/vgetrandom-64.o: %-64.o: %.c FORCE + $(call if_changed_dep,cc_o_c) # Generate VDSO offsets using helper script gen-vdso32sym := $(src)/gen_vdso32_offsets.sh diff --git a/arch/powerpc/kernel/vdso/getrandom.S b/arch/powerpc/kernel/vdso/getrandom.S index 21773ef3fc1d..a957cd2b2b03 100644 --- a/arch/powerpc/kernel/vdso/getrandom.S +++ b/arch/powerpc/kernel/vdso/getrandom.S @@ -27,10 +27,18 @@ .cfi_adjust_cfa_offset PPC_MIN_STKFRM PPC_STL r0, PPC_MIN_STKFRM + PPC_LR_STKOFF(r1) .cfi_rel_offset lr, PPC_MIN_STKFRM + PPC_LR_STKOFF +#ifdef __powerpc64__ + PPC_STL r2, PPC_MIN_STKFRM + STK_GOT(r1) + .cfi_rel_offset r2, PPC_MIN_STKFRM + STK_GOT +#endif get_datapager8 addir8, r8, VDSO_RNG_DATA_OFFSET bl CFUNC(DOTSYM(\funct)) PPC_LL r0, PPC_MIN_STKFRM + PPC_LR_STKOFF(r1) +#ifdef __powerpc64__ + PPC_LL r2, PPC_MIN_STKFRM + STK_GOT(r1) + .cfi_restore r2 +#endif cmpwi r3, 0 mtlrr0 addir1, r1, 2 * PPC_MIN_STKFRM diff --git a/arch/powerpc/kernel/vdso/vdso64.lds.S b/arch/powerpc/kernel/vdso/vdso64.lds.S index 400819258c06..9481e4b892ed 100644 --- a/arch
[PATCH v5 4/5] powerpc/vdso: Wire up getrandom() vDSO implementation on VDSO32
To be consistent with other VDSO functions, the function is called __kernel_getrandom() __arch_chacha20_blocks_nostack() fonction is implemented basically with 32 bits operations. It performs 4 QUARTERROUND operations in parallele. There are enough registers to avoid using the stack: On input: r3: output bytes r4: 32-byte key input r5: 8-byte counter input/output r6: number of 64-byte blocks to write to output During operation: stack: pointer to counter (r5) and non-volatile registers (r14-131) r0: counter of blocks (initialised with r6) r4: Value '4' after key has been read, used for indexing r5-r12: key r14-r15: block counter r16-r31: chacha state At the end: r0, r6-r12: Zeroised r5, r14-r31: Restored Performance on powerpc 885 (using kernel selftest): ~# ./vdso_test_getrandom bench-single vdso: 2500 times in 62.938002291 seconds libc: 2500 times in 535.581916866 seconds syscall: 2500 times in 531.525042806 seconds Performance on powerpc 8321 (using kernel selftest): ~# ./vdso_test_getrandom bench-single vdso: 2500 times in 16.899318858 seconds libc: 2500 times in 131.050596522 seconds syscall: 2500 times in 129.794790389 seconds This first patch adds support for VDSO32. As selftests cannot easily be generated only for VDSO32, and because the following patch brings support for VDSO64 anyway, this patch opts out all code in __arch_chacha20_blocks_nostack() so that vdso_test_chacha will not fail to compile and will not crash on PPC64/PPC64LE, allthough the selftest itself will fail. Signed-off-by: Christophe Leroy --- v5: - Add back vdso symlink that vanished in v4 after a rebase back and forth with rejected patch "selftests: vDSO: Do not rely on $ARCH for vdso_test_getrandom && vdso_test_chacha" - Set meaningfull names to registers and constants in chacha assembly - Add 32 bits LE logic in this patch as well allthought it is only usefull for ppc64le. - Remove the temporary ppc64 __kernel_getrandom added in v4, selftest will return KSFT_FAIL until following patch, not a big issue. - Move the -DBUILD_VDSO logic in patch 3 to allow build VDSO32 on ppc64. v4: - Counter has native byte order - Fix selftest build on ppc64le until implemented. - On ppc64, for now implement __kernel_getrandom to return ENOSYS error - Use stwbrx directly, not compat macro. v3: - Preserve r13, implies saving r5 on stack - Split PPC64 implementation out. --- arch/powerpc/Kconfig | 1 + arch/powerpc/include/asm/mman.h | 2 +- arch/powerpc/include/asm/vdso/getrandom.h| 54 arch/powerpc/include/asm/vdso/vsyscall.h | 6 + arch/powerpc/include/asm/vdso_datapage.h | 2 + arch/powerpc/kernel/asm-offsets.c| 1 + arch/powerpc/kernel/vdso/Makefile| 14 +- arch/powerpc/kernel/vdso/getrandom.S | 50 +++ arch/powerpc/kernel/vdso/vdso32.lds.S| 1 + arch/powerpc/kernel/vdso/vgetrandom-chacha.S | 312 +++ arch/powerpc/kernel/vdso/vgetrandom.c| 14 + tools/arch/powerpc/vdso | 1 + tools/testing/selftests/vDSO/Makefile| 2 +- 13 files changed, 455 insertions(+), 5 deletions(-) create mode 100644 arch/powerpc/include/asm/vdso/getrandom.h create mode 100644 arch/powerpc/kernel/vdso/getrandom.S create mode 100644 arch/powerpc/kernel/vdso/vgetrandom-chacha.S create mode 100644 arch/powerpc/kernel/vdso/vgetrandom.c create mode 12 tools/arch/powerpc/vdso diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index d7b09b064a8a..e500a59ddecc 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -311,6 +311,7 @@ config PPC select SYSCTL_EXCEPTION_TRACE select THREAD_INFO_IN_TASK select TRACE_IRQFLAGS_SUPPORT + select VDSO_GETRANDOM if VDSO32 # # Please keep this list sorted alphabetically. # diff --git a/arch/powerpc/include/asm/mman.h b/arch/powerpc/include/asm/mman.h index 17a77d47ed6d..42a51a993d94 100644 --- a/arch/powerpc/include/asm/mman.h +++ b/arch/powerpc/include/asm/mman.h @@ -6,7 +6,7 @@ #include -#ifdef CONFIG_PPC64 +#if defined(CONFIG_PPC64) && !defined(BUILD_VDSO) #include #include diff --git a/arch/powerpc/include/asm/vdso/getrandom.h b/arch/powerpc/include/asm/vdso/getrandom.h new file mode 100644 index ..501d6bb14e8a --- /dev/null +++ b/arch/powerpc/include/asm/vdso/getrandom.h @@ -0,0 +1,54 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (C) 2024 Christophe Leroy , CS GROUP France + */ +#ifndef _ASM_POWERPC_VDSO_GETRANDOM_H +#define _ASM_POWERPC_VDSO_GETRANDOM_H + +#ifndef __ASSEMBLY__ + +static __always_inline int do_syscall_3(const unsigned long _r0, const unsigned long _r3, +
[PATCH v5 3/5] powerpc/vdso: Refactor CFLAGS for CVDSO build
In order to avoid two much duplication when we add new VDSO functionnalities in C like getrandom, refactor common CFLAGS. Signed-off-by: Christophe Leroy --- v3: Also refactor removed flags --- arch/powerpc/kernel/vdso/Makefile | 32 +-- 1 file changed, 13 insertions(+), 19 deletions(-) diff --git a/arch/powerpc/kernel/vdso/Makefile b/arch/powerpc/kernel/vdso/Makefile index c07a425b8f78..67fe79d26fae 100644 --- a/arch/powerpc/kernel/vdso/Makefile +++ b/arch/powerpc/kernel/vdso/Makefile @@ -10,28 +10,11 @@ obj-vdso64 = sigtramp64-64.o gettimeofday-64.o datapage-64.o cacheflush-64.o not ifneq ($(c-gettimeofday-y),) CFLAGS_vgettimeofday-32.o += -include $(c-gettimeofday-y) - CFLAGS_vgettimeofday-32.o += $(DISABLE_LATENT_ENTROPY_PLUGIN) - CFLAGS_vgettimeofday-32.o += $(call cc-option, -fno-stack-protector) - CFLAGS_vgettimeofday-32.o += -DDISABLE_BRANCH_PROFILING - CFLAGS_vgettimeofday-32.o += -ffreestanding -fasynchronous-unwind-tables - CFLAGS_REMOVE_vgettimeofday-32.o = $(CC_FLAGS_FTRACE) - CFLAGS_REMOVE_vgettimeofday-32.o += -mcmodel=medium -mabi=elfv1 -mabi=elfv2 -mcall-aixdesc - # This flag is supported by clang for 64-bit but not 32-bit so it will cause - # an unused command line flag warning for this file. - ifdef CONFIG_CC_IS_CLANG - CFLAGS_REMOVE_vgettimeofday-32.o += -fno-stack-clash-protection - endif - CFLAGS_vgettimeofday-64.o += -include $(c-gettimeofday-y) - CFLAGS_vgettimeofday-64.o += $(DISABLE_LATENT_ENTROPY_PLUGIN) - CFLAGS_vgettimeofday-64.o += $(call cc-option, -fno-stack-protector) - CFLAGS_vgettimeofday-64.o += -DDISABLE_BRANCH_PROFILING - CFLAGS_vgettimeofday-64.o += -ffreestanding -fasynchronous-unwind-tables - CFLAGS_REMOVE_vgettimeofday-64.o = $(CC_FLAGS_FTRACE) # Go prior to 1.16.x assumes r30 is not clobbered by any VDSO code. That used to be true # by accident when the VDSO was hand-written asm code, but may not be now that the VDSO is # compiler generated. To avoid breaking Go tell GCC not to use r30. Impact on code # generation is minimal, it will just use r29 instead. - CFLAGS_vgettimeofday-64.o += $(call cc-option, -ffixed-r30) + CFLAGS_vgettimeofday-64.o += -include $(c-gettimeofday-y) $(call cc-option, -ffixed-r30) endif # Build rules @@ -49,6 +32,11 @@ targets += $(obj-vdso64) vdso64.so.dbg vgettimeofday-64.o obj-vdso64 := $(addprefix $(obj)/, $(obj-vdso64)) ccflags-y := -fno-common -fno-builtin +ccflags-y += $(DISABLE_LATENT_ENTROPY_PLUGIN) +ccflags-y += $(call cc-option, -fno-stack-protector) +ccflags-y += -DDISABLE_BRANCH_PROFILING +ccflags-y += -ffreestanding -fasynchronous-unwind-tables +ccflags-remove-y := $(CC_FLAGS_FTRACE) ldflags-y := -Wl,--hash-style=both -nostdlib -shared -z noexecstack $(CLANG_FLAGS) ldflags-$(CONFIG_LD_IS_LLD) += $(call cc-option,--ld-path=$(LD),-fuse-ld=lld) ldflags-$(CONFIG_LD_ORPHAN_WARN) += -Wl,--orphan-handling=$(CONFIG_LD_ORPHAN_WARN_LEVEL) @@ -57,6 +45,12 @@ ldflags-$(CONFIG_LD_ORPHAN_WARN) += -Wl,--orphan-handling=$(CONFIG_LD_ORPHAN_WAR ldflags-y += $(filter-out $(CC_AUTO_VAR_INIT_ZERO_ENABLER) $(CC_FLAGS_FTRACE) -Wa$(comma)%, $(KBUILD_CFLAGS)) CC32FLAGS := -m32 +CC32FLAGSREMOVE := -mcmodel=medium -mabi=elfv1 -mabi=elfv2 -mcall-aixdesc + # This flag is supported by clang for 64-bit but not 32-bit so it will cause + # an unused command line flag warning for this file. +ifdef CONFIG_CC_IS_CLANG +CC32FLAGSREMOVE += -fno-stack-clash-protection +endif LD32FLAGS := -Wl,-soname=linux-vdso32.so.1 AS32FLAGS := -D__VDSO32__ @@ -105,7 +99,7 @@ quiet_cmd_vdso32ld_and_check = VDSO32L $@ quiet_cmd_vdso32as = VDSO32A $@ cmd_vdso32as = $(VDSOCC) $(a_flags) $(CC32FLAGS) $(AS32FLAGS) -c -o $@ $< quiet_cmd_vdso32cc = VDSO32C $@ - cmd_vdso32cc = $(VDSOCC) $(c_flags) $(CC32FLAGS) -c -o $@ $< + cmd_vdso32cc = $(VDSOCC) $(filter-out $(CC32FLAGSREMOVE), $(c_flags)) $(CC32FLAGS) -c -o $@ $< quiet_cmd_vdso64ld_and_check = VDSO64L $@ cmd_vdso64ld_and_check = $(VDSOCC) $(ldflags-y) $(LD64FLAGS) -o $@ -Wl,-T$(filter %.lds,$^) $(filter %.o,$^); $(cmd_vdso_check) -- 2.44.0
[PATCH v5 2/5] powerpc/vdso32: Add crtsavres
Commit 08c18b63d965 ("powerpc/vdso32: Add missing _restgpr_31_x to fix build failure") added _restgpr_31_x to the vdso for gettimeofday, but the work on getrandom shows that we will need more of those functions. Remove _restgpr_31_x and link in crtsavres.o so that we get all save/restore functions when optimising the kernel for size. Signed-off-by: Christophe Leroy --- arch/powerpc/kernel/vdso/Makefile | 5 - arch/powerpc/kernel/vdso/gettimeofday.S | 13 - 2 files changed, 4 insertions(+), 14 deletions(-) diff --git a/arch/powerpc/kernel/vdso/Makefile b/arch/powerpc/kernel/vdso/Makefile index 1425b6edc66b..c07a425b8f78 100644 --- a/arch/powerpc/kernel/vdso/Makefile +++ b/arch/powerpc/kernel/vdso/Makefile @@ -43,6 +43,7 @@ else endif targets := $(obj-vdso32) vdso32.so.dbg vgettimeofday-32.o +targets += crtsavres-32.o obj-vdso32 := $(addprefix $(obj)/, $(obj-vdso32)) targets += $(obj-vdso64) vdso64.so.dbg vgettimeofday-64.o obj-vdso64 := $(addprefix $(obj)/, $(obj-vdso64)) @@ -68,7 +69,7 @@ targets += vdso64.lds CPPFLAGS_vdso64.lds += -P -C # link rule for the .so file, .lds has to be first -$(obj)/vdso32.so.dbg: $(obj)/vdso32.lds $(obj-vdso32) $(obj)/vgettimeofday-32.o FORCE +$(obj)/vdso32.so.dbg: $(obj)/vdso32.lds $(obj-vdso32) $(obj)/vgettimeofday-32.o $(obj)/crtsavres-32.o FORCE $(call if_changed,vdso32ld_and_check) $(obj)/vdso64.so.dbg: $(obj)/vdso64.lds $(obj-vdso64) $(obj)/vgettimeofday-64.o FORCE $(call if_changed,vdso64ld_and_check) @@ -76,6 +77,8 @@ $(obj)/vdso64.so.dbg: $(obj)/vdso64.lds $(obj-vdso64) $(obj)/vgettimeofday-64.o # assembly rules for the .S files $(obj-vdso32): %-32.o: %.S FORCE $(call if_changed_dep,vdso32as) +$(obj)/crtsavres-32.o: %-32.o: $(srctree)/arch/powerpc/lib/crtsavres.S FORCE + $(call if_changed_dep,vdso32as) $(obj)/vgettimeofday-32.o: %-32.o: %.c FORCE $(call if_changed_dep,vdso32cc) $(obj-vdso64): %-64.o: %.S FORCE diff --git a/arch/powerpc/kernel/vdso/gettimeofday.S b/arch/powerpc/kernel/vdso/gettimeofday.S index 48fc6658053a..67254ac9c8bb 100644 --- a/arch/powerpc/kernel/vdso/gettimeofday.S +++ b/arch/powerpc/kernel/vdso/gettimeofday.S @@ -118,16 +118,3 @@ V_FUNCTION_END(__kernel_clock_getres) V_FUNCTION_BEGIN(__kernel_time) cvdso_call __c_kernel_time call_time=1 V_FUNCTION_END(__kernel_time) - -/* Routines for restoring integer registers, called by the compiler. */ -/* Called with r11 pointing to the stack header word of the caller of the */ -/* function, just beyond the end of the integer restore area. */ -#ifndef __powerpc64__ -_GLOBAL(_restgpr_31_x) -_GLOBAL(_rest32gpr_31_x) - lwz r0,4(r11) - lwz r31,-4(r11) - mtlrr0 - mr r1,r11 - blr -#endif -- 2.44.0
[PATCH v5 1/5] mm: Define VM_DROPPABLE for powerpc/32
Commit 9651fcedf7b9 ("mm: add MAP_DROPPABLE for designating always lazily freeable mappings") only adds VM_DROPPABLE for 64 bits architectures. In order to also use the getrandom vDSO implementation on powerpc/32, use VM_ARCH_1 for VM_DROPPABLE on powerpc/32. This is possible because VM_ARCH_1 is used for VM_SAO on powerpc and VM_SAO is only for powerpc/64. It is used in combination with PROT_SAO in some parts of code that are restricted to CONFIG_PPC64 through #ifdefs, it is therefore possible to define VM_SAO for CONFIG_PPC64 only. Signed-off-by: Christophe Leroy --- v4: Added more details in commit message following comment from Michael. v3: Fixed build failure reported by robots. --- fs/proc/task_mmu.c | 4 +++- include/linux/mm.h | 4 +++- include/trace/events/mmflags.h | 4 ++-- 3 files changed, 8 insertions(+), 4 deletions(-) diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index 5f171ad7b436..3a07e13e2f81 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -987,8 +987,10 @@ static void show_smap_vma_flags(struct seq_file *m, struct vm_area_struct *vma) #ifdef CONFIG_X86_USER_SHADOW_STACK [ilog2(VM_SHADOW_STACK)] = "ss", #endif -#ifdef CONFIG_64BIT +#if defined(CONFIG_64BIT) || defined(CONFIG_PPC32) [ilog2(VM_DROPPABLE)] = "dp", +#endif +#ifdef CONFIG_64BIT [ilog2(VM_SEALED)] = "sl", #endif }; diff --git a/include/linux/mm.h b/include/linux/mm.h index 6549d0979b28..028847f39442 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -359,7 +359,7 @@ extern unsigned int kobjsize(const void *objp); #if defined(CONFIG_X86) # define VM_PATVM_ARCH_1 /* PAT reserves whole VMA at once (x86) */ -#elif defined(CONFIG_PPC) +#elif defined(CONFIG_PPC64) # define VM_SAOVM_ARCH_1 /* Strong Access Ordering (powerpc) */ #elif defined(CONFIG_PARISC) # define VM_GROWSUPVM_ARCH_1 @@ -409,6 +409,8 @@ extern unsigned int kobjsize(const void *objp); #ifdef CONFIG_64BIT #define VM_DROPPABLE_BIT 40 #define VM_DROPPABLE BIT(VM_DROPPABLE_BIT) +#elif defined(CONFIG_PPC32) +#define VM_DROPPABLE VM_ARCH_1 #else #define VM_DROPPABLE VM_NONE #endif diff --git a/include/trace/events/mmflags.h b/include/trace/events/mmflags.h index b63d211bd141..37265977d524 100644 --- a/include/trace/events/mmflags.h +++ b/include/trace/events/mmflags.h @@ -143,7 +143,7 @@ IF_HAVE_PG_ARCH_X(arch_3) #if defined(CONFIG_X86) #define __VM_ARCH_SPECIFIC_1 {VM_PAT, "pat" } -#elif defined(CONFIG_PPC) +#elif defined(CONFIG_PPC64) #define __VM_ARCH_SPECIFIC_1 {VM_SAO, "sao" } #elif defined(CONFIG_PARISC) #define __VM_ARCH_SPECIFIC_1 {VM_GROWSUP, "growsup" } @@ -165,7 +165,7 @@ IF_HAVE_PG_ARCH_X(arch_3) # define IF_HAVE_UFFD_MINOR(flag, name) #endif -#ifdef CONFIG_64BIT +#if defined(CONFIG_64BIT) || defined(CONFIG_PPC32) # define IF_HAVE_VM_DROPPABLE(flag, name) {flag, name}, #else # define IF_HAVE_VM_DROPPABLE(flag, name) -- 2.44.0
[PATCH v5 0/5] Wire up getrandom() vDSO implementation on powerpc
This series wires up getrandom() vDSO implementation on powerpc. Tested on PPC32 on real hardware. Tested on PPC64 (both BE and LE) on QEMU: Performance on powerpc 885: ~# ./vdso_test_getrandom bench-single vdso: 2500 times in 62.938002291 seconds libc: 2500 times in 535.581916866 seconds syscall: 2500 times in 531.525042806 seconds Performance on powerpc 8321: ~# ./vdso_test_getrandom bench-single vdso: 2500 times in 16.899318858 seconds libc: 2500 times in 131.050596522 seconds syscall: 2500 times in 129.794790389 seconds Performance on QEMU pseries: ~ # ./vdso_test_getrandom bench-single vdso: 2500 times in 4.97162 seconds libc: 2500 times in 75.516749981 seconds syscall: 2500 times in 86.842242014 seconds Changes in v5: - The split between last two patches is not anymore PPC32/PPC64 but VDSO32/VDSO64 - Removed the stub returning ENOSYS - Using meaningfull names for registers - Restored symbolic link that disappeared in v4 Changes in v4: - Rebased on recent random git tree (963233ff0133) (The new tree includes selftests fixes) - Read/write counter in native byte order - Don't use anymore compat macros to write output - Fixed selftests build failure with patch 4 (without patch 5) on little endian on PPC64 - Implement a __kernel_getrandom() stub returning ENOSYS on ppc64 in patch 4 (without patch 5) to make selftests happy. Changes in v3: - Rebased on recent random git tree (0c7e00e22c21) - Fixed build failures reported by robots around VM_DROPPABLE - Fixed crash on PPC64 due to clobbered r13 by not using r13 anymore (saving it was not enough for signals). - Split final patch in two, first for PPC32, second for PPC64 - Moved selftest fixes out of this series Changes in v2: - Define VM_DROPPABLE for powerpc/32 - Fixes generic vDSO getrandom headers to enable CONFIG_COMPAT build. - Fixed size of generation counter - Fixed selftests to work on non x86 architectures Christophe Leroy (5): mm: Define VM_DROPPABLE for powerpc/32 powerpc/vdso32: Add crtsavres powerpc/vdso: Refactor CFLAGS for CVDSO build powerpc/vdso: Wire up getrandom() vDSO implementation on VDSO32 powerpc/vdso: Wire up getrandom() vDSO implementation on VDSO64 arch/powerpc/Kconfig | 1 + arch/powerpc/include/asm/mman.h | 2 +- arch/powerpc/include/asm/vdso/getrandom.h| 54 +++ arch/powerpc/include/asm/vdso/vsyscall.h | 6 + arch/powerpc/include/asm/vdso_datapage.h | 2 + arch/powerpc/kernel/asm-offsets.c| 1 + arch/powerpc/kernel/vdso/Makefile| 57 +-- arch/powerpc/kernel/vdso/getrandom.S | 58 +++ arch/powerpc/kernel/vdso/gettimeofday.S | 13 - arch/powerpc/kernel/vdso/vdso32.lds.S| 1 + arch/powerpc/kernel/vdso/vdso64.lds.S| 1 + arch/powerpc/kernel/vdso/vgetrandom-chacha.S | 365 +++ arch/powerpc/kernel/vdso/vgetrandom.c| 14 + fs/proc/task_mmu.c | 4 +- include/linux/mm.h | 4 +- include/trace/events/mmflags.h | 4 +- tools/arch/powerpc/vdso | 1 + tools/testing/selftests/vDSO/Makefile| 2 +- 18 files changed, 547 insertions(+), 43 deletions(-) create mode 100644 arch/powerpc/include/asm/vdso/getrandom.h create mode 100644 arch/powerpc/kernel/vdso/getrandom.S create mode 100644 arch/powerpc/kernel/vdso/vgetrandom-chacha.S create mode 100644 arch/powerpc/kernel/vdso/vgetrandom.c create mode 12 tools/arch/powerpc/vdso -- 2.44.0
Re: [PATCH v4 5/5] powerpc/vdso: Wire up getrandom() vDSO implementation on PPC64
Hi Jason, hi Michael, Le 02/09/2024 à 16:19, Jason A. Donenfeld a écrit : On Mon, Sep 02, 2024 at 04:16:48PM +0200, Christophe Leroy wrote: Can do that, but there will still be a problem with chacha selftests if I don't opt-out the entire function content when it is ppc64. It will build properly but if someone runs it on a ppc64 it will likely crash because only the low 32 bits of registers will be saved. What if you don't wire up the selftests _at all_ until the ppc64 commit? Then there'll be no risk. (And I think I would prefer to see the 32-bit code all in the 32-bit commit; that'd make it more straight forward to review too.) I'd be fine with that but I'd like feedback from Michael on it: Is there a risk to only get PPC32 part merged as a first step or will both PPC32 and PPC64 go together anyway ? I would prefer not to delay PPC32 because someone doesn't feel confident with PPC64. Christophe
Re: [PATCH v4 5/5] powerpc/vdso: Wire up getrandom() vDSO implementation on PPC64
Le 02/09/2024 à 16:00, Jason A. Donenfeld a écrit : On Mon, Sep 02, 2024 at 03:12:47PM +0200, Christophe Leroy wrote: Le 02/09/2024 à 14:41, Jason A. Donenfeld a écrit : On Mon, Sep 02, 2024 at 02:04:42PM +0200, Christophe Leroy wrote: SYM_FUNC_START(__arch_chacha20_blocks_nostack) #ifdef __powerpc64__ - blr + std r5, -216(r1) + + std r14, -144(r1) + std r15, -136(r1) + std r16, -128(r1) + std r17, -120(r1) + std r18, -112(r1) + std r19, -104(r1) + std r20, -96(r1) + std r21, -88(r1) + std r22, -80(r1) + std r23, -72(r1) + std r24, -64(r1) + std r25, -56(r1) + std r26, -48(r1) + std r27, -40(r1) + std r28, -32(r1) + std r29, -24(r1) + std r30, -16(r1) + std r31, -8(r1) #else stwur1, -96(r1) stw r5, 20(r1) +#ifdef __BIG_ENDIAN__ stmwr14, 24(r1) +#else + stw r14, 24(r1) + stw r15, 28(r1) + stw r16, 32(r1) + stw r17, 36(r1) + stw r18, 40(r1) + stw r19, 44(r1) + stw r20, 48(r1) + stw r21, 52(r1) + stw r22, 56(r1) + stw r23, 60(r1) + stw r24, 64(r1) + stw r25, 68(r1) + stw r26, 72(r1) + stw r27, 76(r1) + stw r28, 80(r1) + stw r29, 84(r1) + stw r30, 88(r1) + stw r31, 92(r1) +#endif +#endif This confuses me. Why are you adding code to the !__powerpc64__ branch in this commit? (Also, why does stmw not work on LE?) That's for the VDSO32 ie running 32 bits binaries on a 64 bits kernel. "Programming Environments Manual for 32-Bit Implementations of the PowerPC™ Architecture" say: In some implementations operating with little-endian byte order, execution of an lmw or stmw instruction causes the system alignment error handler to be invoked And GCC doesn't like it either: tools/arch/powerpc/vdso/vgetrandom-chacha.S:84: Error: `stmw' invalid when little-endian Does it make sense to do all the 32-bit stuff in the PPC32 commit (and then you can introduce the selftests there without the error you mentioned), and then add the 64-bit stuff in this commit? Can do that, but there will still be a problem with chacha selftests if I don't opt-out the entire function content when it is ppc64. It will build properly but if someone runs it on a ppc64 it will likely crash because only the low 32 bits of registers will be saved. That's the reason why I really prefered the approach where I set something in vdso_config.h so that the assembly is used only for powerpc32 and when building powerpc64 the assembly part is kept out and vdso_test_chacha simply tells it is not supported. Christophe
Re: [PATCH v4 5/5] powerpc/vdso: Wire up getrandom() vDSO implementation on PPC64
Le 02/09/2024 à 14:41, Jason A. Donenfeld a écrit : On Mon, Sep 02, 2024 at 02:04:42PM +0200, Christophe Leroy wrote: SYM_FUNC_START(__arch_chacha20_blocks_nostack) #ifdef __powerpc64__ - blr + std r5, -216(r1) + + std r14, -144(r1) + std r15, -136(r1) + std r16, -128(r1) + std r17, -120(r1) + std r18, -112(r1) + std r19, -104(r1) + std r20, -96(r1) + std r21, -88(r1) + std r22, -80(r1) + std r23, -72(r1) + std r24, -64(r1) + std r25, -56(r1) + std r26, -48(r1) + std r27, -40(r1) + std r28, -32(r1) + std r29, -24(r1) + std r30, -16(r1) + std r31, -8(r1) #else stwur1, -96(r1) stw r5, 20(r1) +#ifdef __BIG_ENDIAN__ stmwr14, 24(r1) +#else + stw r14, 24(r1) + stw r15, 28(r1) + stw r16, 32(r1) + stw r17, 36(r1) + stw r18, 40(r1) + stw r19, 44(r1) + stw r20, 48(r1) + stw r21, 52(r1) + stw r22, 56(r1) + stw r23, 60(r1) + stw r24, 64(r1) + stw r25, 68(r1) + stw r26, 72(r1) + stw r27, 76(r1) + stw r28, 80(r1) + stw r29, 84(r1) + stw r30, 88(r1) + stw r31, 92(r1) +#endif +#endif This confuses me. Why are you adding code to the !__powerpc64__ branch in this commit? (Also, why does stmw not work on LE?) That's for the VDSO32 ie running 32 bits binaries on a 64 bits kernel. "Programming Environments Manual for 32-Bit Implementations of the PowerPC™ Architecture" say: In some implementations operating with little-endian byte order, execution of an lmw or stmw instruction causes the system alignment error handler to be invoked And GCC doesn't like it either: tools/arch/powerpc/vdso/vgetrandom-chacha.S:84: Error: `stmw' invalid when little-endian
Re: [PATCH v4 4/5] powerpc/vdso: Wire up getrandom() vDSO implementation on PPC32
Le 02/09/2024 à 14:34, Jason A. Donenfeld a écrit : On Mon, Sep 02, 2024 at 02:04:41PM +0200, Christophe Leroy wrote: This first patch adds support for PPC32. As selftests cannot easily be generated only for PPC32, and because the following patch brings support for PPC64 anyway, this patch opts out all code in __arch_chacha20_blocks_nostack() so that vdso_test_chacha will not fail to compile and will not crash on PPC64/PPC64LE, allthough the selftest itself will fail. This patch also adds a dummy __kernel_getrandom() function that returns ENOSYS on PPC64 so that vdso_test_getrandom returns KSFT_SKIP instead of KSFT_FAIL. Why not just wire up the selftests in the next patch like you did for v3? This seems like extra stuff for no huge reason? In v3 selftests were already wired up in v3, and there was the following build failure: $ make ARCH=powerpc CROSS_COMPILE=powerpc64le-linux- CC vdso_test_gettimeofday CC vdso_test_getcpu CC vdso_test_abi CC vdso_test_clock_getres CC vdso_test_correctness CC vdso_test_getrandom CC vdso_test_chacha /home/chleroy/linux-powerpc/tools/testing/selftests/../../../tools/arch/powerpc/vdso/vgetrandom-chacha.S: Assembler messages: /home/chleroy/linux-powerpc/tools/testing/selftests/../../../tools/arch/powerpc/vdso/vgetrandom-chacha.S:84: Error: `stmw' invalid when little-endian /home/chleroy/linux-powerpc/tools/testing/selftests/../../../tools/arch/powerpc/vdso/vgetrandom-chacha.S:198: Error: `lmw' invalid when little-endian make: *** [../lib.mk:222: /home/chleroy/linux-powerpc/tools/testing/selftests/vDSO/vdso_test_chacha] Error 1 So I did this change to get a clean PPC32 implementation before going into PPC64. I thought it was easier to go in two steps for reviews, bisectability, etc for just a very little extra stuff. arch/powerpc/Kconfig | 1 + arch/powerpc/include/asm/vdso/getrandom.h| 54 + arch/powerpc/include/asm/vdso/vsyscall.h | 6 + arch/powerpc/include/asm/vdso_datapage.h | 2 + arch/powerpc/kernel/asm-offsets.c| 1 + arch/powerpc/kernel/vdso/Makefile| 13 +- arch/powerpc/kernel/vdso/getrandom.S | 58 ++ arch/powerpc/kernel/vdso/vdso32.lds.S| 1 + arch/powerpc/kernel/vdso/vdso64.lds.S| 1 + arch/powerpc/kernel/vdso/vgetrandom-chacha.S | 207 +++ arch/powerpc/kernel/vdso/vgetrandom.c| 16 ++ tools/testing/selftests/vDSO/Makefile| 2 +- 12 files changed, 359 insertions(+), 3 deletions(-) create mode 100644 arch/powerpc/include/asm/vdso/getrandom.h create mode 100644 arch/powerpc/kernel/vdso/getrandom.S create mode 100644 arch/powerpc/kernel/vdso/vgetrandom-chacha.S create mode 100644 arch/powerpc/kernel/vdso/vgetrandom.c I think you might have forgotten to add the symlink in this commit (or the next one, per my comment above, if you agree with it). That's odd. All CI tests on github went ok !!! Looks like the CI tests for selftests are broken. Argh ! And of course on my computer the link was there so I didn't notice. +/* + * Very basic 32 bits implementation of ChaCha20. Produces a given positive number + * of blocks of output with a nonce of 0, taking an input key and 8-byte + * counter. Importantly does not spill to the stack. Its arguments are: + * + * r3: output bytes + * r4: 32-byte key input + * r5: 8-byte counter input/output (saved on stack) + * r6: number of 64-byte blocks to write to output + * + * r0: counter of blocks (initialised with r6) + * r4: Value '4' after key has been read. + * r5-r12: key + * r14-r15: counter + * r16-r31: state + */ +SYM_FUNC_START(__arch_chacha20_blocks_nostack) +#ifdef __powerpc64__ + blr +#else + stwur1, -96(r1) + stw r5, 20(r1) + stmwr14, 24(r1) + + lwz r14, 0(r5) + lwz r15, 4(r5) + mr r0, r6 + subir3, r3, 4 + + lwz r5, 0(r4) + lwz r6, 4(r4) + lwz r7, 8(r4) + lwz r8, 12(r4) + lwz r9, 16(r4) + lwz r10, 20(r4) + lwz r11, 24(r4) + lwz r12, 28(r4) If you don't want to do this, don't worry about it, but while I'm commenting on things, I think it's worth noting that x86, loongarch, and arm64 implementations all use the preprocessor or macros to give names to these registers -- state1,2,3,...copy1,2,3 and so forth. Might be worth doing the same if you think there's an easy and obvious way of doing it. If not -- or if that kind of work abhors you -- don't worry about it, as I'm confident enough that this code works fine. But it might be "nice to have". Up to you. I'll have a look. + + li r4, 4 +.Lblock: + li r31, 10 + Maybe a comment here, "expand 32-byte k"
[PATCH v4 5/5] powerpc/vdso: Wire up getrandom() vDSO implementation on PPC64
Extend getrandom() vDSO implementation to powerpc64. Tested on QEMU on both ppc64_defconfig and ppc64le_defconfig. The results are not precise as it is QEMU on an x86 laptop, but no need to be precise to see the benefit. ~ # ./vdso_test_getrandom bench-single vdso: 2500 times in 4.97162 seconds libc: 2500 times in 75.516749981 seconds syscall: 2500 times in 86.842242014 seconds ~ # ./vdso_test_getrandom bench-single vdso: 2500 times in 6.473814156 seconds libc: 2500 times in 73.875109463 seconds syscall: 2500 times in 71.805066229 seconds Signed-off-by: Christophe Leroy --- v4: - Use __BIG_ENDIAN__ which is defined by GCC instead of CONFIG_CPU_BIG_ENDIAN which is unknown by selftests - Implement a cleaner/smaller output copy for little endian instead of keeping compat macro. v3: New (split out of previous patch) --- arch/powerpc/Kconfig | 2 +- arch/powerpc/include/asm/mman.h | 2 +- arch/powerpc/kernel/vdso/Makefile| 11 +- arch/powerpc/kernel/vdso/getrandom.S | 16 +-- arch/powerpc/kernel/vdso/vgetrandom-chacha.S | 117 ++- arch/powerpc/kernel/vdso/vgetrandom.c| 2 - 6 files changed, 132 insertions(+), 18 deletions(-) diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index 54b270ef18b1..b45452ac4a73 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -311,7 +311,7 @@ config PPC select SYSCTL_EXCEPTION_TRACE select THREAD_INFO_IN_TASK select TRACE_IRQFLAGS_SUPPORT - select VDSO_GETRANDOM if PPC32 + select VDSO_GETRANDOM # # Please keep this list sorted alphabetically. # diff --git a/arch/powerpc/include/asm/mman.h b/arch/powerpc/include/asm/mman.h index 17a77d47ed6d..42a51a993d94 100644 --- a/arch/powerpc/include/asm/mman.h +++ b/arch/powerpc/include/asm/mman.h @@ -6,7 +6,7 @@ #include -#ifdef CONFIG_PPC64 +#if defined(CONFIG_PPC64) && !defined(BUILD_VDSO) #include #include diff --git a/arch/powerpc/kernel/vdso/Makefile b/arch/powerpc/kernel/vdso/Makefile index af3ba61b022e..56fb1633529a 100644 --- a/arch/powerpc/kernel/vdso/Makefile +++ b/arch/powerpc/kernel/vdso/Makefile @@ -9,7 +9,7 @@ obj-vdso32 = sigtramp32-32.o gettimeofday-32.o datapage-32.o cacheflush-32.o not obj-vdso64 = sigtramp64-64.o gettimeofday-64.o datapage-64.o cacheflush-64.o note-64.o getcpu-64.o obj-vdso32 += getrandom-32.o vgetrandom-chacha-32.o -obj-vdso64 += getrandom-64.o +obj-vdso64 += getrandom-64.o vgetrandom-chacha-64.o ifneq ($(c-gettimeofday-y),) CFLAGS_vgettimeofday-32.o += -include $(c-gettimeofday-y) @@ -22,6 +22,7 @@ endif ifneq ($(c-getrandom-y),) CFLAGS_vgetrandom-32.o += -include $(c-getrandom-y) + CFLAGS_vgetrandom-64.o += -include $(c-getrandom-y) $(call cc-option, -ffixed-r30) endif # Build rules @@ -35,10 +36,10 @@ endif targets := $(obj-vdso32) vdso32.so.dbg vgettimeofday-32.o vgetrandom-32.o targets += crtsavres-32.o obj-vdso32 := $(addprefix $(obj)/, $(obj-vdso32)) -targets += $(obj-vdso64) vdso64.so.dbg vgettimeofday-64.o +targets += $(obj-vdso64) vdso64.so.dbg vgettimeofday-64.o vgetrandom-64.o obj-vdso64 := $(addprefix $(obj)/, $(obj-vdso64)) -ccflags-y := -fno-common -fno-builtin +ccflags-y := -fno-common -fno-builtin -DBUILD_VDSO ccflags-y += $(DISABLE_LATENT_ENTROPY_PLUGIN) ccflags-y += $(call cc-option, -fno-stack-protector) ccflags-y += -DDISABLE_BRANCH_PROFILING @@ -72,7 +73,7 @@ CPPFLAGS_vdso64.lds += -P -C # link rule for the .so file, .lds has to be first $(obj)/vdso32.so.dbg: $(obj)/vdso32.lds $(obj-vdso32) $(obj)/vgettimeofday-32.o $(obj)/vgetrandom-32.o $(obj)/crtsavres-32.o FORCE $(call if_changed,vdso32ld_and_check) -$(obj)/vdso64.so.dbg: $(obj)/vdso64.lds $(obj-vdso64) $(obj)/vgettimeofday-64.o FORCE +$(obj)/vdso64.so.dbg: $(obj)/vdso64.lds $(obj-vdso64) $(obj)/vgettimeofday-64.o $(obj)/vgetrandom-64.o FORCE $(call if_changed,vdso64ld_and_check) # assembly rules for the .S files @@ -88,6 +89,8 @@ $(obj-vdso64): %-64.o: %.S FORCE $(call if_changed_dep,vdso64as) $(obj)/vgettimeofday-64.o: %-64.o: %.c FORCE $(call if_changed_dep,cc_o_c) +$(obj)/vgetrandom-64.o: %-64.o: %.c FORCE + $(call if_changed_dep,cc_o_c) # Generate VDSO offsets using helper script gen-vdso32sym := $(src)/gen_vdso32_offsets.sh diff --git a/arch/powerpc/kernel/vdso/getrandom.S b/arch/powerpc/kernel/vdso/getrandom.S index 7db51c0635a5..a957cd2b2b03 100644 --- a/arch/powerpc/kernel/vdso/getrandom.S +++ b/arch/powerpc/kernel/vdso/getrandom.S @@ -5,8 +5,6 @@ * * Copyright (C) 2024 Christophe Leroy , CS GROUP France */ -#include - #include #include #include @@ -29,10 +27,18 @@ .cfi_adjust_cfa_offset PPC_MIN_STKFRM PPC_STL r0, PPC_MIN_STKFRM + PPC_LR_STKOFF(r1) .cfi_rel_offset lr, PPC_MIN_STKFRM + PPC_LR_STKOFF +#ifdef __powerpc64__ + PPC_STL
[PATCH v4 4/5] powerpc/vdso: Wire up getrandom() vDSO implementation on PPC32
To be consistent with other VDSO functions, the function is called __kernel_getrandom() __arch_chacha20_blocks_nostack() fonction is implemented basically with 32 bits operations. It performs 4 QUARTERROUND operations in parallele. There are enough registers to avoid using the stack: On input: r3: output bytes r4: 32-byte key input r5: 8-byte counter input/output r6: number of 64-byte blocks to write to output During operation: stack: pointer to counter (r5) and non-volatile registers (r14-131) r0: counter of blocks (initialised with r6) r4: Value '4' after key has been read, used for indexing r5-r12: key r14-r15: block counter r16-r31: chacha state At the end: r0, r6-r12: Zeroised r5, r14-r31: Restored Performance on powerpc 885 (using kernel selftest): ~# ./vdso_test_getrandom bench-single vdso: 2500 times in 62.938002291 seconds libc: 2500 times in 535.581916866 seconds syscall: 2500 times in 531.525042806 seconds Performance on powerpc 8321 (using kernel selftest): ~# ./vdso_test_getrandom bench-single vdso: 2500 times in 16.899318858 seconds libc: 2500 times in 131.050596522 seconds syscall: 2500 times in 129.794790389 seconds This first patch adds support for PPC32. As selftests cannot easily be generated only for PPC32, and because the following patch brings support for PPC64 anyway, this patch opts out all code in __arch_chacha20_blocks_nostack() so that vdso_test_chacha will not fail to compile and will not crash on PPC64/PPC64LE, allthough the selftest itself will fail. This patch also adds a dummy __kernel_getrandom() function that returns ENOSYS on PPC64 so that vdso_test_getrandom returns KSFT_SKIP instead of KSFT_FAIL. Signed-off-by: Christophe Leroy --- v4: - Counter has native byte order - Fix selftest build on ppc64le until implemented. - On ppc64, for now implement __kernel_getrandom to return ENOSYS error - Use stwbrx directly, not compat macro. v3: - Preserve r13, implies saving r5 on stack - Split PPC64 implementation out. --- arch/powerpc/Kconfig | 1 + arch/powerpc/include/asm/vdso/getrandom.h| 54 + arch/powerpc/include/asm/vdso/vsyscall.h | 6 + arch/powerpc/include/asm/vdso_datapage.h | 2 + arch/powerpc/kernel/asm-offsets.c| 1 + arch/powerpc/kernel/vdso/Makefile| 13 +- arch/powerpc/kernel/vdso/getrandom.S | 58 ++ arch/powerpc/kernel/vdso/vdso32.lds.S| 1 + arch/powerpc/kernel/vdso/vdso64.lds.S| 1 + arch/powerpc/kernel/vdso/vgetrandom-chacha.S | 207 +++ arch/powerpc/kernel/vdso/vgetrandom.c| 16 ++ tools/testing/selftests/vDSO/Makefile| 2 +- 12 files changed, 359 insertions(+), 3 deletions(-) create mode 100644 arch/powerpc/include/asm/vdso/getrandom.h create mode 100644 arch/powerpc/kernel/vdso/getrandom.S create mode 100644 arch/powerpc/kernel/vdso/vgetrandom-chacha.S create mode 100644 arch/powerpc/kernel/vdso/vgetrandom.c diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index d7b09b064a8a..54b270ef18b1 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -311,6 +311,7 @@ config PPC select SYSCTL_EXCEPTION_TRACE select THREAD_INFO_IN_TASK select TRACE_IRQFLAGS_SUPPORT + select VDSO_GETRANDOM if PPC32 # # Please keep this list sorted alphabetically. # diff --git a/arch/powerpc/include/asm/vdso/getrandom.h b/arch/powerpc/include/asm/vdso/getrandom.h new file mode 100644 index ..501d6bb14e8a --- /dev/null +++ b/arch/powerpc/include/asm/vdso/getrandom.h @@ -0,0 +1,54 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (C) 2024 Christophe Leroy , CS GROUP France + */ +#ifndef _ASM_POWERPC_VDSO_GETRANDOM_H +#define _ASM_POWERPC_VDSO_GETRANDOM_H + +#ifndef __ASSEMBLY__ + +static __always_inline int do_syscall_3(const unsigned long _r0, const unsigned long _r3, + const unsigned long _r4, const unsigned long _r5) +{ + register long r0 asm("r0") = _r0; + register unsigned long r3 asm("r3") = _r3; + register unsigned long r4 asm("r4") = _r4; + register unsigned long r5 asm("r5") = _r5; + register int ret asm ("r3"); + + asm volatile( + " sc\n" + " bns+1f\n" + " neg %0, %0\n" + "1:\n" + : "=r" (ret), "+r" (r4), "+r" (r5), "+r" (r0) + : "r" (r3) + : "memory", "r6", "r7", "r8", "r9", "r10", "r11", "r12", "cr0", &
[PATCH v4 3/5] powerpc/vdso: Refactor CFLAGS for CVDSO build
In order to avoid two much duplication when we add new VDSO functionnalities in C like getrandom, refactor common CFLAGS. Signed-off-by: Christophe Leroy --- v3: Also refactor removed flags --- arch/powerpc/kernel/vdso/Makefile | 32 +-- 1 file changed, 13 insertions(+), 19 deletions(-) diff --git a/arch/powerpc/kernel/vdso/Makefile b/arch/powerpc/kernel/vdso/Makefile index c07a425b8f78..67fe79d26fae 100644 --- a/arch/powerpc/kernel/vdso/Makefile +++ b/arch/powerpc/kernel/vdso/Makefile @@ -10,28 +10,11 @@ obj-vdso64 = sigtramp64-64.o gettimeofday-64.o datapage-64.o cacheflush-64.o not ifneq ($(c-gettimeofday-y),) CFLAGS_vgettimeofday-32.o += -include $(c-gettimeofday-y) - CFLAGS_vgettimeofday-32.o += $(DISABLE_LATENT_ENTROPY_PLUGIN) - CFLAGS_vgettimeofday-32.o += $(call cc-option, -fno-stack-protector) - CFLAGS_vgettimeofday-32.o += -DDISABLE_BRANCH_PROFILING - CFLAGS_vgettimeofday-32.o += -ffreestanding -fasynchronous-unwind-tables - CFLAGS_REMOVE_vgettimeofday-32.o = $(CC_FLAGS_FTRACE) - CFLAGS_REMOVE_vgettimeofday-32.o += -mcmodel=medium -mabi=elfv1 -mabi=elfv2 -mcall-aixdesc - # This flag is supported by clang for 64-bit but not 32-bit so it will cause - # an unused command line flag warning for this file. - ifdef CONFIG_CC_IS_CLANG - CFLAGS_REMOVE_vgettimeofday-32.o += -fno-stack-clash-protection - endif - CFLAGS_vgettimeofday-64.o += -include $(c-gettimeofday-y) - CFLAGS_vgettimeofday-64.o += $(DISABLE_LATENT_ENTROPY_PLUGIN) - CFLAGS_vgettimeofday-64.o += $(call cc-option, -fno-stack-protector) - CFLAGS_vgettimeofday-64.o += -DDISABLE_BRANCH_PROFILING - CFLAGS_vgettimeofday-64.o += -ffreestanding -fasynchronous-unwind-tables - CFLAGS_REMOVE_vgettimeofday-64.o = $(CC_FLAGS_FTRACE) # Go prior to 1.16.x assumes r30 is not clobbered by any VDSO code. That used to be true # by accident when the VDSO was hand-written asm code, but may not be now that the VDSO is # compiler generated. To avoid breaking Go tell GCC not to use r30. Impact on code # generation is minimal, it will just use r29 instead. - CFLAGS_vgettimeofday-64.o += $(call cc-option, -ffixed-r30) + CFLAGS_vgettimeofday-64.o += -include $(c-gettimeofday-y) $(call cc-option, -ffixed-r30) endif # Build rules @@ -49,6 +32,11 @@ targets += $(obj-vdso64) vdso64.so.dbg vgettimeofday-64.o obj-vdso64 := $(addprefix $(obj)/, $(obj-vdso64)) ccflags-y := -fno-common -fno-builtin +ccflags-y += $(DISABLE_LATENT_ENTROPY_PLUGIN) +ccflags-y += $(call cc-option, -fno-stack-protector) +ccflags-y += -DDISABLE_BRANCH_PROFILING +ccflags-y += -ffreestanding -fasynchronous-unwind-tables +ccflags-remove-y := $(CC_FLAGS_FTRACE) ldflags-y := -Wl,--hash-style=both -nostdlib -shared -z noexecstack $(CLANG_FLAGS) ldflags-$(CONFIG_LD_IS_LLD) += $(call cc-option,--ld-path=$(LD),-fuse-ld=lld) ldflags-$(CONFIG_LD_ORPHAN_WARN) += -Wl,--orphan-handling=$(CONFIG_LD_ORPHAN_WARN_LEVEL) @@ -57,6 +45,12 @@ ldflags-$(CONFIG_LD_ORPHAN_WARN) += -Wl,--orphan-handling=$(CONFIG_LD_ORPHAN_WAR ldflags-y += $(filter-out $(CC_AUTO_VAR_INIT_ZERO_ENABLER) $(CC_FLAGS_FTRACE) -Wa$(comma)%, $(KBUILD_CFLAGS)) CC32FLAGS := -m32 +CC32FLAGSREMOVE := -mcmodel=medium -mabi=elfv1 -mabi=elfv2 -mcall-aixdesc + # This flag is supported by clang for 64-bit but not 32-bit so it will cause + # an unused command line flag warning for this file. +ifdef CONFIG_CC_IS_CLANG +CC32FLAGSREMOVE += -fno-stack-clash-protection +endif LD32FLAGS := -Wl,-soname=linux-vdso32.so.1 AS32FLAGS := -D__VDSO32__ @@ -105,7 +99,7 @@ quiet_cmd_vdso32ld_and_check = VDSO32L $@ quiet_cmd_vdso32as = VDSO32A $@ cmd_vdso32as = $(VDSOCC) $(a_flags) $(CC32FLAGS) $(AS32FLAGS) -c -o $@ $< quiet_cmd_vdso32cc = VDSO32C $@ - cmd_vdso32cc = $(VDSOCC) $(c_flags) $(CC32FLAGS) -c -o $@ $< + cmd_vdso32cc = $(VDSOCC) $(filter-out $(CC32FLAGSREMOVE), $(c_flags)) $(CC32FLAGS) -c -o $@ $< quiet_cmd_vdso64ld_and_check = VDSO64L $@ cmd_vdso64ld_and_check = $(VDSOCC) $(ldflags-y) $(LD64FLAGS) -o $@ -Wl,-T$(filter %.lds,$^) $(filter %.o,$^); $(cmd_vdso_check) -- 2.44.0
[PATCH v4 1/5] mm: Define VM_DROPPABLE for powerpc/32
Commit 9651fcedf7b9 ("mm: add MAP_DROPPABLE for designating always lazily freeable mappings") only adds VM_DROPPABLE for 64 bits architectures. In order to also use the getrandom vDSO implementation on powerpc/32, use VM_ARCH_1 for VM_DROPPABLE on powerpc/32. This is possible because VM_ARCH_1 is used for VM_SAO on powerpc and VM_SAO is only for powerpc/64. It is used in combination with PROT_SAO in some parts of code that are restricted to CONFIG_PPC64 through #ifdefs, it is therefore possible to define VM_SAO for CONFIG_PPC64 only. Signed-off-by: Christophe Leroy --- v4: Added more details in commit message following comment from Michael. v3: Fixed build failure reported by robots. --- fs/proc/task_mmu.c | 4 +++- include/linux/mm.h | 4 +++- include/trace/events/mmflags.h | 4 ++-- 3 files changed, 8 insertions(+), 4 deletions(-) diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index 5f171ad7b436..3a07e13e2f81 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -987,8 +987,10 @@ static void show_smap_vma_flags(struct seq_file *m, struct vm_area_struct *vma) #ifdef CONFIG_X86_USER_SHADOW_STACK [ilog2(VM_SHADOW_STACK)] = "ss", #endif -#ifdef CONFIG_64BIT +#if defined(CONFIG_64BIT) || defined(CONFIG_PPC32) [ilog2(VM_DROPPABLE)] = "dp", +#endif +#ifdef CONFIG_64BIT [ilog2(VM_SEALED)] = "sl", #endif }; diff --git a/include/linux/mm.h b/include/linux/mm.h index 6549d0979b28..028847f39442 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -359,7 +359,7 @@ extern unsigned int kobjsize(const void *objp); #if defined(CONFIG_X86) # define VM_PATVM_ARCH_1 /* PAT reserves whole VMA at once (x86) */ -#elif defined(CONFIG_PPC) +#elif defined(CONFIG_PPC64) # define VM_SAOVM_ARCH_1 /* Strong Access Ordering (powerpc) */ #elif defined(CONFIG_PARISC) # define VM_GROWSUPVM_ARCH_1 @@ -409,6 +409,8 @@ extern unsigned int kobjsize(const void *objp); #ifdef CONFIG_64BIT #define VM_DROPPABLE_BIT 40 #define VM_DROPPABLE BIT(VM_DROPPABLE_BIT) +#elif defined(CONFIG_PPC32) +#define VM_DROPPABLE VM_ARCH_1 #else #define VM_DROPPABLE VM_NONE #endif diff --git a/include/trace/events/mmflags.h b/include/trace/events/mmflags.h index b63d211bd141..37265977d524 100644 --- a/include/trace/events/mmflags.h +++ b/include/trace/events/mmflags.h @@ -143,7 +143,7 @@ IF_HAVE_PG_ARCH_X(arch_3) #if defined(CONFIG_X86) #define __VM_ARCH_SPECIFIC_1 {VM_PAT, "pat" } -#elif defined(CONFIG_PPC) +#elif defined(CONFIG_PPC64) #define __VM_ARCH_SPECIFIC_1 {VM_SAO, "sao" } #elif defined(CONFIG_PARISC) #define __VM_ARCH_SPECIFIC_1 {VM_GROWSUP, "growsup" } @@ -165,7 +165,7 @@ IF_HAVE_PG_ARCH_X(arch_3) # define IF_HAVE_UFFD_MINOR(flag, name) #endif -#ifdef CONFIG_64BIT +#if defined(CONFIG_64BIT) || defined(CONFIG_PPC32) # define IF_HAVE_VM_DROPPABLE(flag, name) {flag, name}, #else # define IF_HAVE_VM_DROPPABLE(flag, name) -- 2.44.0
[PATCH v4 2/5] powerpc/vdso32: Add crtsavres
Commit 08c18b63d965 ("powerpc/vdso32: Add missing _restgpr_31_x to fix build failure") added _restgpr_31_x to the vdso for gettimeofday, but the work on getrandom shows that we will need more of those functions. Remove _restgpr_31_x and link in crtsavres.o so that we get all save/restore functions when optimising the kernel for size. Signed-off-by: Christophe Leroy --- arch/powerpc/kernel/vdso/Makefile | 5 - arch/powerpc/kernel/vdso/gettimeofday.S | 13 - 2 files changed, 4 insertions(+), 14 deletions(-) diff --git a/arch/powerpc/kernel/vdso/Makefile b/arch/powerpc/kernel/vdso/Makefile index 1425b6edc66b..c07a425b8f78 100644 --- a/arch/powerpc/kernel/vdso/Makefile +++ b/arch/powerpc/kernel/vdso/Makefile @@ -43,6 +43,7 @@ else endif targets := $(obj-vdso32) vdso32.so.dbg vgettimeofday-32.o +targets += crtsavres-32.o obj-vdso32 := $(addprefix $(obj)/, $(obj-vdso32)) targets += $(obj-vdso64) vdso64.so.dbg vgettimeofday-64.o obj-vdso64 := $(addprefix $(obj)/, $(obj-vdso64)) @@ -68,7 +69,7 @@ targets += vdso64.lds CPPFLAGS_vdso64.lds += -P -C # link rule for the .so file, .lds has to be first -$(obj)/vdso32.so.dbg: $(obj)/vdso32.lds $(obj-vdso32) $(obj)/vgettimeofday-32.o FORCE +$(obj)/vdso32.so.dbg: $(obj)/vdso32.lds $(obj-vdso32) $(obj)/vgettimeofday-32.o $(obj)/crtsavres-32.o FORCE $(call if_changed,vdso32ld_and_check) $(obj)/vdso64.so.dbg: $(obj)/vdso64.lds $(obj-vdso64) $(obj)/vgettimeofday-64.o FORCE $(call if_changed,vdso64ld_and_check) @@ -76,6 +77,8 @@ $(obj)/vdso64.so.dbg: $(obj)/vdso64.lds $(obj-vdso64) $(obj)/vgettimeofday-64.o # assembly rules for the .S files $(obj-vdso32): %-32.o: %.S FORCE $(call if_changed_dep,vdso32as) +$(obj)/crtsavres-32.o: %-32.o: $(srctree)/arch/powerpc/lib/crtsavres.S FORCE + $(call if_changed_dep,vdso32as) $(obj)/vgettimeofday-32.o: %-32.o: %.c FORCE $(call if_changed_dep,vdso32cc) $(obj-vdso64): %-64.o: %.S FORCE diff --git a/arch/powerpc/kernel/vdso/gettimeofday.S b/arch/powerpc/kernel/vdso/gettimeofday.S index 48fc6658053a..67254ac9c8bb 100644 --- a/arch/powerpc/kernel/vdso/gettimeofday.S +++ b/arch/powerpc/kernel/vdso/gettimeofday.S @@ -118,16 +118,3 @@ V_FUNCTION_END(__kernel_clock_getres) V_FUNCTION_BEGIN(__kernel_time) cvdso_call __c_kernel_time call_time=1 V_FUNCTION_END(__kernel_time) - -/* Routines for restoring integer registers, called by the compiler. */ -/* Called with r11 pointing to the stack header word of the caller of the */ -/* function, just beyond the end of the integer restore area. */ -#ifndef __powerpc64__ -_GLOBAL(_restgpr_31_x) -_GLOBAL(_rest32gpr_31_x) - lwz r0,4(r11) - lwz r31,-4(r11) - mtlrr0 - mr r1,r11 - blr -#endif -- 2.44.0
[PATCH v4 0/5] Wire up getrandom() vDSO implementation on powerpc
This series wires up getrandom() vDSO implementation on powerpc. Tested on PPC32 on real hardware. Tested on PPC64 (both BE and LE) on QEMU: Performance on powerpc 885: ~# ./vdso_test_getrandom bench-single vdso: 2500 times in 62.938002291 seconds libc: 2500 times in 535.581916866 seconds syscall: 2500 times in 531.525042806 seconds Performance on powerpc 8321: ~# ./vdso_test_getrandom bench-single vdso: 2500 times in 16.899318858 seconds libc: 2500 times in 131.050596522 seconds syscall: 2500 times in 129.794790389 seconds Performance on QEMU pseries: ~ # ./vdso_test_getrandom bench-single vdso: 2500 times in 4.97162 seconds libc: 2500 times in 75.516749981 seconds syscall: 2500 times in 86.842242014 seconds Changes in v4: - Rebased on recent random git tree (963233ff0133) (The new tree includes selftests fixes) - Read/write counter in native byte order - Don't use anymore compat macros to write output - Fixed selftests build failure with patch 4 (without patch 5) on little endian on PPC64 - Implement a __kernel_getrandom() stub returning ENOSYS on ppc64 in patch 4 (without patch 5) to make selftests happy. Changes in v3: - Rebased on recent random git tree (0c7e00e22c21) - Fixed build failures reported by robots around VM_DROPPABLE - Fixed crash on PPC64 due to clobbered r13 by not using r13 anymore (saving it was not enough for signals). - Split final patch in two, first for PPC32, second for PPC64 - Moved selftest fixes out of this series Changes in v2: - Define VM_DROPPABLE for powerpc/32 - Fixes generic vDSO getrandom headers to enable CONFIG_COMPAT build. - Fixed size of generation counter - Fixed selftests to work on non x86 architectures Christophe Leroy (5): mm: Define VM_DROPPABLE for powerpc/32 powerpc/vdso32: Add crtsavres powerpc/vdso: Refactor CFLAGS for CVDSO build powerpc/vdso: Wire up getrandom() vDSO implementation on PPC32 powerpc/vdso: Wire up getrandom() vDSO implementation on PPC64 arch/powerpc/Kconfig | 1 + arch/powerpc/include/asm/mman.h | 2 +- arch/powerpc/include/asm/vdso/getrandom.h| 54 arch/powerpc/include/asm/vdso/vsyscall.h | 6 + arch/powerpc/include/asm/vdso_datapage.h | 2 + arch/powerpc/kernel/asm-offsets.c| 1 + arch/powerpc/kernel/vdso/Makefile| 57 ++-- arch/powerpc/kernel/vdso/getrandom.S | 58 arch/powerpc/kernel/vdso/gettimeofday.S | 13 - arch/powerpc/kernel/vdso/vdso32.lds.S| 1 + arch/powerpc/kernel/vdso/vdso64.lds.S| 1 + arch/powerpc/kernel/vdso/vgetrandom-chacha.S | 320 +++ arch/powerpc/kernel/vdso/vgetrandom.c| 14 + fs/proc/task_mmu.c | 4 +- include/linux/mm.h | 4 +- include/trace/events/mmflags.h | 4 +- tools/testing/selftests/vDSO/Makefile| 2 +- 17 files changed, 501 insertions(+), 43 deletions(-) create mode 100644 arch/powerpc/include/asm/vdso/getrandom.h create mode 100644 arch/powerpc/kernel/vdso/getrandom.S create mode 100644 arch/powerpc/kernel/vdso/vgetrandom-chacha.S create mode 100644 arch/powerpc/kernel/vdso/vgetrandom.c -- 2.44.0
Re: [PATCH 0/2] mm: make copy_to_kernel_nofault() not fault on user addresses
Le 02/09/2024 à 07:31, Omar Sandoval a écrit : [Vous ne recevez pas souvent de courriers de osan...@osandov.com. Découvrez pourquoi ceci est important à https://aka.ms/LearnAboutSenderIdentification ] From: Omar Sandoval Hi, I hit a case where copy_to_kernel_nofault() will fault (lol): if the destination address is in userspace and x86 Supervisor Mode Access Prevention is enabled. Patch 2 has the details and the fix. Patch 1 renames a helper function so that its use in patch 2 makes more sense. If the rename is too intrusive, I can drop it. The name of the function is "copy_to_kernel". If the destination is a user address, it is not a copy to kernel but a copy to user and you already have the function copy_to_user() for that. copy_to_user() properly handles SMAP. Christophe Thanks, Omar Omar Sandoval (2): mm: rename copy_from_kernel_nofault_allowed() to copy_kernel_nofault_allowed() mm: make copy_to_kernel_nofault() not fault on user addresses arch/arm/mm/fault.c | 2 +- arch/loongarch/mm/maccess.c | 2 +- arch/mips/mm/maccess.c | 2 +- arch/parisc/lib/memcpy.c| 2 +- arch/powerpc/mm/maccess.c | 2 +- arch/um/kernel/maccess.c| 2 +- arch/x86/mm/maccess.c | 4 ++-- include/linux/uaccess.h | 2 +- mm/maccess.c| 10 ++ 9 files changed, 15 insertions(+), 13 deletions(-) -- 2.46.0
[PATCH] selftests: vDSO: Also test counter in vdso_test_chacha
The chacha vDSO selftest doesn't check the way the counter is handled by __arch_chacha20_blocks_nostack(). It indirectly checks that the counter is writen on exit and read back on new entry, but it doesn't check that the format is correct. It has led to an invisible erroneous implementation on powerpc where the counter was writen and read in wrong byte order. Also, the counter uses two words, but the tests with a zero counter and uses a small amount of blocks so at the end the upper part of the counter is always 0 so it is not checked. Add a verification of counter's content in addition to the verification of the output. Also add two tests where the counter crosses the u32 upper limit. The first test verifies that the function properly writes back the upper word, the second test verifies that the function properly reads back the upper word. While at it, remove 'nonce' which is not unused anymore after the replacement of libsodium by open coded chacha implementation. Signed-off-by: Christophe Leroy --- .../testing/selftests/vDSO/vdso_test_chacha.c | 39 ++- 1 file changed, 30 insertions(+), 9 deletions(-) diff --git a/tools/testing/selftests/vDSO/vdso_test_chacha.c b/tools/testing/selftests/vDSO/vdso_test_chacha.c index 9d18d49a82f8..ed6cf372d9ee 100644 --- a/tools/testing/selftests/vDSO/vdso_test_chacha.c +++ b/tools/testing/selftests/vDSO/vdso_test_chacha.c @@ -17,11 +17,12 @@ static uint32_t rol32(uint32_t word, unsigned int shift) return (word << (shift & 31)) | (word >> ((-shift) & 31)); } -static void reference_chacha20_blocks(uint8_t *dst_bytes, const uint32_t *key, size_t nblocks) +static void reference_chacha20_blocks(uint8_t *dst_bytes, const uint32_t *key, uint32_t *counter, size_t nblocks) { uint32_t s[16] = { 0x61707865U, 0x3320646eU, 0x79622d32U, 0x6b206574U, - key[0], key[1], key[2], key[3], key[4], key[5], key[6], key[7] + key[0], key[1], key[2], key[3], key[4], key[5], key[6], key[7], + counter[0], counter[1], }; while (nblocks--) { @@ -52,6 +53,8 @@ static void reference_chacha20_blocks(uint8_t *dst_bytes, const uint32_t *key, s if (!++s[12]) ++s[13]; } + counter[0] = s[12]; + counter[1] = s[13]; } typedef uint8_t u8; @@ -66,8 +69,7 @@ typedef uint64_t u64; int main(int argc, char *argv[]) { enum { TRIALS = 1000, BLOCKS = 128, BLOCK_SIZE = 64 }; - static const uint8_t nonce[8] = { 0 }; - uint32_t counter[2]; + uint32_t counter1[2], counter2[2]; uint32_t key[8]; uint8_t output1[BLOCK_SIZE * BLOCKS], output2[BLOCK_SIZE * BLOCKS]; @@ -84,17 +86,36 @@ int main(int argc, char *argv[]) printf("getrandom() failed!\n"); return KSFT_SKIP; } - reference_chacha20_blocks(output1, key, BLOCKS); + memset(counter1, 0, sizeof(counter1)); + reference_chacha20_blocks(output1, key, counter1, BLOCKS); for (unsigned int split = 0; split < BLOCKS; ++split) { memset(output2, 'X', sizeof(output2)); - memset(counter, 0, sizeof(counter)); + memset(counter2, 0, sizeof(counter2)); if (split) - __arch_chacha20_blocks_nostack(output2, key, counter, split); - __arch_chacha20_blocks_nostack(output2 + split * BLOCK_SIZE, key, counter, BLOCKS - split); - if (memcmp(output1, output2, sizeof(output1))) + __arch_chacha20_blocks_nostack(output2, key, counter2, split); + __arch_chacha20_blocks_nostack(output2 + split * BLOCK_SIZE, key, counter2, BLOCKS - split); + if (memcmp(output1, output2, sizeof(output1)) || + memcmp(counter2, counter2, sizeof(counter1))) return KSFT_FAIL; } } + memset(counter1, 0, sizeof(counter1)); + counter1[0] = (uint32_t)-BLOCKS + 2; + memset(counter2, 0, sizeof(counter2)); + counter2[0] = (uint32_t)-BLOCKS + 2; + + reference_chacha20_blocks(output1, key, counter1, BLOCKS); + __arch_chacha20_blocks_nostack(output2, key, counter2, BLOCKS); + if (memcmp(output1, output2, sizeof(output1)) || + memcmp(counter2, counter2, sizeof(counter1))) + return KSFT_FAIL; + + reference_chacha20_blocks(output1, key, counter1, BLOCKS); + __arch_chacha20_blocks_nostack(output2, key, counter2, BLOCKS); + if (memcmp(output1, output2, sizeof(output1)) || + memcmp(counter2, counter2, sizeof(counter1))) + return KSFT_FAIL; + ksft_test_result_pass("chacha: PASS\n"); return KSFT_PASS; } -- 2.44.0
[PATCH] selftests: vDSO: Build vDSO tests with O2 optimisation
Without -O2, the generated code for testing chacha function is awful. GCC even implements rol32() as a function instead of just using the rotlwi instruction, that function is 20 instructions long. ~# time ./vdso_test_chacha TAP version 13 1..1 ok 1 chacha: PASS real0m 37.16s user0m 36.89s sys 0m 0.26s Several other selftests directory add -O2, and the kernel is also always built with optimisation active. Do the same for vDSO selftests. With this patch the time is reduced by approx 15%. ~# time ./vdso_test_chacha TAP version 13 1..1 ok 1 chacha: PASS real0m 32.09s user0m 31.86s sys 0m 0.22s Signed-off-by: Christophe Leroy --- tools/testing/selftests/vDSO/Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/testing/selftests/vDSO/Makefile b/tools/testing/selftests/vDSO/Makefile index cfb7c281b22c..96f25aa2f84e 100644 --- a/tools/testing/selftests/vDSO/Makefile +++ b/tools/testing/selftests/vDSO/Makefile @@ -13,7 +13,7 @@ TEST_GEN_PROGS += vdso_test_correctness TEST_GEN_PROGS += vdso_test_getrandom TEST_GEN_PROGS += vdso_test_chacha -CFLAGS := -std=gnu99 +CFLAGS := -std=gnu99 -O2 ifeq ($(CONFIG_X86_32),y) LDLIBS += -lgcc_s -- 2.44.0
Re: [PATCH v3 4/5] powerpc/vdso: Wire up getrandom() vDSO implementation on PPC32
Le 30/08/2024 à 18:17, Jason A. Donenfeld a écrit : On Fri, Aug 30, 2024 at 05:57:08PM +0200, Christophe Leroy wrote: @@ -14,6 +14,10 @@ ifeq ($(uname_M),x86_64) TEST_GEN_PROGS += vdso_test_getrandom TEST_GEN_PROGS += vdso_test_chacha endif +ifeq ($(ARCH),powerpc) +TEST_GEN_PROGS += vdso_test_getrandom +TEST_GEN_PROGS += vdso_test_chacha +endif FYI, as of [1], you should now be able to add powerpc to the filter list instead of having to duplicate a new stanza: [1] https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.kernel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Fcrng%2Frandom.git%2Fcommit%2F%3Fid%3Dbbaae98172ed284fc0d5d39cc0d68f5d06164f06&data=05%7C02%7Cchristophe.leroy%40csgroup.eu%7C4f51736c027a44cc7df908dcc90f46d6%7C8b87af7d86474dc78df45f69a2011bb5%7C0%7C0%7C638606314665557021%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=UtLdKTuEaZVhsguKa7kX1TBJ%2BvvQtl7DmU9hSBeThWo%3D&reserved=0 I'm a bit sceptic with that commit. IIUC you are changing the meaning of $ARCH. How does that fit with the $ARCH we give when we cross-build or with the ARCH which is set by the top-level Makefile in tools/testing/selftests ? Also, wouldn't there be a way to use scripts/subarch.include instead of opencoding ? Afterall, would it be a problem to build it even for i386 ? It should now be ignored anyway with your new commit f78280b1a3ce ("selftests: vDSO: skip getrandom test if architecture is unsupported") Christophe
Re: [PATCH v3 4/5] powerpc/vdso: Wire up getrandom() vDSO implementation on PPC32
Le 30/08/2024 à 18:42, Christophe Leroy a écrit : Le 30/08/2024 à 18:14, Jason A. Donenfeld a écrit : On Fri, Aug 30, 2024 at 05:57:08PM +0200, Christophe Leroy wrote: + * r5: 8-byte counter input/output (saved on stack) + * + * r14-r15: counter + */ +SYM_FUNC_START(__arch_chacha20_blocks_nostack) + stwu r1, -96(r1) + stw r5, 20(r1) + stmw r14, 24(r1) + li r31, 4 + LWZX_LE r14, 0, r5 + LWZX_LE r15, r31, r5 Why swap endian on the counter? Unlike the keys, the counter is passed to the function as an u8*, not as a u64*, so I thought it was raw data in little endian order, same as when using Sodium. Is it wrong ? Hum . I looked again and it seems it is already a u32 *. Looks like I mis-read the 8-byte comment. Or I did it right in the begining then I swapped it at the same time as I swapped the keys after my first test when the selftest was using Sodium. I can't remember. I'll fix it. Christophe
Re: [PATCH v3 4/5] powerpc/vdso: Wire up getrandom() vDSO implementation on PPC32
Le 30/08/2024 à 18:14, Jason A. Donenfeld a écrit : On Fri, Aug 30, 2024 at 05:57:08PM +0200, Christophe Leroy wrote: + * r5: 8-byte counter input/output (saved on stack) + * + * r14-r15: counter + */ +SYM_FUNC_START(__arch_chacha20_blocks_nostack) + stwur1, -96(r1) + stw r5, 20(r1) + stmwr14, 24(r1) + li r31, 4 + LWZX_LE r14, 0, r5 + LWZX_LE r15, r31, r5 Why swap endian on the counter? Unlike the keys, the counter is passed to the function as an u8*, not as a u64*, so I thought it was raw data in little endian order, same as when using Sodium. Is it wrong ? Christophe
[PATCH v3 5/5] powerpc/vdso: Wire up getrandom() vDSO implementation on PPC64
Extend getrandom() vDSO implementation to powerpc64. Tested on QEMU on both ppc64_defconfig and ppc64le_defconfig. The results are not precise as it is QEMU on an x86 laptop, but no need to be precise to see the benefit. ~ # ./vdso_test_getrandom bench-single vdso: 2500 times in 4.97162 seconds libc: 2500 times in 75.516749981 seconds syscall: 2500 times in 86.842242014 seconds ~ # ./vdso_test_getrandom bench-single vdso: 2500 times in 6.473814156 seconds libc: 2500 times in 73.875109463 seconds syscall: 2500 times in 71.805066229 seconds Signed-off-by: Christophe Leroy --- v3: New (split out of previous patch) --- arch/powerpc/Kconfig | 2 +- arch/powerpc/include/asm/mman.h | 2 +- arch/powerpc/kernel/vdso/Makefile| 10 +- arch/powerpc/kernel/vdso/getrandom.S | 8 ++ arch/powerpc/kernel/vdso/vdso64.lds.S| 1 + arch/powerpc/kernel/vdso/vgetrandom-chacha.S | 98 6 files changed, 116 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index 54b270ef18b1..b45452ac4a73 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -311,7 +311,7 @@ config PPC select SYSCTL_EXCEPTION_TRACE select THREAD_INFO_IN_TASK select TRACE_IRQFLAGS_SUPPORT - select VDSO_GETRANDOM if PPC32 + select VDSO_GETRANDOM # # Please keep this list sorted alphabetically. # diff --git a/arch/powerpc/include/asm/mman.h b/arch/powerpc/include/asm/mman.h index 17a77d47ed6d..42a51a993d94 100644 --- a/arch/powerpc/include/asm/mman.h +++ b/arch/powerpc/include/asm/mman.h @@ -6,7 +6,7 @@ #include -#ifdef CONFIG_PPC64 +#if defined(CONFIG_PPC64) && !defined(BUILD_VDSO) #include #include diff --git a/arch/powerpc/kernel/vdso/Makefile b/arch/powerpc/kernel/vdso/Makefile index fa0b9b3b51af..56fb1633529a 100644 --- a/arch/powerpc/kernel/vdso/Makefile +++ b/arch/powerpc/kernel/vdso/Makefile @@ -9,6 +9,7 @@ obj-vdso32 = sigtramp32-32.o gettimeofday-32.o datapage-32.o cacheflush-32.o not obj-vdso64 = sigtramp64-64.o gettimeofday-64.o datapage-64.o cacheflush-64.o note-64.o getcpu-64.o obj-vdso32 += getrandom-32.o vgetrandom-chacha-32.o +obj-vdso64 += getrandom-64.o vgetrandom-chacha-64.o ifneq ($(c-gettimeofday-y),) CFLAGS_vgettimeofday-32.o += -include $(c-gettimeofday-y) @@ -21,6 +22,7 @@ endif ifneq ($(c-getrandom-y),) CFLAGS_vgetrandom-32.o += -include $(c-getrandom-y) + CFLAGS_vgetrandom-64.o += -include $(c-getrandom-y) $(call cc-option, -ffixed-r30) endif # Build rules @@ -34,10 +36,10 @@ endif targets := $(obj-vdso32) vdso32.so.dbg vgettimeofday-32.o vgetrandom-32.o targets += crtsavres-32.o obj-vdso32 := $(addprefix $(obj)/, $(obj-vdso32)) -targets += $(obj-vdso64) vdso64.so.dbg vgettimeofday-64.o +targets += $(obj-vdso64) vdso64.so.dbg vgettimeofday-64.o vgetrandom-64.o obj-vdso64 := $(addprefix $(obj)/, $(obj-vdso64)) -ccflags-y := -fno-common -fno-builtin +ccflags-y := -fno-common -fno-builtin -DBUILD_VDSO ccflags-y += $(DISABLE_LATENT_ENTROPY_PLUGIN) ccflags-y += $(call cc-option, -fno-stack-protector) ccflags-y += -DDISABLE_BRANCH_PROFILING @@ -71,7 +73,7 @@ CPPFLAGS_vdso64.lds += -P -C # link rule for the .so file, .lds has to be first $(obj)/vdso32.so.dbg: $(obj)/vdso32.lds $(obj-vdso32) $(obj)/vgettimeofday-32.o $(obj)/vgetrandom-32.o $(obj)/crtsavres-32.o FORCE $(call if_changed,vdso32ld_and_check) -$(obj)/vdso64.so.dbg: $(obj)/vdso64.lds $(obj-vdso64) $(obj)/vgettimeofday-64.o FORCE +$(obj)/vdso64.so.dbg: $(obj)/vdso64.lds $(obj-vdso64) $(obj)/vgettimeofday-64.o $(obj)/vgetrandom-64.o FORCE $(call if_changed,vdso64ld_and_check) # assembly rules for the .S files @@ -87,6 +89,8 @@ $(obj-vdso64): %-64.o: %.S FORCE $(call if_changed_dep,vdso64as) $(obj)/vgettimeofday-64.o: %-64.o: %.c FORCE $(call if_changed_dep,cc_o_c) +$(obj)/vgetrandom-64.o: %-64.o: %.c FORCE + $(call if_changed_dep,cc_o_c) # Generate VDSO offsets using helper script gen-vdso32sym := $(src)/gen_vdso32_offsets.sh diff --git a/arch/powerpc/kernel/vdso/getrandom.S b/arch/powerpc/kernel/vdso/getrandom.S index 21773ef3fc1d..a957cd2b2b03 100644 --- a/arch/powerpc/kernel/vdso/getrandom.S +++ b/arch/powerpc/kernel/vdso/getrandom.S @@ -27,10 +27,18 @@ .cfi_adjust_cfa_offset PPC_MIN_STKFRM PPC_STL r0, PPC_MIN_STKFRM + PPC_LR_STKOFF(r1) .cfi_rel_offset lr, PPC_MIN_STKFRM + PPC_LR_STKOFF +#ifdef __powerpc64__ + PPC_STL r2, PPC_MIN_STKFRM + STK_GOT(r1) + .cfi_rel_offset r2, PPC_MIN_STKFRM + STK_GOT +#endif get_datapager8 addir8, r8, VDSO_RNG_DATA_OFFSET bl CFUNC(DOTSYM(\funct)) PPC_LL r0, PPC_MIN_STKFRM + PPC_LR_STKOFF(r1) +#ifdef __powerpc64__ + PPC_LL r2, PPC_MIN_STKFRM + STK_GOT(r1) + .cfi_restore r2
[PATCH v3 3/5] powerpc/vdso: Refactor CFLAGS for CVDSO build
In order to avoid two much duplication when we add new VDSO functionnalities in C like getrandom, refactor common CFLAGS. Signed-off-by: Christophe Leroy --- v3: Also refactor removed flags --- arch/powerpc/kernel/vdso/Makefile | 32 +-- 1 file changed, 13 insertions(+), 19 deletions(-) diff --git a/arch/powerpc/kernel/vdso/Makefile b/arch/powerpc/kernel/vdso/Makefile index c07a425b8f78..67fe79d26fae 100644 --- a/arch/powerpc/kernel/vdso/Makefile +++ b/arch/powerpc/kernel/vdso/Makefile @@ -10,28 +10,11 @@ obj-vdso64 = sigtramp64-64.o gettimeofday-64.o datapage-64.o cacheflush-64.o not ifneq ($(c-gettimeofday-y),) CFLAGS_vgettimeofday-32.o += -include $(c-gettimeofday-y) - CFLAGS_vgettimeofday-32.o += $(DISABLE_LATENT_ENTROPY_PLUGIN) - CFLAGS_vgettimeofday-32.o += $(call cc-option, -fno-stack-protector) - CFLAGS_vgettimeofday-32.o += -DDISABLE_BRANCH_PROFILING - CFLAGS_vgettimeofday-32.o += -ffreestanding -fasynchronous-unwind-tables - CFLAGS_REMOVE_vgettimeofday-32.o = $(CC_FLAGS_FTRACE) - CFLAGS_REMOVE_vgettimeofday-32.o += -mcmodel=medium -mabi=elfv1 -mabi=elfv2 -mcall-aixdesc - # This flag is supported by clang for 64-bit but not 32-bit so it will cause - # an unused command line flag warning for this file. - ifdef CONFIG_CC_IS_CLANG - CFLAGS_REMOVE_vgettimeofday-32.o += -fno-stack-clash-protection - endif - CFLAGS_vgettimeofday-64.o += -include $(c-gettimeofday-y) - CFLAGS_vgettimeofday-64.o += $(DISABLE_LATENT_ENTROPY_PLUGIN) - CFLAGS_vgettimeofday-64.o += $(call cc-option, -fno-stack-protector) - CFLAGS_vgettimeofday-64.o += -DDISABLE_BRANCH_PROFILING - CFLAGS_vgettimeofday-64.o += -ffreestanding -fasynchronous-unwind-tables - CFLAGS_REMOVE_vgettimeofday-64.o = $(CC_FLAGS_FTRACE) # Go prior to 1.16.x assumes r30 is not clobbered by any VDSO code. That used to be true # by accident when the VDSO was hand-written asm code, but may not be now that the VDSO is # compiler generated. To avoid breaking Go tell GCC not to use r30. Impact on code # generation is minimal, it will just use r29 instead. - CFLAGS_vgettimeofday-64.o += $(call cc-option, -ffixed-r30) + CFLAGS_vgettimeofday-64.o += -include $(c-gettimeofday-y) $(call cc-option, -ffixed-r30) endif # Build rules @@ -49,6 +32,11 @@ targets += $(obj-vdso64) vdso64.so.dbg vgettimeofday-64.o obj-vdso64 := $(addprefix $(obj)/, $(obj-vdso64)) ccflags-y := -fno-common -fno-builtin +ccflags-y += $(DISABLE_LATENT_ENTROPY_PLUGIN) +ccflags-y += $(call cc-option, -fno-stack-protector) +ccflags-y += -DDISABLE_BRANCH_PROFILING +ccflags-y += -ffreestanding -fasynchronous-unwind-tables +ccflags-remove-y := $(CC_FLAGS_FTRACE) ldflags-y := -Wl,--hash-style=both -nostdlib -shared -z noexecstack $(CLANG_FLAGS) ldflags-$(CONFIG_LD_IS_LLD) += $(call cc-option,--ld-path=$(LD),-fuse-ld=lld) ldflags-$(CONFIG_LD_ORPHAN_WARN) += -Wl,--orphan-handling=$(CONFIG_LD_ORPHAN_WARN_LEVEL) @@ -57,6 +45,12 @@ ldflags-$(CONFIG_LD_ORPHAN_WARN) += -Wl,--orphan-handling=$(CONFIG_LD_ORPHAN_WAR ldflags-y += $(filter-out $(CC_AUTO_VAR_INIT_ZERO_ENABLER) $(CC_FLAGS_FTRACE) -Wa$(comma)%, $(KBUILD_CFLAGS)) CC32FLAGS := -m32 +CC32FLAGSREMOVE := -mcmodel=medium -mabi=elfv1 -mabi=elfv2 -mcall-aixdesc + # This flag is supported by clang for 64-bit but not 32-bit so it will cause + # an unused command line flag warning for this file. +ifdef CONFIG_CC_IS_CLANG +CC32FLAGSREMOVE += -fno-stack-clash-protection +endif LD32FLAGS := -Wl,-soname=linux-vdso32.so.1 AS32FLAGS := -D__VDSO32__ @@ -105,7 +99,7 @@ quiet_cmd_vdso32ld_and_check = VDSO32L $@ quiet_cmd_vdso32as = VDSO32A $@ cmd_vdso32as = $(VDSOCC) $(a_flags) $(CC32FLAGS) $(AS32FLAGS) -c -o $@ $< quiet_cmd_vdso32cc = VDSO32C $@ - cmd_vdso32cc = $(VDSOCC) $(c_flags) $(CC32FLAGS) -c -o $@ $< + cmd_vdso32cc = $(VDSOCC) $(filter-out $(CC32FLAGSREMOVE), $(c_flags)) $(CC32FLAGS) -c -o $@ $< quiet_cmd_vdso64ld_and_check = VDSO64L $@ cmd_vdso64ld_and_check = $(VDSOCC) $(ldflags-y) $(LD64FLAGS) -o $@ -Wl,-T$(filter %.lds,$^) $(filter %.o,$^); $(cmd_vdso_check) -- 2.44.0
[PATCH v3 4/5] powerpc/vdso: Wire up getrandom() vDSO implementation on PPC32
To be consistent with other VDSO functions, the function is called __kernel_getrandom() __arch_chacha20_blocks_nostack() fonction is implemented basically with 32 bits operations. It performs 4 QUARTERROUND operations in parallele. There are enough registers to avoid using the stack: On input: r3: output bytes r4: 32-byte key input r5: 8-byte counter input/output r6: number of 64-byte blocks to write to output During operation: stack: pointer to counter (r5) and non-volatile registers (r14-131) r0: counter of blocks (initialised with r6) r4: Value '4' after key has been read, used for indexing r5-r12: key r14-r15: block counter r16-r31: chacha state At the end: r0, r6-r12: Zeroised r5, r14-r31: Restored Performance on powerpc 885 (using kernel selftest): ~# ./vdso_test_getrandom bench-single vdso: 2500 times in 62.938002291 seconds libc: 2500 times in 535.581916866 seconds syscall: 2500 times in 531.525042806 seconds Performance on powerpc 8321 (using kernel selftest): ~# ./vdso_test_getrandom bench-single vdso: 2500 times in 16.899318858 seconds libc: 2500 times in 131.050596522 seconds syscall: 2500 times in 129.794790389 seconds Signed-off-by: Christophe Leroy --- v3: - Preserve r13, implies saving r5 on stack - Split PPC64 implementation out. --- arch/powerpc/Kconfig | 1 + arch/powerpc/include/asm/asm-compat.h| 8 + arch/powerpc/include/asm/vdso/getrandom.h| 54 + arch/powerpc/include/asm/vdso/vsyscall.h | 6 + arch/powerpc/include/asm/vdso_datapage.h | 2 + arch/powerpc/kernel/asm-offsets.c| 1 + arch/powerpc/kernel/vdso/Makefile| 12 +- arch/powerpc/kernel/vdso/getrandom.S | 50 + arch/powerpc/kernel/vdso/vdso32.lds.S| 1 + arch/powerpc/kernel/vdso/vgetrandom-chacha.S | 201 +++ arch/powerpc/kernel/vdso/vgetrandom.c| 14 ++ tools/arch/powerpc/vdso | 1 + tools/testing/selftests/vDSO/Makefile| 4 + 13 files changed, 353 insertions(+), 2 deletions(-) create mode 100644 arch/powerpc/include/asm/vdso/getrandom.h create mode 100644 arch/powerpc/kernel/vdso/getrandom.S create mode 100644 arch/powerpc/kernel/vdso/vgetrandom-chacha.S create mode 100644 arch/powerpc/kernel/vdso/vgetrandom.c create mode 12 tools/arch/powerpc/vdso diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index d7b09b064a8a..54b270ef18b1 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -311,6 +311,7 @@ config PPC select SYSCTL_EXCEPTION_TRACE select THREAD_INFO_IN_TASK select TRACE_IRQFLAGS_SUPPORT + select VDSO_GETRANDOM if PPC32 # # Please keep this list sorted alphabetically. # diff --git a/arch/powerpc/include/asm/asm-compat.h b/arch/powerpc/include/asm/asm-compat.h index b0b209c1df50..cce8c31b1b33 100644 --- a/arch/powerpc/include/asm/asm-compat.h +++ b/arch/powerpc/include/asm/asm-compat.h @@ -59,4 +59,12 @@ #endif +#ifdef __BIG_ENDIAN__ +#define LWZX_LEstringify_in_c(lwbrx) +#define STWX_LEstringify_in_c(stwbrx) +#else +#define LWZX_LEstringify_in_c(lwzx) +#define STWX_LEstringify_in_c(stwx) +#endif + #endif /* _ASM_POWERPC_ASM_COMPAT_H */ diff --git a/arch/powerpc/include/asm/vdso/getrandom.h b/arch/powerpc/include/asm/vdso/getrandom.h new file mode 100644 index ..501d6bb14e8a --- /dev/null +++ b/arch/powerpc/include/asm/vdso/getrandom.h @@ -0,0 +1,54 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (C) 2024 Christophe Leroy , CS GROUP France + */ +#ifndef _ASM_POWERPC_VDSO_GETRANDOM_H +#define _ASM_POWERPC_VDSO_GETRANDOM_H + +#ifndef __ASSEMBLY__ + +static __always_inline int do_syscall_3(const unsigned long _r0, const unsigned long _r3, + const unsigned long _r4, const unsigned long _r5) +{ + register long r0 asm("r0") = _r0; + register unsigned long r3 asm("r3") = _r3; + register unsigned long r4 asm("r4") = _r4; + register unsigned long r5 asm("r5") = _r5; + register int ret asm ("r3"); + + asm volatile( + " sc\n" + " bns+1f\n" + " neg %0, %0\n" + "1:\n" + : "=r" (ret), "+r" (r4), "+r" (r5), "+r" (r0) + : "r" (r3) + : "memory", "r6", "r7", "r8", "r9", "r10", "r11", "r12", "cr0", "ctr"); + + return ret; +} + +/** + * getrandom_syscall - Invoke the getrandom() sy
[PATCH v3 2/5] powerpc/vdso32: Add crtsavres
Commit 08c18b63d965 ("powerpc/vdso32: Add missing _restgpr_31_x to fix build failure") added _restgpr_31_x to the vdso for gettimeofday, but the work on getrandom shows that we will need more of those functions. Remove _restgpr_31_x and link in crtsavres.o so that we get all save/restore functions when optimising the kernel for size. Signed-off-by: Christophe Leroy --- arch/powerpc/kernel/vdso/Makefile | 5 - arch/powerpc/kernel/vdso/gettimeofday.S | 13 - 2 files changed, 4 insertions(+), 14 deletions(-) diff --git a/arch/powerpc/kernel/vdso/Makefile b/arch/powerpc/kernel/vdso/Makefile index 1425b6edc66b..c07a425b8f78 100644 --- a/arch/powerpc/kernel/vdso/Makefile +++ b/arch/powerpc/kernel/vdso/Makefile @@ -43,6 +43,7 @@ else endif targets := $(obj-vdso32) vdso32.so.dbg vgettimeofday-32.o +targets += crtsavres-32.o obj-vdso32 := $(addprefix $(obj)/, $(obj-vdso32)) targets += $(obj-vdso64) vdso64.so.dbg vgettimeofday-64.o obj-vdso64 := $(addprefix $(obj)/, $(obj-vdso64)) @@ -68,7 +69,7 @@ targets += vdso64.lds CPPFLAGS_vdso64.lds += -P -C # link rule for the .so file, .lds has to be first -$(obj)/vdso32.so.dbg: $(obj)/vdso32.lds $(obj-vdso32) $(obj)/vgettimeofday-32.o FORCE +$(obj)/vdso32.so.dbg: $(obj)/vdso32.lds $(obj-vdso32) $(obj)/vgettimeofday-32.o $(obj)/crtsavres-32.o FORCE $(call if_changed,vdso32ld_and_check) $(obj)/vdso64.so.dbg: $(obj)/vdso64.lds $(obj-vdso64) $(obj)/vgettimeofday-64.o FORCE $(call if_changed,vdso64ld_and_check) @@ -76,6 +77,8 @@ $(obj)/vdso64.so.dbg: $(obj)/vdso64.lds $(obj-vdso64) $(obj)/vgettimeofday-64.o # assembly rules for the .S files $(obj-vdso32): %-32.o: %.S FORCE $(call if_changed_dep,vdso32as) +$(obj)/crtsavres-32.o: %-32.o: $(srctree)/arch/powerpc/lib/crtsavres.S FORCE + $(call if_changed_dep,vdso32as) $(obj)/vgettimeofday-32.o: %-32.o: %.c FORCE $(call if_changed_dep,vdso32cc) $(obj-vdso64): %-64.o: %.S FORCE diff --git a/arch/powerpc/kernel/vdso/gettimeofday.S b/arch/powerpc/kernel/vdso/gettimeofday.S index 48fc6658053a..67254ac9c8bb 100644 --- a/arch/powerpc/kernel/vdso/gettimeofday.S +++ b/arch/powerpc/kernel/vdso/gettimeofday.S @@ -118,16 +118,3 @@ V_FUNCTION_END(__kernel_clock_getres) V_FUNCTION_BEGIN(__kernel_time) cvdso_call __c_kernel_time call_time=1 V_FUNCTION_END(__kernel_time) - -/* Routines for restoring integer registers, called by the compiler. */ -/* Called with r11 pointing to the stack header word of the caller of the */ -/* function, just beyond the end of the integer restore area. */ -#ifndef __powerpc64__ -_GLOBAL(_restgpr_31_x) -_GLOBAL(_rest32gpr_31_x) - lwz r0,4(r11) - lwz r31,-4(r11) - mtlrr0 - mr r1,r11 - blr -#endif -- 2.44.0
[PATCH v3 1/5] mm: Define VM_DROPPABLE for powerpc/32
Commit 9651fcedf7b9 ("mm: add MAP_DROPPABLE for designating always lazily freeable mappings") only adds VM_DROPPABLE for 64 bits architectures. In order to also use the getrandom vDSO implementation on powerpc/32, use VM_ARCH_1 for VM_DROPPABLE on powerpc/32. This is possible because VM_ARCH_1 is used for VM_SAO on powerpc and VM_SAO is only for powerpc/64. Signed-off-by: Christophe Leroy --- v3: Fixed build failure reported by robots. --- fs/proc/task_mmu.c | 4 +++- include/linux/mm.h | 4 +++- include/trace/events/mmflags.h | 4 ++-- 3 files changed, 8 insertions(+), 4 deletions(-) diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index 5f171ad7b436..3a07e13e2f81 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -987,8 +987,10 @@ static void show_smap_vma_flags(struct seq_file *m, struct vm_area_struct *vma) #ifdef CONFIG_X86_USER_SHADOW_STACK [ilog2(VM_SHADOW_STACK)] = "ss", #endif -#ifdef CONFIG_64BIT +#if defined(CONFIG_64BIT) || defined(CONFIG_PPC32) [ilog2(VM_DROPPABLE)] = "dp", +#endif +#ifdef CONFIG_64BIT [ilog2(VM_SEALED)] = "sl", #endif }; diff --git a/include/linux/mm.h b/include/linux/mm.h index 6549d0979b28..028847f39442 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -359,7 +359,7 @@ extern unsigned int kobjsize(const void *objp); #if defined(CONFIG_X86) # define VM_PATVM_ARCH_1 /* PAT reserves whole VMA at once (x86) */ -#elif defined(CONFIG_PPC) +#elif defined(CONFIG_PPC64) # define VM_SAOVM_ARCH_1 /* Strong Access Ordering (powerpc) */ #elif defined(CONFIG_PARISC) # define VM_GROWSUPVM_ARCH_1 @@ -409,6 +409,8 @@ extern unsigned int kobjsize(const void *objp); #ifdef CONFIG_64BIT #define VM_DROPPABLE_BIT 40 #define VM_DROPPABLE BIT(VM_DROPPABLE_BIT) +#elif defined(CONFIG_PPC32) +#define VM_DROPPABLE VM_ARCH_1 #else #define VM_DROPPABLE VM_NONE #endif diff --git a/include/trace/events/mmflags.h b/include/trace/events/mmflags.h index b63d211bd141..37265977d524 100644 --- a/include/trace/events/mmflags.h +++ b/include/trace/events/mmflags.h @@ -143,7 +143,7 @@ IF_HAVE_PG_ARCH_X(arch_3) #if defined(CONFIG_X86) #define __VM_ARCH_SPECIFIC_1 {VM_PAT, "pat" } -#elif defined(CONFIG_PPC) +#elif defined(CONFIG_PPC64) #define __VM_ARCH_SPECIFIC_1 {VM_SAO, "sao" } #elif defined(CONFIG_PARISC) #define __VM_ARCH_SPECIFIC_1 {VM_GROWSUP, "growsup" } @@ -165,7 +165,7 @@ IF_HAVE_PG_ARCH_X(arch_3) # define IF_HAVE_UFFD_MINOR(flag, name) #endif -#ifdef CONFIG_64BIT +#if defined(CONFIG_64BIT) || defined(CONFIG_PPC32) # define IF_HAVE_VM_DROPPABLE(flag, name) {flag, name}, #else # define IF_HAVE_VM_DROPPABLE(flag, name) -- 2.44.0
[PATCH v3 0/5] Wire up getrandom() vDSO implementation on powerpc
This series wires up getrandom() vDSO implementation on powerpc. Tested on PPC32 on real hardware. Tested on PPC64 (both BE and LE) on QEMU: Performance on powerpc 885: ~# ./vdso_test_getrandom bench-single vdso: 2500 times in 62.938002291 seconds libc: 2500 times in 535.581916866 seconds syscall: 2500 times in 531.525042806 seconds Performance on powerpc 8321: ~# ./vdso_test_getrandom bench-single vdso: 2500 times in 16.899318858 seconds libc: 2500 times in 131.050596522 seconds syscall: 2500 times in 129.794790389 seconds Performance on QEMU pseries: ~ # ./vdso_test_getrandom bench-single vdso: 2500 times in 4.97162 seconds libc: 2500 times in 75.516749981 seconds syscall: 2500 times in 86.842242014 seconds In order to run selftests, some fixes are needed, see https://lore.kernel.org/linuxppc-dev/6c5da802e72befecfa09046c489aa45d934d611f.1725020674.git.christophe.le...@csgroup.eu/ Those selftest fixes are independant and are not required to apply and use this series. Changes in v3: - Rebased on recent random git tree (0c7e00e22c21) - Fixed build failures reported by robots around VM_DROPPABLE - Fixed crash on PPC64 due to clobbered r13 by not using r13 anymore (saving it was not enough for signals). - Split final patch in two, first for PPC32, second for PPC64 - Moved selftest fixes out of this series Changes in v2: - Define VM_DROPPABLE for powerpc/32 - Fixes generic vDSO getrandom headers to enable CONFIG_COMPAT build. - Fixed size of generation counter - Fixed selftests to work on non x86 architectures Christophe Leroy (5): mm: Define VM_DROPPABLE for powerpc/32 powerpc/vdso32: Add crtsavres powerpc/vdso: Refactor CFLAGS for CVDSO build powerpc/vdso: Wire up getrandom() vDSO implementation on PPC32 powerpc/vdso: Wire up getrandom() vDSO implementation on PPC64 arch/powerpc/Kconfig | 1 + arch/powerpc/include/asm/asm-compat.h| 8 + arch/powerpc/include/asm/mman.h | 2 +- arch/powerpc/include/asm/vdso/getrandom.h| 54 arch/powerpc/include/asm/vdso/vsyscall.h | 6 + arch/powerpc/include/asm/vdso_datapage.h | 2 + arch/powerpc/kernel/asm-offsets.c| 1 + arch/powerpc/kernel/vdso/Makefile| 57 ++-- arch/powerpc/kernel/vdso/getrandom.S | 58 arch/powerpc/kernel/vdso/gettimeofday.S | 13 - arch/powerpc/kernel/vdso/vdso32.lds.S| 1 + arch/powerpc/kernel/vdso/vdso64.lds.S| 1 + arch/powerpc/kernel/vdso/vgetrandom-chacha.S | 299 +++ arch/powerpc/kernel/vdso/vgetrandom.c| 14 + fs/proc/task_mmu.c | 4 +- include/linux/mm.h | 4 +- include/trace/events/mmflags.h | 4 +- tools/arch/powerpc/vdso | 1 + tools/testing/selftests/vDSO/Makefile| 4 + 19 files changed, 492 insertions(+), 42 deletions(-) create mode 100644 arch/powerpc/include/asm/vdso/getrandom.h create mode 100644 arch/powerpc/kernel/vdso/getrandom.S create mode 100644 arch/powerpc/kernel/vdso/vgetrandom-chacha.S create mode 100644 arch/powerpc/kernel/vdso/vgetrandom.c create mode 12 tools/arch/powerpc/vdso -- 2.44.0
[PATCH 5/5] selftests: vdso: Use parse_vdso.h in vdso_test_abi
Don't duplicate parse_vdso function prototypes, include the header instead. Fixes: 693f5ca08ca0 ("kselftest: Extend vDSO selftest") Signed-off-by: Christophe Leroy --- tools/testing/selftests/vDSO/vdso_test_abi.c | 5 + 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/tools/testing/selftests/vDSO/vdso_test_abi.c b/tools/testing/selftests/vDSO/vdso_test_abi.c index 00034208c4c6..a54424e2336f 100644 --- a/tools/testing/selftests/vDSO/vdso_test_abi.c +++ b/tools/testing/selftests/vDSO/vdso_test_abi.c @@ -21,10 +21,7 @@ #include "../kselftest.h" #include "vdso_config.h" #include "vdso_call.h" - -extern void *vdso_sym(const char *version, const char *name); -extern void vdso_init_from_sysinfo_ehdr(uintptr_t base); -extern void vdso_init_from_auxv(void *auxv); +#include "parse_vdso.h" static const char *version; static const char **name; -- 2.44.0
[PATCH 4/5] selftests: vdso: Fix the way vDSO functions are called for powerpc
vdso_test_correctness test fails on powerpc: ~ # ./vdso_test_correctness ... [RUN] Testing clock_gettime for clock CLOCK_REALTIME_ALARM (8)... [FAIL] No such clock, but __vdso_clock_gettime returned 22 [RUN] Testing clock_gettime for clock CLOCK_BOOTTIME_ALARM (9)... [FAIL] No such clock, but __vdso_clock_gettime returned 22 [RUN] Testing clock_gettime for clock CLOCK_SGI_CYCLE (10)... [FAIL] No such clock, but __vdso_clock_gettime returned 22 ... [RUN] Testing clock_gettime for clock invalid (-1)... [FAIL] No such clock, but __vdso_clock_gettime returned 22 [RUN] Testing clock_gettime for clock invalid (-2147483648)... [FAIL] No such clock, but __vdso_clock_gettime returned 22 [RUN] Testing clock_gettime for clock invalid (2147483647)... [FAIL] No such clock, but __vdso_clock_gettime returned 22 On powerpc, a call to a VDSO function is not an ordinary C function call. Unlike several architectures which returns a negative error code in case of an error, powerpc sets CR[SO] and returns the error code as a positive value. Define and use a macro called VDSO_CALL() which takes a pointer to the function to call, the number of arguments and the arguments. Also update ABI vdso documentation to reflect this subtlety. Provide a specific version of VDSO_CALL() for powerpc that negates the error code on return when CR[SO] is set. Fixes: c7e5789b24d3 ("kselftest: Move test_vdso to the vDSO test suite") Fixes: 2e9a97256616 ("selftests: vdso: Add a selftest for vDSO getcpu()") Fixes: 693f5ca08ca0 ("kselftest: Extend vDSO selftest") Fixes: b2f1c3db2887 ("kselftest: Extend vdso correctness test to clock_gettime64") Fixes: 4920a2590e91 ("selftests/vDSO: add tests for vgetrandom") Signed-off-by: Christophe Leroy --- Documentation/ABI/stable/vdso | 8 ++- tools/testing/selftests/vDSO/vdso_call.h | 70 +++ tools/testing/selftests/vDSO/vdso_test_abi.c | 9 +-- .../selftests/vDSO/vdso_test_correctness.c| 15 ++-- .../testing/selftests/vDSO/vdso_test_getcpu.c | 3 +- .../selftests/vDSO/vdso_test_getrandom.c | 5 +- .../selftests/vDSO/vdso_test_gettimeofday.c | 3 +- 7 files changed, 95 insertions(+), 18 deletions(-) create mode 100644 tools/testing/selftests/vDSO/vdso_call.h diff --git a/Documentation/ABI/stable/vdso b/Documentation/ABI/stable/vdso index 951838d42781..85dbb6a160df 100644 --- a/Documentation/ABI/stable/vdso +++ b/Documentation/ABI/stable/vdso @@ -9,9 +9,11 @@ maps an ELF DSO into that program's address space. This DSO is called the vDSO and it often contains useful and highly-optimized alternatives to real syscalls. -These functions are called just like ordinary C function according to -your platform's ABI. Call them from a sensible context. (For example, -if you set CS on x86 to something strange, the vDSO functions are +These functions are called according to your platform's ABI. On many +platforms they are called just like ordinary C function. On other platforms +(ex: powerpc) they are called with the same convention as system calls which +is different from ordinary C functions. Call them from a sensible context. +(For example, if you set CS on x86 to something strange, the vDSO functions are within their rights to crash.) In addition, if you pass a bad pointer to a vDSO function, you might get SIGSEGV instead of -EFAULT. diff --git a/tools/testing/selftests/vDSO/vdso_call.h b/tools/testing/selftests/vDSO/vdso_call.h new file mode 100644 index ..bb237d771051 --- /dev/null +++ b/tools/testing/selftests/vDSO/vdso_call.h @@ -0,0 +1,70 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Macro to call vDSO functions + * + * Copyright (C) 2024 Christophe Leroy , CS GROUP France + */ +#ifndef __VDSO_CALL_H__ +#define __VDSO_CALL_H__ + +#ifdef __powerpc__ + +#define LOADARGS_1(fn, __arg1) do {\ + _r0 = fn; \ + _r3 = (long)__arg1; \ +} while (0) + +#define LOADARGS_2(fn, __arg1, __arg2) do {\ + _r0 = fn; \ + _r3 = (long)__arg1; \ + _r4 = (long)__arg2; \ +} while (0) + +#define LOADARGS_3(fn, __arg1, __arg2, __arg3) do {\ + _r0 = fn; \ + _r3 = (long)__arg1; \ + _r4 = (long)__arg2; \ + _r5 = (long)__arg3; \ +} while (0) + +#define LOADARGS_5(fn, __arg1, __arg2, __arg3, __arg4, __arg5) do {\ + _r0 = fn; \ + _r3 = (long)__arg1;
[PATCH 3/5] selftests: vdso: Fix vDSO symbols lookup for powerpc64
On powerpc64, following tests fail locating vDSO functions: ~ # ./vdso_test_abi TAP version 13 1..16 # [vDSO kselftest] VDSO_VERSION: LINUX_2.6.15 # Couldn't find __kernel_gettimeofday ok 1 # SKIP __kernel_gettimeofday # clock_id: CLOCK_REALTIME # Couldn't find __kernel_clock_gettime ok 2 # SKIP __kernel_clock_gettime CLOCK_REALTIME # Couldn't find __kernel_clock_getres ok 3 # SKIP __kernel_clock_getres CLOCK_REALTIME ... # Couldn't find __kernel_time ok 16 # SKIP __kernel_time # Totals: pass:0 fail:0 xfail:0 xpass:0 skip:16 error:0 ~ # ./vdso_test_getrandom __kernel_getrandom is missing! ~ # ./vdso_test_gettimeofday Could not find __kernel_gettimeofday ~ # ./vdso_test_getcpu Could not find __kernel_getcpu On powerpc64, as shown below by readelf, vDSO functions symbols have type NOTYPE, so also accept that type when looking for symbols. $ powerpc64-linux-gnu-readelf -a arch/powerpc/kernel/vdso/vdso64.so.dbg ELF Header: Magic: 7f 45 4c 46 02 02 01 00 00 00 00 00 00 00 00 00 Class: ELF64 Data: 2's complement, big endian Version: 1 (current) OS/ABI:UNIX - System V ABI Version: 0 Type: DYN (Shared object file) Machine: PowerPC64 Version: 0x1 ... Symbol table '.dynsym' contains 12 entries: Num:Value Size TypeBind Vis Ndx Name 0: 0 NOTYPE LOCAL DEFAULT UND 1: 052484 NOTYPE GLOBAL DEFAULT8 __[...]@@LINUX_2.6.15 2: 05f036 NOTYPE GLOBAL DEFAULT8 __[...]@@LINUX_2.6.15 3: 057868 NOTYPE GLOBAL DEFAULT8 __[...]@@LINUX_2.6.15 4: 0 OBJECT GLOBAL DEFAULT ABS LINUX_2.6.15 5: 06c048 NOTYPE GLOBAL DEFAULT8 __[...]@@LINUX_2.6.15 6: 0614 172 NOTYPE GLOBAL DEFAULT8 __[...]@@LINUX_2.6.15 7: 06f084 NOTYPE GLOBAL DEFAULT8 __[...]@@LINUX_2.6.15 8: 047c84 NOTYPE GLOBAL DEFAULT8 __[...]@@LINUX_2.6.15 9: 045412 NOTYPE GLOBAL DEFAULT8 __[...]@@LINUX_2.6.15 10: 04d084 NOTYPE GLOBAL DEFAULT8 __[...]@@LINUX_2.6.15 11: 05bc52 NOTYPE GLOBAL DEFAULT8 __[...]@@LINUX_2.6.15 Symbol table '.symtab' contains 56 entries: Num:Value Size TypeBind Vis Ndx Name ... 45: 0 OBJECT GLOBAL DEFAULT ABS LINUX_2.6.15 46: 06c048 NOTYPE GLOBAL DEFAULT8 __kernel_getcpu 47: 052484 NOTYPE GLOBAL DEFAULT8 __kernel_clock_getres 48: 05f036 NOTYPE GLOBAL DEFAULT8 __kernel_get_tbfreq 49: 047c84 NOTYPE GLOBAL DEFAULT8 __kernel_gettimeofday 50: 0614 172 NOTYPE GLOBAL DEFAULT8 __kernel_sync_dicache 51: 06f084 NOTYPE GLOBAL DEFAULT8 __kernel_getrandom 52: 045412 NOTYPE GLOBAL DEFAULT8 __kernel_sigtram[...] 53: 057868 NOTYPE GLOBAL DEFAULT8 __kernel_time 54: 04d084 NOTYPE GLOBAL DEFAULT8 __kernel_clock_g[...] 55: 05bc52 NOTYPE GLOBAL DEFAULT8 __kernel_get_sys[...] Fixes: 98eedc3a9dbf ("Document the vDSO and add a reference parser") Signed-off-by: Christophe Leroy --- tools/testing/selftests/vDSO/parse_vdso.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/tools/testing/selftests/vDSO/parse_vdso.c b/tools/testing/selftests/vDSO/parse_vdso.c index 4ae417372e9e..d9ccc5acac18 100644 --- a/tools/testing/selftests/vDSO/parse_vdso.c +++ b/tools/testing/selftests/vDSO/parse_vdso.c @@ -216,7 +216,8 @@ void *vdso_sym(const char *version, const char *name) ELF(Sym) *sym = &vdso_info.symtab[chain]; /* Check for a defined global or weak function w/ right name. */ - if (ELF64_ST_TYPE(sym->st_info) != STT_FUNC) + if (ELF64_ST_TYPE(sym->st_info) != STT_FUNC && + ELF64_ST_TYPE(sym->st_info) != STT_NOTYPE) continue; if (ELF64_ST_BIND(sym->st_info) != STB_GLOBAL && ELF64_ST_BIND(sym->st_info) != STB_WEAK) -- 2.44.0
[PATCH 2/5] selftests: vdso: Fix vdso_config for powerpc
Running vdso_test_correctness on powerpc64 gives the following warning: ~ # ./vdso_test_correctness Warning: failed to find clock_gettime64 in vDSO This is because vdso_test_correctness was built with VDSO_32BIT defined. __powerpc__ macro is defined on both powerpc32 and powerpc64 so __powerpc64__ needs to be checked first in vdso_config.h Fixes: 693f5ca08ca0 ("kselftest: Extend vDSO selftest") Signed-off-by: Christophe Leroy --- tools/testing/selftests/vDSO/vdso_config.h | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/tools/testing/selftests/vDSO/vdso_config.h b/tools/testing/selftests/vDSO/vdso_config.h index 7b543e7f04d7..00bfed6e4922 100644 --- a/tools/testing/selftests/vDSO/vdso_config.h +++ b/tools/testing/selftests/vDSO/vdso_config.h @@ -18,13 +18,13 @@ #elif defined(__aarch64__) #define VDSO_VERSION 3 #define VDSO_NAMES 0 -#elif defined(__powerpc__) +#elif defined(__powerpc64__) #define VDSO_VERSION 1 #define VDSO_NAMES 0 -#define VDSO_32BIT 1 -#elif defined(__powerpc64__) +#elif defined(__powerpc__) #define VDSO_VERSION 1 #define VDSO_NAMES 0 +#define VDSO_32BIT 1 #elif defined (__s390__) #define VDSO_VERSION 2 #define VDSO_NAMES 0 -- 2.44.0
[PATCH 1/5] selftests: vdso: Fix vDSO name for powerpc
Following error occurs when running vdso_test_correctness on powerpc: ~ # ./vdso_test_correctness [WARN] failed to find vDSO [SKIP] No vDSO, so skipping clock_gettime() tests [SKIP] No vDSO, so skipping clock_gettime64() tests [RUN] Testing getcpu... [OK]CPU 0: syscall: cpu 0, node 0 On powerpc, vDSO is neither called linux-vdso.so.1 nor linux-gate.so.1 but linux-vdso32.so.1 or linux-vdso64.so.1. Also search those two names before giving up. Fixes: c7e5789b24d3 ("kselftest: Move test_vdso to the vDSO test suite") Signed-off-by: Christophe Leroy --- tools/testing/selftests/vDSO/vdso_test_correctness.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/tools/testing/selftests/vDSO/vdso_test_correctness.c b/tools/testing/selftests/vDSO/vdso_test_correctness.c index e691a3cf1491..cdb697ae8343 100644 --- a/tools/testing/selftests/vDSO/vdso_test_correctness.c +++ b/tools/testing/selftests/vDSO/vdso_test_correctness.c @@ -114,6 +114,12 @@ static void fill_function_pointers() if (!vdso) vdso = dlopen("linux-gate.so.1", RTLD_LAZY | RTLD_LOCAL | RTLD_NOLOAD); + if (!vdso) + vdso = dlopen("linux-vdso32.so.1", + RTLD_LAZY | RTLD_LOCAL | RTLD_NOLOAD); + if (!vdso) + vdso = dlopen("linux-vdso64.so.1", + RTLD_LAZY | RTLD_LOCAL | RTLD_NOLOAD); if (!vdso) { printf("[WARN]\tfailed to find vDSO\n"); return; -- 2.44.0
Re: [PATCH v2 05/17] vdso: Avoid call to memset() by getrandom
Le 29/08/2024 à 20:02, Segher Boessenkool a écrit : On Thu, Aug 29, 2024 at 07:36:38PM +0200, Christophe Leroy wrote: Le 28/08/2024 à 19:25, Segher Boessenkool a écrit : Not sure about static binaries, though: do those even use the VDSO? With "static binary" people usually mean "a binary not using any DSOs", I think the VDSO is a DSO, also in this respect? As always, -static builds are *way* less problematic (and faster and smaller :-) ) AFAIK on powerpc even static binaries use the vDSO, otherwise signals don't work. How can that work? Non-dynamic binaries do not use ld.so (that is the definition of a dynamic binary, even). So they cannot link (at runtime) to any DSO (unless that is done manually?!) Maybe there is something at a fixed offset in the vDSO, or something like that? Is this documented somewhere? You've got some explanation here : https://github.com/torvalds/linux/blob/master/Documentation/ABI/stable/vdso
Re: [PATCH v2 05/17] vdso: Avoid call to memset() by getrandom
Le 28/08/2024 à 19:25, Segher Boessenkool a écrit : Not sure about static binaries, though: do those even use the VDSO? With "static binary" people usually mean "a binary not using any DSOs", I think the VDSO is a DSO, also in this respect? As always, -static builds are *way* less problematic (and faster and smaller :-) ) AFAIK on powerpc even static binaries use the vDSO, otherwise signals don't work. Christophe
Re: [PATCH] random: vDSO: Redefine PAGE_SIZE and PAGE_MASK
Hi Vincenzo, Le 29/08/2024 à 14:01, Vincenzo Frascino a écrit : Hi Christophe, On 27/08/2024 18:14, Christophe Leroy wrote: Le 27/08/2024 à 18:05, Vincenzo Frascino a écrit : Hi Christophe, On 27/08/2024 11:49, Christophe Leroy wrote: ... ... Could you please clarify where minmax is needed? I tried to build Jason's master tree for x86, commenting the header and it seems building fine. I might be missing something. Without it: VDSO32C arch/powerpc/kernel/vdso/vgetrandom-32.o In file included from /home/chleroy/linux-powerpc/lib/vdso/getrandom.c:11, from : ... Same for ARRAY_SIZE(->reserved) by the way, easy to do opencode, we also have it only once I have a similar issue to figure out why linux/array_size.h and uapi/linux/random.h are needed. It seems that I can build the object without them. Could you please explain? Without linux/array_size.h: VDSO32C arch/powerpc/kernel/vdso/vgetrandom-32.o In file included from : /home/chleroy/linux-powerpc/lib/vdso/getrandom.c: In function '__cvdso_getrandom_data': /home/chleroy/linux-powerpc/lib/vdso/getrandom.c:89:40: error: implicit If this is the case, those headers should be defined for the powerpc implementation only. The generic implementation should be interpreted as the minimum common denominator in between all the architectures for what concerns the headers. Sorry, I disagree. You can't rely on necessary headers being included indirectly by other arch specific headers. getrandom.c uses ARRAY_SIZE(), it must include the header that defines ARRAY_SIZE(). At the moment, on x86 you get linux/array.h by change through the following chain, that the reason why the build doesn't break: In file included from ./include/linux/kernel.h:16, from ./include/linux/cpumask.h:11, from ./arch/x86/include/asm/cpumask.h:5, from ./arch/x86/include/asm/msr.h:11, from ./arch/x86/include/asm/vdso/gettimeofday.h:19, from ./include/vdso/datapage.h:164, from arch/x86/entry/vdso/../../../../lib/vdso/getrandom.c:9, From my point of view you can't expect such a chain from each architecture. Christophe
Re: [PATCH net-next 5/6] net: ethernet: fs_enet: fcc: use macros for speed and duplex values
Le 28/08/2024 à 11:51, Maxime Chevallier a écrit : The PHY speed and duplex should be manipulated using the SPEED_XXX and DUPLEX_XXX macros available. Use it in the fcc, fec and scc MAC for fs_enet. Signed-off-by: Maxime Chevallier Reviewed-by: Christophe Leroy --- drivers/net/ethernet/freescale/fs_enet/mac-fcc.c | 4 ++-- drivers/net/ethernet/freescale/fs_enet/mac-fec.c | 2 +- drivers/net/ethernet/freescale/fs_enet/mac-scc.c | 2 +- 3 files changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/net/ethernet/freescale/fs_enet/mac-fcc.c b/drivers/net/ethernet/freescale/fs_enet/mac-fcc.c index add062928d99..056909156b4f 100644 --- a/drivers/net/ethernet/freescale/fs_enet/mac-fcc.c +++ b/drivers/net/ethernet/freescale/fs_enet/mac-fcc.c @@ -361,7 +361,7 @@ static void restart(struct net_device *dev) /* adjust to speed (for RMII mode) */ if (fpi->use_rmii) { - if (dev->phydev->speed == 100) + if (dev->phydev->speed == SPEED_100) C8(fcccp, fcc_gfemr, 0x20); else S8(fcccp, fcc_gfemr, 0x20); @@ -387,7 +387,7 @@ static void restart(struct net_device *dev) S32(fccp, fcc_fpsmr, FCC_PSMR_RMII); /* adjust to duplex mode */ - if (dev->phydev->duplex) + if (dev->phydev->duplex == DUPLEX_FULL) S32(fccp, fcc_fpsmr, FCC_PSMR_FDE | FCC_PSMR_LPB); else C32(fccp, fcc_fpsmr, FCC_PSMR_FDE | FCC_PSMR_LPB); diff --git a/drivers/net/ethernet/freescale/fs_enet/mac-fec.c b/drivers/net/ethernet/freescale/fs_enet/mac-fec.c index f75acb3b358f..855ee9e3f042 100644 --- a/drivers/net/ethernet/freescale/fs_enet/mac-fec.c +++ b/drivers/net/ethernet/freescale/fs_enet/mac-fec.c @@ -309,7 +309,7 @@ static void restart(struct net_device *dev) /* * adjust to duplex mode */ - if (dev->phydev->duplex) { + if (dev->phydev->duplex == DUPLEX_FULL) { FC(fecp, r_cntrl, FEC_RCNTRL_DRT); FS(fecp, x_cntrl, FEC_TCNTRL_FDEN); /* FD enable */ } else { diff --git a/drivers/net/ethernet/freescale/fs_enet/mac-scc.c b/drivers/net/ethernet/freescale/fs_enet/mac-scc.c index 29ba0048396b..9e5e29312c27 100644 --- a/drivers/net/ethernet/freescale/fs_enet/mac-scc.c +++ b/drivers/net/ethernet/freescale/fs_enet/mac-scc.c @@ -338,7 +338,7 @@ static void restart(struct net_device *dev) W16(sccp, scc_psmr, SCC_PSMR_ENCRC | SCC_PSMR_NIB22); /* Set full duplex mode if needed */ - if (dev->phydev->duplex) + if (dev->phydev->duplex == DUPLEX_FULL) S16(sccp, scc_psmr, SCC_PSMR_LPB | SCC_PSMR_FDE); /* Restore multicast and promiscuous settings */
Re: [PATCH net-next 4/6] net: ethernet: fs_enet: drop unused phy_info and mii_if_info
Le 28/08/2024 à 11:51, Maxime Chevallier a écrit : There's no user of the struct phy_info, the 'phy' field and the mii_if_info in the fs_enet driver, probably dating back when phylib wasn't as widely used. Drop these from the driver code. Seems like they haven't been used since commit 5b4b8454344a ("[PATCH] FS_ENET: use PAL for mii management") Reviewed-by: Christophe Leroy Signed-off-by: Maxime Chevallier --- drivers/net/ethernet/freescale/fs_enet/fs_enet.h | 11 --- 1 file changed, 11 deletions(-) diff --git a/drivers/net/ethernet/freescale/fs_enet/fs_enet.h b/drivers/net/ethernet/freescale/fs_enet/fs_enet.h index abe4dc97e52a..781f506c933c 100644 --- a/drivers/net/ethernet/freescale/fs_enet/fs_enet.h +++ b/drivers/net/ethernet/freescale/fs_enet/fs_enet.h @@ -92,14 +92,6 @@ struct fs_ops { void (*tx_restart)(struct net_device *dev); }; -struct phy_info { - unsigned int id; - const char *name; - void (*startup) (struct net_device * dev); - void (*shutdown) (struct net_device * dev); - void (*ack_int) (struct net_device * dev); -}; - /* The FEC stores dest/src/type, data, and checksum for receive packets. */ #define MAX_MTU 1508 /* Allow fullsized pppoe packets over VLAN */ @@ -153,10 +145,7 @@ struct fs_enet_private { cbd_t __iomem *cur_rx; cbd_t __iomem *cur_tx; int tx_free; - const struct phy_info *phy; u32 msg_enable; - struct mii_if_info mii_if; - unsigned int last_mii_status; int interrupt; int oldduplex, oldspeed, oldlink; /* current settings */
Re: [PATCH net-next 3/6] net: ethernet: fs_enet: drop the .adjust_link custom fs_ops
Le 28/08/2024 à 11:50, Maxime Chevallier a écrit : There's no in-tree user for the fs_ops .adjust_link() function, so we can always use the generic one in fe_enet-main. Signed-off-by: Maxime Chevallier Reviewed-by: Christophe Leroy --- drivers/net/ethernet/freescale/fs_enet/fs_enet-main.c | 7 +-- drivers/net/ethernet/freescale/fs_enet/fs_enet.h | 1 - 2 files changed, 1 insertion(+), 7 deletions(-) diff --git a/drivers/net/ethernet/freescale/fs_enet/fs_enet-main.c b/drivers/net/ethernet/freescale/fs_enet/fs_enet-main.c index 2b48a2a5e32d..caca81b3ccb6 100644 --- a/drivers/net/ethernet/freescale/fs_enet/fs_enet-main.c +++ b/drivers/net/ethernet/freescale/fs_enet/fs_enet-main.c @@ -649,12 +649,7 @@ static void fs_adjust_link(struct net_device *dev) unsigned long flags; spin_lock_irqsave(&fep->lock, flags); - - if (fep->ops->adjust_link) - fep->ops->adjust_link(dev); - else - generic_adjust_link(dev); - + generic_adjust_link(dev); spin_unlock_irqrestore(&fep->lock, flags); } diff --git a/drivers/net/ethernet/freescale/fs_enet/fs_enet.h b/drivers/net/ethernet/freescale/fs_enet/fs_enet.h index 21c07ac05225..abe4dc97e52a 100644 --- a/drivers/net/ethernet/freescale/fs_enet/fs_enet.h +++ b/drivers/net/ethernet/freescale/fs_enet/fs_enet.h @@ -77,7 +77,6 @@ struct fs_ops { void (*free_bd)(struct net_device *dev); void (*cleanup_data)(struct net_device *dev); void (*set_multicast_list)(struct net_device *dev); - void (*adjust_link)(struct net_device *dev); void (*restart)(struct net_device *dev); void (*stop)(struct net_device *dev); void (*napi_clear_event)(struct net_device *dev);
Re: [PATCH net-next 2/6] net: ethernet: fs_enet: cosmetic cleanups
Le 28/08/2024 à 11:50, Maxime Chevallier a écrit : Due to the age of the driver and the slow recent activity on it, the code has taken some layers of dust. Clean the main driver file up so that it passes checkpatch and also conforms with the net coding style. Changes include : - Re-ordering of the variable declarations for RCT - Fixing the comment styles to either one-line comments, or net-style comments - Adding braces around single-statement 'else' clauses - Aligning function/macro parameters on the opening parenthesis - Simplifying checks for NULL pointers - Splitting cascaded assignments into individual assignments - Fixing some typos - Fixing whitespace issues This is a cosmetic change and doesn't introduce any change in behaviour. Signed-off-by: Maxime Chevallier Reviewed-by: Christophe Leroy --- .../ethernet/freescale/fs_enet/fs_enet-main.c | 220 +++--- 1 file changed, 89 insertions(+), 131 deletions(-) diff --git a/drivers/net/ethernet/freescale/fs_enet/fs_enet-main.c b/drivers/net/ethernet/freescale/fs_enet/fs_enet-main.c index 5bfdd43ffdeb..2b48a2a5e32d 100644 --- a/drivers/net/ethernet/freescale/fs_enet/fs_enet-main.c +++ b/drivers/net/ethernet/freescale/fs_enet/fs_enet-main.c @@ -81,15 +81,14 @@ static void skb_align(struct sk_buff *skb, int align) static int fs_enet_napi(struct napi_struct *napi, int budget) { struct fs_enet_private *fep = container_of(napi, struct fs_enet_private, napi); - struct net_device *dev = fep->ndev; const struct fs_platform_info *fpi = fep->fpi; - cbd_t __iomem *bdp; + struct net_device *dev = fep->ndev; + int curidx, dirtyidx, received = 0; + int do_wake = 0, do_restart = 0; + int tx_left = TX_RING_SIZE; struct sk_buff *skb, *skbn; - int received = 0; + cbd_t __iomem *bdp; u16 pkt_len, sc; - int curidx; - int dirtyidx, do_wake, do_restart; - int tx_left = TX_RING_SIZE; spin_lock(&fep->tx_lock); bdp = fep->dirty_tx; @@ -97,7 +96,6 @@ static int fs_enet_napi(struct napi_struct *napi, int budget) /* clear status bits for napi*/ (*fep->ops->napi_clear_event)(dev); - do_wake = do_restart = 0; while (((sc = CBDR_SC(bdp)) & BD_ENET_TX_READY) == 0 && tx_left) { dirtyidx = bdp - fep->tx_bd_base; @@ -106,12 +104,9 @@ static int fs_enet_napi(struct napi_struct *napi, int budget) skb = fep->tx_skbuff[dirtyidx]; - /* -* Check for errors. -*/ +/* Check for errors. */ if (sc & (BD_ENET_TX_HB | BD_ENET_TX_LC | BD_ENET_TX_RL | BD_ENET_TX_UN | BD_ENET_TX_CSL)) { - if (sc & BD_ENET_TX_HB) /* No heartbeat */ dev->stats.tx_heartbeat_errors++; if (sc & BD_ENET_TX_LC) /* Late collision */ @@ -127,16 +122,16 @@ static int fs_enet_napi(struct napi_struct *napi, int budget) dev->stats.tx_errors++; do_restart = 1; } - } else + } else { dev->stats.tx_packets++; + } if (sc & BD_ENET_TX_READY) { dev_warn(fep->dev, "HEY! Enet xmit interrupt and TX_READY.\n"); } - /* -* Deferred means some collisions occurred during transmit, + /* Deferred means some collisions occurred during transmit, * but we eventually sent the packet OK. */ if (sc & BD_ENET_TX_DEF) @@ -150,25 +145,20 @@ static int fs_enet_napi(struct napi_struct *napi, int budget) dma_unmap_single(fep->dev, CBDR_BUFADDR(bdp), CBDR_DATLEN(bdp), DMA_TO_DEVICE); - /* -* Free the sk buffer associated with this last transmit. -*/ + /* Free the sk buffer associated with this last transmit. */ if (skb) { dev_kfree_skb(skb); fep->tx_skbuff[dirtyidx] = NULL; } - /* -* Update pointer to next buffer descriptor to be transmitted. + /* Update pointer to next buffer descriptor to be transmitted. */ if ((sc & BD_ENET_TX_WRAP) == 0) bdp++; else bdp = fep->tx_bd_base; - /* -* Since we have freed up a buffer, the ring is no longer -* full. + /* Since we have freed up a buffer, the ring is no longer full. */ if (++fep-&
Re: [PATCH net-next 1/6] net: ethernet: fs_enet: convert to SPDX
Le 28/08/2024 à 11:50, Maxime Chevallier a écrit : The ENET driver has SPDX tags in the header files, but they were missing in the C files. Change the licence information to SPDX format. AFAIK you have to CC linux-s...@vger.kernel.org for this kind of change. Signed-off-by: Maxime Chevallier Reviewed-by: Christophe Leroy --- drivers/net/ethernet/freescale/fs_enet/fs_enet-main.c | 5 + drivers/net/ethernet/freescale/fs_enet/mac-fcc.c | 5 + drivers/net/ethernet/freescale/fs_enet/mac-fec.c | 5 + drivers/net/ethernet/freescale/fs_enet/mac-scc.c | 5 + drivers/net/ethernet/freescale/fs_enet/mii-bitbang.c | 5 + drivers/net/ethernet/freescale/fs_enet/mii-fec.c | 5 + 6 files changed, 6 insertions(+), 24 deletions(-) diff --git a/drivers/net/ethernet/freescale/fs_enet/fs_enet-main.c b/drivers/net/ethernet/freescale/fs_enet/fs_enet-main.c index cf392faa6105..5bfdd43ffdeb 100644 --- a/drivers/net/ethernet/freescale/fs_enet/fs_enet-main.c +++ b/drivers/net/ethernet/freescale/fs_enet/fs_enet-main.c @@ -1,3 +1,4 @@ +// SPDX-License-Identifier: GPL-2.0-only /* * Combined Ethernet driver for Motorola MPC8xx and MPC82xx. * @@ -9,10 +10,6 @@ * * Heavily based on original FEC driver by Dan Malek * and modifications by Joakim Tjernlund - * - * This file is licensed under the terms of the GNU General Public License - * version 2. This program is licensed "as is" without any warranty of any - * kind, whether express or implied. */ #include diff --git a/drivers/net/ethernet/freescale/fs_enet/mac-fcc.c b/drivers/net/ethernet/freescale/fs_enet/mac-fcc.c index e2ffac9eb2ad..add062928d99 100644 --- a/drivers/net/ethernet/freescale/fs_enet/mac-fcc.c +++ b/drivers/net/ethernet/freescale/fs_enet/mac-fcc.c @@ -1,3 +1,4 @@ +// SPDX-License-Identifier: GPL-2.0-only /* * FCC driver for Motorola MPC82xx (PQ2). * @@ -6,10 +7,6 @@ * * 2005 (c) MontaVista Software, Inc. * Vitaly Bordug - * - * This file is licensed under the terms of the GNU General Public License - * version 2. This program is licensed "as is" without any warranty of any - * kind, whether express or implied. */ #include diff --git a/drivers/net/ethernet/freescale/fs_enet/mac-fec.c b/drivers/net/ethernet/freescale/fs_enet/mac-fec.c index cdc89d83cf07..f75acb3b358f 100644 --- a/drivers/net/ethernet/freescale/fs_enet/mac-fec.c +++ b/drivers/net/ethernet/freescale/fs_enet/mac-fec.c @@ -1,3 +1,4 @@ +// SPDX-License-Identifier: GPL-2.0-only /* * Freescale Ethernet controllers * @@ -6,10 +7,6 @@ * * 2005 (c) MontaVista Software, Inc. * Vitaly Bordug - * - * This file is licensed under the terms of the GNU General Public License - * version 2. This program is licensed "as is" without any warranty of any - * kind, whether express or implied. */ #include diff --git a/drivers/net/ethernet/freescale/fs_enet/mac-scc.c b/drivers/net/ethernet/freescale/fs_enet/mac-scc.c index 9e89ac2b6ce3..29ba0048396b 100644 --- a/drivers/net/ethernet/freescale/fs_enet/mac-scc.c +++ b/drivers/net/ethernet/freescale/fs_enet/mac-scc.c @@ -1,3 +1,4 @@ +// SPDX-License-Identifier: GPL-2.0-only /* * Ethernet on Serial Communications Controller (SCC) driver for Motorola MPC8xx and MPC82xx. * @@ -6,10 +7,6 @@ * * 2005 (c) MontaVista Software, Inc. * Vitaly Bordug - * - * This file is licensed under the terms of the GNU General Public License - * version 2. This program is licensed "as is" without any warranty of any - * kind, whether express or implied. */ #include diff --git a/drivers/net/ethernet/freescale/fs_enet/mii-bitbang.c b/drivers/net/ethernet/freescale/fs_enet/mii-bitbang.c index f965a2329055..2e210a003558 100644 --- a/drivers/net/ethernet/freescale/fs_enet/mii-bitbang.c +++ b/drivers/net/ethernet/freescale/fs_enet/mii-bitbang.c @@ -1,3 +1,4 @@ +// SPDX-License-Identifier: GPL-2.0-only /* * Combined Ethernet driver for Motorola MPC8xx and MPC82xx. * @@ -6,10 +7,6 @@ * * 2005 (c) MontaVista Software, Inc. * Vitaly Bordug - * - * This file is licensed under the terms of the GNU General Public License - * version 2. This program is licensed "as is" without any warranty of any - * kind, whether express or implied. */ #include diff --git a/drivers/net/ethernet/freescale/fs_enet/mii-fec.c b/drivers/net/ethernet/freescale/fs_enet/mii-fec.c index 7bb69727952a..93d91e8ad0de 100644 --- a/drivers/net/ethernet/freescale/fs_enet/mii-fec.c +++ b/drivers/net/ethernet/freescale/fs_enet/mii-fec.c @@ -1,3 +1,4 @@ +// SPDX-License-Identifier: GPL-2.0-only /* * Combined Ethernet driver for Motorola MPC8xx and MPC82xx. * @@ -6,10 +7,6 @@ * * 2005 (c) MontaVista Software, Inc. * Vitaly Bordug - * - * This file is licensed under the terms of the GNU General Public License - * version 2. This program is licensed "as is" with
Re: [PATCH net-next 0/6] net: ethernet: fs_enet: Cleanup and phylink conversion
Le 28/08/2024 à 11:50, Maxime Chevallier a écrit : This series aims at improving the fs_enet code and port it's PHY handling from direct phylib access to using phylink instead. Although this driver is quite old, there are still some users out there, running an upstream kernel. The development I'm doing is on an MPC885 device, which uses fs_enet, as well as a MPC866-based device. The main motivation for that work is to eventually support ethernet interfaces that have more than one PHY attached to the MAC upstream, for which phylink might be a pre-requisite. That work isn't submitted yet, and the final solution might not even require phylink. Regardless, I do believe that this series is relevant, as it does some cleanup to the driver, and having it use phylink brings some nice improvements as it simplifies the DT parsing, fixed-link handling and removes code in that driver that predates even phylib itself. The series is structured in the following way : - Patches 1 and 2 are cosmetic changes. The former converts the source to SPDX, while the latter has fs_enet-main.c pass checkpatch. Patch 2 is really not mandatory in this series, and I understand that this isn't the easiest or most pleasant patch to review. OTOH, this allows getting a clean checkpatch output for the main part of the driver. - Patches 3, 4 and 5 drop some leftovers from back when the driver didn't use phylib, and brings the use of phylib macros. - Patch 6 is the actual phylink port, which also cleans the bits of code that become irrelevant when using phylink. Testing was done on an MPC866 and MPC885, any test on other platforms that use fs_enet are more than welcome. Thanks, Maxime Maxime Chevallier (6): net: ethernet: fs_enet: convert to SPDX net: ethernet: fs_enet: cosmetic cleanups net: ethernet: fs_enet: drop the .adjust_link custom fs_ops net: ethernet: fs_enet: drop unused phy_info and mii_if_info net: ethernet: fs_enet: fcc: use macros for speed and duplex values net: ethernet: fs_enet: phylink conversion For the series, Acked-by: Christophe Leroy # LINUX FOR POWERPC EMBEDDED PPC8XX AND PPC83XX .../net/ethernet/freescale/fs_enet/Kconfig| 2 +- .../ethernet/freescale/fs_enet/fs_enet-main.c | 421 -- .../net/ethernet/freescale/fs_enet/fs_enet.h | 24 +- .../net/ethernet/freescale/fs_enet/mac-fcc.c | 16 +- .../net/ethernet/freescale/fs_enet/mac-fec.c | 14 +- .../net/ethernet/freescale/fs_enet/mac-scc.c | 10 +- .../ethernet/freescale/fs_enet/mii-bitbang.c | 5 +- .../net/ethernet/freescale/fs_enet/mii-fec.c | 5 +- 8 files changed, 209 insertions(+), 288 deletions(-)
Re: [PATCH v1 2/2] powerpc/debug: hook to user return notifier infrastructure
Le 28/08/2024 à 08:50, Luming Yu a écrit : On Wed, Aug 28, 2024 at 07:46:52AM +0200, Christophe Leroy wrote: Hi, Le 28/08/2024 à 05:17, 虞陆铭 a écrit : Hi, it appears the little feature might require a little bit more work to find its value of the patch. Using the following debug module , some debugging shows the TIF_USER_RETURN_NOTIFY bit is propagated in __switch_to among tasks , but USER_RETURN_NOTIFY call back seems to be dropped somewhere on somone who carries the bit return to user space. side notes: there is an issue that the module symbols is not appended to /sys/kernel/debug/tracing/available_filter_functions which should be sovled first to make it easier for further debuggig. As far as I can see, user return notifier infrastructure was implemented in 2009 for KVM on x86, see https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fall%2F1253105134-8862-1-git-send-email-avi%40redhat.com%2F&data=05%7C02%7Cchristophe.leroy%40csgroup.eu%7C260e5ecf10764312459c08dcc72dc2f5%7C8b87af7d86474dc78df45f69a2011bb5%7C0%7C0%7C638604246584044745%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=3hjAzcVu3xOq0QNK5WINQ8trLd9Xp7XCiQjw2htabpQ%3D&reserved=0 Can you explain what is your usage of that infrastructure with your patch ? You are talking about debug, what's the added value, what is it used for ? one example: I was thinking to live patch kernel at the moment that all cpus are either returning to user space or going into idle. But I'm not sure if it is truly valuable. secondly, it can help us get more accurate user/system time accounting via tracing rather than through sampling technique. The third: it could have similar usages in kvm for ppc as x86 for tsc_aux. etc :-) Thanks. Don't we already have a very accurate user/system time accounting with CONFIG_VIRT_CPU_ACCOUNTING_NATIVE ? Christophe
Re: [PATCH 07/16] powerpc: mm: Support MAP_BELOW_HINT
Hi Charlie, Le 28/08/2024 à 07:49, Charlie Jenkins a écrit : Add support for MAP_BELOW_HINT to arch_get_mmap_base() and arch_get_mmap_end(). Signed-off-by: Charlie Jenkins --- arch/powerpc/include/asm/task_size_64.h | 36 +++-- 1 file changed, 30 insertions(+), 6 deletions(-) diff --git a/arch/powerpc/include/asm/task_size_64.h b/arch/powerpc/include/asm/task_size_64.h index 239b363841aa..a37a5a81365d 100644 --- a/arch/powerpc/include/asm/task_size_64.h +++ b/arch/powerpc/include/asm/task_size_64.h @@ -72,12 +72,36 @@ #define STACK_TOP_MAX TASK_SIZE_USER64 #define STACK_TOP (is_32bit_task() ? STACK_TOP_USER32 : STACK_TOP_USER64) -#define arch_get_mmap_base(addr, len, base, flags) \ - (((addr) > DEFAULT_MAP_WINDOW) ? (base) + TASK_SIZE - DEFAULT_MAP_WINDOW : (base)) +#define arch_get_mmap_base(addr, len, base, flags) \ This macro looks quite big for a macro, can it be a static inline function instead ? Same for the other macro below. +({ \ + unsigned long mmap_base; \ + typeof(flags) _flags = (flags); \ + typeof(addr) _addr = (addr); \ + typeof(base) _base = (base); \ + typeof(len) _len = (len); \ + unsigned long rnd_gap = DEFAULT_MAP_WINDOW - (_base); \ + if (_flags & MAP_BELOW_HINT && _addr != 0 && ((_addr + _len) > BIT(VA_BITS - 1)))\ + mmap_base = (_addr + _len) - rnd_gap; \ + else \ + mmap_end = ((_addr > DEFAULT_MAP_WINDOW) ? \ + _base + TASK_SIZE - DEFAULT_MAP_WINDOW : \ + _base); \ + mmap_end; \ mmap_end doesn't exist, did you mean mmap_base ? +}) -#define arch_get_mmap_end(addr, len, flags) \ - (((addr) > DEFAULT_MAP_WINDOW) || \ -(((flags) & MAP_FIXED) && ((addr) + (len) > DEFAULT_MAP_WINDOW)) ? TASK_SIZE : \ - DEFAULT_MAP_WINDOW) +#define arch_get_mmap_end(addr, len, flags) \ +({ \ + unsigned long mmap_end; \ + typeof(flags) _flags = (flags); \ + typeof(addr) _addr = (addr); \ + typeof(len) _len = (len); \ + if (_flags & MAP_BELOW_HINT && _addr != 0 && ((_addr + _len) > BIT(VA_BITS - 1)))\ + mmap_end = (_addr + _len); \ + else \ + mmap_end = (((_addr) > DEFAULT_MAP_WINDOW) || \ + (((_flags) & MAP_FIXED) && ((_addr) + (_len) > DEFAULT_MAP_WINDOW))\ + ? TASK_SIZE : DEFAULT_MAP_WINDOW) \ + mmap_end; \ +}) #endif /* _ASM_POWERPC_TASK_SIZE_64_H */
Re: [PATCH v1 2/2] powerpc/debug: hook to user return notifier infrastructure
Hi, Le 28/08/2024 à 05:17, 虞陆铭 a écrit : Hi, it appears the little feature might require a little bit more work to find its value of the patch. Using the following debug module , some debugging shows the TIF_USER_RETURN_NOTIFY bit is propagated in __switch_to among tasks , but USER_RETURN_NOTIFY call back seems to be dropped somewhere on somone who carries the bit return to user space. side notes: there is an issue that the module symbols is not appended to /sys/kernel/debug/tracing/available_filter_functions which should be sovled first to make it easier for further debuggig. As far as I can see, user return notifier infrastructure was implemented in 2009 for KVM on x86, see https://lore.kernel.org/all/1253105134-8862-1-git-send-email-...@redhat.com/ Can you explain what is your usage of that infrastructure with your patch ? You are talking about debug, what's the added value, what is it used for ? Thanks Christophe [root@localhost linux]# cat lib/user-return-test.c #include #include #include #include #include #include #include #include MODULE_LICENSE("GPL"); struct test_user_return { struct user_return_notifier urn; bool registered; int urn_value_changed; struct task_struct *worker; }; static struct test_user_return __percpu *user_return_test; static void test_user_return_cb(struct user_return_notifier *urn) { struct test_user_return *tur = container_of(urn, struct test_user_return, urn); unsigned long flags; local_irq_save(flags); tur->urn_value_changed++; local_irq_restore(flags); return; } static int test_user_return_worker(void *tur) { struct test_user_return *t; t = (struct test_user_return *) tur; preempt_disable(); user_return_notifier_register(&t->urn); preempt_enable(); t->registered = true; while (!kthread_should_stop()) { static int err_rate = 0; msleep (1000); if (!test_thread_flag(TIF_USER_RETURN_NOTIFY) && (err_rate == 0)) { pr_err("TIF_USER_RETURN_NOTIFY is lost"); err_rate++; } } return 0; } static int init_test_user_return(void) { int r = 0; user_return_test = alloc_percpu(struct test_user_return); if (!user_return_test) { pr_err("failed to allocate percpu test_user_return\n"); r = -ENOMEM; goto exit; } { unsigned int cpu; struct task_struct *task; struct test_user_return *tur; for_each_online_cpu(cpu) { tur = per_cpu_ptr(user_return_test, cpu); if (!tur->registered) { tur->urn.on_user_return = test_user_return_cb; task = kthread_create(test_user_return_worker, tur, "test_user_return"); if (IS_ERR(task)) pr_err("no test_user_return kthread created for cpu %d",cpu); else { tur->worker = task; wake_up_process(task); } } } } exit: return r; } static void exit_test_user_return(void) { struct test_user_return *tur; int i,ret=0; for_each_online_cpu(i) { tur = per_cpu_ptr(user_return_test, i); if (tur->registered) { pr_info("[cpu=%d, %d] ", i, tur->urn_value_changed); user_return_notifier_unregister(&tur->urn); tur->registered = false; } if (tur->worker) { ret = kthread_stop(tur->worker); if (ret) pr_err("can't stop test_user_return kthread for cpu %d", i); } } free_percpu(user_return_test); return; } module_init(init_test_user_return); module_exit(exit_test_user_return); -- Original -- From: "Christophe Leroy"; Date: Tue, Feb 20, 2024 05:02 PM To: "mpe"; "Aneesh Kumar K.V"; "虞陆铭"; "linuxppc-dev"; "linux-kernel"; "npiggin"; Cc: "shenghui...@shingroup.cn"; "dawei...@shingroup.cn"; "ke.z...@shingroup.cn"; "luming.yu"; Subject: Re: [PATCH v1 2/2] powerpc/debug: hook to user return notif
Re: [PATCH] random: vDSO: Redefine PAGE_SIZE and PAGE_MASK
Le 27/08/2024 à 18:05, Vincenzo Frascino a écrit : Hi Christophe, On 27/08/2024 11:49, Christophe Leroy wrote: ... These are still two headers outside of the vdso/ namespace. For arm64 we had concluded that this is never safe, and any vdso header should only include other vdso headers so we never pull in anything that e.g. depends on memory management headers that are in turn broken for the compat vdso. The array_size.h header is really small, so that one could probably just be moved into the vdso/ namespace. The minmax.h header is already rather complex, so it may be better to just open-code the usage of MIN/MAX where needed? It is used at two places only so yes can to that. Could you please clarify where minmax is needed? I tried to build Jason's master tree for x86, commenting the header and it seems building fine. I might be missing something. Without it: VDSO32C arch/powerpc/kernel/vdso/vgetrandom-32.o In file included from /home/chleroy/linux-powerpc/lib/vdso/getrandom.c:11, from : ./arch/powerpc/include/asm/vdso/getrandom.h: In function '__arch_get_vdso_rng_data': ./arch/powerpc/include/asm/vdso/getrandom.h:46:9: error: implicit declaration of function 'BUILD_BUG' [-Werror=implicit-function-declaration] 46 | BUILD_BUG(); | ^ ./arch/powerpc/include/asm/vdso/getrandom.h:47:1: error: no return statement in function returning non-void [-Werror=return-type] 47 | } | ^ /home/chleroy/linux-powerpc/lib/vdso/getrandom.c: In function '__cvdso_getrandom_data': /home/chleroy/linux-powerpc/lib/vdso/getrandom.c:76:23: error: implicit declaration of function 'min_t' [-Werror=implicit-function-declaration] 76 | ssize_t ret = min_t(size_t, INT_MAX & PAGE_MASK /* = MAX_RW_COUNT */, len); | ^ /home/chleroy/linux-powerpc/lib/vdso/getrandom.c:76:29: error: expected expression before 'size_t' 76 | ssize_t ret = min_t(size_t, INT_MAX & PAGE_MASK /* = MAX_RW_COUNT */, len); | ^~ In file included from ./include/linux/array_size.h:5, from /home/chleroy/linux-powerpc/lib/vdso/getrandom.c:6: ./include/linux/compiler.h:243:33: error: implicit declaration of function 'BUILD_BUG_ON_ZERO' [-Werror=implicit-function-declaration] 243 | #define __must_be_array(a) BUILD_BUG_ON_ZERO(__same_type((a), &(a)[0])) | ^ ./include/linux/array_size.h:11:59: note: in expansion of macro '__must_be_array' 11 | #define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]) + __must_be_array(arr)) | ^~~ /home/chleroy/linux-powerpc/lib/vdso/getrandom.c:89:40: note: in expansion of macro 'ARRAY_SIZE' 89 | for (size_t i = 0; i < ARRAY_SIZE(params->reserved); ++i) |^~ /home/chleroy/linux-powerpc/lib/vdso/getrandom.c:196:27: error: expected expression before 'size_t' 196 | batch_len = min_t(size_t, sizeof(state->batch) - state->pos, len); | ^~ /home/chleroy/linux-powerpc/lib/vdso/getrandom.c:247:9: error: implicit declaration of function 'BUILD_BUG_ON' [-Werror=implicit-function-declaration] 247 | BUILD_BUG_ON(sizeof(state->batch_key) % CHACHA_BLOCK_SIZE != 0); | ^~~~ cc1: some warnings being treated as errors make[2]: *** [arch/powerpc/kernel/vdso/Makefile:93: arch/powerpc/kernel/vdso/vgetrandom-32.o] Error 1 make[1]: *** [arch/powerpc/Makefile:388: vdso_prepare] Error 2 make: *** [Makefile:224: __sub-make] Error 2 Same for ARRAY_SIZE(->reserved) by the way, easy to do opencode, we also have it only once I have a similar issue to figure out why linux/array_size.h and uapi/linux/random.h are needed. It seems that I can build the object without them. Could you please explain? Without linux/array_size.h: VDSO32C arch/powerpc/kernel/vdso/vgetrandom-32.o In file included from : /home/chleroy/linux-powerpc/lib/vdso/getrandom.c: In function '__cvdso_getrandom_data': /home/chleroy/linux-powerpc/lib/vdso/getrandom.c:89:40: error: implicit declaration of function 'ARRAY_SIZE' [-Werror=implicit-function-declaration] 89 | for (size_t i = 0; i < ARRAY_SIZE(params->reserved); ++i) |^~ cc1: some warnings being treated as errors make[2]: *** [arch/powerpc/kernel/vdso/Makefile:93: arch/powerpc/kernel/vdso/vgetrandom-32.o] Error 1 make[1]: *** [arch/powerpc/Makefile:388: vdso_prepare] Error 2 make: *** [Makefile:224: __sub-make] Error 2 Without uapi/linux/random.h: VDSO32C arch/powerpc/kernel/vdso/vgetrandom-32.o In file included from : /home/chleroy/linux-powerpc/l
Re: [PATCH] powerpc: Use printk instead of WARN in change_memory_attr
Le 27/08/2024 à 11:12, Ritesh Harjani (IBM) a écrit : [Vous ne recevez pas souvent de courriers de ritesh.l...@gmail.com. Découvrez pourquoi ceci est important à https://aka.ms/LearnAboutSenderIdentification ] Use pr_warn_once instead of WARN_ON_ONCE as discussed here [1] for printing possible use of set_memory_* on linear map on Hash. [1]: https://lore.kernel.org/all/877cc2fpi2.fsf@mail.lhotse/#t Signed-off-by: Ritesh Harjani (IBM) --- arch/powerpc/mm/pageattr.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/mm/pageattr.c b/arch/powerpc/mm/pageattr.c index ac22bf28086f..c8c2d664c6f3 100644 --- a/arch/powerpc/mm/pageattr.c +++ b/arch/powerpc/mm/pageattr.c @@ -94,8 +94,11 @@ int change_memory_attr(unsigned long addr, int numpages, long action) if (!radix_enabled()) { int region = get_region_id(addr); - if (WARN_ON_ONCE(region != VMALLOC_REGION_ID && region != IO_REGION_ID)) + if (region != VMALLOC_REGION_ID && region != IO_REGION_ID) { + pr_warn_once("%s: possible use of set_memory_* on linear map on Hash from (%ps)\n", + __func__, __builtin_return_address(0)); Is it really only linear map ? What about "possible user of set_memory_* outside of vmalloc or io region. Maybe a show_stack() would also be worth it ? But in principle I think it would be better to keep the WARN_ONCE until we can add __must_check to set_memory_xxx() functions to be sure all callers check the result, as mandated by https://github.com/KSPP/linux/issues/7 Christophe
Re: [PATCH] random: vDSO: Redefine PAGE_SIZE and PAGE_MASK
Le 27/08/2024 à 11:59, Arnd Bergmann a écrit : On Tue, Aug 27, 2024, at 10:40, Jason A. Donenfeld wrote: I don't love this, but it might be the lesser of evils, so sure, let's do it. I think I'll combine these header fixups so that the whole operation is a bit more clear. The commit is still pretty small. Something like below: From 0d9a3d68cd6222395a605abd0ac625c41d4cabfa Mon Sep 17 00:00:00 2001 From: Christophe Leroy Date: Tue, 27 Aug 2024 09:31:47 +0200 Subject: [PATCH] random: vDSO: clean header inclusion in getrandom Depending on the architecture, building a 32-bit vDSO on a 64-bit kernel is problematic when some system headers are included. Minimise the amount of headers by moving needed items, such as __{get,put}_unaligned_t, into dedicated common headers and in general use more specific headers, similar to what was done in commit 8165b57bca21 ("linux/const.h: Extract common header for vDSO") and commit 8c59ab839f52 ("lib/vdso: Enable common headers"). On some architectures this results in missing PAGE_SIZE, as was described by commit 8b3843ae3634 ("vdso/datapage: Quick fix - use asm/page-def.h for ARM64"), so define this if necessary, in the same way as done prior by commit cffaefd15a8f ("vdso: Use CONFIG_PAGE_SHIFT in vdso/datapage.h"). Removing linux/time64.h leads to missing 'struct timespec64' in x86's asm/pvclock.h. Add a forward declaration of that struct in that file. Signed-off-by: Christophe Leroy Signed-off-by: Jason A. Donenfeld This is clearly better, but there are still a couple of inaccuracies that may end up biting us again later. Not sure whether it's worth trying to fix it all at once or if we want to address them when that happens: #include -#include -#include -#include +#include These are still two headers outside of the vdso/ namespace. For arm64 we had concluded that this is never safe, and any vdso header should only include other vdso headers so we never pull in anything that e.g. depends on memory management headers that are in turn broken for the compat vdso. The array_size.h header is really small, so that one could probably just be moved into the vdso/ namespace. The minmax.h header is already rather complex, so it may be better to just open-code the usage of MIN/MAX where needed? It is used at two places only so yes can to that. Same for ARRAY_SIZE(->reserved) by the way, easy to do opencode, we also have it only once #include #include +#include #include -#include -#include #include +#include + +#undef PAGE_SIZE +#undef PAGE_MASK +#define PAGE_SIZE (1UL << CONFIG_PAGE_SHIFT) +#define PAGE_MASK (~(PAGE_SIZE - 1)) Since these are now the same across all architectures, maybe we can just have the PAGE_SIZE definitions a vdso header instead and include that from asm/page.h. I gave it a quick look yesterday, there are still some subtleties between architectures. For instance, most architectures use 1UL for the shift but powerpc use 1 and has the following comment: /* * Subtle: (1 << PAGE_SHIFT) is an int, not an unsigned long. So if we * assign PAGE_MASK to a larger type it gets extended the way we want * (i.e. with 1s in the high bits) */ So we'll have to look at all this carefully when we want something common, or am I missing something ? Including uapi/linux/mman.h may still be problematic on some architectures if they change it in a way that is incompatible with compat vdso, but at least that can't accidentally rely on CONFIG_64BIT or something else that would be wrong there. Yes that one is tricky. Because uapi/linux/mman.h includes asm/mman.h with the intention to include uapi/asm/mman.h but when built from the kernel in reality you get arch/powerpc/include/asm/mman.h and I had to add some ifdefery to kick-out kernel oddities it contains that pull additional kernel headers. diff --git a/arch/powerpc/include/asm/mman.h b/arch/powerpc/include/asm/mman.h index 17a77d47ed6d..42a51a993d94 100644 --- a/arch/powerpc/include/asm/mman.h +++ b/arch/powerpc/include/asm/mman.h @@ -6,7 +6,7 @@ #include -#ifdef CONFIG_PPC64 +#if defined(CONFIG_PPC64) && !defined(BUILD_VDSO) #include #include Christophe
Re: [PATCH] random: vDSO: Redefine PAGE_SIZE and PAGE_MASK
Le 27/08/2024 à 10:40, Jason A. Donenfeld a écrit : I don't love this, but it might be the lesser of evils, so sure, let's do it. I don't love it either but I still prefer it to: #ifndef PAGE_SIZE #define PAGE_SIZE #define PAGE_MASK #endif At least we are sure that every architecture get the same , independant of the selected options, and we know what we get. For instance, on x86, when you select CONFIG_HYPERV_TIMER you get asm/hyperv_timer.h which indirectly pulls page.h. But when CONFIG_HYPERV_TIMER is not selected you don't get asm/hyperv_timer.h I think I'll combine these header fixups so that the whole operation is a bit more clear. The commit is still pretty small. Something like below: Looks good to me. Thanks Christophe From 0d9a3d68cd6222395a605abd0ac625c41d4cabfa Mon Sep 17 00:00:00 2001 From: Christophe Leroy Date: Tue, 27 Aug 2024 09:31:47 +0200 Subject: [PATCH] random: vDSO: clean header inclusion in getrandom Depending on the architecture, building a 32-bit vDSO on a 64-bit kernel is problematic when some system headers are included. Minimise the amount of headers by moving needed items, such as __{get,put}_unaligned_t, into dedicated common headers and in general use more specific headers, similar to what was done in commit 8165b57bca21 ("linux/const.h: Extract common header for vDSO") and commit 8c59ab839f52 ("lib/vdso: Enable common headers"). On some architectures this results in missing PAGE_SIZE, as was described by commit 8b3843ae3634 ("vdso/datapage: Quick fix - use asm/page-def.h for ARM64"), so define this if necessary, in the same way as done prior by commit cffaefd15a8f ("vdso: Use CONFIG_PAGE_SHIFT in vdso/datapage.h"). Removing linux/time64.h leads to missing 'struct timespec64' in x86's asm/pvclock.h. Add a forward declaration of that struct in that file. Signed-off-by: Christophe Leroy Signed-off-by: Jason A. Donenfeld --- arch/x86/include/asm/pvclock.h | 1 + include/asm-generic/unaligned.h | 11 +-- include/vdso/helpers.h | 1 + include/vdso/unaligned.h| 15 +++ lib/vdso/getrandom.c| 13 - 5 files changed, 26 insertions(+), 15 deletions(-) create mode 100644 include/vdso/unaligned.h diff --git a/arch/x86/include/asm/pvclock.h b/arch/x86/include/asm/pvclock.h index 0c92db84469d..6e4f8fae3ce9 100644 --- a/arch/x86/include/asm/pvclock.h +++ b/arch/x86/include/asm/pvclock.h @@ -5,6 +5,7 @@ #include #include +struct timespec64; /* some helper functions for xen and kvm pv clock sources */ u64 pvclock_clocksource_read(struct pvclock_vcpu_time_info *src); u64 pvclock_clocksource_read_nowd(struct pvclock_vcpu_time_info *src); diff --git a/include/asm-generic/unaligned.h b/include/asm-generic/unaligned.h index a84c64e5f11e..95acdd70b3b2 100644 --- a/include/asm-generic/unaligned.h +++ b/include/asm-generic/unaligned.h @@ -8,16 +8,7 @@ */ #include #include - -#define __get_unaligned_t(type, ptr) ({ \ - const struct { type x; } __packed *__pptr = (typeof(__pptr))(ptr); \ - __pptr->x; \ -}) - -#define __put_unaligned_t(type, val, ptr) do { \ - struct { type x; } __packed *__pptr = (typeof(__pptr))(ptr); \ - __pptr->x = (val); \ -} while (0) +#include #define get_unaligned(ptr)__get_unaligned_t(typeof(*(ptr)), (ptr)) #define put_unaligned(val, ptr) __put_unaligned_t(typeof(*(ptr)), (val), (ptr)) diff --git a/include/vdso/helpers.h b/include/vdso/helpers.h index 73501149439d..3ddb03bb05cb 100644 --- a/include/vdso/helpers.h +++ b/include/vdso/helpers.h @@ -4,6 +4,7 @@ #ifndef __ASSEMBLY__ +#include #include static __always_inline u32 vdso_read_begin(const struct vdso_data *vd) diff --git a/include/vdso/unaligned.h b/include/vdso/unaligned.h new file mode 100644 index ..eee3d2a4dbe4 --- /dev/null +++ b/include/vdso/unaligned.h @@ -0,0 +1,15 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef __VDSO_UNALIGNED_H +#define __VDSO_UNALIGNED_H + +#define __get_unaligned_t(type, ptr) ({ \ + const struct { type x; } __packed *__pptr = (typeof(__pptr))(ptr); \ + __pptr->x; \ +}) + +#define __put_unaligned_t(type, val, ptr) do { \ + struct { type x; } __packed *__pptr = (typeof(__pptr))(ptr); \ + __pptr->x = (val); \ +} while (0) + +#endif /* __VDSO_UNALIGNED_H */ diff --git a/lib/vdso/getrandom.c b/lib/vdso/getrandom.c index 1281fa3546c2..938ca539aaa6 100644 --- a/lib/vdso/getrandom.c +++
Re: [PATCH 2/4] random: vDSO: Don't use PAGE_SIZE and PAGE_MASK
Le 27/08/2024 à 09:49, Jason A. Donenfeld a écrit : On Tue, Aug 27, 2024 at 09:31:48AM +0200, Christophe Leroy wrote: - ssize_t ret = min_t(size_t, INT_MAX & PAGE_MASK /* = MAX_RW_COUNT */, len); + const unsigned long page_size = 1UL << CONFIG_PAGE_SHIFT; + const unsigned long page_mask = ~(page_size - 1); + ssize_t ret = min_t(size_t, INT_MAX & page_mask /* = MAX_RW_COUNT */, len); I'm really not a fan of making the code less idiomatic... Ok, I have another idea, let's give it a try. An easy solution would be to define PAGE_SIZE and PAGE_MASK in vDSO when they do not exist already, but this can be misleading. Why would what tglx and I suggested be misleading? That seems pretty normal... Are you worried they might mismatch somehow? (If so, why?) All architectures have their own definition, they are all based on CONFIG_PAGE_SHIFT and should give the same value but with some subtleties. The best would be to have an asm-generic definition of PAGE_SIZE and PAGE_MASK that all architectures use, but that's another level of work. tglx or yourself suggested to put in a one of the vdso headers instead of directly in getrandom.c. This is too fragile because PAGE_SIZE might be absent in that header but arrive in getrandom.c through another header. Christophe
[PATCH] random: vDSO: Redefine PAGE_SIZE and PAGE_MASK
Using PAGE_SIZE and PAGE_MASK in VDSO requires inclusion of page.h and it creates several problems, see commit 8b3843ae3634 ("vdso/datapage: Quick fix - use asm/page-def.h for ARM64") and commit cffaefd15a8f ("vdso: Use CONFIG_PAGE_SHIFT in vdso/datapage.h"). Redefine PAGE_SIZE and PAGE_MASK based on CONFIG_PAGE_SHIFT. Signed-off-by: Christophe Leroy --- v3: Use local consts instead of _PAGE_SIZE and _PAGE_MASK macros that are already defined by some architectures. v4: undefine and redefine PAGE_SIZE and PAGE_MASK --- lib/vdso/getrandom.c | 5 + 1 file changed, 5 insertions(+) diff --git a/lib/vdso/getrandom.c b/lib/vdso/getrandom.c index f1643656d0b0..e5968ed141cb 100644 --- a/lib/vdso/getrandom.c +++ b/lib/vdso/getrandom.c @@ -14,6 +14,11 @@ #include #include +#undef PAGE_SIZE +#undef PAGE_MASK +#define PAGE_SIZE (1UL << CONFIG_PAGE_SHIFT) +#define PAGE_MASK (~(PAGE_SIZE - 1)) + #define MEMCPY_AND_ZERO_SRC(type, dst, src, len) do { \ while (len >= sizeof(type)) { \ __put_unaligned_t(type, __get_unaligned_t(type, src), dst); \ -- 2.44.0
[PATCH 4/4] random: vDSO: don't use 64 bits atomics on 32 bits architectures
Performing SMP atomic operations on u64 fails on powerpc32: CC drivers/char/random.o In file included from : drivers/char/random.c: In function 'crng_reseed': ././include/linux/compiler_types.h:510:45: error: call to '__compiletime_assert_391' declared with attribute error: Need native word sized stores/loads for atomicity. 510 | _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__) | ^ ././include/linux/compiler_types.h:491:25: note: in definition of macro '__compiletime_assert' 491 | prefix ## suffix(); \ | ^~ ././include/linux/compiler_types.h:510:9: note: in expansion of macro '_compiletime_assert' 510 | _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__) | ^~~ ././include/linux/compiler_types.h:513:9: note: in expansion of macro 'compiletime_assert' 513 | compiletime_assert(__native_word(t), \ | ^~ ./arch/powerpc/include/asm/barrier.h:74:9: note: in expansion of macro 'compiletime_assert_atomic_type' 74 | compiletime_assert_atomic_type(*p); \ | ^~ ./include/asm-generic/barrier.h:172:55: note: in expansion of macro '__smp_store_release' 172 | #define smp_store_release(p, v) do { kcsan_release(); __smp_store_release(p, v); } while (0) | ^~~ drivers/char/random.c:286:9: note: in expansion of macro 'smp_store_release' 286 | smp_store_release(&__arch_get_k_vdso_rng_data()->generation, next_gen + 1); | ^ Random driver generation is handled as unsigned long not u64, see for instance base_cnrg or struct crng. But on vDSO it needs to be an u64 not just an unsigned long because of 32 bits VDSO being used with 64 bits kernels. On random side however it is an unsigned long hence a 32 bits value on 32 bits architectures, so just cast it to unsigned long for the smp_store_release(). A side effect is that on big endian architectures the store will be performed in the upper 32 bits. It is not an issue on its own because the vDSO site doesn't mind the value, it only checks differences. Just make sure that the vDSO side checks the full 64 bits, for that the local current_generation has to be u64 as well. Signed-off-by: Christophe Leroy --- v3: Cast to unsigned long in random and use u64 in vDSO instead of changing generation field to unsigned long --- drivers/char/random.c | 9 - lib/vdso/getrandom.c | 2 +- 2 files changed, 9 insertions(+), 2 deletions(-) diff --git a/drivers/char/random.c b/drivers/char/random.c index 77968309e2c2..dc9bab51e74d 100644 --- a/drivers/char/random.c +++ b/drivers/char/random.c @@ -282,8 +282,15 @@ static void crng_reseed(struct work_struct *work) * former to arrive at the latter. Use smp_store_release so that this * is ordered with the write above to base_crng.generation. Pairs with * the smp_rmb() before the syscall in the vDSO code. +* +* Cast to unsigned long for 32 bits architectures as atomic 64 bits +* operations are not supported on those architectures. Anyway +* base_crng.generation is a 32 bits value so it is ok. On big endian +* architectures it will be stored in the upper 32 bits but that's ok +* because the vDSO side only checks whether the value changed, it +* doesn't use or interpret the value. */ - smp_store_release(&__arch_get_k_vdso_rng_data()->generation, next_gen + 1); + smp_store_release((unsigned long *)&__arch_get_k_vdso_rng_data()->generation, next_gen + 1); #endif if (!static_branch_likely(&crng_is_ready)) crng_init = CRNG_READY; diff --git a/lib/vdso/getrandom.c b/lib/vdso/getrandom.c index 5d79663b026b..8027b2711b69 100644 --- a/lib/vdso/getrandom.c +++ b/lib/vdso/getrandom.c @@ -69,7 +69,7 @@ __cvdso_getrandom_data(const struct vdso_rng_data *rng_info, void *buffer, size_ struct vgetrandom_state *state = opaque_state; size_t batch_len, nblocks, orig_len = len; bool in_use, have_retried = false; - unsigned long current_generation; + u64 current_generation; void *orig_buffer = buffer; u32 counter[2] = { 0 }; -- 2.44.0
[PATCH 3/4] random: vDSO: Clean header inclusion in getrandom
Building a VDSO32 on a 64 bits kernel is problematic when some system headers are included. See commit 8c59ab839f52 ("lib/vdso: Enable common headers") for more details. Minimise the amount of headers by moving needed items into dedicated common headers. Removing linux/time64.h leads to missing 'struct timespec64' in x86's asm/pvclock.h. Add a forward declaration of that struct in that file. Signed-off-by: Christophe Leroy --- v3: Split PAGE_SIZE/PAGE_MASK subject in another patch and explained the forward declaration of 'struct timespec64' in commit message. --- arch/x86/include/asm/pvclock.h | 1 + include/vdso/helpers.h | 1 + lib/vdso/getrandom.c | 8 +++- 3 files changed, 5 insertions(+), 5 deletions(-) diff --git a/arch/x86/include/asm/pvclock.h b/arch/x86/include/asm/pvclock.h index 0c92db84469d..6e4f8fae3ce9 100644 --- a/arch/x86/include/asm/pvclock.h +++ b/arch/x86/include/asm/pvclock.h @@ -5,6 +5,7 @@ #include #include +struct timespec64; /* some helper functions for xen and kvm pv clock sources */ u64 pvclock_clocksource_read(struct pvclock_vcpu_time_info *src); u64 pvclock_clocksource_read_nowd(struct pvclock_vcpu_time_info *src); diff --git a/include/vdso/helpers.h b/include/vdso/helpers.h index 73501149439d..3ddb03bb05cb 100644 --- a/include/vdso/helpers.h +++ b/include/vdso/helpers.h @@ -4,6 +4,7 @@ #ifndef __ASSEMBLY__ +#include #include static __always_inline u32 vdso_read_begin(const struct vdso_data *vd) diff --git a/lib/vdso/getrandom.c b/lib/vdso/getrandom.c index 5874e3072bfe..5d79663b026b 100644 --- a/lib/vdso/getrandom.c +++ b/lib/vdso/getrandom.c @@ -4,15 +4,13 @@ */ #include -#include -#include -#include +#include #include #include +#include #include -#include -#include #include +#include #define MEMCPY_AND_ZERO_SRC(type, dst, src, len) do { \ while (len >= sizeof(type)) { \ -- 2.44.0
[PATCH 2/4] random: vDSO: Don't use PAGE_SIZE and PAGE_MASK
Using PAGE_SIZE and PAGE_MASK in VDSO requires inclusion of page.h and it creates several problems, see commit 8b3843ae3634 ("vdso/datapage: Quick fix - use asm/page-def.h for ARM64") and commit cffaefd15a8f ("vdso: Use CONFIG_PAGE_SHIFT in vdso/datapage.h"). An easy solution would be to define PAGE_SIZE and PAGE_MASK in vDSO when they do not exist already, but this can be misleading. So follow the same approach as commit cffaefd15a8f ("vdso: Use CONFIG_PAGE_SHIFT in vdso/datapage.h") and exclusively use CONFIG_PAGE_SHIFT. To avoid too much ugliness, define local consts that constains the calculated page size and page mask. Signed-off-by: Christophe Leroy --- v3: Use local consts instead of _PAGE_SIZE and _PAGE_MASK macros that are already defined by some architectures. --- lib/vdso/getrandom.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/lib/vdso/getrandom.c b/lib/vdso/getrandom.c index f1643656d0b0..5874e3072bfe 100644 --- a/lib/vdso/getrandom.c +++ b/lib/vdso/getrandom.c @@ -65,7 +65,9 @@ static __always_inline ssize_t __cvdso_getrandom_data(const struct vdso_rng_data *rng_info, void *buffer, size_t len, unsigned int flags, void *opaque_state, size_t opaque_len) { - ssize_t ret = min_t(size_t, INT_MAX & PAGE_MASK /* = MAX_RW_COUNT */, len); + const unsigned long page_size = 1UL << CONFIG_PAGE_SHIFT; + const unsigned long page_mask = ~(page_size - 1); + ssize_t ret = min_t(size_t, INT_MAX & page_mask /* = MAX_RW_COUNT */, len); struct vgetrandom_state *state = opaque_state; size_t batch_len, nblocks, orig_len = len; bool in_use, have_retried = false; @@ -84,7 +86,7 @@ __cvdso_getrandom_data(const struct vdso_rng_data *rng_info, void *buffer, size_ } /* The state must not straddle a page, since pages can be zeroed at any time. */ - if (unlikely(((unsigned long)opaque_state & ~PAGE_MASK) + sizeof(*state) > PAGE_SIZE)) + if (unlikely(((unsigned long)opaque_state & ~page_mask) + sizeof(*state) > page_size)) return -EFAULT; /* Handle unexpected flags by falling back to the kernel. */ -- 2.44.0
[PATCH 1/4] asm-generic/unaligned.h: Extract common header for vDSO
getrandom vDSO implementation requires __put_unaligned_t() and __put_unaligned_t() but including asm-generic/unaligned.h pulls too many other headers. Follow the same approach as for most things in include/vdso/, see for instance commit 8165b57bca21 ("linux/const.h: Extract common header for vDSO"): Move __get_unaligned_t and __put_unaligned_t into a new unaligned.h living in the vdso/ include directory. Signed-off-by: Christophe Leroy --- include/asm-generic/unaligned.h | 11 +-- include/vdso/unaligned.h| 15 +++ 2 files changed, 16 insertions(+), 10 deletions(-) create mode 100644 include/vdso/unaligned.h diff --git a/include/asm-generic/unaligned.h b/include/asm-generic/unaligned.h index a84c64e5f11e..95acdd70b3b2 100644 --- a/include/asm-generic/unaligned.h +++ b/include/asm-generic/unaligned.h @@ -8,16 +8,7 @@ */ #include #include - -#define __get_unaligned_t(type, ptr) ({ \ - const struct { type x; } __packed *__pptr = (typeof(__pptr))(ptr); \ - __pptr->x; \ -}) - -#define __put_unaligned_t(type, val, ptr) do { \ - struct { type x; } __packed *__pptr = (typeof(__pptr))(ptr); \ - __pptr->x = (val); \ -} while (0) +#include #define get_unaligned(ptr) __get_unaligned_t(typeof(*(ptr)), (ptr)) #define put_unaligned(val, ptr) __put_unaligned_t(typeof(*(ptr)), (val), (ptr)) diff --git a/include/vdso/unaligned.h b/include/vdso/unaligned.h new file mode 100644 index ..eee3d2a4dbe4 --- /dev/null +++ b/include/vdso/unaligned.h @@ -0,0 +1,15 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef __VDSO_UNALIGNED_H +#define __VDSO_UNALIGNED_H + +#define __get_unaligned_t(type, ptr) ({ \ + const struct { type x; } __packed *__pptr = (typeof(__pptr))(ptr); \ + __pptr->x; \ +}) + +#define __put_unaligned_t(type, val, ptr) do { \ + struct { type x; } __packed *__pptr = (typeof(__pptr))(ptr); \ + __pptr->x = (val); \ +} while (0) + +#endif /* __VDSO_UNALIGNED_H */ -- 2.44.0
[PATCH 0/4] Fixups for random vDSO
This small series is an extract of fixups for generic part of random vDSO in preparation of implementing vDSO getrandom for powerpc. See last version of full series at: https://patchwork.ozlabs.org/project/linuxppc-dev/cover/cover.1724309198.git.christophe.le...@csgroup.eu/ This series is based on top of: https://git.kernel.org/pub/scm/linux/kernel/git/crng/random.git master Christophe Leroy (4): asm-generic/unaligned.h: Extract common header for vDSO random: vDSO: Don't use PAGE_SIZE and PAGE_MASK random: vDSO: Clean header inclusion in getrandom random: vDSO: don't use 64 bits atomics on 32 bits architectures arch/x86/include/asm/pvclock.h | 1 + drivers/char/random.c | 9 - include/asm-generic/unaligned.h | 11 +-- include/vdso/helpers.h | 1 + include/vdso/unaligned.h| 15 +++ lib/vdso/getrandom.c| 16 6 files changed, 34 insertions(+), 19 deletions(-) create mode 100644 include/vdso/unaligned.h -- 2.44.0