Christophe Leroy <christophe.le...@csgroup.eu> writes: > Le 10/10/2024 à 11:12, Thomas Weißschuh a écrit : >> On Thu, Oct 10, 2024 at 11:00:15AM +0200, Christophe Leroy wrote: >>> Hi Thomas, >>> >>> Le 10/10/2024 à 10:20, Thomas Weißschuh a écrit : >>>> On Wed, Oct 02, 2024 at 10:39:29AM +0200, Christophe Leroy wrote: >>>>> VDSO time functions do not call any other function, so they don't >>>>> need to save/restore LR. However, retrieving the address of VDSO data >>>>> page requires using LR hence saving then restoring it, which can be >>>>> heavy on some CPUs. On the other hand, VDSO functions on powerpc are >>>>> not standard functions and require a wrapper function to call C VDSO >>>>> functions. And that wrapper has to save and restore LR in order to >>>>> call the C VDSO function, so retrieving VDSO data page address in that >>>>> wrapper doesn't require additional save/restore of LR. >>>>> >>>>> For random VDSO functions it is a bit different. Because the function >>>>> calls __arch_chacha20_blocks_nostack(), it saves and restores LR. >>>>> Retrieving VDSO data page address can then be done there without >>>>> additional save/restore of LR. >>>>> >>>>> So lets implement __arch_get_vdso_rng_data() and simplify the wrapper. >>>>> >>>>> It starts paving the way for the day powerpc will implement a more >>>>> standard ABI for VDSO functions. >>>>> >>>>> Signed-off-by: Christophe Leroy <christophe.le...@csgroup.eu> >>>>> --- >>>>> arch/powerpc/include/asm/vdso/getrandom.h | 15 +++++++++++++-- >>>>> arch/powerpc/kernel/asm-offsets.c | 1 - >>>>> arch/powerpc/kernel/vdso/getrandom.S | 1 - >>>>> arch/powerpc/kernel/vdso/vgetrandom.c | 4 ++-- >>>>> 4 files changed, 15 insertions(+), 6 deletions(-) >>>>> >>>>> diff --git a/arch/powerpc/include/asm/vdso/getrandom.h >>>>> b/arch/powerpc/include/asm/vdso/getrandom.h >>>>> index 501d6bb14e8a..4302e7c67aa5 100644 >>>>> --- a/arch/powerpc/include/asm/vdso/getrandom.h >>>>> +++ b/arch/powerpc/include/asm/vdso/getrandom.h >>>>> @@ -7,6 +7,8 @@ >>>>> #ifndef __ASSEMBLY__ >>>>> +#include <asm/vdso_datapage.h> >>>>> + >>>>> static __always_inline int do_syscall_3(const unsigned long _r0, const >>>>> unsigned long _r3, >>>>> const unsigned long _r4, const >>>>> unsigned long _r5) >>>>> { >>>>> @@ -43,11 +45,20 @@ static __always_inline ssize_t getrandom_syscall(void >>>>> *buffer, size_t len, unsig >>>>> static __always_inline struct vdso_rng_data >>>>> *__arch_get_vdso_rng_data(void) >>>>> { >>>>> - return NULL; >>>>> + struct vdso_arch_data *data; >>>>> + >>>>> + asm( >>>>> + " bcl 20, 31, .+4\n" >>>>> + "0: mflr %0\n" >>>>> + " addis %0, %0, (_vdso_datapage - 0b)@ha\n" >>>>> + " addi %0, %0, (_vdso_datapage - 0b)@l\n" >>>>> + : "=r" (data) :: "lr"); >>>>> + >>>>> + return &data->rng_data; >>>>> } >>>> >>>> Did you also try something like this: >>>> >>>> extern struct vdso_arch_data _vdso_datapage >>>> __attribute__((visibility("hidden"))); >>>> >>>> static __always_inline struct vdso_rng_data *__arch_get_vdso_rng_data(void) >>>> { >>>> return &_vdso_datapage.rng_data; >>>> } >>>> >>>> Not knowing much about ppc asm the resulting assembly looks simpler. >>>> And it would be more in line with what other archs are doing. >>> >>> Did you build it ? >> >> Yes, I couldn't have looked at the asm otherwise. >> >>> I get : >>> >>> VDSO32C arch/powerpc/kernel/vdso/vgetrandom-32.o >>> VDSO32L arch/powerpc/kernel/vdso/vdso32.so.dbg >>> arch/powerpc/kernel/vdso/vdso32.so.dbg: dynamic relocations are not >>> supported >>> make[2]: *** [arch/powerpc/kernel/vdso/Makefile:75: >>> arch/powerpc/kernel/vdso/vdso32.so.dbg] Error 1 >> >> I forgot to enable CONFIG_COMPAT. >> It's only broken for the 32 bit case. >> >>> Current solution gives: >>> >>> 24: 42 9f 00 05 bcl 20,4*cr7+so,28 >>> <__c_kernel_getrandom+0x28> >>> 28: 7e a8 02 a6 mflr r21 >>> 2c: 3e b5 00 00 addis r21,r21,0 >>> 2e: R_PPC_REL16_HA _vdso_datapage+0x6 >>> 30: 3a b5 00 00 addi r21,r21,0 >>> 32: R_PPC_REL16_LO _vdso_datapage+0xa >>> >>> >>> Your solution gives: >>> >>> 60: 3e e0 00 00 lis r23,0 >>> 62: R_PPC_ADDR16_HA _vdso_datapage >>> 64: 3a f7 00 00 addi r23,r23,0 >>> 66: R_PPC_ADDR16_LO _vdso_datapage >>> >>> >>> So yes your solution looks simpler, but relies on absolute addresses set up >>> through dynamic relocation which is not possible because different processes >>> map the same VDSO datapage at different addresses. >> >> Due to visibility("hidden"), the compiler should not emit absolute >> references but PC-relative ones. >> This is how it works for most other architectures, see >> include/vdso/datapage.h. >> >> I'll try to see why this doesn't work for ppc32. > > PC-rel instructions only exist on very very recent powerpc CPUs (power10 ?) Yeah power10 or later.
> On PPC64, ELF ABI v2 requires caller to put called function address in > r12 and it looks like GCC uses that: > > 0000000000000000 <__c_kernel_getrandom>: > 0: 3c 4c 00 00 addis r2,r12,0 > 2: R_PPC64_REL16_HA .TOC.+0x2 > 4: 38 42 00 00 addi r2,r2,0 > 6: R_PPC64_REL16_LO .TOC.+0x6 > ... > 64: 3d 22 00 00 addis r9,r2,0 > 66: R_PPC64_TOC16_HA _vdso_datapage+0x100 > 68: 89 29 00 00 lbz r9,0(r9) > 6a: R_PPC64_TOC16_LO _vdso_datapage+0x100 Setting up r12 is only required for calls to the global entry point (offset 0), local calls can be made to offset 8 and use/don't require r12 to be set. That's because local calls should already have the correct toc pointer in r2. But that's not true in VDSO code. > Which after final link results in: > > 0000000000001060 <__c_kernel_getrandom>: > 1060: 3c 4c 00 01 addis r2,r12,1 > 1064: 38 42 8e a0 addi r2,r2,-29024 > ... > 10c4: 3d 22 ff fc addis r9,r2,-4 > 10c8: 89 29 62 00 lbz r9,25088(r9) The call to __c_kernel_getrandom skips over the r2 setup because it's a local call, even though we haven't setup r2 correctly: 0000000000000758 <__kernel_getrandom>: 758: 91 ff 21 f8 stdu r1,-112(r1) 75c: a6 02 08 7c mflr r0 760: 91 ff 21 f8 stdu r1,-112(r1) 764: 80 00 01 f8 std r0,128(r1) 768: 88 00 41 f8 std r2,136(r1) 76c: 05 00 9f 42 bcl 20,4*cr7+so,770 <__kernel_getrandom+0x18> 770: a6 02 08 7d mflr r8 774: fe ff 08 3d addis r8,r8,-2 778: 90 f8 08 39 addi r8,r8,-1904 77c: fc 00 68 81 lwz r11,252(r8) 780: ff 7f 6b 6d xoris r11,r11,32767 784: ff ff 6b 69 xori r11,r11,65535 788: 34 00 6b 7d cntlzw r11,r11 78c: de 5b 6b 55 rlwinm r11,r11,11,15,15 790: 14 5a 08 7d add r8,r8,r11 794: d8 02 08 39 addi r8,r8,728 798: 41 09 00 48 bl 10d8 <__c_kernel_getrandom+0x8> We could setup r2, but that would only help 64-bit. This also makes me notice that we have a mixture of ELF ABI v1 and v2 code in the VDSO, depending on whether the kernel is building itself ABI v1 or v2: arch/powerpc/kernel/vdso/cacheflush-64.o: ELF 64-bit LSB relocatable, 64-bit PowerPC or cisco 7500, Unspecified or Power ELF V1 ABI, version 1 (SYSV), not stripped arch/powerpc/kernel/vdso/datapage-64.o: ELF 64-bit LSB relocatable, 64-bit PowerPC or cisco 7500, Unspecified or Power ELF V1 ABI, version 1 (SYSV), not stripped arch/powerpc/kernel/vdso/getcpu-64.o: ELF 64-bit LSB relocatable, 64-bit PowerPC or cisco 7500, Unspecified or Power ELF V1 ABI, version 1 (SYSV), not stripped arch/powerpc/kernel/vdso/getrandom-64.o: ELF 64-bit LSB relocatable, 64-bit PowerPC or cisco 7500, Unspecified or Power ELF V1 ABI, version 1 (SYSV), not stripped arch/powerpc/kernel/vdso/gettimeofday-64.o: ELF 64-bit LSB relocatable, 64-bit PowerPC or cisco 7500, Unspecified or Power ELF V1 ABI, version 1 (SYSV), not stripped arch/powerpc/kernel/vdso/note-64.o: ELF 64-bit LSB relocatable, 64-bit PowerPC or cisco 7500, Unspecified or Power ELF V1 ABI, version 1 (SYSV), not stripped arch/powerpc/kernel/vdso/sigtramp64-64.o: ELF 64-bit LSB relocatable, 64-bit PowerPC or cisco 7500, Unspecified or Power ELF V1 ABI, version 1 (SYSV), not stripped arch/powerpc/kernel/vdso/vgetrandom-64.o: ELF 64-bit LSB relocatable, 64-bit PowerPC or cisco 7500, OpenPOWER ELF V2 ABI, version 1 (SYSV), not stripped arch/powerpc/kernel/vdso/vgetrandom-chacha-64.o: ELF 64-bit LSB relocatable, 64-bit PowerPC or cisco 7500, Unspecified or Power ELF V1 ABI, version 1 (SYSV), not stripped arch/powerpc/kernel/vdso/vgettimeofday-64.o: ELF 64-bit LSB relocatable, 64-bit PowerPC or cisco 7500, OpenPOWER ELF V2 ABI, version 1 (SYSV), not stripped All the asm files are ABI v1 because they historically were, and don't say otherwise. The C code comes out as ABI v1 or v2 depending on what we're building the kernel as. Which is a bit fishy. cheers