Re: arm64 BTI support for mpg123
Mark Kettenis wrote: > I'm not sure to what extent this makes IBT less effective. Can the > retpolines be used as gadgets to bypass IBT? Should we stop enabling > retpolines by default? > > What *is* obvious is that retpolines are incompatible wuth shadow > stacks. Is there an alternative that doesn't replace the indirect > branch with a return instruction? It is clear however that both AMD and Intel have seperate (yet compatible) strategies to encourage you to buy newer chips.
Re: arm64 BTI support for mpg123
> Date: Tue, 25 Jul 2023 16:51:18 +0200 > From: Christian Weisgerber > > Christian Weisgerber: > > > Because amd64 should suffer from the same problem: > > > > fr->cpu_opts.the_dct36 = dct36_avx; > > ... > > fr->cpu_opts.the_dct36 = dct36_x86_64; > > > > dct36_avx and dct36_x86_64 are assembly routines that lack the > > endbr64 landing pad. And yet, on my IBT-enabled amd64 laptop, > > mpg123 plays just fine with both the avx and x86_64 decoders. > > I have examined the generated assembly on the calling side. There > is no "jmp *%r11" or such. Instead, calling the function pointer > goes through __llvm_retpoline_r11: > > <__llvm_retpoline_r11>: >0: e8 0b 00 00 00 callq 10 <__llvm_retpoline_r11+0x10> >5: f3 90 pause >7: 0f ae e8lfence %eax >a: e9 f6 ff ff ff jmpq 5 <__llvm_retpoline_r11+0x5> >f: cc int3 > 10: 4c 89 1c 24 mov%r11,(%rsp) > 14: c3 retq > 15: 0f ae e8lfence %eax Thanks! That means that retpolines explain why you're not seeing any crashes.
Re: arm64 BTI support for mpg123
Christian Weisgerber: > Because amd64 should suffer from the same problem: > > fr->cpu_opts.the_dct36 = dct36_avx; > ... > fr->cpu_opts.the_dct36 = dct36_x86_64; > > dct36_avx and dct36_x86_64 are assembly routines that lack the > endbr64 landing pad. And yet, on my IBT-enabled amd64 laptop, > mpg123 plays just fine with both the avx and x86_64 decoders. I have examined the generated assembly on the calling side. There is no "jmp *%r11" or such. Instead, calling the function pointer goes through __llvm_retpoline_r11: <__llvm_retpoline_r11>: 0: e8 0b 00 00 00 callq 10 <__llvm_retpoline_r11+0x10> 5: f3 90 pause 7: 0f ae e8lfence %eax a: e9 f6 ff ff ff jmpq 5 <__llvm_retpoline_r11+0x5> f: cc int3 10: 4c 89 1c 24 mov%r11,(%rsp) 14: c3 retq 15: 0f ae e8lfence %eax -- Christian "naddy" Weisgerber na...@mips.inka.de
Re: arm64 BTI support for mpg123
> From: "Theo de Raadt" > Date: Tue, 25 Jul 2023 08:23:14 -0600 > > Christian Weisgerber wrote: > > > Mark Kettenis: > > > > > This port has some infrastructure to use an optimized function that > > > uses a function pointer. Not sure why for arm64 it actually uses that > > > infrastructure, since the only alternative is the generic C > > > implementation. But adding a BTI instruction is the easiest fix. > > > > ok naddy@ > > > > The question is whether there are any others hiding in there. How > > did you find this in the first place? It broke on the M2? Yes, it broke on the M2. Very few people are actually running packages on M2, so there may be more packages that are broken. > > Because amd64 should suffer from the same problem: > > > > fr->cpu_opts.the_dct36 = dct36_avx; > > ... > > fr->cpu_opts.the_dct36 = dct36_x86_64; > > > > dct36_avx and dct36_x86_64 are assembly routines that lack the > > endbr64 landing pad. And yet, on my IBT-enabled amd64 laptop, > > mpg123 plays just fine with both the avx and x86_64 decoders. > > Maybe it requires specific files to go via those code paths? > > Ayways, how will we know all the locations? I think we will eventually > know based upon user reports. > > It feels like there is a compilers or linker option missing to proactively > identify these problems. Please don't bring up LTO. Well, this is probably the wrong place to bring this up, but... ...the most likely reason why this doesn't break is because we enable repolines by default. If you look at the retpoline implementation: https://www.intel.com/content/www/us/en/developer/articles/technical/software-security-guidance/technical-documentation/retpoline-branch-target-injection-mitigation.html you'll see that it replaces an indirect branch ("jmp *%rax") with a sequence that effectively pushes the address to the stack and returns to that address. I'm not sure to what extent this makes IBT less effective. Can the retpolines be used as gadgets to bypass IBT? Should we stop enabling retpolines by default? What *is* obvious is that retpolines are incompatible wuth shadow stacks. Is there an alternative that doesn't replace the indirect branch with a return instruction? I mentioned the retpolines at some point before but I don't think the right people saw that, so adding tech@ and guenther@.
Re: arm64 BTI support for mpg123
Christian Weisgerber wrote: > Mark Kettenis: > > > This port has some infrastructure to use an optimized function that > > uses a function pointer. Not sure why for arm64 it actually uses that > > infrastructure, since the only alternative is the generic C > > implementation. But adding a BTI instruction is the easiest fix. > > ok naddy@ > > The question is whether there are any others hiding in there. How > did you find this in the first place? It broke on the M2? > > Because amd64 should suffer from the same problem: > > fr->cpu_opts.the_dct36 = dct36_avx; > ... > fr->cpu_opts.the_dct36 = dct36_x86_64; > > dct36_avx and dct36_x86_64 are assembly routines that lack the > endbr64 landing pad. And yet, on my IBT-enabled amd64 laptop, > mpg123 plays just fine with both the avx and x86_64 decoders. Maybe it requires specific files to go via those code paths? Ayways, how will we know all the locations? I think we will eventually know based upon user reports. It feels like there is a compilers or linker option missing to proactively identify these problems. Please don't bring up LTO.
Re: arm64 BTI support for mpg123
Mark Kettenis: > This port has some infrastructure to use an optimized function that > uses a function pointer. Not sure why for arm64 it actually uses that > infrastructure, since the only alternative is the generic C > implementation. But adding a BTI instruction is the easiest fix. ok naddy@ The question is whether there are any others hiding in there. How did you find this in the first place? It broke on the M2? Because amd64 should suffer from the same problem: fr->cpu_opts.the_dct36 = dct36_avx; ... fr->cpu_opts.the_dct36 = dct36_x86_64; dct36_avx and dct36_x86_64 are assembly routines that lack the endbr64 landing pad. And yet, on my IBT-enabled amd64 laptop, mpg123 plays just fine with both the avx and x86_64 decoders. -- Christian "naddy" Weisgerber na...@mips.inka.de
arm64 BTI support for mpg123
This port has some infrastructure to use an optimized function that uses a function pointer. Not sure why for arm64 it actually uses that infrastructure, since the only alternative is the generic C implementation. But adding a BTI instruction is the easiest fix. ok? Index: audio/mpg123/Makefile === RCS file: /cvs/ports/audio/mpg123/Makefile,v retrieving revision 1.116 diff -u -p -r1.116 Makefile --- audio/mpg123/Makefile 30 Mar 2023 12:35:39 - 1.116 +++ audio/mpg123/Makefile 24 Jul 2023 19:23:10 - @@ -1,6 +1,7 @@ COMMENT= fast console MPEG audio player and decoder library DISTNAME= mpg123-1.31.3 +REVISION= 0 SHARED_LIBS= mpg123 7.2 # 47.0 SHARED_LIBS+= out123 2.1 # 4.7 SHARED_LIBS+= syn123 0.1 # 1.5 Index: audio/mpg123/patches/patch-src_libmpg123_dct36_neon64_S === RCS file: audio/mpg123/patches/patch-src_libmpg123_dct36_neon64_S diff -N audio/mpg123/patches/patch-src_libmpg123_dct36_neon64_S --- /dev/null 1 Jan 1970 00:00:00 - +++ audio/mpg123/patches/patch-src_libmpg123_dct36_neon64_S 24 Jul 2023 19:23:10 - @@ -0,0 +1,11 @@ +Index: src/libmpg123/dct36_neon64.S +--- src/libmpg123/dct36_neon64.S.orig src/libmpg123/dct36_neon64.S +@@ -49,6 +49,7 @@ dct36_aarch64_COS9: + .type ASM_NAME(dct36_neon64), %function + #endif + ASM_NAME(dct36_neon64): ++ bti c + adrpx5, AARCH64_PCREL_HI(dct36_aarch64_COS9) + add x5, x5, AARCH64_PCREL_LO(dct36_aarch64_COS9) + cmeqv28.16b, v28.16b, v28.16b