https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121013
Bug ID: 121013 Summary: Possible miscompilation triggered by __builtin_stack_address() at `-O3`. Product: gcc Version: 15.1.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: moorabbit at proton dot me Target Milestone: --- While working on implementing __builtin_stack_address() for Clang, I noticed that GCC can produce different results for this builtin depending on whether optimizations are enabled. Context ---------- $ ~/gcc-15.1.0/bin/gcc -v Target: x86_64-pc-linux-gnu Configured with: ./configure --prefix=~/gcc-15.1.0 --disable-multilib Thread model: posix Supported LTO compression algorithms: zlib zstd gcc version 15.1.0 (GCC) $ cat /tmp/main.c extern void f(int, int, long, long, long, long, long, long); void *a() { f(1, 2, 3, 4, 5, 6, 7, 8); return __builtin_stack_address(); } With optimizations disabled (-O0) ------------------------------------------- $ ~/gcc-15.1.0/bin/gcc -O0 -c /tmp/main.c -o /tmp/main.o && objdump -d /tmp/main.o /tmp/main.o: file format elf64-x86-64 Disassembly of section .text: 0000000000000000 <a>: 0: 55 push %rbp 1: 48 89 e5 mov %rsp,%rbp 4: 6a 08 push $0x8 6: 6a 07 push $0x7 8: 41 b9 06 00 00 00 mov $0x6,%r9d e: 41 b8 05 00 00 00 mov $0x5,%r8d 14: b9 04 00 00 00 mov $0x4,%ecx 19: ba 03 00 00 00 mov $0x3,%edx 1e: be 02 00 00 00 mov $0x2,%esi 23: bf 01 00 00 00 mov $0x1,%edi 28: e8 00 00 00 00 call 2d <a+0x2d> 2d: 48 83 c4 10 add $0x10,%rsp 31: 48 89 e0 mov %rsp,%rax 34: c9 leave 35: c3 ret With optimizations enabled (-O3) ------------------------------------------- $ ~/gcc-15.1.0/bin/gcc -O3 -c /tmp/main.c -o /tmp/main.o && objdump -d /tmp/main.o /tmp/main.o: file format elf64-x86-64 Disassembly of section .text: 0000000000000000 <a>: 0: 48 83 ec 08 sub $0x8,%rsp 4: 41 b9 06 00 00 00 mov $0x6,%r9d a: 41 b8 05 00 00 00 mov $0x5,%r8d 10: b9 04 00 00 00 mov $0x4,%ecx 15: 6a 08 push $0x8 17: ba 03 00 00 00 mov $0x3,%edx 1c: be 02 00 00 00 mov $0x2,%esi 21: bf 01 00 00 00 mov $0x1,%edi 26: 6a 07 push $0x7 28: e8 00 00 00 00 call 2d <a+0x2d> 2d: 48 89 e0 mov %rsp,%rax 30: 48 83 c4 18 add $0x18,%rsp 34: c3 ret Issue ------- The issue is in the order of the `mov %rsp, %rax` and `add _, %rsp` instructions. At -O0, we first adjust the stack pointer by adding $0x10 to %rsp in <a+0x2d>. We then save %rsp to %rax in <a+0x31> and return from the procedure. At -O3, we first save %rsp to %rax in <a+0x2d>. We then adjust the stack pointer by adding $0x18 to %rsp in <a+0x30> and return from the procedure. What's the right behavior? --------------------------------- The comment within the `static rtx expand_builtin_stack_address()` procedure located in gcc/builtins.cc states: ```[...] the outgoing on-stack arguments pushed temporarily for a call are regarded as part of the callee's stack range, rather than the caller's.``` This makes the codegen at -O3 incorrect because, at the moment when %rsp is saved (<a+0x2d>), it still includes the space used by the temporary pushed on-stack arguments. That's not the case at -O0. -O1 and -O2 have the same problem as -O3.