| Issue |
60972
|
| Summary |
[LLC] Tail-call optimization on arm64 macOS generates incorrect spill & reload
|
| Labels |
new issue
|
| Assignees |
|
| Reporter |
Baltoli
|
# Description
I maintain a language runtime that relies on guaranteed tail call optimizations provided by `llc`. On arm64 Macs, a seemingly correct program that executed successfully on other platforms (X86 Linux & macOS) exhibited segmentation faults due to steps executed earlier in the program[^1].
The issue can be narrowed down - I believe - to incorrect stack-spilling code when calling a function with more than 8 arguments. I have minimized the issue to two LLVM programs that differ only in the number of arguments passed to a tail-called `fastcc` function (8 / 9 respectively). When passing 8 arguments, the program executes correctly, but segfaults for 9.
With reference to the (heavily minimized) LLVM code and reproduction steps below, I observe the following snippets of assembly code being generated for the failing case:
```asm
_foo: ; @foo
add sp, sp, #16
ret
_baz:
... ; abbreviated prelude
bl _foo
str x0, [sp, #8] ; 8-byte Folded Spill
sub sp, sp, #16
bl _bar
ldr x0, [sp, #8] ; 8-byte Folded Reload
... ; abbreviated postlude
```
Here, the return value of `_foo` is being spilled to the stack at address `sp + 8`, which the compiler then expects to reload into `x0` after the call to `_bar`. **However**, the stack adjustment to undo the change made by `_foo` is between the spill and reload, meaning that the address `sp + 8` is different for each.
In the successful program (8 args) below, `_foo` and `_baz` do not perform any stack adjustment, but the spill and reload offset assumptions are the same:
```asm
_foo: ; @foo
ret
_baz:
... ; abbreviated prelude
bl _foo
str x0, [sp, #8] ; 8-byte Folded Spill
bl _bar
ldr x0, [sp, #8] ; 8-byte Folded Reload
... ; abbreviated postlude
```
Please let me know if there is any more information it would be helpful for me to provide.
# Reproduction
The problem can be reproduced with the LLVM files and shell commands below. My machine is an M1 Mac Mini (full system information below), but I believe the issue can be reproduced on any arm64 Mac.
## Instructions
```console
$ llc -O0 -tailcallopt bad.ll -o bad.s
$ clang bad.s -o bad
$ ./bad
[1] 68198 segmentation fault ./bad
$ echo $?
139
```
```console
$ llc -O0 -tailcallopt good.ll -o good.s
$ clang good.s -o good
$ ./good
$ echo $?
0
```
## Code
### Failing: `bad.ll`
```llvm
target datalayout = "e-m:o-i64:64-i128:128-n32:64-S128"
target triple = "arm64-apple-macosx12.0.0"
define void @bar() {
ret void
}
define fastcc i64 @foo(i64 %0, i64 %1, i64 %2, i64 %3, i64 %4, i64 %5, i64 %6, i64 %7, i64 %8) {
ret i64 %0
}
define fastcc i64 @baz() {
entry:
%0 = tail call fastcc i64 @foo(i64 0, i64 0, i64 0, i64 0, i64 0, i64 0, i64 0, i64 0, i64 0)
call void @bar()
ret i64 %0
}
define i32 @main() {
entry:
%0 = call i64 @baz()
%1 = trunc i64 %0 to i32
ret i32 %1
}
```
### Successful: `good.ll`
```llvm
target datalayout = "e-m:o-i64:64-i128:128-n32:64-S128"
target triple = "arm64-apple-macosx12.0.0"
define void @bar() {
ret void
}
define fastcc i64 @foo(i64 %0, i64 %1, i64 %2, i64 %3, i64 %4, i64 %5, i64 %6, i64 %7) {
ret i64 %0
}
define fastcc i64 @baz() {
entry:
%0 = tail call fastcc i64 @foo(i64 0, i64 0, i64 0, i64 0, i64 0, i64 0, i64 0, i64 0)
call void @bar()
ret i64 %0
}
define i32 @main() {
entry:
%0 = call i64 @baz()
%1 = trunc i64 %0 to i32
ret i32 %1
}
```
## Environment
<details>
<summary>System Information</summary>
```console
$ system_profiler SPSoftwareDataType SPHardwareDataType
Software:
System Software Overview:
System Version: macOS 12.3 (21E230)
Kernel Version: Darwin 21.4.0
Boot Volume: Macintosh HD
Boot Mode: Normal
Computer Name: 44634
User Name: administrator (administrator)
Secure Virtual Memory: Enabled
System Integrity Protection: Enabled
Time since boot: 38 days 17:45
Hardware:
Hardware Overview:
Model Name: Mac mini
Model Identifier: Macmini9,1
Chip: Apple M1
Total Number of Cores: 8 (4 performance and 4 efficiency)
Memory: 8 GB
System Firmware Version: 7459.121.3
OS Loader Version: 7459.101.2
Serial Number (system): H2WDQ1K6Q6NV
Hardware UUID: 92B3F9F2-F010-5367-8CAF-7716809951EE
Provisioning UDID: 00008103-001908282440291E
Activation Lock Status: Disabled
```
</details>
<details>
<summary>LLVM Versions</summary>
```console
$ clang --version
Homebrew clang version 15.0.7
Target: arm64-apple-darwin21.4.0
Thread model: posix
InstalledDir: /opt/homebrew/opt/llvm/bin
$ llc --version
Homebrew LLVM version 15.0.7
Optimized build.
Default target: arm64-apple-darwin21.4.0
Host CPU: apple-m1
Registered Targets:
aarch64 - AArch64 (little endian)
aarch64_32 - AArch64 (little endian ILP32)
aarch64_be - AArch64 (big endian)
amdgcn - AMD GCN GPUs
arm - ARM
arm64 - ARM64 (little endian)
arm64_32 - ARM64 (little endian ILP32)
armeb - ARM (big endian)
avr - Atmel AVR Microcontroller
bpf - BPF (host endian)
bpfeb - BPF (big endian)
bpfel - BPF (little endian)
hexagon - Hexagon
lanai - Lanai
mips - MIPS (32-bit big endian)
mips64 - MIPS (64-bit big endian)
mips64el - MIPS (64-bit little endian)
mipsel - MIPS (32-bit little endian)
msp430 - MSP430 [experimental]
nvptx - NVIDIA PTX 32-bit
nvptx64 - NVIDIA PTX 64-bit
ppc32 - PowerPC 32
ppc32le - PowerPC 32 LE
ppc64 - PowerPC 64
ppc64le - PowerPC 64 LE
r600 - AMD GPUs HD2XXX-HD6XXX
riscv32 - 32-bit RISC-V
riscv64 - 64-bit RISC-V
sparc - Sparc
sparcel - Sparc LE
sparcv9 - Sparc V9
systemz - SystemZ
thumb - Thumb
thumbeb - Thumb (big endian)
ve - VE
wasm32 - WebAssembly 32-bit
wasm64 - WebAssembly 64-bit
x86 - 32-bit X86: Pentium-Pro and above
x86-64 - 64-bit X86: EM64T and AMD64
xcore - XCore
```
</details>
<details>
<summary>Homebrew</summary>
```console
$ brew info llvm
==> Downloading https://formulae.brew.sh/api/cask.json
######################################################################## 100.0%
==> llvm: stable 15.0.7 (bottled), HEAD [keg-only]
Next-gen compiler infrastructure
https://llvm.org/
/opt/homebrew/Cellar/llvm/15.0.7_1 (6,411 files, 1.3GB)
Poured from bottle using the formulae.brew.sh API on 2023-02-17 at 11:24:42
From: https://github.com/Homebrew/homebrew-core/blob/HEAD/Formula/llvm.rb
License: Apache-2.0 with LLVM-exception
==> Dependencies
Build: cmake ✔, swig ✘
Required: [email protected] ✘, six ✔, z3 ✔, zstd ✔
==> Options
--HEAD
Install HEAD version
==> Caveats
To use the bundled libc++ please add the following LDFLAGS:
LDFLAGS="-L/opt/homebrew/opt/llvm/lib/c++ -Wl,-rpath,/opt/homebrew/opt/llvm/lib/c++"
llvm is keg-only, which means it was not symlinked into /opt/homebrew,
because macOS already provides this software and installing another version in
parallel can cause all kinds of trouble.
If you need to have llvm first in your PATH, run:
echo 'export PATH="/opt/homebrew/opt/llvm/bin:$PATH"' >> ~/.zshrc
For compilers to find llvm you may need to set:
export LDFLAGS="-L/opt/homebrew/opt/llvm/lib"
export CPPFLAGS="-I/opt/homebrew/opt/llvm/include"
==> Analytics
install: 34,000 (30 days), 133,673 (90 days), 518,755 (365 days)
install-on-request: 29,851 (30 days), 115,410 (90 days), 396,557 (365 days)
build-error: 163 (30 days)
```
</details>
[^1]: Concretely, a function in our language can be observed to return a correct / expected value, but when the result is stored in a data structure and later retrieved, it no longer points to a valid value, and the program segfaults in library code whose invariants have been violated.
_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs