| Issue |
178268
|
| Summary |
[X86] Stack clash protection with loop-based probing omits varargs XMM register saves
|
| Labels |
new issue
|
| Assignees |
|
| Reporter |
philiptaron
|
## Summary
When `-fstack-clash-protection` triggers loop-based stack probing (stack frame ≥ 32KB on Linux x86-64), the XMM register save sequence for varargs functions is completely omitted. This causes `va_arg(ap, double)` to read uninitialized memory instead of the actual floating-point argument value.
I ran into this while compiling `mpr` in Nixpkgs using Clang.
## Environment
- **Clang version**: 21.1.8 (also reproduced with earlier versions)
- **Target**: x86_64-unknown-linux-gnu
- **OS**: Linux (NixOS, but reproducible on other distros)
## Minimal Reproducer
```c
// t32k.c - FAILS (reads uninitialized memory, prints 0)
#include <stdarg.h>
#include <stdio.h>
void test(const char *f, ...) {
char b[32768]; // >= 32KB triggers loop-based stack probing
va_list ap;
va_start(ap, f);
double d = va_arg(ap, double);
sprintf(b, f, d);
printf("%s\n", b);
va_end(ap);
}
int main() {
test("%e", -1.25);
return 0;
}
```
```c
// t16k.c - WORKS (correctly prints -1.250000e+00)
#include <stdarg.h>
#include <stdio.h>
void test(const char *f, ...) {
char b[16384]; // < 32KB uses inline probing, works correctly
va_list ap;
va_start(ap, f);
double d = va_arg(ap, double);
sprintf(b, f, d);
printf("%s\n", b);
va_end(ap);
}
int main() {
test("%e", -1.25);
return 0;
}
```
**Compile and run:**
```bash
$ clang -fstack-clash-protection -o t32k t32k.c && ./t32k
0.000000e+00 # WRONG - should be -1.250000e+00
$ clang -fstack-clash-protection -o t16k t16k.c && ./t16k
-1.250000e+00 # Correct
```
**Workaround:**
```bash
$ clang -fno-stack-clash-protection -o t32k t32k.c && ./t32k
-1.250000e+00 # Correct with stack clash protection disabled
```
## Root Cause Analysis
### Expected Behavior (16KB buffer - inline probing)
The x86-64 SysV ABI requires varargs functions to:
1. Check `%al` for the count of XMM registers used by caller
2. If non-zero, save XMM0-XMM7 to the register save area
With a 16KB buffer, the generated prologue correctly includes:
```asm
test %al,%al # Check if any FP args were passed
je .skip_xmm_save # Skip if none
movaps %xmm0,-0x40b0(%rbp) # Save XMM0
movaps %xmm1,-0x40a0(%rbp) # Save XMM1
... (saves all xmm0-xmm7)
.skip_xmm_save:
```
### Actual Behavior (32KB buffer - loop-based probing)
With a 32KB buffer, stack clash protection switches to loop-based probing. The generated prologue **completely omits** the `test %al,%al` check and all XMM register saves:
```asm
# Loop-based stack probe
mov %rsp,%r11
sub $0x8000,%r11
.probe_loop:
sub $0x1000,%rsp
test %rsp,(%rsp) # Probe the stack
cmp %r11,%rsp
jne .probe_loop
# GP register saves (present)
mov %rsi,-0x80d8(%rbp)
mov %rdx,-0x80d0(%rbp)
...
# Stack canary setup
mov %fs:0x28,%rax
# NO test %al, NO movaps instructions!
# XMM register saves are completely missing
```
This causes `va_arg(ap, double)` to read from the uninitialized register save area, returning garbage (typically 0).
## Impact
This bug affects any varargs function that:
1. Has a stack frame ≥ 32KB (the threshold for loop-based probing on Linux)
2. Is compiled with `-fstack-clash-protection` (default in many hardened builds)
3. Receives floating-point arguments via varargs
Real-world impact: **MPFR 4.2.2** test suite fails (`tsprintf` test) when built with Clang and stack clash protection enabled, because `mpfr_vsprintf` uses a 65KB buffer internally.
_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs