| Issue |
185361
|
| Summary |
[MC][AArch64] Clang crashes when assembling AArch64 `ext` with symbol as immediate operand
|
| Labels |
clang
|
| Assignees |
|
| Reporter |
venkyqz
|
## Summary
`llvm-mc` (debug build) crashes with an assertion failure when assembling the AArch64 Advanced SIMD `ext` (vector extract) instruction if the byte-index immediate operand is specified as a symbol reference (e.g., `f0`, `NaN`, `Inf`). The release build silently emits wrong object code with the byte-index field set to 0.
---
## Reproduction
**Godbolt Link**
+ https://godbolt.org/z/MhYY85j83
**Test File (`poc.s`):**
```asm
.text
ext v0.8b, v1.8b, v2.8b, f0
```
**Commands:**
```bash
# Debug Build - Crashes with assertion
echo -e ".text\next v0.8b, v1.8b, v2.8b, f0" | llvm-mc - \
--arch=aarch64 --triple=aarch64-linux-gnu --filetype=obj -o /dev/null
# Exit 134, Assertion failed
# Release Build - Silent miscompilation
echo -e ".text\next v0.8b, v1.8b, v2.8b, f0" | llvm-mc - \
--arch=aarch64 --triple=aarch64-linux-gnu --filetype=obj -o /dev/null
# Exit 0, byte-index field set to 0
```
**Debug Build Output:**
```
llvm-mc: AArch64MCCodeEmitter.cpp:239:
Assertion `MO.isImm() && "did not expect relocated _expression_"' failed.
Stack dump:
#9 (anonymous namespace)::AArch64MCCodeEmitter::getMachineOpValue(...)
AArch64MCCodeEmitter.cpp:239
#10 (anonymous namespace)::AArch64MCCodeEmitter::getBinaryCodeForInstr(...)
AArch64GenMCCodeEmitter.inc:11533
```
---
## Root Cause
The `ext` instruction is defined in `AArch64InstrFormats.td` via `BaseSIMDBitwiseExtract`:
```tablegen
class BaseSIMDBitwiseExtract<...> : I<
(outs regtype:$Rd), (ins regtype:$Rn, regtype:$Rm, i32imm:$imm), ...
> {
bits<4> imm;
let Inst{14-11} = imm;
}
```
The `i32imm` operand type has **no `ParserMatchClass`** restriction. Any identifier — including register names (`f0`-`f31`), C-style constants (`NaN`, `Inf`), or labels — is accepted at parse time as an MCExpr symbol reference.
At encoding time, `getMachineOpValue` asserts without checking `isExpr()`:
```cpp
// AArch64MCCodeEmitter.cpp:231-240
unsigned getMachineOpValue(const MCInst &MI, const MCOperand &MO, ...) const {
if (MO.isReg())
return Ctx.getRegisterInfo()->getEncodingValue(MO.getReg());
assert(MO.isImm() && "did not expect relocated _expression_");
return static_cast<unsigned>(MO.getImm());
}
```
---
## Impact
| Build Type | Behavior | Security Risk |
|------------|----------|---------------|
| Debug | Assertion failure (Exit 134) | Detectable |
| Release | Silent miscompilation (Exit 0) | **Undetectable in CI/CD** |
**Affected Instructions:**
| Instruction | Debug | Release |
|-------------|-------|---------|
| `ext v0.8b, v1.8b, v2.8b, f0` | 134 | 0 |
| `ext v0.16b, v1.16b, v2.16b, f0` | 134 | 0 |
| `ext v0.8b, v1.8b, v2.8b, NaN` | 134 | 0 |
| `ext v0.8b, v1.8b, v2.8b, Inf` | 134 | 0 |
**Release Behavior:**
```
# Correct (index=2):
ext v0.8b, v1.8b, v2.8b, #2 // encoding: [0x20,0x10,0x02,0x2e]
# Wrong (index=0, symbol 'f0' treated as 0):
ext v0.8b, v1.8b, v2.8b, f0 // encoding: [0x20,0x00,0x02,0x2e]
```
**Affected Triples:**
- `aarch64-linux-gnu`
- `aarch64_be-linux-gnu`
_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs