sihuan wrote:
> Have you tested if this generates the expected instructions?
Yes, I have tested the codegen. For most cases, it generates the expected
instructions perfectly.
However, there are two specific cases where the generated assembly is
sub-optimal. I think these are backend lowering and optimization opportunities
rather than issues with the frontend intrinsics themselves.
#### 1. RV32 handling 64-bit vectors (e.g., `int8x8_t`):
Instead of generating a register pair instruction like `padd.db`, the backend
currently splits the operation into two 32-bit instructions.
```c
int8x8_t test_padd_i8x8(int8x8_t a, int8x8_t b) {
return __riscv_padd_i8x8(a, b);
}
```
Compiled by `clang -cc1 -triple riscv32 -target-feature +experimental-p -mllvm
-riscv-enable-p-ext-simd-codegen -O2 -S`, yields:
```assembly
test_padd_i8x8:
padd.b a0, a2, a0
padd.b a1, a3, a1
ret
```
The frontend correctly emits the `<8 x i8>` addition in IR, but it seems the
backend currently lacks the patterns to lower this into register pair
instructions. This limitation is also documented in the existing backend test:
https://github.com/llvm/llvm-project/blob/0b8bb80e27c6051794873a16a0eaf63501a6a1c7/llvm/test/CodeGen/RISCV/calling-conv-p-ext-vector.ll#L27-L40
#### 2. RV64 handling 32-bit vectors (e.g., `int8x4_t`):
The assembly includes redundant shift instructions.
```c
int8x4_t test_padd_i8x4(int8x4_t a, int8x4_t b) {
return __riscv_padd_i8x4(a, b);
}
```
Compiled by `clang -cc1 -triple riscv64 -target-feature +experimental-p -mllvm
-riscv-enable-p-ext-simd-codegen -O2 -S`, yields:
```assembly
test_padd_i8x4:
padd.b a0, a0, a1
slli a0, a0, 32
srli a0, a0, 32
ret
```
These shift instructions appear because the frontend uses integer coercion
(`zext i32 ... to i64`) to return the 32-bit aggregate in a 64-bit register
according to the ABI. The backend faithfully executes the zero-extension, but
misses the opportunity to optimize the shifts away.
Since the Clang frontend is emitting the correct vector arithmetic and ABI
coercion IR, I think it might be better to address these codegen improvements
in subsequent backend patches. What are your thoughts on this?
https://github.com/llvm/llvm-project/pull/181115
_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits