| Issue |
183265
|
| Summary |
[AArch64][SVE] SVE load/store intrinsics fail with “Calling a function with a bad signature!” for non-zero address space pointers
|
| Labels |
new issue
|
| Assignees |
|
| Reporter |
nikhil-m-k
|
I’m encountering a codegen failure when SVE load/store intrinsics are used with pointers in non-zero address spaces. Running `interleaved-access` pass, which converts `llvm.vector.deinterleave4.nxv16f32 ` to SVE intrinsics, produces the following error:
`Calling a function with a bad signature!`
This happens when ptr addrspace(1) is passed to SVE load/store intrinsics as shown below in the reproducer test case.
This is because in `llvm/include/llvm/IR/IntrinsicsAArch64.td` the SVE load intrinsics such as `AdvSIMD_1Vec_PredLoad_Intrinsic`, `AdvSIMD_2Vec_PredLoad_Intrinsic` etc are defined only to accept **`llvm_ptr_ty`** as the argument.
However, their counterparts in the NEON intrinsics such as `AdvSIMD_1Vec_Load_Intrinsic`, `AdvSIMD_2Vec_Load_Intrinsic` are defined to accept **`llvm_anyptr_ty`** in their arguments.
- Was this restriction intentional?
- If not, would it be acceptable to generalize SVE intrinsics to accept arbitrary address spaces (e.g., via `llvm_anyptr_ty`) and update the AArch64 ISel lowering accordingly?
I’m happy to work on a patch and tests if this is considered a supported use case.
Reproducer:
```
; ModuleID = 'test.ll'
source_filename = "test.ll"
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128-Fn32"
target triple = "aarch64-unknown-linux-gnu"
declare ptr @malloc_fn(i64, i8)
define void @forward() local_unnamed_addr {
%1 = call ptr @malloc_fn(i64 6656, i8 1)
%2 = call ptr @malloc_fn(i64 6666, i8 1)
%3 = addrspacecast ptr %2 to ptr addrspace(1)
%lsr.iv16089 = addrspacecast ptr %1 to ptr addrspace(1)
%promoted2396 = load float, ptr addrspace(1) %3, align 4
%wide.load4241 = load <vscale x 4 x float>, ptr addrspace(1) %lsr.iv16089, align 4
%wide.vec4243 = load <vscale x 16 x float>, ptr addrspace(1) %lsr.iv16089, align 4
%strided.vec4244 =
call { <vscale x 4 x float>, <vscale x 4 x float>, <vscale x 4 x float>, <vscale x 4 x float> }
@llvm.vector.deinterleave4.nxv16f32(<vscale x 16 x float> %wide.vec4243)
%4 = extractvalue { <vscale x 4 x float>, <vscale x 4 x float>, <vscale x 4 x float>, <vscale x 4 x float> } %strided.vec4244, 0
%5 = fmul <vscale x 4 x float> %wide.load4241, %4
%6 = call float @llvm.vector.reduce.fadd.nxv4f32(float %promoted2396, <vscale x 4 x float> %5)
%7 = call float @llvm.vector.reduce.fadd.nxv4f32(float %6, <vscale x 4 x float> %5)
store float %7, ptr addrspace(1) %3, align 4
ret void
}
declare { <vscale x 4 x float>, <vscale x 4 x float>, <vscale x 4 x float>, <vscale x 4 x float> }
@llvm.vector.deinterleave4.nxv16f32(<vscale x 16 x float>)
declare float @llvm.vector.reduce.fadd.nxv4f32(float, <vscale x 4 x float>)
```
Command:
`llvm-project/llvm-build/bin/opt -passes=interleaved-access -mtriple=aarch64-linux-gnu -mattr=+sve -S test.ll`
_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs