https://gcc.gnu.org/bugzilla/show_bug.cgi?id=123824
Bug ID: 123824
Summary: riscv: Incorrect passing of VLS vectors via the stack.
Product: gcc
Version: 16.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: rdapp at gcc dot gnu.org
CC: kito at gcc dot gnu.org, kristerw at gcc dot gnu.org,
law at gcc dot gnu.org
Target Milestone: ---
Krister reported on IRC that we don't follow the psABI exactly:
https://godbolt.org/z/hcf3ce7Mv
typedef int v4si __attribute__ ((vector_size (16)));
int test (int accumulator, v4si v1, v4si v2, v4si v3, v4si v4)
{
accumulator &= v4[0] & v4[1] & v4[2] & v4[3];
return accumulator;
}
This is what we currently generate:
test:
vsetivli zero,4,e32,m1,ta,ma
vle32.v v1,0(a7)
li a5,-1
vmv.s.x v2,a5
vredand.vs v1,v1,v2
vmv.x.s a5,v1
and a0,a0,a5
sext.w a0,a0
ret
i.e. we load/pass the vector via the stack. The psABI stipulates, though:
"Aggregates whose total size is no more than 2×XLEN bits are passed in a pair
of registers; if only one register is available, the first XLEN bits are passed
in a register and the remaining bits are passed on the stack."
I have a patch but the code it produces isn't exactly pretty. Going to regtest
it before continuing further.