https://gcc.gnu.org/bugzilla/show_bug.cgi?id=123824

            Bug ID: 123824
           Summary: riscv: Incorrect passing of VLS vectors via the stack.
           Product: gcc
           Version: 16.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: rdapp at gcc dot gnu.org
                CC: kito at gcc dot gnu.org, kristerw at gcc dot gnu.org,
                    law at gcc dot gnu.org
  Target Milestone: ---

Krister reported on IRC that we don't follow the psABI exactly: 

https://godbolt.org/z/hcf3ce7Mv

typedef int v4si __attribute__ ((vector_size (16)));
int test (int accumulator, v4si v1, v4si v2, v4si v3, v4si v4)
{
  accumulator &= v4[0] & v4[1] & v4[2] & v4[3];
  return accumulator;
}

This is what we currently generate:

test:
        vsetivli        zero,4,e32,m1,ta,ma
        vle32.v v1,0(a7)
        li      a5,-1
        vmv.s.x v2,a5
        vredand.vs      v1,v1,v2
        vmv.x.s a5,v1
        and     a0,a0,a5
        sext.w  a0,a0
        ret

i.e. we load/pass the vector via the stack.  The psABI stipulates, though:

"Aggregates whose total size is no more than 2×XLEN bits are passed in a pair
of registers; if only one register is available, the first XLEN bits are passed
in a register and the remaining bits are passed on the stack."

I have a patch but the code it produces isn't exactly pretty.  Going to regtest
it before continuing further.

Reply via email to