https://gcc.gnu.org/bugzilla/show_bug.cgi?id=125550

            Bug ID: 125550
           Summary: aarch64: wrong code for VEC_PERM_EXPR
           Product: gcc
           Version: 17.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: kristerw at gcc dot gnu.org
            Blocks: 118443
  Target Milestone: ---
            Target: aarch64

The following function is miscompiled with -O3 -march=armv9.5-a:

#include <arm_sve.h>

svfloat16_t foo (float x0, float x1)
{
  return svdupq_n_f16 (x0, x1, x0, x1, x0, x1, x0, x1);
}


The generated assembly returns a vector where the odd-indexed elements are set
to 0 (i.e., x1 is effectively ignored). This happens because uzp1 creates the
low 64 bits of v0 as [h0, 0, h1, 0], and the low 32 bits of this are then
broadcast across all 32-bit lanes of z0:

foo:
        fcvt    h0, s0
        fcvt    h1, s1
        uzp1    v0.4h, v0.4h, v1.4h
        mov     z0.s, s0
        ret


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118443
[Bug 118443] [Meta bug] Bugs triggered by and blocking more smtgcc testing

Reply via email to