https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100165

            Bug ID: 100165
           Summary: fmov could be used to zero out the upper bits instead
                    of movi/zip or movi/ins with __builtin_shuffle and
                    zero vector
           Product: gcc
           Version: 11.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: enhancement
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: pinskia at gcc dot gnu.org
  Target Milestone: ---
            Target: aarch64-*-*

Take:
typedef double V __attribute__((vector_size(16)));
typedef long long VI __attribute__((vector_size(16)));

V
foo (V x)
{
  return __builtin_shuffle (x, (V) { 0, 0,  }, (VI) {0, 3});
}

----- CUT ----
Or
typedef float V __attribute__((vector_size(16)));
typedef int VI __attribute__((vector_size(16)));

V
foo (V x)
{
  return __builtin_shuffle (x, (V) { 0, 0, 0, 0 }, (VI) {0, 1, 4, 5});
}
---- CUT ----
Both should just produce:
fmov d0, d0
ret
---- CUT ----
The x86_64 specific version of this was PR 94680 which I just confirmed today.

Reply via email to