https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100165
Bug ID: 100165
Summary: fmov could be used to zero out the upper bits instead
of movi/zip or movi/ins with __builtin_shuffle and
zero vector
Product: gcc
Version: 11.0
Status: UNCONFIRMED
Keywords: missed-optimization
Severity: enhancement
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: pinskia at gcc dot gnu.org
Target Milestone: ---
Target: aarch64-*-*
Take:
typedef double V __attribute__((vector_size(16)));
typedef long long VI __attribute__((vector_size(16)));
V
foo (V x)
{
return __builtin_shuffle (x, (V) { 0, 0, }, (VI) {0, 3});
}
----- CUT ----
Or
typedef float V __attribute__((vector_size(16)));
typedef int VI __attribute__((vector_size(16)));
V
foo (V x)
{
return __builtin_shuffle (x, (V) { 0, 0, 0, 0 }, (VI) {0, 1, 4, 5});
}
---- CUT ----
Both should just produce:
fmov d0, d0
ret
---- CUT ----
The x86_64 specific version of this was PR 94680 which I just confirmed today.