https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81303

Wilco <wilco at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2017-07-12
                 CC|                            |wilco at gcc dot gnu.org
     Ever confirmed|0                           |1

--- Comment #3 from Wilco <wilco at gcc dot gnu.org> ---
Confirmed, on AArch64 bwaves is ~20% slower in SPEC2006 and ~30% slower in
SPEC2017. There are twice as many spills (outside the inner loop) and the
vectors are created in an inefficient way:

ldr     d4, [x5,x27]
ld1r    {v6.2d}, [x5]
mov     v6.d[1], v4.d[0]
add     x5, x5, x26
fmla    v1.2d, v20.2d, v6.2d

Reply via email to