https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106038

            Bug ID: 106038
           Summary: x86_64 vectorization of ALU ops using xmm registers
                    prematurely
           Product: gcc
           Version: 13.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: goldstein.w.n at gmail dot com
  Target Milestone: ---

See: https://godbolt.org/z/YxWEn6Y65

Basically in all cases where the total amount of memory touched is <= 8 bytes
(word size) the vectorization pass is choosing to inefficiently use xmm
registers to vectorize the unrolled loops. 

GPRs (as GCC <= 9.5 was doing) is faster / less code size.


Related to: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106022

Reply via email to