https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110215

            Bug ID: 110215
           Summary: RA fails to allocate register when loop invariant
                    lives through EH region
           Product: gcc
           Version: 14.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: rtl-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: wwwhhhyyy333 at gmail dot com
  Target Milestone: ---

Created attachment 55305
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55305&action=edit
A Testcase

Compiled with -Ofast, The innermost loop is

.L41:
        movups  (%rax), %xmm3
        movaps  (%rsp), %xmm0
        addq    $16, %rax
        subps   %xmm3, %xmm0
        andps   %xmm2, %xmm0
        movups  %xmm0, -16(%rax)
        addps   %xmm0, %xmm1
        cmpq    %rax, %rdx
        jne     .L41

While for Clang it produces

.LBB0_14:                               #   Parent Loop BB0_3 Depth=1
        movups  (%rbp,%rax), %xmm1
        movaps  %xmm3, %xmm2
        subps   %xmm1, %xmm2
        andps   %xmm4, %xmm2
        movups  %xmm2, (%rbp,%rax)
        addps   %xmm2, %xmm0
        addq    $16, %rax
        cmpq    %rax, %r12
        jne     .LBB0_14

The loop invariant `base` was spilled to stack in GCC, but for clang it can
directly use a sse register.

Godbolt: https://godbolt.org/z/TTvG8M6E8

Reply via email to