https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89680

            Bug ID: 89680
           Summary: Redundant moves with -march=skylake for long long
                    shift on 32bit x86
           Product: gcc
           Version: 9.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: ubizjak at gmail dot com
  Target Milestone: ---

Following testcase:

--cut here--
unsigned long long
foo (unsigned long long i)
{
  return i << 3;
}
--cut here--

compiles with -O2 -march=skylake -m32 to:

        subl    $28, %esp
        movl    32(%esp), %eax
        movl    36(%esp), %edx
        movl    %eax, (%esp)
        movl    %edx, 4(%esp)
        vmovdqa (%esp), %xmm1
        addl    $28, %esp
        vpsllq  $3, %xmm1, %xmm0
        vmovd   %xmm0, %eax
        vpextrd $1, %xmm0, %edx
        ret

but with -O2 -march=haswell -m32 to:

        vmovq   4(%esp), %xmm0
        vpsllq  $3, %xmm0, %xmm0
        vmovd   %xmm0, %eax
        vpextrd $1, %xmm0, %edx
        ret

The difference starts in IRA pass with:

Pass 0 for finding pseudo/allocno costs

     a0 (r88,l0) best DREG, allocno DREG
     a1 (r87,l0) best AREG, allocno AREG
     a2 (r85,l0) best NO_REX_SSE_REGS, allocno NO_REX_SSE_REGS
-    a3 (r83,l0) best NO_REX_SSE_REGS, allocno NO_REX_SSE_REGS
+    a3 (r83,l0) best NO_REGS, allocno NO_REGS

 Pass 1 for finding pseudo/allocno costs

     r88: preferred DREG, alternative GENERAL_REGS, allocno GENERAL_REGS
     r87: preferred AREG, alternative GENERAL_REGS, allocno GENERAL_REGS
     r85: preferred NO_REX_SSE_REGS, alternative NO_REGS, allocno
NO_REX_SSE_REGS
-    r83: preferred NO_REX_SSE_REGS, alternative NO_REGS, allocno
NO_REX_SSE_REGS
+    r83: preferred NO_REGS, alternative NO_REGS, allocno NO_REGS

and going downhill from there.

Reply via email to