Bug ID: 84757
           Summary: Useless MOVs and PUSHes to store results of MUL
           Product: gcc
           Version: 7.2.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot
          Reporter: b7.10110111 at gmail dot com
  Target Milestone: ---

Consider the following C code:

#ifdef __SIZEOF_INT128__
typedef __uint128_t Longer;
typedef unsigned long long Longer;
typedef unsigned long Shorter;

Shorter mulSmarter(Shorter a, Shorter b, Shorter* upper)
    const Longer ab=(Longer)a*b;
    *upper=ab >> 8*sizeof(Shorter);
    return ab;

On amd64 with -m64 option I get identical assembly on both gcc 7.x and 6.3. But
on x86 (or amd64 with -m32) assembly is different, and on gcc 7.x is less
efficient. See to compare:

# gcc 6.3
  mov eax, DWORD PTR [esp+8]
  mul DWORD PTR [esp+4]
  mov ecx, edx
  mov edx, DWORD PTR [esp+12]
  mov DWORD PTR [edx], ecx

# gcc 7.3
  push esi
  push ebx
  mov eax, DWORD PTR [esp+16]
  mul DWORD PTR [esp+12]
  mov esi, edx
  mov edx, DWORD PTR [esp+20]
  mov ebx, eax
  mov eax, ebx
  mov DWORD PTR [edx], esi
  pop ebx
  pop esi

The gcc 6.3 version is already not perfect, but it's much better than that of

Reply via email to