https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84756
Bug ID: 84756 Summary: Multiplication done twice just to get upper and lower parts of product Product: gcc Version: 7.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: b7.10110111 at gmail dot com Target Milestone: --- Consider the following C code valid for both x86 and amd64 targets: #ifdef __SIZEOF_INT128__ typedef __uint128_t Longer; #else typedef unsigned long long Longer; #endif typedef unsigned long Shorter; Shorter mul(Shorter a, Shorter b, Shorter* upper) { *upper=(Longer)a*b >> 8*sizeof(Shorter); return (Longer)a*b; } Longer lmul(Shorter a, Shorter b) { return (Longer)a*b; } From lmul function I get the expected good assembly: lmul: mov eax, DWORD PTR [esp+8] mul DWORD PTR [esp+4] ret But for mul gcc generates two multiplications instead of one: mul: push ebx mov ecx, DWORD PTR [esp+8] mov ebx, DWORD PTR [esp+12] mov eax, ecx mul ebx mov eax, DWORD PTR [esp+16] mov DWORD PTR [eax], edx mov eax, ecx imul eax, ebx pop ebx ret Here 'mul ebx' is used to get the upper part of the result, and `imul eax, ebx` is supposed to ge the lower part, although it has already been present right after `mul ebx` in eax register. Similar problem happens when I use -m64 option for gcc to get amd64 code.