https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122871
Bug ID: 122871
Summary: [13/14/15/16 Regression] de-optimized synthesis of
long long shift and add
Product: gcc
Version: 16.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: rearnsha at gcc dot gnu.org
Target Milestone: ---
Target: arm
long long ashll_fn (long long a)
{
long long c;
c = a << 33;
c += a;
return c;
}
On a 32-bit machine, this should optimize to, eg on Arm
add r1, r1, r0, lsl #1
bx lr
But instead we get (-O2)
lsl ip, r0, #11
lsl r2, r1, #11
subs ip, ip, r0
orr r2, r2, r0, lsr #21
sbc r2, r2, r1
lsl r3, ip, #11
lsl r2, r2, #11
adds r3, r3, r0
orr r2, r2, ip, lsr #21
adc r1, r1, r2
lsl r2, r1, #11
lsl r0, r3, #11
adds r0, r3, r0
orr r2, r2, r3, lsr #21
adc r1, r1, r2
bx lr
Which is much worse than GCC-5 used to generate:
mov r2, #0
mov r3, r0, asl #1
adds r0, r0, r2
adc r1, r1, r3
bx lr
The problem seems to stem from the gimple optimizers 'simplifying' the code to
return a * (2^33 + 1);
but then the expand pass failing to synthesise this with shifts and adds again
as it would do for a 32-bit multiply.