https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69556
Bug ID: 69556
Summary: [6 Regression] forwprop4/match.pd undoing work from
recip
Product: gcc
Version: 6.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: jgreenhalgh at gcc dot gnu.org
Target Milestone: ---
For this code compiled at -Ofast:
double bar (double, double, double, double, double);
double
foo (double a)
{
return bar (1.0/a, 2.0/a, 4.0/a, 8.0/a, 16.0/a);
}
GCC 5 generates:
foo:
.LFB0:
.cfi_startproc
movsd .LC0(%rip), %xmm1
movsd .LC1(%rip), %xmm4
movsd .LC2(%rip), %xmm3
divsd %xmm0, %xmm1
movsd .LC3(%rip), %xmm2
mulsd %xmm1, %xmm4
movapd %xmm1, %xmm0
mulsd %xmm1, %xmm3
mulsd %xmm1, %xmm2
addsd %xmm1, %xmm1
jmp bar
(i.e. one divide, 4 multiplies)
GCC trunk at revision r232907 generates:
foo:
.LFB0:
.cfi_startproc
movapd %xmm0, %xmm5
movsd .LC0(%rip), %xmm4
movsd .LC4(%rip), %xmm0
movsd .LC1(%rip), %xmm3
movsd .LC2(%rip), %xmm2
movsd .LC3(%rip), %xmm1
divsd %xmm5, %xmm0
divsd %xmm5, %xmm4
divsd %xmm5, %xmm3
divsd %xmm5, %xmm2
divsd %xmm5, %xmm1
jmp bar
(i.e. 5 divides)
This is bad for performance.
forwprop4 shows:
Applying pattern match.pd:453, gimple-match.c:32116
gimple_simplified to _2 = 1.6e+1 / a_1(D);
Applying pattern match.pd:453, gimple-match.c:32116
gimple_simplified to _3 = 8.0e+0 / a_1(D);
Applying pattern match.pd:453, gimple-match.c:32116
gimple_simplified to _4 = 4.0e+0 / a_1(D);
Applying pattern match.pd:453, gimple-match.c:32116
gimple_simplified to _5 = 2.0e+0 / a_1(D);
This starts with r229107 which moves the (C1/X)*C2 into (C1*C2)/X pattern from
fold-const.c to match.pd.