https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99068
Bug ID: 99068
Summary: Missed PowerPC lhau optimization
Product: gcc
Version: 10.2.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: brian.grayson at sifive dot com
Target Milestone: ---
(This relates to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99067 but is a
distinct target optimization bug).
This code:
int16_t a[1000];
int64_t N = 100;
int found_zero_ptr(int *a, int N) {
for (int16_t* p = &a[0]; p <= &a[N]; p++) {
if (*p == 0) return 1;
}
return 0;
}
generates this PowerPC assembly under -O3:
...
.L15:
bgt 7,.L12
.L11:
lha 10,0(9)
addi 9,9,2
cmpld 7,9,8
cmpwi 0,10,0
bne 0,.L15
...
In a minor variation of this code, the lha and addi are merged into an lhau.
Why does gcc not do that same merge in the code shown here?