https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122305
Bug ID: 122305
Summary: Missed optimization: recognize dot-product in linear
code sequence
Product: gcc
Version: unknown
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: fxue at os dot amperecomputing.com
Target Milestone: ---
For the below case, the code sequence could be mapped to a compact dot-product
operation. But gcc fails to do that.
int foo(char *a, char *b, int i)
{
int sum = 0;
sum += a[i + 0] * b[i + 0];
sum += a[i + 1] * b[i + 1];
sum += a[i + 2] * b[i + 2];
sum += a[i + 3] * b[i + 3];
sum += a[i + 4] * b[i + 4];
sum += a[i + 5] * b[i + 5];
sum += a[i + 6] * b[i + 6];
sum += a[i + 7] * b[i + 7];
return sum;
}
And the thing becomes more complicated, if the sequence is inside a loop.
Obviously, transforming all statements in the sequence into a dot-product via
SLP is better than vectorizing each of them individually based on the loop.
void foo(char *a, char *b, int * __restrict__ c, int n, int m)
{
for (int i = 0; i < n; i++, a += m, b += m)
{
int sum = 0;
sum += a[0] * b[0];
sum += a[1] * b[1];
sum += a[2] * b[2];
sum += a[3] * b[3];
sum += a[4] * b[4];
sum += a[5] * b[5];
sum += a[6] * b[6];
sum += a[7] * b[7];
c[i] = sum;
}
}