https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113458
--- Comment #4 from Richard Biener ---
(In reply to Hongtao Liu from comment #2)
> > But if we reduce n to 4, the loop based vectorizer is not able to handle it
> > either.
>
> Do we support 1 element vector(i.e V1SI) in vectorizer?
Yes, but
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113458
--- Comment #3 from Richard Biener ---
On x86_64 with -mavx2 we vectorize
t.c:7:13: note: Vectorizing SLP tree:
t.c:7:13: note: Root stmt: sum_26 = _20 + sum_25;
t.c:7:13: note: node 0x57386c0 (max_nunits=4, refcnt=1) vector(4) int
t.c:7:13:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113458
--- Comment #2 from Hongtao Liu ---
> But if we reduce n to 4, the loop based vectorizer is not able to handle it
> either.
Do we support 1 element vector(i.e V1SI) in vectorizer?
and it also relies on backend support of dot_prodv4qi.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113458
--- Comment #1 from Andrew Pinski ---
The loop based vectorizer is able to do a decent job at:
```
int f(short *a, signed char *b, int n)
{
int sum = 0;
n = 8;
for(int i = 0;i < n; i++)
sum += a[i]*b[i];