[Bug tree-optimization/113458] Missed SLP for reduction of multiplication/addition with promotion

2024-01-18 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113458 --- Comment #4 from Richard Biener --- (In reply to Hongtao Liu from comment #2) > > But if we reduce n to 4, the loop based vectorizer is not able to handle it > > either. > > Do we support 1 element vector(i.e V1SI) in vectorizer? Yes, but

[Bug tree-optimization/113458] Missed SLP for reduction of multiplication/addition with promotion

2024-01-18 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113458 --- Comment #3 from Richard Biener --- On x86_64 with -mavx2 we vectorize t.c:7:13: note: Vectorizing SLP tree: t.c:7:13: note: Root stmt: sum_26 = _20 + sum_25; t.c:7:13: note: node 0x57386c0 (max_nunits=4, refcnt=1) vector(4) int t.c:7:13:

[Bug tree-optimization/113458] Missed SLP for reduction of multiplication/addition with promotion

2024-01-17 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113458 --- Comment #2 from Hongtao Liu --- > But if we reduce n to 4, the loop based vectorizer is not able to handle it > either. Do we support 1 element vector(i.e V1SI) in vectorizer? and it also relies on backend support of dot_prodv4qi.

[Bug tree-optimization/113458] Missed SLP for reduction of multiplication/addition with promotion

2024-01-17 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113458 --- Comment #1 from Andrew Pinski --- The loop based vectorizer is able to do a decent job at: ``` int f(short *a, signed char *b, int n) { int sum = 0; n = 8; for(int i = 0;i < n; i++) sum += a[i]*b[i];