[Bug tree-optimization/98813] loop is sub-optimized if index is unsigned int with offset

2021-01-27 Thread guojiufu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98813 --- Comment #8 from Jiu Fu Guo --- For code in comment 4, it is optimized since there are some range info for "_2 = l_m_34 + _54;" where _54 > 0.

[Bug tree-optimization/98813] loop is sub-optimized if index is unsigned int with offset

2021-01-26 Thread guojiufu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98813 --- Comment #7 from Jiu Fu Guo --- (In reply to Richard Biener from comment #6) > (In reply to Andrew Pinski from comment #5) > > (In reply to Jiu Fu Guo from comment #0) > > > For the below code: > > > ---t.c > > > void > > > foo (const

[Bug tree-optimization/98813] loop is sub-optimized if index is unsigned int with offset

2021-01-26 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98813 --- Comment #6 from Richard Biener --- (In reply to Andrew Pinski from comment #5) > (In reply to Jiu Fu Guo from comment #0) > > For the below code: > > ---t.c > > void > > foo (const double* __restrict__ A, const double* __restrict__ B,

[Bug tree-optimization/98813] loop is sub-optimized if index is unsigned int with offset

2021-01-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98813 --- Comment #5 from Andrew Pinski --- (In reply to Jiu Fu Guo from comment #0) > For the below code: > ---t.c > void > foo (const double* __restrict__ A, const double* __restrict__ B, double* > __restrict__ C, > int n, int k, int m) > {

[Bug tree-optimization/98813] loop is sub-optimized if index is unsigned int with offset

2021-01-25 Thread guojiufu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98813 --- Comment #4 from Jiu Fu Guo --- Thanks, Richard! One interesting thing: below code is vectorized: void foo (const double *__restrict__ A, const double *__restrict__ B, double *__restrict__ C, int n, int k, int m) { if (n > 0 && m > 0

[Bug tree-optimization/98813] loop is sub-optimized if index is unsigned int with offset

2021-01-25 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98813 Richard Biener changed: What|Removed |Added Blocks||53947 Keywords|

[Bug tree-optimization/98813] loop is sub-optimized if index is unsigned int with offset

2021-01-24 Thread guojiufu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98813 --- Comment #2 from Jiu Fu Guo --- For code: for (unsigned int k = 0; k < BS; k++) { s += A[k] * B[k]; } PR48052 handles this, and for this code, the additional runtime check seems not required. If there is offset in code:

[Bug tree-optimization/98813] loop is sub-optimized if index is unsigned int with offset

2021-01-24 Thread guojiufu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98813 --- Comment #1 from Jiu Fu Guo --- Since there are additional costs for the run-time check, we can see the benefit if upbound `m` is large; if upbound is small (e.g. < 12), the vectorized code (from clang) is worse than un-vectorized binary.