[Bug rtl-optimization/99067] Missed optimization for induction variable elimination

wilson at gcc dot gnu.org via Gcc-bugs Wed, 10 Feb 2021 16:48:28 -0800

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99067


Jim Wilson <wilson at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |wilson at gcc dot gnu.org

--- Comment #1 from Jim Wilson <wilson at gcc dot gnu.org> ---
This looks similar to an ivopts problem I looked at regarding coremark.  Given
this testcase
void matrix_add_const_unsigned(unsigned int N, short *A, short val) {
        unsigned int i,j;
        for (i=0; i<N; i++) {
                for (j=0; j<N; j++) {
                        A[i*N+j] += val;
                }
        }
}
void matrix_add_const_signed(signed int N, short *A, short val) {
        signed int i,j;
        for (i=0; i<N; i++) {
                for (j=0; j<N; j++) {
                        A[i*N+j] += val;
                }
        }
}
and compiling for 64-bit targets with -O2, we get much better code for the
second function than the first function.  For riscv64, the first function has 8
instructions in the inner loop.  The second function has 5 instructions in the
inner loop.  I reproduced this problem on multiple 64-bit targets including
mips64, ppc64, arm64.

The problem I saw was that with a signed iterator, ivopts decides that we can
ignore overflow and it is safe to eliminate.  With an unsigned interator, it
decides that unsigned overflow can't be ignored.  Then it looks at loop bounds.
 If the loop bound is unknown, e.g. it is a function parameter in this case,
then it decides that this indunction variable isn't safe to eliminate and we
get poor optimization.

Brian's testcase appears to be another issue of this.  With the original code
ivopts turns a[i] into an unsigned interator and then sees that the loop bound
is a global variable and apparently decides it can't eliminate it.  With the
modified code using int16_t *p, gcc decides that it can eliminate it, and we
get better code.  This issue shows up with 32-bit targets but appears related
to the above.

I don't know if the ivopts issue can be fixed.

[Bug rtl-optimization/99067] Missed optimization for induction variable elimination

Reply via email to