https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81189

            Bug ID: 81189
           Summary: Out of bounds memory access introduced by the
                    vectoriser
           Product: gcc
           Version: 8.0
            Status: UNCONFIRMED
          Keywords: wrong-code
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: ktkachov at gcc dot gnu.org
  Target Milestone: ---

The testcase gcc.dg/vect/O3-pr70130.c performs an out of bounds access when
vectorised on aarch64 (I didn't check other targets).
Compile with -O3.
The problematic function is Loop_err:
void
Loop_err (struct foo *img, const int s[16][2], int s0)
{
  int i, j;

  for (j = 0; j < 16; j++)
    {
      for (i=0; i < 16; i++)
        {
          img->a[0][j][i] = s[i][0];
          img->a[1][j][i] = s[j][1];
          img->a[2][j][i] = s0;
        }
    }
}

The part of the assembly code that performs the loads from s[j][1] is the
problematic one:

...
        add     x4, x1, 4 // Add a +4 offset to 's' to access s[j][1]
...
.L7:
        ldr     q0, [x4], 8  // <---- V4SI load from s[j][1] onwards
        add     x2, x2, 32
        str     q4, [x2, -32]
        cmp     x5, x2
        dup     v0.4s, v0.s[0]
        str     q3, [x2, -16]
        str     q1, [x2, 992]
        xtn     v2.4h, v0.4s
        xtn2    v2.8h, v0.4s
        str     q1, [x2, 1008]
        str     q2, [x2, 480]
        str     q2, [x2, 496]
        bne     .L7

The array passed as as 's' is defined as:
const int s[16][2] = { { 1, 16 }, { 2, 15 }, { 3, 14 }, { 4, 13 },
                       { 5, 12 }, { 6, 11 }, { 7, 10 }, { 8, 9 },
                       { 9, 8 }, { 10, 7 }, { 11, 6 }, { 12, 5 },
                       { 13, 4 }, { 14, 3 }, { 15, 2 }, { 16, 1 } };

So the V4SI load marked above loads 4 ints at a time starting from the second
element in each entry of 's'.
If I step through the execution gdb in gdb I see the loop reaching iteration 15
at which it loads { s[14][1], s[15][0], s[15][1], s[16][0] } where s[16][0] is
out-of-bounds.
GDB shows the contents of Q0 after the load as (formatted for readability):
s = {0x00020001 0x00000001 0x00000010 0x00000002}

As you can see the 4th element (0x00020001) is bogus (presumably from an
adjacent constant pool entry) but because the code after the load doesn't use
it (it only cares about element 0) it doesn't cause any problems in this
instance.
It is however an out-of-bounds access so we should fix it.

Sorry I can't come up with an aborting testcase, I guess since the OOB memory
access is only 4 bytes off in the constant pool the system doesn't trap it.

Reply via email to