On Wed, 25 Mar 2020, Jakub Jelinek wrote: > Hi! > > The following testcase is miscompiled, because there is a buffer overflow > in push_partial_def in the little-endian case when working 64-byte vectors. > The code computes the number of bytes we need in the BUFFER: NEEDED_LEN, > which is rounded up number of bits we need. Then the code > native_encode_expr each (partially overlapping) pd into THIS_BUFFER. > If pd.offset < 0, i.e. the pd.rhs store starts at some bits before the > window we are interested in, we pass -pd.offset to native_encode_expr and > shrink the size already earlier: > HOST_WIDE_INT size = pd.size; > if (pd.offset < 0) > size -= ROUND_DOWN (-pd.offset, BITS_PER_UNIT); > On this testcase, the problem is with a store with pd.offset > 0, > in particular pd.offset 256, pd.size 512, i.e. a 64-byte store which doesn't > fit into entirely into BUFFER. > We have just: > size = MIN (size, (HOST_WIDE_INT) needed_len * BITS_PER_UNIT); > in this case for little-endian, which isn't sufficient, because needed_len > is 64, the entire BUFFER (except of the last extra byte used for shifting). > native_encode_expr fills the whole THIS_BUFFER (again, except the last extra > byte), and the code then performs memcpy (BUFFER + 32, THIS_BUFFER, 64); > which overflows BUFFER and as THIS_BUFFER is usually laid out after it, > overflows it into THIS_BUFFER. > The following patch fixes it by for pd.offset > 0 making sure size is > reduced too. For big-endian the code does things differently and already > handles this right. > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
OK. Thanks, Richard. > 2020-03-25 Jakub Jelinek <ja...@redhat.com> > > PR tree-optimization/94300 > * tree-ssa-sccvn.c (vn_walk_cb_data::push_partial_def): If pd.offset > is positive, make sure that off + size isn't larger than needed_len. > > * gcc.target/i386/avx512f-pr94300.c: New test. > > --- gcc/tree-ssa-sccvn.c.jj 2020-03-12 14:28:13.365363769 +0100 > +++ gcc/tree-ssa-sccvn.c 2020-03-24 15:48:22.465737253 +0100 > @@ -2058,6 +2058,8 @@ vn_walk_cb_data::push_partial_def (pd_da > shift_bytes_in_array_left (this_buffer, len + 1, amnt); > unsigned int off = pd.offset / BITS_PER_UNIT; > gcc_assert (off < needed_len); > + size = MIN (size, > + (HOST_WIDE_INT) (needed_len - off) * BITS_PER_UNIT); > p = buffer + off; > if (amnt + size < BITS_PER_UNIT) > { > --- gcc/testsuite/gcc.target/i386/avx512f-pr94300.c.jj 2020-03-24 > 15:38:32.391643991 +0100 > +++ gcc/testsuite/gcc.target/i386/avx512f-pr94300.c 2020-03-24 > 15:39:09.874078217 +0100 > @@ -0,0 +1,21 @@ > +/* PR tree-optimization/94300 */ > +/* { dg-do run { target { avx512f } } } */ > +/* { dg-options "-O1 -mavx512f -mprefer-vector-width=512 > -mtune=skylake-avx512" } */ > + > +#include "avx512f-check.h" > + > +typedef double V __attribute__((vector_size (64))); > + > +static void > +avx512f_test (void) > +{ > + double mem[16]; > + const V a = { 0.0, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0 }; > + const V b = { 8.0, 9.0, 10.0, 11.0, 12.0, 13.0, 14.0, 15.0 }; > + V c; > + __builtin_memcpy (mem, &a, 64); > + __builtin_memcpy (mem + 8, &b, 64); > + __builtin_memcpy (&c, mem + 4, 64); > + if (c[5] != 9.0) > + __builtin_abort (); > +} > > Jakub > > -- Richard Biener <rguent...@suse.de> SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)